TY - JOUR
T1 - Performance of risk prediction for inflammatory bowel disease based on genotyping platform and genomic risk score method
AU - International IBD Genetics Consortium
AU - Chen, Guo Bo
AU - Lee, Sang Hong
AU - Montgomery, Grant W.
AU - Wray, Naomi R.
AU - Visscher, Peter M.
AU - Gearry, Richard B.
AU - Lawrance, Ian C.
AU - Andrews, Jane M.
AU - Bampton, Peter
AU - Mahy, Gillian
AU - Bell, Sally
AU - Walsh, Alissa
AU - Connor, Susan
AU - Sparrow, Miles
AU - Bowdler, Lisa M.
AU - Simms, Lisa A.
AU - Krishnaprasad, Krupa
AU - Radford-Smith, Graham L.
AU - Moser, Gerhard
PY - 2017/8/29
Y1 - 2017/8/29
N2 - Background: Predicting risk of disease from genotypes is being increasingly proposed for a variety of diagnostic and prognostic purposes. Genome-wide association studies (GWAS) have identified a large number of genome-wide significant susceptibility loci for Crohn's disease (CD) and ulcerative colitis (UC), two subtypes of inflammatory bowel disease (IBD). Recent studies have demonstrated that including only loci that are significantly associated with disease in the prediction model has low predictive power and that power can substantially be improved using a polygenic approach. Methods: We performed a comprehensive analysis of risk prediction models using large case-control cohorts genotyped for 909,763 GWAS SNPs or 123,437 SNPs on the custom designed Immunochip using four prediction methods (polygenic score, best linear genomic prediction, elastic-net regularization and a Bayesian mixture model). We used the area under the curve (AUC) to assess prediction performance for discovery populations with different sample sizes and number of SNPs within cross-validation. Results: On average, the Bayesian mixture approach had the best prediction performance. Using cross-validation we found little differences in prediction performance between GWAS and Immunochip, despite the GWAS array providing a 10 times larger effective genome-wide coverage. The prediction performance using Immunochip is largely due to the power of the initial GWAS for its marker selection and its low cost that enabled larger sample sizes. The predictive ability of the genomic risk score based on Immunochip was replicated in external data, with AUC of 0.75 for CD and 0.70 for UC. CD patients with higher risk scores demonstrated clinical characteristics typically associated with a more severe disease course including ileal location and earlier age at diagnosis. Conclusions: Our analyses demonstrate that the power of genomic risk prediction for IBD is mainly due to strongly associated SNPs with considerable effect sizes. Additional SNPs that are only tagged by high-density GWAS arrays and low or rare-variants over-represented in the high-density region on the Immunochip contribute little to prediction accuracy. Although a quantitative assessment of IBD risk for an individual is not currently possible, we show sufficient power of genomic risk scores to stratify IBD risk among individuals at diagnosis.
AB - Background: Predicting risk of disease from genotypes is being increasingly proposed for a variety of diagnostic and prognostic purposes. Genome-wide association studies (GWAS) have identified a large number of genome-wide significant susceptibility loci for Crohn's disease (CD) and ulcerative colitis (UC), two subtypes of inflammatory bowel disease (IBD). Recent studies have demonstrated that including only loci that are significantly associated with disease in the prediction model has low predictive power and that power can substantially be improved using a polygenic approach. Methods: We performed a comprehensive analysis of risk prediction models using large case-control cohorts genotyped for 909,763 GWAS SNPs or 123,437 SNPs on the custom designed Immunochip using four prediction methods (polygenic score, best linear genomic prediction, elastic-net regularization and a Bayesian mixture model). We used the area under the curve (AUC) to assess prediction performance for discovery populations with different sample sizes and number of SNPs within cross-validation. Results: On average, the Bayesian mixture approach had the best prediction performance. Using cross-validation we found little differences in prediction performance between GWAS and Immunochip, despite the GWAS array providing a 10 times larger effective genome-wide coverage. The prediction performance using Immunochip is largely due to the power of the initial GWAS for its marker selection and its low cost that enabled larger sample sizes. The predictive ability of the genomic risk score based on Immunochip was replicated in external data, with AUC of 0.75 for CD and 0.70 for UC. CD patients with higher risk scores demonstrated clinical characteristics typically associated with a more severe disease course including ileal location and earlier age at diagnosis. Conclusions: Our analyses demonstrate that the power of genomic risk prediction for IBD is mainly due to strongly associated SNPs with considerable effect sizes. Additional SNPs that are only tagged by high-density GWAS arrays and low or rare-variants over-represented in the high-density region on the Immunochip contribute little to prediction accuracy. Although a quantitative assessment of IBD risk for an individual is not currently possible, we show sufficient power of genomic risk scores to stratify IBD risk among individuals at diagnosis.
KW - Case-control study
KW - Complex trait
KW - Crohn's disease
KW - Inflammatory bowel disease
KW - Risk score
KW - SNP array
KW - Ulcerative colitis
UR - http://www.scopus.com/inward/record.url?scp=85028458127&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/ARC/DP160102126
UR - http://purl.org/au-research/grants/ARC/FT160100229
U2 - 10.1186/s12881-017-0451-2
DO - 10.1186/s12881-017-0451-2
M3 - Article
C2 - 28851283
AN - SCOPUS:85028458127
SN - 1471-2350
VL - 18
JO - BMC Medical Genetics
JF - BMC Medical Genetics
IS - 1
M1 - 94
ER -