Genome-wide Modeling of Polygenic Risk Score in Colorectal Cancer Risk
Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články
Grantová podpora
10589
Cancer Research UK - United Kingdom
P30 ES010126
NIEHS NIH HHS - United States
U10 CA037429
NCI NIH HHS - United States
R01 CA195789
NCI NIH HHS - United States
24390
Cancer Research UK - United Kingdom
U01 CA137088
NCI NIH HHS - United States
R01 CA223498
NCI NIH HHS - United States
UM1 CA186107
NCI NIH HHS - United States
001
World Health Organization - International
P30 CA047904
NCI NIH HHS - United States
UG1 CA189974
NCI NIH HHS - United States
19167
Cancer Research UK - United Kingdom
R01 CA206279
NCI NIH HHS - United States
T32 CA163177
NCI NIH HHS - United States
U01 CA206110
NCI NIH HHS - United States
P30 CA008748
NCI NIH HHS - United States
U01 CA167551
NCI NIH HHS - United States
P20 CA252728
NCI NIH HHS - United States
K07 CA188142
NCI NIH HHS - United States
PubMed
32758450
PubMed Central
PMC7477007
DOI
10.1016/j.ajhg.2020.07.006
PII: S0002-9297(20)30236-6
Knihovny.cz E-zdroje
- Klíčová slova
- cancer risk prediction, colorectal cancer, machine learning, polygenic risk score,
- MeSH
- Asijci genetika MeSH
- Bayesova věta MeSH
- celogenomová asociační studie MeSH
- genetická predispozice k nemoci * MeSH
- genom lidský genetika MeSH
- hodnocení rizik * MeSH
- jednonukleotidový polymorfismus genetika MeSH
- kolorektální nádory epidemiologie genetika patologie MeSH
- lidé středního věku MeSH
- lidé MeSH
- multifaktoriální dědičnost genetika MeSH
- rizikové faktory MeSH
- senioři MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
Accurate colorectal cancer (CRC) risk prediction models are critical for identifying individuals at low and high risk of developing CRC, as they can then be offered targeted screening and interventions to address their risks of developing disease (if they are in a high-risk group) and avoid unnecessary screening and interventions (if they are in a low-risk group). As it is likely that thousands of genetic variants contribute to CRC risk, it is clinically important to investigate whether these genetic variants can be used jointly for CRC risk prediction. In this paper, we derived and compared different approaches to generating predictive polygenic risk scores (PRS) from genome-wide association studies (GWASs) including 55,105 CRC-affected case subjects and 65,079 control subjects of European ancestry. We built the PRS in three ways, using (1) 140 previously identified and validated CRC loci; (2) SNP selection based on linkage disequilibrium (LD) clumping followed by machine-learning approaches; and (3) LDpred, a Bayesian approach for genome-wide risk prediction. We tested the PRS in an independent cohort of 101,987 individuals with 1,699 CRC-affected case subjects. The discriminatory accuracy, calculated by the age- and sex-adjusted area under the receiver operating characteristics curve (AUC), was highest for the LDpred-derived PRS (AUC = 0.654) including nearly 1.2 M genetic variants (the proportion of causal genetic variants for CRC assumed to be 0.003), whereas the PRS of the 140 known variants identified from GWASs had the lowest AUC (AUC = 0.629). Based on the LDpred-derived PRS, we are able to identify 30% of individuals without a family history as having risk for CRC similar to those with a family history of CRC, whereas the PRS based on known GWAS variants identified only top 10% as having a similar relative risk. About 90% of these individuals have no family history and would have been considered average risk under current screening guidelines, but might benefit from earlier screening. The developed PRS offers a way for risk-stratified CRC screening and other targeted interventions.
Behavioral and Epidemiology Research Group American Cancer Society Atlanta GA 30303 USA
Center for Applied Genomics Children's Hospital of Philadelphia Philadelphia PA 19104 USA
Center for Public Health Genomics University of Virginia Charlottesville VA 22903 USA
CIBER Epidemiología y Salud Pública University of León León 24071 Spain
Department of Biomedical Informatics Vanderbilt University Medical Center Nashville TN 37232 USA
Department of Cardiovascular Medicine Mayo Clinic Rochester MN 55905 USA
Department of Family Medicine University of Virginia Charlottesville VA 22903 USA
Department of General Surgery University Hospital Rostock Rostock 18051 Germany
Department of Health Science Research Mayo Clinic Scottsdale AZ 85260 USA
Department of Health Sciences Research Mayo Clinic Rochester MN 55905 USA
Department of Internal Medicine University of Utah Salt Lake City UT 84132 USA
Department of Medicine University of Washington Medical Center Seattle WA 98195 USA
Department of Public Health and Primary Care University of Cambridge Cambridge CB2 0SR UK
Department of Surgery University of Virginia Health System Charlottesville VA 22903 USA
Division of Cancer Epidemiology German Cancer Research Center Hamburg 20246 Germany
Division of Research Kaiser Permanente Northern California Oakland CA 94612 USA
Institute for Health Research Kaiser Permanente Colorado Denver CO 80014 USA
Institute of Cancer Research Department of Medicine 1 Medical University Vienna Vienna 1090 Austria
Institute of Environmental Medicine Karolinska Institutet Stockholm 17177 Sweden
Kaiser Permanente Washington Research Institute Seattle WA 98101 USA
Leeds Institute of Cancer and Pathology University of Leeds Leeds LS2 9JT UK
Memorial University of Newfoundland Discipline of Genetics St John's NL A1B 3R7 Canada
Ontario Institute for Cancer Research Toronto ON M5G0A3 Canada
Public Health Sciences Division Fred Hutchinson Cancer Research Center Seattle WA 98109 USA
School of Public Health Imperial College London London SW7 2AZ UK
School of Public Health Oregon Health and Science University Portland OR 97239 USA
Service de Génétique Médicale Centre Hospitalier Universitaire Nantes Nantes 44093 France
SWOG Statistical Center Fred Hutchinson Cancer Research Center Seattle WA 98109 USA
Translational Genomics Research Institute An Affiliate of City of Hope Phoenix AZ 85003 USA
University of Hawaii Cancer Center Honolulu HI 96813 USA
University of Southern California Preventative Medicine Los Angeles CA 90089 USA
Zobrazit více v PubMed
Sandouk F., Al Jerf F., Al-Halabi M.H.D.B. Precancerous lesions in colorectal cancer. Gastroenterol. Res. Pract. 2013;2013:457901. PubMed PMC
Howlader N., Noone A.M., Krapcho M., Miller D. National Cancer Institute; Bethesda, MD: 2019. SEER Cancer Statistics Review, 1975-2016.https://seer.cancer.gov/archive/csr/1975_2016/
Vogelaar I., van Ballegooijen M., Schrag D., Boer R., Winawer S.J., Habbema J.D.F., Zauber A.G. How much can current interventions reduce colorectal cancer mortality in the U.S.? Mortality projections for scenarios of risk-factor modification, screening, and treatment. Cancer. 2006;107:1624–1633. PubMed
Smith R.A., Mettlin C.J., Davis K.J., Eyre H. American Cancer Society guidelines for the early detection of cancer. CA Cancer J. Clin. 2000;50:34–49. PubMed
Kooperberg C., LeBlanc M., Obenchain V. Risk prediction using genome-wide association studies. Genet. Epidemiol. 2010;34:643–652. PubMed PMC
Vilhjálmsson B.J., Yang J., Finucane H.K., Gusev A., Lindström S., Ripke S., Genovese G., Loh P.-R., Bhatia G., Do R., Schizophrenia Working Group of the Psychiatric Genomics Consortium, Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) study Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 2015;97:576–592. PubMed PMC
Khera A.V., Chaffin M., Aragam K.G., Haas M.E., Roselli C., Choi S.H., Natarajan P., Lander E.S., Lubitz S.A., Ellinor P.T., Kathiresan S. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 2018;50:1219–1224. PubMed PMC
Schork A.J., Schork M.A., Schork N.J. Genetic risks and clinical rewards. Nat. Genet. 2018;50:1210–1211. PubMed PMC
Jeon J., Du M., Schoen R.E., Hoffmeister M., Newcomb P.A., Berndt S.I., Caan B., Campbell P.T., Chan A.T., Chang-Claude J., Colorectal Transdisciplinary Study and Genetics and Epidemiology of Colorectal Cancer Consortium Determining risk of colorectal cancer and starting age of screening based on lifestyle, environmental, and genetic factors. Gastroenterology. 2018;154:2152–2164.e19. PubMed PMC
Hsu L., Jeon J., Brenner H., Gruber S.B., Schoen R.E., Berndt S.I., Chan A.T., Chang-Claude J., Du M., Gong J., Colorectal Transdisciplinary (CORECT) Study. Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) A model to determine colorectal cancer risk using common genetic susceptibility loci. Gastroenterology. 2015;148 1330–9.e14. PubMed PMC
Dunlop M.G., Tenesa A., Farrington S.M., Ballereau S., Brewster D.H., Koessler T., Pharoah P., Schafmayer C., Hampe J., Völzke H. Cumulative impact of common genetic variants and other risk factors on colorectal cancer risk in 42,103 individuals. Gut. 2013;62:871–881. PubMed PMC
Ibáñez-Sanz G., Díez-Villanueva A., Alonso M.H., Rodríguez-Moranta F., Pérez-Gómez B., Bustamante M., Martin V., Llorca J., Amiano P., Ardanaz E. Risk Model for Colorectal Cancer in Spanish Population Using Environmental and Genetic Factors: Results from the MCC-Spain study. Sci. Rep. 2017;7:43263. PubMed PMC
Smith T., Gunter M.J., Tzoulaki I., Muller D.C. The added value of genetic information in colorectal cancer risk prediction models: development and evaluation in the UK Biobank prospective cohort study. Br. J. Cancer. 2018;119:1036–1039. PubMed PMC
Huyghe J.R., Bien S.A., Harrison T.A., Kang H.M., Chen S., Schmit S.L., Conti D.V., Qu C., Jeon J., Edlund C.K. Discovery of common and rare genetic risk variants for colorectal cancer. Nat. Genet. 2019;51:76–87. PubMed PMC
Chatterjee N., Wheeler B., Sampson J., Hartge P., Chanock S.J., Park J.-H. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat. Genet. 2013;45:400–405. e1–e3. PubMed PMC
Wei Z., Wang K., Qu H.-Q., Zhang H., Bradfield J., Kim C., Frackleton E., Hou C., Glessner J.T., Chiavacci R. From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet. 2009;5:e1000678. PubMed PMC
Moore J.H., Asselbergs F.W., Williams S.M. Bioinformatics challenges for genome-wide association studies. Bioinformatics. 2010;26:445–455. PubMed PMC
Abraham G., Kowalczyk A., Zobel J., Inouye M. Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genet. Epidemiol. 2013;37:184–195. PubMed
Bureau A., Dupuis J., Hayward B., Falls K., Van Eerdewegh P. Mapping complex traits using Random Forests. BMC Genet. 2003;4(Suppl 1):S64. PubMed PMC
Goldstein B.A., Hubbard A.E., Cutler A., Barcellos L.F. An application of Random Forests to a genome-wide association dataset: methodological considerations & new findings. BMC Genet. 2010;11:49. PubMed PMC
Martin A.R., Daly M.J., Robinson E.B., Hyman S.E., Neale B.M. Predicting polygenic risk of psychiatric disorders. Biol. Psychiatry. 2019;86:97–109. PubMed PMC
Gordon N.P. How does the adult Kaiser Permanente membership in Northern California compare with the larger community? 2006. https://divisionofresearch.kaiserpermanente.org/projects/memberhealthsurvey/SiteCollectionDocuments/comparison_kaiser_vs_nonKaiser_adults_kpnc.pdf
Kvale M.N., Hesselson S., Hoffmann T.J., Cao Y., Chan D., Connell S., Croen L.A., Dispensa B.P., Eshragh J., Finn A. Genotyping informatics and quality control for 100,000 subjects in the genetic epidemiology research on adult health and aging (GERA) cohort. Genetics. 2015;200:1051–1060. PubMed PMC
Lee J.K., Jensen C.D., Levin T.R., Zauber A.G., Doubeni C.A., Zhao W.K., Corley D.A. Accurate identification of colonoscopy quality and polyp findings using natural language processing. J. Clin. Gastroenterol. 2019;53:e25–e30. PubMed PMC
Gottesman O., Kuivaniemi H., Tromp G., Faucett W.A., Li R., Manolio T.A., Sanderson S.C., Kannry J., Zinberg R., Basford M.A., eMERGE Network The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet. Med. 2013;15:761–771. PubMed PMC
Law P.J., Timofeeva M., Fernandez-Rozadilla C., Broderick P., Studd J., Fernandez-Tajes J., Farrington S., Svinti V., Palles C., Orlando G., PRACTICAL consortium Association analyses identify 31 new risk loci for colorectal cancer susceptibility. Nat. Commun. 2019;10:2154. PubMed PMC
Lu Y., Kweon S.-S., Tanikawa C., Jia W.-H., Xiang Y.-B., Cai Q., Zeng C., Schmit S.L., Shin A., Matsuo K. Large-Scale Genome-Wide Association Study of East Asians Identifies Loci Associated With Risk for Colorectal Cancer. Gastroenterology. 2019;156:1455–1466. PubMed PMC
Zhong H., Prentice R.L. Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostatistics. 2008;9:621–634. PubMed PMC
Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7. PubMed PMC
Hastie T., Tibshirani R., Friedman J. Second Edition. Springer; 2009. The elements of statistical learning.
Friedman J.H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 2001;29:1189–1232.
Heagerty P.J., Lumley T., Pepe M.S. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56:337–344. PubMed
Lichtenstein P., Holm N.V., Verkasalo P.K., Iliadou A., Kaprio J., Koskenvuo M., Pukkala E., Skytthe A., Hemminki K. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland. N. Engl. J. Med. 2000;343:78–85. PubMed
Zhang Y., Wilcox A.N., Zhang H., Choudhury P.P., Easton D.F., Milne R.L., Simard J., Hall P., Michailidou K., Dennis J. Assessment of Polygenic Architecture and Risk Prediction based on Common Variants Across Fourteen Cancers. Nat. Commun. 2020;11:3353. PubMed PMC
Evans D.M., Visscher P.M., Wray N.R. Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. Hum. Mol. Genet. 2009;18:3525–3531. PubMed
Yang J., Benyamin B., McEvoy B.P., Gordon S., Henders A.K., Nyholt D.R., Madden P.A., Heath A.C., Martin N.G., Montgomery G.W. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010;42:565–569. PubMed PMC
de Vlaming R., Groenen P.J.F. The current and future use of ridge regression for prediction in quantitative genetics. BioMed Res. Int. 2015;2015:143712. PubMed PMC
Malo N., Libiger O., Schork N.J. Accommodating linkage disequilibrium in genetic-association analyses via ridge regression. Am. J. Hum. Genet. 2008;82:375–385. PubMed PMC
Masys D.R., Jarvik G.P., Abernethy N.F., Anderson N.R., Papanicolaou G.J., Paltoo D.N., Hoffman M.A., Kohane I.S., Levy H.P. Technical desiderata for the integration of genomic data into Electronic Health Records. J. Biomed. Inform. 2012;45:419–422. PubMed PMC
Hoffman J.M., Haidar C.E., Wilkinson M.R., Crews K.R., Baker D.K., Kornegay N.M., Yang W., Pui C.-H., Reiss U.M., Gaur A.H. PG4KDS: a model for the clinical implementation of pre-emptive pharmacogenetics. Am. J. Med. Genet. C. Semin. Med. Genet. 2014;166C:45–55. PubMed PMC
Weigl K., Thomsen H., Balavarca Y., Hellwege J.N., Shrubsole M.J., Brenner H. Genetic risk score is associated with prevalence of advanced neoplasms in a colorectal cancer screening population. Gastroenterology. 2018;155:88–98.e10. PubMed PMC
Hang D., Joshi A.D., He X., Chan A.T., Jovani M., Gala M.K., Ogino S., Kraft P., Turman C., Peters U. Colorectal cancer susceptibility variants and risk of conventional adenomas and serrated polyps: results from three cohort studies. Int. J. Epidemiol. 2020;49:259–269. PubMed PMC
Bien S.A., Auer P.L., Harrison T.A., Qu C., Connolly C.M., Greenside P.G., Chen S., Berndt S.I., Bézieau S., Kang H.M., GECCO and CCFR Enrichment of colorectal cancer associations in functional regions: Insight for using epigenomics data in the analysis of whole genome sequence-imputed GWAS data. PLoS ONE. 2017;12:e0186518. PubMed PMC
Su Y.-R., Di C., Bien S., Huang L., Dong X., Abecasis G., Berndt S., Bezieau S., Brenner H., Caan B. A Mixed-Effects Model for Powerful Association Tests in Integrative Functional Genomics. Am. J. Hum. Genet. 2018;102:904–919. PubMed PMC
Hu Y., Lu Q., Powles R., Yao X., Yang C., Fang F., Xu X., Zhao H. Leveraging functional annotations in genetic risk prediction for human complex diseases. PLoS Comput. Biol. 2017;13:e1005589. PubMed PMC
De La Vega F.M., Bustamante C.D. Polygenic risk scores: a biased prediction? Genome Med. 2018;10:100. PubMed PMC
Dafnis G., Ekbom A., Pahlman L., Blomqvist P. Complications of diagnostic and therapeutic colonoscopy within a defined population in Sweden. Gastrointest. Endosc. 2001;54:302–309. PubMed
Gatto N.M., Frucht H., Sundararajan V., Jacobson J.S., Grann V.R., Neugut A.I. Risk of perforation after colonoscopy and sigmoidoscopy: a population-based study. J. Natl. Cancer Inst. 2003;95:230–236. PubMed
Arora N.K. Importance of patient-centered care in enhancing patient well-being: a cancer survivor’s perspective. Qual. Life Res. 2009;18:1–4. PubMed