Machine learning in Alzheimer's disease genetics
Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
UK DRI-3206
RCUK | Medical Research Council (MRC)
PubMed
40691194
PubMed Central
PMC12280214
DOI
10.1038/s41467-025-61650-z
PII: 10.1038/s41467-025-61650-z
Knihovny.cz E-zdroje
- MeSH
- algoritmy MeSH
- Alzheimerova nemoc * genetika MeSH
- celogenomová asociační studie MeSH
- genetická predispozice k nemoci MeSH
- jednonukleotidový polymorfismus MeSH
- lidé MeSH
- neuronové sítě MeSH
- proteiny aktivující GTPasu genetika MeSH
- strojové učení * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny aktivující GTPasu MeSH
Traditional statistical approaches have advanced our understanding of the genetics of complex diseases, yet are limited to linear additive models. Here we applied machine learning (ML) to genome-wide data from 41,686 individuals in the largest European consortium on Alzheimer's disease (AD) to investigate the effectiveness of various ML algorithms in replicating known findings, discovering novel loci, and predicting individuals at risk. We utilised Gradient Boosting Machines (GBMs), biological pathway-informed Neural Networks (NNs), and Model-based Multifactor Dimensionality Reduction (MB-MDR) models. ML approaches successfully captured all genome-wide significant genetic variants identified in the training set and 22% of associations from larger meta-analyses. They highlight 6 novel loci which replicate in an external dataset, including variants which map to ARHGAP25, LY6H, COG7, SOD1 and ZNF597. They further identify novel association in AP4E1, refining the genetic landscape of the known SPPL2A locus. Our results demonstrate that machine learning methods can achieve predictive performance comparable to classical approaches in genetic epidemiology and have the potential to uncover novel loci that remain undetected by traditional GWAS. These insights provide a complementary avenue for advancing the understanding of AD genetics.
ACE Alzheimer Center Barcelona Universitat Internacional de Catalunya Barcelona Spain
Alzheimer Research Center and Memory Clinic Andalusian Institute for Neuroscience Málaga Spain
Alzheimer's Centre Reina Sofia CIEN Foundation ISCIII 28031 Madrid Spain
BIO3 Systems Genetics GIGA R Molecular and Computational Biology University of Liege Liege Belgium
BIO3 Systems Medicine Department of Human Genetics KU Leuven Leuven Belgium
Centre for Artificial Intelligence in Precision Medicine University of Oxford Oxford United Kingdom
Centro de Biología Molecular Severo Ochoa Madrid Spain
CHU de Bordeaux Pole santé publique Bordeaux France
CHUV Old Age Psychiatry Department of Psychiatry Lausanne Switzerland
Clinic of Neurology UH Alexandrovska Medical University Sofia Sofia Bulgaria
Complex Genetics of Alzheimer's Disease Group VIB Center for Molecular Neurology VIB Antwerp Belgium
Department of Biomedical Sciences University of Antwerp Antwerp Belgium
Department of Clinical and Experimental Sciences University of Brescia Brescia Italy
Department of Clinical Biochemistry Copenhagen University Hospital Rigshospitalet Copenhagen Denmark
Department of Clinical Genetics VU University Medical Centre Amsterdam The Netherlands
Department of Clinical Medicine University of Copenhagen Copenhagen Denmark
Department of Clinical Sciences and Community Health University of Milan 20122 Milan Italy
Department of Epidemiology Erasmus MC Rotterdam The Netherlands
Department of Epidemiology ErasmusMC Rotterdam The Netherlands
Department of Geriatric Psychiatry University Hospital of Psychiatry Zürich Zürich Switzerland
Department of Hematology and Stem Cell Transplant Vito Fazzi Hospital Lecce Italy
Department of Medicine and Surgery Unit of Neurology University Hospital of Parma Parma Italy
Department of Neurology Bordeaux University Hospital Bordeaux France
Department of Neurology ErasmusMC Rotterdam Netherlands
Department of Neurology Hospital Universitario Donostia San Sebastian Spain
Department of Neurology Medical University Sofia Sofia Bulgaria
Department of Neuroscience Rita Levi Montalcini University of Torino Torino Italy
Department of Neuroscience Università Cattolica del Sacro Cuore Rome Italy
Department of Psychiatry and Psychotherapy University Medical Center Goettingen Goettingen Germany
Department of Radiology and Nuclear Medicine Erasmus MC Rotterdam The Netherlands
Dept of Biomedical Surgical and Dental Sciences University of Milan Milan IT Italy
Dept of Public Health and Caring Sciences Geriatrics Uppsala University Uppsala Sweden
Dept of Public Health and Carins Sciences Geriatrics Uppsala University Uppsala Sweden
Estudios en Neurociencias y Sistemas Complejos CONICET HEC UNAJ Buenos Aires Argentina
Faculty of Medicine University of Lisbon Lisbon Portugal
Fundació Docència i Recerca MútuaTerrassa Terrassa Barcelona Spain
Geriatric Unit Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico 20122 Milan Italy
German Center for Neurodegenerative Diseases Berlin Germany
German Center for Neurodegenerative Diseases Bonn Germany
German Center for Neurodegenerative Diseases Goettingen Germany
German Center for Neurodegenerative Diseases Magdeburg Germany
German Center for Neurodegenerative Diseases Munich Germany
Institut de Recerca Biomedica de Lleida Lleida Spain
Institute for Stroke and Dementia Research University Hospital LMU Munich Munich Germany
Institute for Urban Public Health University Hospital of University Duisburg Essen Essen Germany
Institute of Biomedicine University of Eastern Finland Kuopio Finland
Institute of Clinical Medicine Neurology University of Eastern Finland Kuopio Finland
Institute of Clinical Medicine University of Oslo Oslo Norway
Institute of Neurology Catholic University of the Sacred Heart Rome Italy
Instituto de Investigacion Sanitaria 'Hospital la Paz' Madrid Spain
Instituto de Investigación Sanitaria del Principado de Asturias Asturias Spain
International Clinical Research Center St Anne's University Hospital Brno Brno Czech Republic
IRCCS Fondazione Don Carlo Gnocchi Florence Italy
Krembil Brain Institute University Health Network Toronto Ontario Canada
Laboratorio de Genética Hospital Universitario Central de Asturias Oviedo Spain
Laboratory of Neuropsychiatry IRCCS Santa Lucia Foundation Rome Italy
Medical Science Department iBiMED Aveiro Portugal
Munich Cluster for Systems Neurology Munich Germany
Networking Research Center on Neurodegenerative Diseases Instituto de Salud Carlos 3 Madrid Spain
Neurodegenerative Diseases Unit Fondazione IRCCS Ca' Granda Ospedale Policlinico Milan IT Italy
Neurological Tissue Bank Biobank Hospital Clinic FRCB IDIBAPS Barcelona Spain
Neurology Service Marqués de Valdecilla University Hospital Santander Spain
Neurology Unit IRCCS Fondazione Policlinico Universitario A Gemelli Rome Italy
Neurology Unit San Gerardo hospital Monza and University of Milano Bicocca Milan Italy
Neuroscience Center Zurich University of Zurich and ETH Zurich Zurich Switzerland
Neurosciences Area Instituto Biogipuzkoa San Sebastian Spain
NORMENT Centre Division of Mental Health and Addiction Oslo University Hospital Oslo Norway
Nuffield Department of Population Health University of Oxford Oxford United Kingdom
Old Age Psychiatry Department of Psychiatry Lausanne University Hospital Lausanne Switzerland
School of Medicine Cardiff University Cardiff UK
Translational Health Sciences Bristol Medical School University of Bristol Bristol UK
Unit of Neurology 5 Neuropathology Fondazione IRCCS Istituto Neurologico Carlo Besta Milan Italy
Unitat de Genètica Molecular Institut de Biomedicina de València CSIC Valencia Spain
Unitat Trastorns Cognitius Hospital Universitari Santa Maria de Lleida Lleida Spain
Universidad Autónoma de Madrid Madrid Spain
Université de Paris EA 4468 APHP Hôpital Broca Paris France
Université Paris Saclay CEA Centre National de Recherche en Génomique Humaine 91057 Evry France
University Bordeaux Inserm Bordeaux Population Health Research Center Bordeaux France
Zurich Center for Integrative Human Physiology University of Zurich Zurich Switzerland
Zobrazit více v PubMed
Lambert J. C., Ramirez A., Grenier-Boley B., Bellenguez C. Step by step: towards a better understanding of the genetic architecture of Alzheimer’s disease. PubMed PMC
Baker, E. & Escott-Price, V. Polygenic Risk Scores in Alzheimer’s Disease: Current Applications and Future Directions. PubMed DOI PMC
Nelson, R. M., Pettersson, M. E. & Carlborg, Ö A century after Fisher: time for a new paradigm in quantitative genetics. PubMed DOI
Lewis C. M., Vassos E. Polygenic risk scores: from research tools to clinical instruments. PubMed PMC
Visscher P. M., et al. 10 Years of GWAS discovery: biology, function, and translation. PubMed PMC
Liu, C. C. et al. Cell-autonomous effects of APOE4 in restricting microglial response in brain homeostasis and Alzheimer’s disease. PubMed DOI PMC
Yin, Z. et al. APOE4 impairs the microglial response in Alzheimer’s disease by inducing TGFβ-mediated checkpoints. PubMed DOI PMC
Lin, Y. T. et al. APOE4 causes widespread molecular and cellular alterations associated with Alzheimer’s disease phenotypes in human iPSC-derived brain cell types. PubMed DOI PMC
Emrani, S., Arain, H. A., DeMarshall, C. & Nuriel, T. APOE4 is associated with cognitive and pathological heterogeneity in patients with Alzheimer’s disease: a systematic review. PubMed PMC
Ma, Y. et al. Analysis of whole-exome sequencing data for Alzheimer disease stratified by APOE genotype. PubMed DOI PMC
Bracher-Smith, M. et al. Whole genome analysis in APOE4 homozygotes identifies the DAB1-RELN pathway in Alzheimer’s disease pathogenesis. PubMed DOI PMC
Jun, G. et al. A novel Alzheimer disease locus located near the gene encoding tau protein. PubMed DOI PMC
Park, J. H. et al. Novel Alzheimer’s disease risk variants identified based on whole-genome sequencing of APOE ε4 carriers. PubMed DOI PMC
Escott-Price, V. & Schmidt, K. M. Pitfalls of predicting age-related traits by polygenic risk scores. PubMed DOI PMC
Ishioka, Y. L. et al. Effects of the APOE epsilon 4 allele and education on cognitive function in Japanese centenarians. PubMed DOI PMC
Hayden, K. M. et al. Cognitive resilience among APOE ε4 carriers in the oldest old. PubMed DOI PMC
Abdellaoui, A., Yengo, L., Verweij, K. J. H. & Visscher, P. M. 15 years of GWAS discovery: realizing the promise. PubMed DOI PMC
Rowe T. W., et al. Machine learning for the life-time risk prediction of Alzheimer’s disease: a systematic review. PubMed PMC
Alatrany, A. S., Hussain, A. J., Mustafina, J. & Al-Jumeily, D. Machine learning approaches and applications in genome wide association study for Alzheimer’s disease: a systematic review. DOI
Christodoulou, E. et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. PubMed DOI
Bracher-Smith, M., Crawford, K. & Escott-Price, V. Machine learning for genetic prediction of psychiatric disorders: a systematic review. PubMed DOI PMC
Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. PubMed DOI PMC
Bellenguez, C. et al. New insights into the genetic etiology of Alzheimer’s disease and related dementias. PubMed DOI PMC
Andrews, S. J., Fulton-Howard, B. & Goate, A. Interpretation of risk loci from genome-wide association studies of Alzheimer’s disease. PubMed DOI PMC
Wightman, D. P. et al. A genome-wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer’s disease. PubMed DOI PMC
Van Steen K. Travelling the world of gene–gene interactions. Brief Bioinform [Internet]. 2012 Jan 1 [cited 28 Feb 2025];13:1–19. Available from: 10.1093/bib/bbr012. PubMed
Leonenko, G. et al. Identifying individuals with high risk of Alzheimer’s disease using polygenic risk scores. PubMed DOI PMC
Zhou, X. et al. Deep learning-based polygenic risk analysis for Alzheimer’s disease prediction. PubMed PMC
Schlam, D. et al. Phosphoinositide 3-kinase enables phagocytosis of large particles by terminating actin assembly through Rac/Cdc42 GTPase-activating proteins. Nature. PubMed DOI PMC
Csépányi-Kömi, R., Sirokmány, G., Geiszt, M. & Ligeti, E. ARHGAP25, a novel Rac GTPase-activating protein, regulates phagocytosis in human neutrophilic granulocytes. PubMed DOI
Wu, M., Liu, C. Z., Barrall, E. A., Rissman, R. A. & Joiner, W. J. Unbalanced regulation of α7 nAChRs by Ly6h and NACHO contributes to neurotoxicity in Alzheimer’s disease. PubMed DOI PMC
Wu, X. et al. Mutation of the COG complex subunit gene COG7 causes a lethal congenital disorder. PubMed DOI
Haukedal H., Freude K. K. Implications of glycosylation in Alzheimer’s disease. Front. Neurosci. [Internet]. Jan 13 [cited 28 Feb 2025];14. Available from: https://pubmed.ncbi.nlm.nih.gov/33519371/ (2021). PubMed PMC
Burgos P. V., et al. Sorting of the Alzheimer’s disease amyloid precursor protein mediated by the AP-4 complex. Dev Cell [Internet]. 2010 Mar 16 [cited 28 Feb 2025];18:425. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC2841041/. PubMed PMC
GTEX. rs4690324 in DGKQ linked to IDUA through sQTLs [Internet]. [cited 2024 Apr 18]. Available from: https://www.gtexportal.org/home/snp/rs4690324.
Fabregat A., et al. Reactome. [cited 16 Apr 2024]. IDUA [lysosomal lumen]. Available from: https://reactome.org/content/detail/R-HSA-1678661 (2017).
Nalls, M. A. et al. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease. PubMed DOI PMC
Grasby K. L., et al. The genetic architecture of the human cerebral cortex. Science [Internet]. Mar 20 [cited 2025 Feb 28];367. Available from: https://pubmed.ncbi.nlm.nih.gov/32193296/ (2020). PubMed PMC
Lind L. Genetic determinants of clustering of cardiometabolic risk factors in U.K. biobank. Metab. Syndr. Relat. Disord. [Internet]. 2020 [cited 28 Feb 2025];18:121–127. Available from: https://pubmed.ncbi.nlm.nih.gov/31928498/. PubMed
Li, Y., Ma, C. & Wei, Y. Relationship between superoxide dismutase 1 and patients with Alzheimer’s disease.
Spisak, K. et al. rs2070424 of the SOD1 gene is associated with risk of Alzheimer’s disease. PubMed DOI
Gola, D., Erdmann, J., Müller-Myhsok, B., Schunkert, H. & König, I. R. Polygenic risk scores outperform machine learning methods in predicting coronary artery disease status. PubMed DOI
Blanco, K. et al. Systematic review: fluid biomarkers and machine learning methods to improve the diagnosis from mild cognitive impairment to Alzheimer’s disease. PubMed DOI PMC
Wray, N. R., Yang, J., Goddard, M. E. & Visscher, P. M. The genetic interpretation of area under the roc curve in genomic profiling. PubMed DOI PMC
Kunkle, B. W. et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. PubMed DOI PMC
Farrer, L. A. et al. Effects of age, sex, and ethnicity on the association between apolipoprotein E genotype and Alzheimer disease: a meta-analysis. PubMed DOI
Escott-Price, V., Myers, A. J., Huentelman, M. & Hardy, J. Polygenic risk score analysis of pathologically confirmed Alzheimer’s disease. PubMed DOI PMC
Weber, C. J. et al. The worldwide Alzheimer’s disease neuroimaging initiative: ADNI-3 updates and global perspectives. PubMed DOI PMC
Leonenko, G. et al. Genetic risk for alzheimer disease is distinct from genetic risk for amyloid deposition. PubMed DOI PMC
Bellou, E. et al. Age-dependent effect of APOE and polygenic component on Alzheimer’s disease. PubMed DOI PMC
Frankel, W. N. & Schork, N. J. Who’s afraid of epistasis? PubMed DOI
Ding, Y. et al. Large uncertainty in individual polygenic risk score estimation impacts PRS-based risk stratification. PubMed DOI PMC
Zhu, H., Xu, J., Liu, S. & Jin, Y. Federated learning on non-IID data: a survey. DOI
Joiret, M., Mahachie John, J. M., Gusareva, E. S. & Van Steen, K. Confounding of linkage disequilibrium patterns in large scale DNA based gene-gene interaction studies. PubMed DOI PMC
Chen T., Guestrin C. XGBoost. In:
Rocklin M. Dask: parallel computation with blocked algorithms and task scheduling. In:
Lundberg S. M., Allen P. G., Lee S. I. A Unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 30 (2017)
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. PubMed DOI PMC
van Hilten, A. et al. GenNet framework: interpretable deep learning for predicting phenotypes from genetic data. PubMed PMC
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. PubMed DOI PMC
Cattaert, T. et al. Model-based multifactor dimensionality reduction for detecting epistasis in case–control data in the presence of noise. PubMed DOI PMC
Mahachie John, J. M., Van Lishout, F. & Van Steen, K. Model-based multifactor dimensionality reduction to detect epistasis for quantitative traits in the presence of error-free and noisy data. PubMed DOI PMC
Lishout F. et al. K Van. gammaMAXT: a fast multiple-testing correction algorithm. PubMed PMC
Gola, D. & König, I. R. Empowering individual trait prediction using interactions for precision medicine. PubMed DOI PMC
Verhaeghe J., Van Der Donckt J., Ongenae F., Van Hoecke S. Powershap: a power-full shapley feature selection method. 71–87 (2022).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. PubMed DOI PMC
Kursa, M. B. & Rudnicki, W. R. Feature selection with the boruta package. DOI
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. PubMed DOI PMC
Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. PubMed DOI PMC
Szklarczyk, D. et al. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. PubMed DOI PMC
Gosselin, D. et al. An environment-dependent transcriptional network specifies human microglia identity. PubMed PMC
Endo F., et al. Molecular basis of astrocyte diversity and morphology across the CNS in health and disease. PubMed PMC
Koopmans, F. et al. SynGO: an evidence-based, expert-curated knowledge base for the synapse. PubMed DOI PMC
Seabold S., Perktold J. Statsmodels: econometric and statistical modeling with Python.
Hunter, J. D. Matplotlib. DOI
Waskom, M. L. seaborn: statistical data visualization. DOI
Bracher-Smith M. Machine learning in Alzheimer’s disease genetics (2025). PubMed PMC