Delineation of functionally essential protein regions for 242 neurodevelopmental genes
Jazyk angličtina Země Anglie, Velká Británie Médium print
Typ dokumentu časopisecké články, Research Support, N.I.H., Extramural, práce podpořená grantem
Grantová podpora
R01 NS112499
NINDS NIH HHS - United States
PubMed
36256779
PubMed Central
PMC9924913
DOI
10.1093/brain/awac381
PII: 6763201
Knihovny.cz E-zdroje
- Klíčová slova
- bioinformatics, genetics, neurodevelopmental disorder,
- MeSH
- dítě MeSH
- genetické testování MeSH
- lidé MeSH
- mentální retardace * genetika MeSH
- missense mutace MeSH
- mutace genetika MeSH
- neurovývojové poruchy * genetika MeSH
- Check Tag
- dítě MeSH
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Neurodevelopmental disorders (NDDs), including severe paediatric epilepsy, autism and intellectual disabilities are heterogeneous conditions in which clinical genetic testing can often identify a pathogenic variant. For many of them, genetic therapies will be tested in this or the coming years in clinical trials. In contrast to first-generation symptomatic treatments, the new disease-modifying precision medicines require a genetic test-informed diagnosis before a patient can be enrolled in a clinical trial. However, even in 2022, most identified genetic variants in NDD genes are 'variants of uncertain significance'. To safely enrol patients in precision medicine clinical trials, it is important to increase our knowledge about which regions in NDD-associated proteins can 'tolerate' missense variants and which ones are 'essential' and will cause a NDD when mutated. In addition, knowledge about functionally indispensable regions in the 3D structure context of proteins can also provide insights into the molecular mechanisms of disease variants. We developed a novel consensus approach that overlays evolutionary, and population based genomic scores to identify 3D essential sites (Essential3D) on protein structures. After extensive benchmarking of AlphaFold predicted and experimentally solved protein structures, we generated the currently largest expert curated protein structure set for 242 NDDs and identified 14 377 Essential3D sites across 189 gene disorders associated proteins. We demonstrate that the consensus annotation of Essential3D sites improves prioritization of disease mutations over single annotations. The identified Essential3D sites were enriched for functional features such as intermembrane regions or active sites and discovered key inter-molecule interactions in protein complexes that were otherwise not annotated. Using the currently largest autism, developmental disorders, and epilepsies exome sequencing studies including >360 000 NDD patients and population controls, we found that missense variants at Essential3D sites are 8-fold enriched in patients. In summary, we developed a comprehensive protein structure set for 242 NDDs and identified 14 377 Essential3D sites in these. All data are available at https://es-ndd.broadinstitute.org for interactive visual inspection to enhance variant interpretation and development of mechanistic hypotheses for 242 NDDs genes. The provided resources will enhance clinical variant interpretation and in silico drug target development for NDD-associated genes and encoded proteins.
Analytic and Translational Genetics Unit Massachusetts General Hospital Boston MA 02114 USA
Cologne Center for Genomics University of Cologne 50923 Köln Germany
Epilepsy Center Neurological Institute Cleveland Clinic Cleveland OH 44195 USA
Genomic Medicine Institute Lerner Research Institute Cleveland Clinic Cleveland OH 44106 USA
Luxembourg Centre for Systems Biomedicine University of Luxembourg 4365 Esch sur Alzette Luxembourg
Stanley Center for Psychiatric Research Broad Institute of MIT and Harvard Cambridge MA 02142 USA
The Paediatric Neurosciences Research Group Royal Hospital for Children Glasgow G12 8QQ UK
Zobrazit více v PubMed
Emerson E. Deprivation, ethnicity and the prevalence of intellectual and developmental disabilities. J Epidemiol Community Health. 2012;66:218–224. PubMed
Parenti I, Rabaneda LG, Schoen H, Novarino G. Neurodevelopmental disorders: From genetics to functional pathways. Trends Neurosci. 2020;43:608–621. PubMed
Thapar A, Cooper M, Rutter M. Neurodevelopmental disorders. Lancet Psychiatry. 2017;4:339–346. PubMed
Morris-Rosendahl DJ, Crocq MA. Neurodevelopmental disorders-the history and future of a diagnostic concept. Dialogues Clin Neurosci. 2020;22:65–72. PubMed PMC
Jarmasz JS, Basalah DA, Chudley AE, Del Bigio MR. Human brain abnormalities associated with prenatal alcohol exposure and fetal alcohol spectrum disorder. J Neuropathol Exp Neurol. 2017;76:813–833. PubMed PMC
Goeden N, Velasquez J, Arnold KA, et al. . Maternal inflammation disrupts fetal neurodevelopment via increased placental output of serotonin to the fetal brain. J Neurosci. 2016;36:6041–6049. PubMed PMC
Satterstrom FK, Kosmicki JA, Wang J, et al. . Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell. 2020;180:568–584.e23. PubMed PMC
Sanders SJ, He X, Willsey AJ, et al. . Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron. 2015;87:1215–1233. PubMed PMC
Heyne HO, Singh T, Stamberger H, et al. . De novo variants in neurodevelopmental disorders with epilepsy. Nat Genet. 2018;50:1048–1053. PubMed
Singh T, Walters JTR, Johnstone M, et al. . The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat Genet. 2017;49:1167–1173. PubMed PMC
Deciphering Developmental Disorders Study . Prevalence and architecture of de novo mutations in developmental disorders. Nature. 2017;542:433–438. PubMed PMC
Epi25 Collaborative . Ultra-rare genetic variation in the epilepsies: A whole-exome sequencing study of 17,606 individuals. Am J Hum Genet. 2019;105:267–282. PubMed PMC
Kaplanis J, Samocha KE, Wiel L, et al. . Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature. 2020;586(7831):757–762. PubMed PMC
Heyne HO, Baez-Nieto D, Iqbal S, et al. . Predicting functional effects of missense variants in voltage-gated sodium and calcium channels. Sci Transl Med. 2020;12(556):eaay6848. PubMed
Escayg A, Goldin AL. Sodium channel SCN1A and epilepsy: Mutations and mechanisms. Epilepsia. 2010;51:1650–1658. PubMed PMC
Sanders SJ, Campbell AJ, Cottrell JR, et al. . Progress in understanding and treating SCN2A-mediated disorders. Trends Neurosci. 2018;41:442–456. PubMed PMC
Richards S, Aziz N, Bale S, et al. . Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–424. PubMed PMC
Sivley RM, Dou X, Meiler J, Bush WS, Capra JA. Comprehensive analysis of constraint on the spatial distribution of missense variants in human protein structures. Am J Hum Genet. 2018;102:415–426. PubMed PMC
Kamburov A, , Lawrence MS, Polak P, et al. . Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc Natl Acad Sci U S A. 2015;112:E5486–E5495. PubMed PMC
Iqbal S, Pérez-Palma E, Jespersen JB, et al. . Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants. Proc Natl Acad Sci U S A. 2020;117:28201–28211. PubMed PMC
Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci. 2020;29:247–257. PubMed PMC
Tang Z-Z, Sliwoski GR, Chen G, et al. . PSCAN: Spatial scan tests guided by protein structures improve complex disease gene discovery and signal variant detection. Genome Biol. 2020;21(1):217. PubMed PMC
Kelly M, Park M, Mihalek I, et al. . Spectrum of neurodevelopmental disease associated with the GNAO1 guanosine triphosphate-binding region. Epilepsia. 2019;60:406–418. PubMed PMC
Olson HE, Demarest ST, Pestana-Knight EM, et al. . Cyclin-dependent kinase-like 5 deficiency disorder: Clinical review. Pediatr Neurol. 2019;97:18–25. PubMed PMC
Katayama S, Sueyoshi N, Inazu T, Kameshita I. Cyclin-dependent kinase-like 5 (CDKL5): Possible cellular signalling targets and involvement in CDKL5 deficiency disorder. Neural Plast. 2020;2020:6970190. PubMed PMC
Jumper J, Evans R, Pritzel A, et al. . Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. PubMed PMC
Tunyasuvunakool K, Adler J, Wu Z, et al. . Highly accurate protein structure prediction for the human proteome. Nature. 2021;596:590–596. PubMed PMC
Berman HM, Westbrook J, Feng Z, et al. . The protein data bank. Nucleic Acids Res. 2000;28:235–242. PubMed PMC
The UniProt Consortium . UniProt: The universal protein knowledgebase. Nucleic Acids Res. 2018;46:2699. PubMed PMC
Karczewski KJ, Francioli LC, Tiao G, et al. . The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. PubMed PMC
Danecek P, Auton A, Abecasis G, et al. . The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. PubMed PMC
Landrum MJ, Lee JM, Benson MB, et al. . ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–D1067. PubMed PMC
Stenson PD, Mort M, Ball EV, et al. . The Human Gene Mutation Database: Towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet. 2017;136:665–677. PubMed PMC
Turner TN, Yi Q, Krumm N, et al. . denovo-db: A compendium of human de novo variants. Nucleic Acids Res. 2017;45:D804–D811. PubMed PMC
Dewey FE, Murray MF, Overton JD, et al. . Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science. 2016;354:aaf6814. PubMed
Sudlow C, Gallacher J, Allen N, et al. . UK biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. PubMed PMC
Yuan C, Chen H, Kihara D. Effective inter-residue contact definitions for accurate protein fold recognition. BMC Bioinformatics. 2012;13:292. PubMed PMC
Adhikari B, Cheng J. Protein residue contacts and prediction methods. Methods Mol Biol. 2016;1415:463–476. PubMed PMC
Hoksza D, Gawron P, Ostaszewski M, Schneider R. MolArt: A molecular structure annotation and visualization tool. Bioinformatics. 2018;34:4127–4128. PubMed PMC
Lek M, Karczewski KJ, Minikel EV, et al. . Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. PubMed PMC
Zhang Y, Skolnick J. TM-align: A protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–2309. PubMed PMC
Hocker B. Design of proteins from smaller fragments-learning from evolution. Curr Opin Struct Biol. 2014;27:56–62. PubMed
Worth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol. 2009;10:709–720. PubMed
Perez-Palma E, May P, Iqbal S, et al. . Identification of pathogenic variant enriched regions across genes and gene families. Genome Res. 2020;30:62–71. PubMed PMC
Traynelis J, Silk M, Wang Q, et al. . Optimizing genomic medicine in epilepsy through a gene-customized approach to missense variant interpretation. Genome Res. 2017;27:1715–1729. PubMed PMC
Lal D, May P, Perez-Palma E, et al. . Gene family information facilitates variant interpretation and identification of disease-associated genes. Genome Med. 2020;12(1):28. PubMed PMC
Hopf TA, Ingraham JB, Poelwijk FJ, et al. . Mutation effects predicted from sequence co-variation. Nat Biotechnol. 2017;35:128–135. PubMed PMC
Miceli F, Soldovieri MV, Ambrosino P, et al. . Molecular pathophysiology and pharmacology of the voltage-sensing module of neuronal ion channels. Front Cell Neurosci. 2015;9:259. PubMed PMC
Scheffer IE, Berkovic S, Capovilla G, et al. . ILAE classification of the epilepsies: Position paper of the ILAE Commission for Classification and Terminology. Epilepsia. 2017;58:512–521. PubMed PMC
Muir AM, Gardner JF, van Jaarsveld RH, et al. . Variants in GNAI1 cause a syndrome associated with variable features including developmental delay, seizures, and hypotonia. Genet Med. 2021;23:881–887. PubMed PMC
Reynhout S, Jansen S, Haesen D, et al. . De novo mutations affecting the catalytic Calpha subunit of PP2A, PPP2CA, cause syndromic intellectual disability resembling other PP2A-related neurodevelopmental disorders. Am J Hum Genet. 2019;104:139–156. PubMed PMC
Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Molecular mechanisms of disease-causing missense mutations. J Mol Biol. 2013;425:3919–3936. PubMed PMC
Al Mehdi K, Fouad B, Zouhair E, et al. . Molecular modelling and dynamics study of nsSNP in STXBP1 gene in early infantile epileptic encephalopathy disease. Biomed Res Int. 2019;2019:4872101. PubMed PMC
McTague A, Nair U, Malhotra S, et al. . Clinical and molecular characterization of KCNT1-related severe early-onset epilepsy. Neurology. 2018;90:e55–e66. PubMed PMC
Parrini E, Marini C, Mei D, et al. . Diagnostic targeted resequencing in 349 patients with drug-resistant pediatric epilepsies identifies causative mutations in 30 different genes. Hum Mutat. 2017;38:216–225. PubMed
Thornton JM, Laskowski RA, Borkakoti N. AlphaFold heralds a data-driven revolution in biology and medicine. Nat Med. 2021;27:1666–1669. PubMed
Lal D, May P, Perez-Palma E, et al. . Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders. Genome Med. 2020;12:28. PubMed PMC
Akdel M, Pires DEV, Porta Pardo E, et al. . A structural biology community assessment of AlphaFold 2 applications. bioRxiv 391185. 2021. doi: 10.1101/2021.09.26.461876 DOI
Meyer MJ, Lapcevic R, Romero AE, et al. . mutation3D: Cancer gene prediction through atomic clustering of coding variants in the structural proteome. Hum Mutat. 2016;37:447–456. PubMed PMC
Ghosh R, Oak N, Plon SE. Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biol. 2017;18:225. PubMed PMC
Geisheker MR, Heymann G, Wang T, et al. . Hotspots of missense mutation identify neurodevelopmental disorder genes and functional domains. Nat Neurosci. 2017;20:1043–1051. PubMed PMC
Ye J, Pavlicek A, Lunney EA, Rejto PA, Teng CH. Statistical method on nonrandom clustering with application to somatic mutations in cancer. BMC Bioinformatics. 2010;11:11. PubMed PMC
Poole W, Leinonen K, Shmulevich I, Knijnenburg TA, Bernard B. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression. PLoS Comput Biol. 2017;13:e1005347. PubMed PMC
Jubb HC, Saini HK, Verdonk ML, Forbes SA. COSMIC-3D provides structural perspectives on cancer genetics for drug discovery. Nat Genet. 2018;50:1200–1202. PubMed PMC
Ofoegbu TC, David A, Kelley LA, et al. . PhyreRisk: A dynamic web application to bridge genomics, proteomics and 3D structural data to guide interpretation of human genetic variants. J Mol Biol. 2019;431:2460–2466. PubMed PMC
Stephenson JD, Laskowski RA, Nightingale A, Hurles ME, Thornton JM. VarMap: A web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations. Bioinformatics. 2019;35:4854–4856. PubMed PMC
Liang S, Mort M, Stenson PD, Cooper DN, Yu H. PIVOTAL: Prioritizing variants of uncertain significance with spatial genomic patterns in the 3D proteome. bioRxiv 2020.06.04.135103. 2020. doi: 10.1101/2020.06.04.135103 DOI
Segura J, Sanchez-Garcia R, Sorzano COS, Carazo JM. 3DBIONOTES v3.0: Crossing molecular and structural biology data with genomic variations. Bioinformatics. 2019;35:3512–3513. PubMed PMC
Paznekas WA, Boyadjiev SA, Shapiro RE, et al. . Connexin 43 (GJA1) mutations cause the pleiotropic phenotype of oculodentodigital dysplasia. Am J Hum Genet. 2003;72:408–418. PubMed PMC
Brunklaus A, Du J, Steckler F, et al. . Biological concepts in human sodium channel epilepsies and their relevance in clinical practice. Epilepsia. 2020;61:387–399. PubMed
Bellazzi R, Masseroli M, Murphy S, Shabo A, Romano P. Clinical bioinformatics: Challenges and opportunities. BMC Bioinformatics. 2012;13(Suppl 14):S1. PubMed PMC
Mangul S, Martin LS, Eskin E, Blekhman R. Improving the usability and archival stability of bioinformatics software. Genome Biol. 2019;20:47. PubMed PMC
Li Q, Wang K. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. Am J Hum Genet. 2017;100:267–280. PubMed PMC
Amendola LM, Jarvik GP, Leo MC, et al. . Performance of ACMG-AMP variant-interpretation guidelines among nine laboratories in the clinical sequencing exploratory research consortium. Am J Hum Genet. 2016;99:247. PubMed PMC
Babione JN, Ocampo W, Haubrich S, et al. . Human-centred design processes for clinical decision support: A pulmonary embolism case study. Int J Med Inform. 2020;142:104196. PubMed
Bates DW, Kuperman GJ, Wang S, et al. . Ten commandments for effective clinical decision support: Making the practice of evidence-based medicine a reality. J Am Med Inform Assoc. 2003;10:523–530. PubMed PMC
Cai CJ, Reif E, Hegde N, et al. Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ‘19). Association for Computing Machinery; Paper 4:1-14. DOI
Castellotti B, Ragona F, Freri E, et al. . Screening of SLC2A1 in a large cohort of patients suspected for Glut1 deficiency syndrome: Identification of novel variants and associated phenotypes. J Neurol. 2019;266:1439–1448. PubMed
Nickels KC, Zaccariello MJ, Hamiwka LD, Wirrell EC. Cognitive and neurodevelopmental comorbidities in paediatric epilepsy. Nat Rev Neurol. 2016;12:465–476. PubMed
Deng D, Xu C, Sun P, et al. . Crystal structure of the human glucose transporter GLUT1. Nature. 2014;510:121–125. PubMed
Tung K-F, Pan C-Y, Chen C-H, Lin W-C. Top-ranked expressed gene transcripts of human protein-coding genes investigated with GTEx dataset. Sci Rep. 2020;10:16245. PubMed PMC