Delineation of functionally essential protein regions for 242 neurodevelopmental genes

. 2023 Feb 13 ; 146 (2) : 519-533.

Jazyk angličtina Země Anglie, Velká Británie Médium print

Typ dokumentu časopisecké články, Research Support, N.I.H., Extramural, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid36256779

Grantová podpora
R01 NS112499 NINDS NIH HHS - United States

Neurodevelopmental disorders (NDDs), including severe paediatric epilepsy, autism and intellectual disabilities are heterogeneous conditions in which clinical genetic testing can often identify a pathogenic variant. For many of them, genetic therapies will be tested in this or the coming years in clinical trials. In contrast to first-generation symptomatic treatments, the new disease-modifying precision medicines require a genetic test-informed diagnosis before a patient can be enrolled in a clinical trial. However, even in 2022, most identified genetic variants in NDD genes are 'variants of uncertain significance'. To safely enrol patients in precision medicine clinical trials, it is important to increase our knowledge about which regions in NDD-associated proteins can 'tolerate' missense variants and which ones are 'essential' and will cause a NDD when mutated. In addition, knowledge about functionally indispensable regions in the 3D structure context of proteins can also provide insights into the molecular mechanisms of disease variants. We developed a novel consensus approach that overlays evolutionary, and population based genomic scores to identify 3D essential sites (Essential3D) on protein structures. After extensive benchmarking of AlphaFold predicted and experimentally solved protein structures, we generated the currently largest expert curated protein structure set for 242 NDDs and identified 14 377 Essential3D sites across 189 gene disorders associated proteins. We demonstrate that the consensus annotation of Essential3D sites improves prioritization of disease mutations over single annotations. The identified Essential3D sites were enriched for functional features such as intermembrane regions or active sites and discovered key inter-molecule interactions in protein complexes that were otherwise not annotated. Using the currently largest autism, developmental disorders, and epilepsies exome sequencing studies including >360 000 NDD patients and population controls, we found that missense variants at Essential3D sites are 8-fold enriched in patients. In summary, we developed a comprehensive protein structure set for 242 NDDs and identified 14 377 Essential3D sites in these. All data are available at https://es-ndd.broadinstitute.org for interactive visual inspection to enhance variant interpretation and development of mechanistic hypotheses for 242 NDDs genes. The provided resources will enhance clinical variant interpretation and in silico drug target development for NDD-associated genes and encoded proteins.

Komentář v

PubMed

Zobrazit více v PubMed

Emerson E. Deprivation, ethnicity and the prevalence of intellectual and developmental disabilities. J Epidemiol Community Health. 2012;66:218–224. PubMed

Parenti I, Rabaneda LG, Schoen H, Novarino G. Neurodevelopmental disorders: From genetics to functional pathways. Trends Neurosci. 2020;43:608–621. PubMed

Thapar A, Cooper M, Rutter M. Neurodevelopmental disorders. Lancet Psychiatry. 2017;4:339–346. PubMed

Morris-Rosendahl DJ, Crocq MA. Neurodevelopmental disorders-the history and future of a diagnostic concept. Dialogues Clin Neurosci. 2020;22:65–72. PubMed PMC

Jarmasz JS, Basalah DA, Chudley AE, Del Bigio MR. Human brain abnormalities associated with prenatal alcohol exposure and fetal alcohol spectrum disorder. J Neuropathol Exp Neurol. 2017;76:813–833. PubMed PMC

Goeden N, Velasquez J, Arnold KA, et al. . Maternal inflammation disrupts fetal neurodevelopment via increased placental output of serotonin to the fetal brain. J Neurosci. 2016;36:6041–6049. PubMed PMC

Satterstrom FK, Kosmicki JA, Wang J, et al. . Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism. Cell. 2020;180:568–584.e23. PubMed PMC

Sanders SJ, He X, Willsey AJ, et al. . Insights into autism spectrum disorder genomic architecture and biology from 71 risk loci. Neuron. 2015;87:1215–1233. PubMed PMC

Heyne HO, Singh T, Stamberger H, et al. . De novo variants in neurodevelopmental disorders with epilepsy. Nat Genet. 2018;50:1048–1053. PubMed

Singh T, Walters JTR, Johnstone M, et al. . The contribution of rare variants to risk of schizophrenia in individuals with and without intellectual disability. Nat Genet. 2017;49:1167–1173. PubMed PMC

Deciphering Developmental Disorders Study . Prevalence and architecture of de novo mutations in developmental disorders. Nature. 2017;542:433–438. PubMed PMC

Epi25 Collaborative . Ultra-rare genetic variation in the epilepsies: A whole-exome sequencing study of 17,606 individuals. Am J Hum Genet. 2019;105:267–282. PubMed PMC

Kaplanis J, Samocha KE, Wiel L, et al. . Evidence for 28 genetic disorders discovered by combining healthcare and research data. Nature. 2020;586(7831):757–762. PubMed PMC

Heyne HO, Baez-Nieto D, Iqbal S, et al. . Predicting functional effects of missense variants in voltage-gated sodium and calcium channels. Sci Transl Med. 2020;12(556):eaay6848. PubMed

Escayg A, Goldin AL. Sodium channel SCN1A and epilepsy: Mutations and mechanisms. Epilepsia. 2010;51:1650–1658. PubMed PMC

Sanders SJ, Campbell AJ, Cottrell JR, et al. . Progress in understanding and treating SCN2A-mediated disorders. Trends Neurosci. 2018;41:442–456. PubMed PMC

Richards S, Aziz N, Bale S, et al. . Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–424. PubMed PMC

Sivley RM, Dou X, Meiler J, Bush WS, Capra JA. Comprehensive analysis of constraint on the spatial distribution of missense variants in human protein structures. Am J Hum Genet. 2018;102:415–426. PubMed PMC

Kamburov A, , Lawrence MS, Polak P, et al. . Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc Natl Acad Sci U S A. 2015;112:E5486–E5495. PubMed PMC

Iqbal S, Pérez-Palma E, Jespersen JB, et al. . Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants. Proc Natl Acad Sci U S A. 2020;117:28201–28211. PubMed PMC

Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci. 2020;29:247–257. PubMed PMC

Tang Z-Z, Sliwoski GR, Chen G, et al. . PSCAN: Spatial scan tests guided by protein structures improve complex disease gene discovery and signal variant detection. Genome Biol. 2020;21(1):217. PubMed PMC

Kelly M, Park M, Mihalek I, et al. . Spectrum of neurodevelopmental disease associated with the GNAO1 guanosine triphosphate-binding region. Epilepsia. 2019;60:406–418. PubMed PMC

Olson HE, Demarest ST, Pestana-Knight EM, et al. . Cyclin-dependent kinase-like 5 deficiency disorder: Clinical review. Pediatr Neurol. 2019;97:18–25. PubMed PMC

Katayama S, Sueyoshi N, Inazu T, Kameshita I. Cyclin-dependent kinase-like 5 (CDKL5): Possible cellular signalling targets and involvement in CDKL5 deficiency disorder. Neural Plast. 2020;2020:6970190. PubMed PMC

Jumper J, Evans R, Pritzel A, et al. . Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. PubMed PMC

Tunyasuvunakool K, Adler J, Wu Z, et al. . Highly accurate protein structure prediction for the human proteome. Nature. 2021;596:590–596. PubMed PMC

Berman HM, Westbrook J, Feng Z, et al. . The protein data bank. Nucleic Acids Res. 2000;28:235–242. PubMed PMC

The UniProt Consortium . UniProt: The universal protein knowledgebase. Nucleic Acids Res. 2018;46:2699. PubMed PMC

Karczewski KJ, Francioli LC, Tiao G, et al. . The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443. PubMed PMC

Danecek P, Auton A, Abecasis G, et al. . The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. PubMed PMC

Landrum MJ, Lee JM, Benson MB, et al. . ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–D1067. PubMed PMC

Stenson PD, Mort M, Ball EV, et al. . The Human Gene Mutation Database: Towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet. 2017;136:665–677. PubMed PMC

Turner TN, Yi Q, Krumm N, et al. . denovo-db: A compendium of human de novo variants. Nucleic Acids Res. 2017;45:D804–D811. PubMed PMC

Dewey FE, Murray MF, Overton JD, et al. . Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science. 2016;354:aaf6814. PubMed

Sudlow C, Gallacher J, Allen N, et al. . UK biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779. PubMed PMC

Yuan C, Chen H, Kihara D. Effective inter-residue contact definitions for accurate protein fold recognition. BMC Bioinformatics. 2012;13:292. PubMed PMC

Adhikari B, Cheng J. Protein residue contacts and prediction methods. Methods Mol Biol. 2016;1415:463–476. PubMed PMC

Hoksza D, Gawron P, Ostaszewski M, Schneider R. MolArt: A molecular structure annotation and visualization tool. Bioinformatics. 2018;34:4127–4128. PubMed PMC

Lek M, Karczewski KJ, Minikel EV, et al. . Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–291. PubMed PMC

Zhang Y, Skolnick J. TM-align: A protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–2309. PubMed PMC

Hocker B. Design of proteins from smaller fragments-learning from evolution. Curr Opin Struct Biol. 2014;27:56–62. PubMed

Worth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol. 2009;10:709–720. PubMed

Perez-Palma E, May P, Iqbal S, et al. . Identification of pathogenic variant enriched regions across genes and gene families. Genome Res. 2020;30:62–71. PubMed PMC

Traynelis J, Silk M, Wang Q, et al. . Optimizing genomic medicine in epilepsy through a gene-customized approach to missense variant interpretation. Genome Res. 2017;27:1715–1729. PubMed PMC

Lal D, May P, Perez-Palma E, et al. . Gene family information facilitates variant interpretation and identification of disease-associated genes. Genome Med. 2020;12(1):28. PubMed PMC

Hopf TA, Ingraham JB, Poelwijk FJ, et al. . Mutation effects predicted from sequence co-variation. Nat Biotechnol. 2017;35:128–135. PubMed PMC

Miceli F, Soldovieri MV, Ambrosino P, et al. . Molecular pathophysiology and pharmacology of the voltage-sensing module of neuronal ion channels. Front Cell Neurosci. 2015;9:259. PubMed PMC

Scheffer IE, Berkovic S, Capovilla G, et al. . ILAE classification of the epilepsies: Position paper of the ILAE Commission for Classification and Terminology. Epilepsia. 2017;58:512–521. PubMed PMC

Muir AM, Gardner JF, van Jaarsveld RH, et al. . Variants in GNAI1 cause a syndrome associated with variable features including developmental delay, seizures, and hypotonia. Genet Med. 2021;23:881–887. PubMed PMC

Reynhout S, Jansen S, Haesen D, et al. . De novo mutations affecting the catalytic Calpha subunit of PP2A, PPP2CA, cause syndromic intellectual disability resembling other PP2A-related neurodevelopmental disorders. Am J Hum Genet. 2019;104:139–156. PubMed PMC

Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Molecular mechanisms of disease-causing missense mutations. J Mol Biol. 2013;425:3919–3936. PubMed PMC

Al Mehdi K, Fouad B, Zouhair E, et al. . Molecular modelling and dynamics study of nsSNP in STXBP1 gene in early infantile epileptic encephalopathy disease. Biomed Res Int. 2019;2019:4872101. PubMed PMC

McTague A, Nair U, Malhotra S, et al. . Clinical and molecular characterization of KCNT1-related severe early-onset epilepsy. Neurology. 2018;90:e55–e66. PubMed PMC

Parrini E, Marini C, Mei D, et al. . Diagnostic targeted resequencing in 349 patients with drug-resistant pediatric epilepsies identifies causative mutations in 30 different genes. Hum Mutat. 2017;38:216–225. PubMed

Thornton JM, Laskowski RA, Borkakoti N. AlphaFold heralds a data-driven revolution in biology and medicine. Nat Med. 2021;27:1666–1669. PubMed

Lal D, May P, Perez-Palma E, et al. . Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders. Genome Med. 2020;12:28. PubMed PMC

Akdel M, Pires DEV, Porta Pardo E, et al. . A structural biology community assessment of AlphaFold 2 applications. bioRxiv 391185. 2021. doi: 10.1101/2021.09.26.461876 DOI

Meyer MJ, Lapcevic R, Romero AE, et al. . mutation3D: Cancer gene prediction through atomic clustering of coding variants in the structural proteome. Hum Mutat. 2016;37:447–456. PubMed PMC

Ghosh R, Oak N, Plon SE. Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biol. 2017;18:225. PubMed PMC

Geisheker MR, Heymann G, Wang T, et al. . Hotspots of missense mutation identify neurodevelopmental disorder genes and functional domains. Nat Neurosci. 2017;20:1043–1051. PubMed PMC

Ye J, Pavlicek A, Lunney EA, Rejto PA, Teng CH. Statistical method on nonrandom clustering with application to somatic mutations in cancer. BMC Bioinformatics. 2010;11:11. PubMed PMC

Poole W, Leinonen K, Shmulevich I, Knijnenburg TA, Bernard B. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression. PLoS Comput Biol. 2017;13:e1005347. PubMed PMC

Jubb HC, Saini HK, Verdonk ML, Forbes SA. COSMIC-3D provides structural perspectives on cancer genetics for drug discovery. Nat Genet. 2018;50:1200–1202. PubMed PMC

Ofoegbu TC, David A, Kelley LA, et al. . PhyreRisk: A dynamic web application to bridge genomics, proteomics and 3D structural data to guide interpretation of human genetic variants. J Mol Biol. 2019;431:2460–2466. PubMed PMC

Stephenson JD, Laskowski RA, Nightingale A, Hurles ME, Thornton JM. VarMap: A web tool for mapping genomic coordinates to protein sequence and structure and retrieving protein structural annotations. Bioinformatics. 2019;35:4854–4856. PubMed PMC

Liang S, Mort M, Stenson PD, Cooper DN, Yu H. PIVOTAL: Prioritizing variants of uncertain significance with spatial genomic patterns in the 3D proteome. bioRxiv 2020.06.04.135103. 2020. doi: 10.1101/2020.06.04.135103 DOI

Segura J, Sanchez-Garcia R, Sorzano COS, Carazo JM. 3DBIONOTES v3.0: Crossing molecular and structural biology data with genomic variations. Bioinformatics. 2019;35:3512–3513. PubMed PMC

Paznekas WA, Boyadjiev SA, Shapiro RE, et al. . Connexin 43 (GJA1) mutations cause the pleiotropic phenotype of oculodentodigital dysplasia. Am J Hum Genet. 2003;72:408–418. PubMed PMC

Brunklaus A, Du J, Steckler F, et al. . Biological concepts in human sodium channel epilepsies and their relevance in clinical practice. Epilepsia. 2020;61:387–399. PubMed

Bellazzi R, Masseroli M, Murphy S, Shabo A, Romano P. Clinical bioinformatics: Challenges and opportunities. BMC Bioinformatics. 2012;13(Suppl 14):S1. PubMed PMC

Mangul S, Martin LS, Eskin E, Blekhman R. Improving the usability and archival stability of bioinformatics software. Genome Biol. 2019;20:47. PubMed PMC

Li Q, Wang K. InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines. Am J Hum Genet. 2017;100:267–280. PubMed PMC

Amendola LM, Jarvik GP, Leo MC, et al. . Performance of ACMG-AMP variant-interpretation guidelines among nine laboratories in the clinical sequencing exploratory research consortium. Am J Hum Genet. 2016;99:247. PubMed PMC

Babione JN, Ocampo W, Haubrich S, et al. . Human-centred design processes for clinical decision support: A pulmonary embolism case study. Int J Med Inform. 2020;142:104196. PubMed

Bates DW, Kuperman GJ, Wang S, et al. . Ten commandments for effective clinical decision support: Making the practice of evidence-based medicine a reality. J Am Med Inform Assoc. 2003;10:523–530. PubMed PMC

Cai CJ, Reif E, Hegde N, et al. Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ‘19). Association for Computing Machinery; Paper 4:1-14. DOI

Castellotti B, Ragona F, Freri E, et al. . Screening of SLC2A1 in a large cohort of patients suspected for Glut1 deficiency syndrome: Identification of novel variants and associated phenotypes. J Neurol. 2019;266:1439–1448. PubMed

Nickels KC, Zaccariello MJ, Hamiwka LD, Wirrell EC. Cognitive and neurodevelopmental comorbidities in paediatric epilepsy. Nat Rev Neurol. 2016;12:465–476. PubMed

Deng D, Xu C, Sun P, et al. . Crystal structure of the human glucose transporter GLUT1. Nature. 2014;510:121–125. PubMed

Tung K-F, Pan C-Y, Chen C-H, Lin W-C. Top-ranked expressed gene transcripts of human protein-coding genes investigated with GTEx dataset. Sci Rep. 2020;10:16245. PubMed PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...