Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants

. 2020 Nov 10 ; 117 (45) : 28201-28211. [epub] 20201026

Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid33106425

Interpretation of the colossal number of genetic variants identified from sequencing applications is one of the major bottlenecks in clinical genetics, with the inference of the effect of amino acid-substituting missense variations on protein structure and function being especially challenging. Here we characterize the three-dimensional (3D) amino acid positions affected in pathogenic and population variants from 1,330 disease-associated genes using over 14,000 experimentally solved human protein structures. By measuring the statistical burden of variations (i.e., point mutations) from all genes on 40 3D protein features, accounting for the structural, chemical, and functional context of the variations' positions, we identify features that are generally associated with pathogenic and population missense variants. We then perform the same amino acid-level analysis individually for 24 protein functional classes, which reveals unique characteristics of the positions of the altered amino acids: We observe up to 46% divergence of the class-specific features from the general characteristics obtained by the analysis on all genes, which is consistent with the structural diversity of essential regions across different protein classes. We demonstrate that the function-specific 3D features of the variants match the readouts of mutagenesis experiments for BRCA1 and PTEN, and positively correlate with an independent set of clinically interpreted pathogenic and benign missense variants. Finally, we make our results available through a web server to foster accessibility and downstream research. Our findings represent a crucial step toward translational genetics, from highlighting the impact of mutations on protein structure to rationalizing the variants' pathogenicity in terms of the perturbed molecular mechanisms.

Zobrazit více v PubMed

Glusman G., Clinical applications of sequencing take center stage. Genome Biol. 14, 303 (2013). PubMed PMC

Dugger S. A., Platt A., Goldstein D. B., Drug development in the era of precision medicine. Nat. Rev. Drug Discov. 17, 183–196 (2018). PubMed PMC

Lek M., et al. , Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016). PubMed PMC

McKusick V. A., Mendelian inheritance in man and its online version, OMIM. Am. J. Hum. Genet. 80, 588–604 (2007). PubMed PMC

Stenson P. D., et al. , The human gene mutation database: Building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum. Genet. 133, 1–9 (2014). PubMed PMC

Landrum M. J., et al. , ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018). PubMed PMC

Karczewski K. J., et al. , The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020). PubMed PMC

Berman H. M., Bourne P. E., Westbrook J., Zardecki C., “The protein data bank” in Protein Structure, Chasman D., Ed. (CRC, 2003), pp. 394–410.

Adzhubei I. A., et al. , A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010). PubMed PMC

Kircher M., et al. , A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014). PubMed PMC

Ng P. C., Henikoff S., SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003). PubMed PMC

Baugh E. H., et al. , Robust classification of protein variation using structural modelling and large-scale data integration. Nucleic Acids Res. 44, 2501–2513 (2016). PubMed PMC

Sundaram L., et al. , Predicting the clinical impact of human mutation with deep neural networks. Nat. Genet. 50, 1161–1170 (2018). PubMed PMC

Pejaver V., Mooney S. D., Radivojac P., Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges. Hum. Mutat. 38, 1092–1108 (2017). PubMed PMC

David A., Sternberg M. J., The contribution of missense mutations in core and rim residues of protein–protein interfaces to human disease. J. Mol. Biol. 427, 2886–2898 (2015). PubMed PMC

Nishi H., Nakata J., Kinoshita K., Distribution of single-nucleotide variants on protein–protein interaction sites and its relationship with minor allele frequency. Protein Sci. 25, 316–321 (2016). PubMed PMC

Sahni N., et al. , Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161, 647–660 (2015). PubMed PMC

Stefl S., Nishi H., Petukh M., Panchenko A. R., Alexov E., Molecular mechanisms of disease-causing missense mutations. J. Mol. Biol. 425, 3919–3936 (2013). PubMed PMC

Petukh M., Kucukkal T. G., Alexov E., On human disease-causing amino acid variants: Statistical study of sequence and structural patterns. Hum. Mutat. 36, 524–534 (2015). PubMed PMC

Kucukkal T. G., Petukh M., Li L., Alexov E., Structural and physico-chemical effects of disease and non-disease nsSNPs on proteins. Curr. Opin. Struct. Biol. 32, 18–24 (2015). PubMed PMC

Gao M., Zhou H., Skolnick J., Insights into disease-associated mutations in the human proteome through protein structural analysis. Structure 23, 1362–1369 (2015). PubMed PMC

Araya C. L., et al. , Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations. Nat. Genet. 48, 117–125 (2016). PubMed PMC

Kamburov A., et al. , Comprehensive assessment of cancer missense mutation clustering in protein structures. Proc. Natl. Acad. Sci. U.S.A 112, E5486–E5495 (2015). PubMed PMC

Sivley R. M., Dou X., Meiler J., Bush W. S., Capra J. A., Comprehensive analysis of constraint on the spatial distribution of missense variants in human protein structures. Am. J. Hum. Genet. 102, 415–426 (2018). PubMed PMC

Meyer M. J., et al. , Mutation3D: Cancer gene prediction through atomic clustering of coding variants in the structural proteome. Hum. Mutat. 37, 447–456 (2016). PubMed PMC

Tokheim C., et al. , Exome-scale discovery of hotspot mutation regions in human cancer using 3D protein structure. Cancer Res. 76, 3719–3731 (2016). PubMed PMC

Ittisoponpisan S., et al. , Can predicted protein 3D structures provide reliable insights into whether missense variants are disease associated? J. Mol. Biol. 431, 2197–2212 (2019). PubMed PMC

Yates C. M., Filippis I., Kelley L. A., Sternberg M. J., Suspect: Enhanced prediction of single amino acid variant (SAV) phenotype using network features. J. Mol. Biol. 426, 2692–2701 (2014). PubMed PMC

Laskowski R. A., Stephenson J. D., Sillitoe I., Orengo C. A., Thornton J. M., VarSite: Disease variants and protein structure. Protein Sci. 29, 111–119 (2020). PubMed PMC

Richards S., et al. , Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–423 (2015). PubMed PMC

Fersht A., Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding (Macmillan, 1999).

Worth C. L., Gong S., Blundell T. L., Structural and functional constraints in the evolution of protein families. Nat. Rev. Mol. Cell Biol. 10, 709–720 (2009). PubMed

Williams S. G., Lovell S. C., The effect of sequence evolution on protein structural divergence. Mol. Biol. Evol. 26, 1055–1065 (2009). PubMed

Sanders S. J., et al. , Progress in understanding and treating SCN2A-mediated disorders. Trends Neurosci. 41, 442–456 (2018). PubMed PMC

Spillane J., Kullmann D., Hanna M., Genetic neurological channelopathies: Molecular genetics and clinical phenotypes. J. Neurol. Neurosurg. Psychiatry 87, 37–48 (2016). PubMed PMC

Heyne H. O., et al. , Predicting functional effects of missense variants in voltage-gated sodium and calcium channels. Sci. Transl. Med. 12, eaay6848 (2020). PubMed

Smith I. N., Thacker S., Jaini R., Eng C., Dynamics and structural stability effects of germline PTEN mutations associated with cancer versus autism phenotypes. J. Biomol. Struct. Dyn. 37, 1766–1782 (2019). PubMed PMC

Olson H. E., et al. , Cyclin-dependent kinase-like 5 (CDKL5) deficiency disorder: Clinical review. Pediatr. Neurol. 97, 18–25 (2019). PubMed PMC

Velankar S., et al. , SIFTS: Structure integration with function, taxonomy and sequences resource. Nucleic Acids Res. 41, D483–D489 (2012). PubMed PMC

Kabsch W., Sander C., Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolym. Orig. Res. on Biomol. 22, 2577–2637 (1983). PubMed

Laskowski R. A., Jabłońska J., Pravda L., Vařeková R. S., Thornton J. M., PDBsum: Structural summaries of PDB entries. Protein Sci. 27, 129–134 (2018). PubMed PMC

Hornbeck P. V., et al. , Phosphositeplus, 2014: Mutations, PTMs and recalibrations. Nucleic Acids Res. 43, D512–D520 (2015). PubMed PMC

Apweiler R., et al. , Uniprot: The universal protein knowledgebase. Nucleic Acids Res. 32, D115–D119 (2004). PubMed PMC

Mi H., et al. , PANTHER version 11: Expanded annotation data from gene ontology and reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. 45, D183–D189 (2017). PubMed PMC

Bürkle A., “Posttranslational modification ” in Encyclopedia of Genetics, Brenner S., Miller J. H., Eds. (Academic, New York, 2001), p. 1533.

Dougherty D. A., Cation- PubMed

Friedberg I., Margalit H., Persistently conserved positions in structurally similar, sequence dissimilar proteins: Roles in preserving protein fold and function. Protein Sci. 11, 350–360 (2002). PubMed PMC

Rentzsch P., Witten D., Cooper G. M., Shendure J., Kircher M., CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019). PubMed PMC

Liaw A., Wiener M., Classification and regression by randomforest. R News 2, 18–22 (2002).

Findlay G. M., et al. , Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018). PubMed PMC

Mighell T. L., Evans-Dutson S., O’Roak B. J., A saturation mutagenesis approach to understanding PTEN lipid phosphatase activity and genotype-phenotype relationships. Am. J. Hum. Genet. 102, 943–955 (2018). PubMed PMC

Iqbal S., et al. , MISCAST: MIssense variant to protein StruCture Analysis web SuiTe, Nucleic Acids Res. 48, W132–W139 (2020). PubMed PMC

Li J., et al. , Performance evaluation of pathogenicity-computation methods for missense variants. Nucleic Acids Res. 46, 7793–7804 (2018). PubMed PMC

Thusberg J., Olatubosun A., Vihinen M., Performance of mutation pathogenicity prediction methods on missense variants. Hum. Mutat. 32, 358–368 (2011). PubMed

Starita L. M., et al. , Variant interpretation: Functional assays to the rescue. Am. J. Hum. Genet. 101, 315–325 (2017). PubMed PMC

Raraigh K. S., et al. , Functional assays are essential for interpretation of missense variants associated with variable expressivity. Am. J. Hum. Genet. 102, 1062–1077 (2018). PubMed PMC

Li Q., Wang K., Intervar: Clinical interpretation of genetic variants by the 2015 ACMG-AMP guidelines. Am. J. Hum. Genet. 100, 267–280 (2017). PubMed PMC

Mitchell A. L., et al. , InterPro in 2019: Improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 47, D351–D360 (2019). PubMed PMC

Reimand J., Wagih O., Bader G. D., Evolutionary constraint and disease associations of post-translational modification sites in human genomes. PLoS Genet. 11, e1004919 (2015). PubMed PMC

Del-Toro N., et al. , Capturing variation impact on molecular interactions in the IMEx Consortium mutations data set. Nat. Commun. 10, 1–14 (2019). PubMed PMC

Abrusán G., Marsh J. A., Alpha helices are more robust to mutations than beta strands. PLoS Comput. Biol. 12, e1005242 (2016). PubMed PMC

Fodje M., Al-Karadaghi S., Occurrence, conformational features and amino acid propensities for the PubMed

Weaver T. M., The PubMed PMC

Hicks M., Bartha I., di Iulio J., Venter J. C., Telenti A., Functional characterization of 3D protein structures informed by human genetic diversity. Proc. Natl. Acad. Sci. U.S.A. 116, 8960–8965 (2019). PubMed PMC

Aukrust I., et al. , SUMOylation of pancreatic glucokinase regulates its cellular stability and activity. J. Biol. Chem. 288, 5951–5962 (2013). PubMed PMC

Krassowski M., et al. , ActiveDriverDB: Human disease mutations and genome variation in post-translational modification sites of proteins. Nucleic Acids Res. 46, D901–D910 (2018). PubMed PMC

Sitbon E., Pietrokovski S., Occurrence of protein structure elements in conserved sequence regions. BMC Struct. Biol. 7, 3 (2007). PubMed PMC

Beaglehole R., et al. , Basic Epidemiology (World Health Organization, Geneva, Switzerland, 1993).

Yehia L., Keel E., Eng C., The clinical spectrum of PTEN mutations. Annu. Rev. Med. 71, 103–116 (2019). PubMed

Yates C. M., Sternberg M. J., Proteins and domains vary in their tolerance of non-synonymous single nucleotide polymorphisms (nsSNPs). J. Mol. Biol. 425, 1274–1286 (2013). PubMed

Knudsen M., Wiuf C., The CATH database. Hum. Genomics 4, 207–212 (2010). PubMed PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...