Widespread evolutionary crosstalk among protein domains in the context of multi-domain proteins
Language English Country United States Media electronic-ecollection
Document type Journal Article, Research Support, Non-U.S. Gov't
PubMed
30169546
PubMed Central
PMC6118372
DOI
10.1371/journal.pone.0203085
PII: PONE-D-18-15590
Knihovny.cz E-resources
- MeSH
- Models, Biological * MeSH
- Information Theory MeSH
- Evolution, Molecular * MeSH
- Protein Domains * MeSH
- Proteins genetics metabolism MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Proteins MeSH
Domains are distinct units within proteins that typically can fold independently into recognizable three-dimensional structures to facilitate their functions. The structural and functional independence of protein domains is reflected by their apparent modularity in the context of multi-domain proteins. In this work, we examined the coupling of evolution of domain sequences co-occurring within multi-domain proteins to see if it proceeds independently, or in a coordinated manner. We used continuous information theory measures to assess the extent of correlated mutations among domains in multi-domain proteins from organisms across the tree of life. In all multi-domain architectures we examined, domains co-occurring within protein sequences had to some degree undergone concerted evolution. This finding challenges the notion of complete modularity and independence of protein domains, providing new perspective on the evolution of protein sequence and function.
See more in PubMed
Björklund ÅK, Ekman D, Light S, Frey-Skött J, Elofsson A. Domain Rearrangements in Protein Evolution. J Mol Biol. 2005;353: 911–923. 10.1016/j.jmb.2005.08.067 PubMed DOI
Moore AD, Björklund ÅK, Ekman D, Bornberg-Bauer E, Elofsson A. Arrangements in the modular evolution of proteins. Trends Biochem Sci. 2008;33: 444–451. 10.1016/j.tibs.2008.05.008 PubMed DOI
Bornberg-Bauer E, Albà MM. Dynamics and adaptive benefits of modular protein evolution. Curr Opin Struct Biol. 2013;23: 459–466. 10.1016/j.sbi.2013.02.012 PubMed DOI
Vogel C, Bashton M, Kerrison ND, Chothia C, Teichmann SA. Structure, function and evolution of multidomain proteins. Curr Opin Struct Biol. 2004;14: 208–216. 10.1016/j.sbi.2004.03.011 PubMed DOI
Han J-H, Batey S, Nickson AA, Teichmann SA, Clarke J. The folding and evolution of multidomain proteins. Nat Rev Mol Cell Biol. 2007;8: 319–330. 10.1038/nrm2144 PubMed DOI
Schueler-Furman O, Wodak SJ. Computational approaches to investigating allostery. Curr Opin Struct Biol. 2016;41: 159–171. 10.1016/j.sbi.2016.06.017 PubMed DOI
Bashton M, Chothia C. The Generation of New Protein Functions by the Combination of Domains. Structure. 2007;15: 85–99. 10.1016/j.str.2006.11.009 PubMed DOI
Lees JG, Dawson NL, Sillitoe I, Orengo CA. Functional innovation from changes in protein domains and their combinations. Curr Opin Struct Biol. 2016;38: 44–52. 10.1016/j.sbi.2016.05.016 PubMed DOI
Gloor GB, Martin LC, Wahl LM, Dunn SD. Mutual Information in Protein Multiple Sequence Alignments Reveals Two Classes of Coevolving Positions. Biochemistry. 2005;44: 7156–7165. 10.1021/bi050293e PubMed DOI
Liu Y, Bahar I. Sequence Evolution Correlates with Structural Dynamics. Mol Biol Evol. 2012;29: 2253–2263. 10.1093/molbev/mss097 PubMed DOI PMC
Neuwald AF. Gleaning structural and functional information from correlations in protein multiple sequence alignments. Curr Opin Struct Biol. 2016;38: 1–8. 10.1016/j.sbi.2016.04.006 PubMed DOI
Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci U S A. 2011;108: E1293–E1301. 10.1073/pnas.1111471108 PubMed DOI PMC
Morcos F, Jana B, Hwa T, Onuchic JN. Coevolutionary signals across protein lineages help capture multiple protein conformations. Proc Natl Acad Sci U S A. 2013;110: 20533–20538. 10.1073/pnas.1315625110 PubMed DOI PMC
Kamisetty H, Ovchinnikov S, Baker D. Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era. Proc Natl Acad Sci U S A. 2013;110: 15674–15679. 10.1073/pnas.1314045110 PubMed DOI PMC
Sutto L, Marsili S, Valencia A, Gervasio FL. From residue coevolution to protein conformational ensembles and functional dynamics. Proc Natl Acad Sci U S A. 2015;112: 13567–13572. 10.1073/pnas.1508584112 PubMed DOI PMC
Ovchinnikov S, Kinch L, Park H, Liao Y, Pei J, Kim DE, et al. Large-scale determination of previously unsolved protein structures using evolutionary information. Elife. 2015;4: e09248 10.7554/eLife.09248 PubMed DOI PMC
Ovchinnikov S, Park H, Varghese N, Huang P-S, Pavlopoulos GA, Kim DE, et al. Protein structure determination using metagenome sequence data. Science. 2017;355: 294–298. 10.1126/science.aah4043 PubMed DOI PMC
Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD, Rodgers JR, et al. The Protein Data Bank. A Computer-Based Archival File for Macromolecular Structures. Eur J Biochem. 1977;80: 319–324. 10.1111/j.1432-1033.1977.tb11885.x PubMed DOI
Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44: D279–D285. 10.1093/nar/gkv1344 PubMed DOI PMC
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45: D158–D169. 10.1093/nar/gkw1099 PubMed DOI PMC
Yeang C-H, Haussler D. Detecting Coevolution in and among Protein Domains. PLOS Comput Biol. 2007;3: e211 10.1371/journal.pcbi.0030211 PubMed DOI PMC
Yang S, Yalamanchili HK, Li X, Yao K-M, Sham PC, Zhang MQ, et al. Correlated evolution of transcription factors and their binding sites. Bioinformatics. 2011;27: 2972–2978. 10.1093/bioinformatics/btr503 PubMed DOI
Hopf TA, Schärfe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. Elife. 2014;3: e03430 10.7554/eLife.03430 PubMed DOI PMC
Uguzzoni G, John Lovis S, Oteri F, Schug A, Szurmant H, Weigt M. Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis. Proc Natl Acad Sci U S A. 2017;114: E2662–E2671. 10.1073/pnas.1615068114 PubMed DOI PMC
Buslje CM, Santos J, Delfino JM, Nielsen M. Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information. Bioinformatics. 2009;25: 1125–1131. 10.1093/bioinformatics/btp135 PubMed DOI PMC
Mao W, Kaya C, Dutta A, Horovitz A, Bahar I. Comparative study of the effectiveness and limitations of current methods for detecting sequence coevolution. Bioinformatics. 2015;31: 1929–1937. 10.1093/bioinformatics/btv103 PubMed DOI PMC
Eddy SR, the HMMER development team. HMMER User’s Guide Version 3.2.1; June 2018. http://eddylab.org/software/hmmer/Userguide.pdf.
Martin LC, Gloor GB, Dunn SD, Wahl LM. Using information theory to search for co-evolving residues in proteins. Bioinformatics. 2005;21: 4116–4124. 10.1093/bioinformatics/bti671 PubMed DOI
Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical Recipes in C: The Art of Scientific Computing. Cambridge: Cambridge University Press; 1992.
Wilcoxon F. INDIVIDUAL COMPARISONS BY RANKING METHODS. Biometrics Bull. 1945;1: 80–83. 10.2307/3001968 PubMed DOI
Massey FJ Jr. THE KOLMOGOROV–SMIRNOV TEST FOR GOODNESS OF FIT. J Am Stat Assoc. 1951;46: 68–78. 10.1080/01621459.1951.10500769 DOI
Mann HB, Whitney DR. ON A TEST OF WHETHER ONE OF TWO RANDOM VARIABLES IS STOCHASTICALLY LARGER THAN THE OTHER. Ann Math Stat. 1947;18: 50–60. 10.1214/aoms/1177730491 DOI
Jones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics. 2012;28: 184–190. 10.1093/bioinformatics/btr638 PubMed DOI
Lockless SW, Ranganathan R. Evolutionarily Conserved Pathways of Energetic Connectivity in Protein Families. Science. 1999;286: 295–299. 10.1126/science.286.5438.295 PubMed DOI