Random protein sequences can form defined secondary structures and are well-tolerated in vivo
Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
29133927
PubMed Central
PMC5684393
DOI
10.1038/s41598-017-15635-8
PII: 10.1038/s41598-017-15635-8
Knihovny.cz E-zdroje
- MeSH
- cirkulární dichroismus MeSH
- databáze proteinů MeSH
- datové soubory jako téma MeSH
- molekulární modely * MeSH
- nukleární magnetická rezonance biomolekulární MeSH
- peptidová knihovna * MeSH
- proteinové agregáty MeSH
- rekombinantní proteiny chemie izolace a purifikace toxicita MeSH
- rozpustnost MeSH
- sbalování proteinů MeSH
- sekundární struktura proteinů * MeSH
- sekvence aminokyselin MeSH
- výpočetní biologie MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- peptidová knihovna * MeSH
- proteinové agregáty MeSH
- rekombinantní proteiny MeSH
The protein sequences found in nature represent a tiny fraction of the potential sequences that could be constructed from the 20-amino-acid alphabet. To help define the properties that shaped proteins to stand out from the space of possible alternatives, we conducted a systematic computational and experimental exploration of random (unevolved) sequences in comparison with biological proteins. In our study, combinations of secondary structure, disorder, and aggregation predictions are accompanied by experimental characterization of selected proteins. We found that the overall secondary structure and physicochemical properties of random and biological sequences are very similar. Moreover, random sequences can be well-tolerated by living cells. Contrary to early hypotheses about the toxicity of random and disordered proteins, we found that random sequences with high disorder have low aggregation propensity (unlike random sequences with high structural content) and were particularly well-tolerated. This direct structure content/aggregation propensity dependence differentiates random and biological proteins. Our study indicates that while random sequences can be both structured and disordered, the properties of the latter make them better suited as progenitors (in both in vivo and in vitro settings) for further evolution of complex, soluble, three-dimensional scaffolds that can perform specific biochemical tasks.
Zobrazit více v PubMed
Luisi, P. L. The bottle neck: macromolecular sequences in The Emergence of Life, From Chemical Origins to Synthetic Biology, 59–84 (Cambridge University Press, 2010).
LaBean TH, Butt TR, Kauffman SA, Schultes EA. Protein folding absent selection. Genes. 2011;2:608–626. doi: 10.3390/genes2030608. PubMed DOI PMC
Orengo CA, Thornton JM. Protein families and their evolution-a structural perspective. Annu. Rev. Biochem. 2005;74:867–900. doi: 10.1146/annurev.biochem.74.082803.133029. PubMed DOI
Levy ED, Boeri Erba E, Robinson CV, Teichmann SA. Assembly reflects evolution of protein complexes. Nature. 2008;453:1262–1265. doi: 10.1038/nature06942. PubMed DOI PMC
Marsh JA, Teichmann SA. How do proteins gain new domains? Genome Biol. 2010;11:126. doi: 10.1186/gb-2010-11-7-126. PubMed DOI PMC
Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995;247:536–540. PubMed
Orengo CA, et al. CATH - a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. doi: 10.1016/S0969-2126(97)00260-8. PubMed DOI
Levitt M. Nature of the protein universe. Proc. Natl. Acad. Sci. USA. 2009;106:11079–11084. doi: 10.1073/pnas.0905029106. PubMed DOI PMC
Metpally, R. P. R. and Reddy, B. V. B. Protein structure evolution and the SCOP database in Structural Bioinformatics (ed. Gu, J. and Bourne, P.) 419–732 (Wiley-Blackwell, 2009).
Keefe AD, Szostak JW. Functional proteins from a random-sequence library. Nature. 2001;410:715–718. doi: 10.1038/35070613. PubMed DOI PMC
Cossio P, et al. Exploring the universe of protein structures beyond the Protein Data Bank. PLoS Comput. Biol. 2010;6:e1000957. doi: 10.1371/journal.pcbi.1000957. PubMed DOI PMC
Chao F-A, et al. Structure and dynamics of a primordial catalytic fold generated by in vitro evolution. Nat. Chem. Biol. 2013;9:81–83. doi: 10.1038/nchembio.1138. PubMed DOI PMC
Minervini G, et al. Massive non-natural proteins structure prediction using grid technologies. BMC Bioinformatics. 2009;10(Suppl 6):S22. doi: 10.1186/1471-2105-10-S6-S22. PubMed DOI PMC
Prymula K, et al. In silico structural study of random amino acid sequence proteins not present in nature. Chem. Biodivers. 2009;6:2311–2336. doi: 10.1002/cbdv.200800338. PubMed DOI
Yu JF, et al. Natural protein sequences are more intrinsically disordered than random sequences. Cell. Mol. Life Sci. 2016;73:2949–2957. doi: 10.1007/s00018-016-2138-9. PubMed DOI PMC
Davidson AR, Sauer RT. Folded proteins occur frequently in libraries of random amino acid sequences. Proc. Natl. Acad. Sci. USA. 1994;91:2146–2150. doi: 10.1073/pnas.91.6.2146. PubMed DOI PMC
Chiarabelli C, et al. Investigation of de novo Totally Random Biosequences. Chem. Biodivers. 2006;3:840–859. doi: 10.1002/cbdv.200690088. PubMed DOI
Berman HM, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. PubMed DOI PMC
Apweiler R, et al. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004;32:D115–119. doi: 10.1093/nar/gkh131. PubMed DOI PMC
Piovesan D, et al. DisProt 7.0: a major update of the database of disordered proteins. Nucleic Acids Res. 2017;45:D219–D227. doi: 10.1093/nar/gkw1056. PubMed DOI PMC
Fang Y, Gao S, Tai D, Middaugh CR, Fang J. Identification of properties important to protein aggregation using feature selection. BMC Bioinformatics. 2013;14:314. doi: 10.1186/1471-2105-14-314. PubMed DOI PMC
Ángyán AF, Perczel A, Gáspári Z. Estimating intrinsic structural preferences of de novo emerging random‐sequence proteins: Is aggregation the main bottleneck? FEBS Lett. 2012;586:2468–2472. doi: 10.1016/j.febslet.2012.06.007. PubMed DOI
Naranjo Y, Pons M, Konrat R. Meta-structure correlation in protein space unveils different selection rules for folded and intrinsically disordered proteins. Mol. Biosyst. 2012;8:411–416. doi: 10.1039/C1MB05367A. PubMed DOI
de Groot NS, et al. Evolutionary selection for protein aggregation. Biochem. Soc. Trans. 2012;40:1032–7. doi: 10.1042/BST20120160. PubMed DOI
Uversky VN. Paradoxes and wonders of intrinsic disorder: Prevalence of exceptionality. Intrinsically Disordered Proteins. 2015;3:e1065029. doi: 10.1080/21690707.2015.1065029. PubMed DOI PMC
Chen Y, Dokholyan NV. Natural selection against protein aggregation on self-interacting and essential proteins in yeast, fly, and worm. Mol. Biol. Evol. 2008;25:1530–3. doi: 10.1093/molbev/msn122. PubMed DOI PMC
Monsellier E, Chiti F. Prevention of amyloid-like aggregation as a driving force of protein evolution. EMBO Rep. 2007;8:737–42. doi: 10.1038/sj.embor.7401034. PubMed DOI PMC
Neme R, Amador C, Yildirim B, McConnell E, Tautz D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat. Ecol. Evol. 2017;1:0217. doi: 10.1038/s41559-017-0127. PubMed DOI PMC
Wilson BA, Foy SG, Neme R, Masel J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat. Ecol. Evol. 2017;1:0146. doi: 10.1038/s41559-017-0146. PubMed DOI PMC
Murphy GS, Greisman JB, Hecht MH. De Novo Proteins with Life-Sustaining Functions Are Structurally Dynamic. J. Mol. Biol. 2016;428:399–411. doi: 10.1016/j.jmb.2015.12.008. PubMed DOI PMC
Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleaic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. PubMed DOI PMC
Schaffer AA, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 2001;29:2994–3005. doi: 10.1093/nar/29.14.2994. PubMed DOI PMC
Levin JM, Pascarella S, Argos P, Garnier J. Quantification of secondary structure prediction improvement using multiple alignments. Protein Eng. 1993;6:849–854. doi: 10.1093/protein/6.8.849. PubMed DOI
Garnier J, Gibrat JF, Robson B. GOR secondary structure prediction method version IV. Methods Enzymol. 1996;266:540–553. doi: 10.1016/S0076-6879(96)66034-0. PubMed DOI
Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure prediction. Proteins. 1997;27:329–335. doi: 10.1002/(SICI)1097-0134(199703)27:3<329::AID-PROT1>3.0.CO;2-8. PubMed DOI
Cuff JA, Barton GJ. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins. 1999;34:508–519. doi: 10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4. PubMed DOI
Jones T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 1999;292:195–202. doi: 10.1006/jmbi.1999.3091. PubMed DOI
Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982;157:105–132. doi: 10.1016/0022-2836(82)90515-0. PubMed DOI
Linding R, et al. Protein disorder prediction: Implications for structural proteomics. Structure. 2003;11:1453–1459. doi: 10.1016/j.str.2003.10.002. PubMed DOI
Wilkinson DL, Harrison RG. Predicting the solubility of recombinant proteins in Escherichia coli. Biotechnology. 1991;9:443–448. PubMed
Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J. Mol. Biol. 2004;337:635–645. doi: 10.1016/j.jmb.2004.02.002. PubMed DOI
Dosztányi Z, Csizmók V, Tompa P, Simon I. The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J. Mol. Biol. 2005;347:827–839. doi: 10.1016/j.jmb.2005.01.071. PubMed DOI
Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics. 2006;7:208. doi: 10.1186/1471-2105-7-208. PubMed DOI PMC
Šácha P, et al. IBodies: Modular synthetic antibody mimetics based on hydrophilic polymers decorated with functional moieties. Angew. Chem. Int. Ed. Engl. 2016;55:2356–2360. doi: 10.1002/anie.201508642. PubMed DOI PMC
Shevchenko A, Tomas H, Havlis J, Olsen JV, Mann M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 2006;1:2856–2860. doi: 10.1038/nprot.2006.468. PubMed DOI
Peptides En Route from Prebiotic to Biotic Catalysis
High-throughput Selection of Human de novo-emerged sORFs with High Folding Potential
Toxin rescue by a random sequence
Experimental characterization of de novo proteins and their unevolved random-sequence counterparts
Modern and prebiotic amino acids support distinct structural profiles in proteins
Enzyme catalysis prior to aromatic residues: Reverse engineering of a dephospho-CoA kinase
CoLiDe: Combinatorial Library Design tool for probing protein sequence space
Sequence Versus Composition: What Prescribes IDP Biophysical Properties?