Random protein sequences can form defined secondary structures and are well-tolerated in vivo

. 2017 Nov 13 ; 7 (1) : 15449. [epub] 20171113

Jazyk angličtina Země Anglie, Velká Británie Médium electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid29133927
Odkazy

PubMed 29133927
PubMed Central PMC5684393
DOI 10.1038/s41598-017-15635-8
PII: 10.1038/s41598-017-15635-8
Knihovny.cz E-zdroje

The protein sequences found in nature represent a tiny fraction of the potential sequences that could be constructed from the 20-amino-acid alphabet. To help define the properties that shaped proteins to stand out from the space of possible alternatives, we conducted a systematic computational and experimental exploration of random (unevolved) sequences in comparison with biological proteins. In our study, combinations of secondary structure, disorder, and aggregation predictions are accompanied by experimental characterization of selected proteins. We found that the overall secondary structure and physicochemical properties of random and biological sequences are very similar. Moreover, random sequences can be well-tolerated by living cells. Contrary to early hypotheses about the toxicity of random and disordered proteins, we found that random sequences with high disorder have low aggregation propensity (unlike random sequences with high structural content) and were particularly well-tolerated. This direct structure content/aggregation propensity dependence differentiates random and biological proteins. Our study indicates that while random sequences can be both structured and disordered, the properties of the latter make them better suited as progenitors (in both in vivo and in vitro settings) for further evolution of complex, soluble, three-dimensional scaffolds that can perform specific biochemical tasks.

Zobrazit více v PubMed

Luisi, P. L. The bottle neck: macromolecular sequences in The Emergence of Life, From Chemical Origins to Synthetic Biology, 59–84 (Cambridge University Press, 2010).

LaBean TH, Butt TR, Kauffman SA, Schultes EA. Protein folding absent selection. Genes. 2011;2:608–626. doi: 10.3390/genes2030608. PubMed DOI PMC

Orengo CA, Thornton JM. Protein families and their evolution-a structural perspective. Annu. Rev. Biochem. 2005;74:867–900. doi: 10.1146/annurev.biochem.74.082803.133029. PubMed DOI

Levy ED, Boeri Erba E, Robinson CV, Teichmann SA. Assembly reflects evolution of protein complexes. Nature. 2008;453:1262–1265. doi: 10.1038/nature06942. PubMed DOI PMC

Marsh JA, Teichmann SA. How do proteins gain new domains? Genome Biol. 2010;11:126. doi: 10.1186/gb-2010-11-7-126. PubMed DOI PMC

Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995;247:536–540. PubMed

Orengo CA, et al. CATH - a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. doi: 10.1016/S0969-2126(97)00260-8. PubMed DOI

Levitt M. Nature of the protein universe. Proc. Natl. Acad. Sci. USA. 2009;106:11079–11084. doi: 10.1073/pnas.0905029106. PubMed DOI PMC

Metpally, R. P. R. and Reddy, B. V. B. Protein structure evolution and the SCOP database in Structural Bioinformatics (ed. Gu, J. and Bourne, P.) 419–732 (Wiley-Blackwell, 2009).

Keefe AD, Szostak JW. Functional proteins from a random-sequence library. Nature. 2001;410:715–718. doi: 10.1038/35070613. PubMed DOI PMC

Cossio P, et al. Exploring the universe of protein structures beyond the Protein Data Bank. PLoS Comput. Biol. 2010;6:e1000957. doi: 10.1371/journal.pcbi.1000957. PubMed DOI PMC

Chao F-A, et al. Structure and dynamics of a primordial catalytic fold generated by in vitro evolution. Nat. Chem. Biol. 2013;9:81–83. doi: 10.1038/nchembio.1138. PubMed DOI PMC

Minervini G, et al. Massive non-natural proteins structure prediction using grid technologies. BMC Bioinformatics. 2009;10(Suppl 6):S22. doi: 10.1186/1471-2105-10-S6-S22. PubMed DOI PMC

Prymula K, et al. In silico structural study of random amino acid sequence proteins not present in nature. Chem. Biodivers. 2009;6:2311–2336. doi: 10.1002/cbdv.200800338. PubMed DOI

Yu JF, et al. Natural protein sequences are more intrinsically disordered than random sequences. Cell. Mol. Life Sci. 2016;73:2949–2957. doi: 10.1007/s00018-016-2138-9. PubMed DOI PMC

Davidson AR, Sauer RT. Folded proteins occur frequently in libraries of random amino acid sequences. Proc. Natl. Acad. Sci. USA. 1994;91:2146–2150. doi: 10.1073/pnas.91.6.2146. PubMed DOI PMC

Chiarabelli C, et al. Investigation of de novo Totally Random Biosequences. Chem. Biodivers. 2006;3:840–859. doi: 10.1002/cbdv.200690088. PubMed DOI

Berman HM, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. PubMed DOI PMC

Apweiler R, et al. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004;32:D115–119. doi: 10.1093/nar/gkh131. PubMed DOI PMC

Piovesan D, et al. DisProt 7.0: a major update of the database of disordered proteins. Nucleic Acids Res. 2017;45:D219–D227. doi: 10.1093/nar/gkw1056. PubMed DOI PMC

Fang Y, Gao S, Tai D, Middaugh CR, Fang J. Identification of properties important to protein aggregation using feature selection. BMC Bioinformatics. 2013;14:314. doi: 10.1186/1471-2105-14-314. PubMed DOI PMC

Ángyán AF, Perczel A, Gáspári Z. Estimating intrinsic structural preferences of de novo emerging random‐sequence proteins: Is aggregation the main bottleneck? FEBS Lett. 2012;586:2468–2472. doi: 10.1016/j.febslet.2012.06.007. PubMed DOI

Naranjo Y, Pons M, Konrat R. Meta-structure correlation in protein space unveils different selection rules for folded and intrinsically disordered proteins. Mol. Biosyst. 2012;8:411–416. doi: 10.1039/C1MB05367A. PubMed DOI

de Groot NS, et al. Evolutionary selection for protein aggregation. Biochem. Soc. Trans. 2012;40:1032–7. doi: 10.1042/BST20120160. PubMed DOI

Uversky VN. Paradoxes and wonders of intrinsic disorder: Prevalence of exceptionality. Intrinsically Disordered Proteins. 2015;3:e1065029. doi: 10.1080/21690707.2015.1065029. PubMed DOI PMC

Chen Y, Dokholyan NV. Natural selection against protein aggregation on self-interacting and essential proteins in yeast, fly, and worm. Mol. Biol. Evol. 2008;25:1530–3. doi: 10.1093/molbev/msn122. PubMed DOI PMC

Monsellier E, Chiti F. Prevention of amyloid-like aggregation as a driving force of protein evolution. EMBO Rep. 2007;8:737–42. doi: 10.1038/sj.embor.7401034. PubMed DOI PMC

Neme R, Amador C, Yildirim B, McConnell E, Tautz D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat. Ecol. Evol. 2017;1:0217. doi: 10.1038/s41559-017-0127. PubMed DOI PMC

Wilson BA, Foy SG, Neme R, Masel J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat. Ecol. Evol. 2017;1:0146. doi: 10.1038/s41559-017-0146. PubMed DOI PMC

Murphy GS, Greisman JB, Hecht MH. De Novo Proteins with Life-Sustaining Functions Are Structurally Dynamic. J. Mol. Biol. 2016;428:399–411. doi: 10.1016/j.jmb.2015.12.008. PubMed DOI PMC

Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleaic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. PubMed DOI PMC

Schaffer AA, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 2001;29:2994–3005. doi: 10.1093/nar/29.14.2994. PubMed DOI PMC

Levin JM, Pascarella S, Argos P, Garnier J. Quantification of secondary structure prediction improvement using multiple alignments. Protein Eng. 1993;6:849–854. doi: 10.1093/protein/6.8.849. PubMed DOI

Garnier J, Gibrat JF, Robson B. GOR secondary structure prediction method version IV. Methods Enzymol. 1996;266:540–553. doi: 10.1016/S0076-6879(96)66034-0. PubMed DOI

Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure prediction. Proteins. 1997;27:329–335. doi: 10.1002/(SICI)1097-0134(199703)27:3<329::AID-PROT1>3.0.CO;2-8. PubMed DOI

Cuff JA, Barton GJ. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins. 1999;34:508–519. doi: 10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4. PubMed DOI

Jones T. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 1999;292:195–202. doi: 10.1006/jmbi.1999.3091. PubMed DOI

Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982;157:105–132. doi: 10.1016/0022-2836(82)90515-0. PubMed DOI

Linding R, et al. Protein disorder prediction: Implications for structural proteomics. Structure. 2003;11:1453–1459. doi: 10.1016/j.str.2003.10.002. PubMed DOI

Wilkinson DL, Harrison RG. Predicting the solubility of recombinant proteins in Escherichia coli. Biotechnology. 1991;9:443–448. PubMed

Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J. Mol. Biol. 2004;337:635–645. doi: 10.1016/j.jmb.2004.02.002. PubMed DOI

Dosztányi Z, Csizmók V, Tompa P, Simon I. The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J. Mol. Biol. 2005;347:827–839. doi: 10.1016/j.jmb.2005.01.071. PubMed DOI

Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics. 2006;7:208. doi: 10.1186/1471-2105-7-208. PubMed DOI PMC

Šácha P, et al. IBodies: Modular synthetic antibody mimetics based on hydrophilic polymers decorated with functional moieties. Angew. Chem. Int. Ed. Engl. 2016;55:2356–2360. doi: 10.1002/anie.201508642. PubMed DOI PMC

Shevchenko A, Tomas H, Havlis J, Olsen JV, Mann M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 2006;1:2856–2860. doi: 10.1038/nprot.2006.468. PubMed DOI

Nejnovějších 20 citací...

Zobrazit více v
Medvik | PubMed

Peptides En Route from Prebiotic to Biotic Catalysis

. 2024 Aug 06 ; 57 (15) : 2027-2037. [epub] 20240717

High-throughput Selection of Human de novo-emerged sORFs with High Folding Potential

. 2024 Apr 02 ; 16 (4) : .

Toxin rescue by a random sequence

. 2023 Dec ; 7 (12) : 1963-1964.

Experimental characterization of de novo proteins and their unevolved random-sequence counterparts

. 2023 Apr ; 7 (4) : 570-580. [epub] 20230406

Early Selection of the Amino Acid Alphabet Was Adaptively Shaped by Biophysical Constraints of Foldability

. 2023 Mar 08 ; 145 (9) : 5320-5329. [epub] 20230224

Modern and prebiotic amino acids support distinct structural profiles in proteins

. 2022 Jun ; 12 (6) : 220040. [epub] 20220622

Peptides before and during the nucleotide world: an origins story emphasizing cooperation between proteins and nucleic acids

. 2022 Feb ; 19 (187) : 20210641. [epub] 20220209

Enzyme catalysis prior to aromatic residues: Reverse engineering of a dephospho-CoA kinase

. 2021 May ; 30 (5) : 1022-1034. [epub] 20210326

CoLiDe: Combinatorial Library Design tool for probing protein sequence space

. 2021 May 01 ; 37 (4) : 482-489.

Sequence Versus Composition: What Prescribes IDP Biophysical Properties?

. 2019 Jul 03 ; 21 (7) : . [epub] 20190703

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...