JavaScript NENÍ povolen !

Prosím povolte JavaScript.

* Zobrazit nápovědu

Reset

Nejvíce citované: 33267368

4 citací v PubMed Filtry

Nejvíce citovaný článek - PubMed ID 33267368

Sequence Versus Composition: What Prescribes IDP Biophysical Properties?

Entropy (Basel, Switzerland). 2019 Jul 03 ; 21 (7) : . [epub] 20190703

Entropy (Basel)
ISSN 1099-4300
Zdroj

Článek

Experimental characterization of de novo proteins and their unevolved random-sequence counterparts

Heames, Brennen
Autor Heames, Brennen ORCID Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
Buchel, Filip
Autor Buchel, Filip Department of Cell Biology, Charles University, BIOCEV, Prague, Czech Republic Department of Biochemistry, Charles University, Prague, Czech Republic
Aubel, Margaux
Autor Aubel, Margaux ORCID Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
Tretyachenko, Vyacheslav
Autor Tretyachenko, Vyacheslav Department of Cell Biology, Charles University, BIOCEV, Prague, Czech Republic
Loginov, Dmitry
Autor Loginov, Dmitry Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
Novák, Petr
Autor Novák, Petr ORCID Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic
Lange, Andreas
Autor Lange, Andreas Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
Bornberg-Bauer, Erich
Autor Bornberg-Bauer, Erich Institute for Evolution and Biodiversity, University of Münster, Münster, Germany. ebb@wwu.de Department of Protein Evolution, MPI for Developmental Biology, Tübingen, Germany. ebb@wwu.de
Hlouchová, Klára
Autor Hlouchová, Klára ORCID Department of Cell Biology, Charles University, BIOCEV, Prague, Czech Republic. klara.hlouchova@natur.cuni.cz Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic. klara.hlouchova@natur.cuni.cz

Nature ecology & evolution. 2023 Apr ; 7 (4) : 570-580. [epub] 20230406

Nat Ecol Evol
ISSN 2397-334X
Zdroj

De novo gene emergence provides a route for new proteins to be formed from previously non-coding DNA. Proteins born in this way are considered random sequences and typically assumed to lack defined structure. While it remains unclear how likely a de novo protein is to assume a soluble and stable tertiary structure, intersecting evidence from random sequence and de novo-designed proteins suggests that native-like biophysical properties are abundant in sequence space. Taking putative de novo proteins identified in human and fly, we experimentally characterize a library of these sequences to assess their solubility and structure propensity. We compare this library to a set of synthetic random proteins with no evolutionary history. Bioinformatic prediction suggests that de novo proteins may have remarkably similar distributions of biophysical properties to unevolved random sequences of a given length and amino acid composition. However, upon expression in vitro, de novo proteins exhibit moderately higher solubility which is further induced by the DnaK chaperone system. We suggest that while synthetic random sequences are a useful proxy for de novo proteins in terms of structure propensity, de novo proteins may be better integrated in the cellular system than random expectation, given their higher solubility.

Článek

Early Selection of the Amino Acid Alphabet Was Adaptively Shaped by Biophysical Constraints of Foldability

Journal of the American Chemical Society. 2023 Mar 08 ; 145 (9) : 5320-5329. [epub] 20230224

J Am Chem Soc
ISSN 1520-5126 | 0002-7863
Zdroj

Whereas modern proteins rely on a quasi-universal repertoire of 20 canonical amino acids (AAs), numerous lines of evidence suggest that ancient proteins relied on a limited alphabet of 10 "early" AAs and that the 10 "late" AAs were products of biosynthetic pathways. However, many nonproteinogenic AAs were also prebiotically available, which begs two fundamental questions: Why do we have the current modern amino acid alphabet and would proteins be able to fold into globular structures as well if different amino acids comprised the genetic code? Here, we experimentally evaluate the solubility and secondary structure propensities of several prebiotically relevant amino acids in the context of synthetic combinatorial 25-mer peptide libraries. The most prebiotically abundant linear aliphatic and basic residues were incorporated along with or in place of other early amino acids to explore these alternative sequence spaces. The results show that foldability was likely a critical factor in the selection of the canonical alphabet. Unbranched aliphatic amino acids were purged from the proteinogenic alphabet despite their high prebiotic abundance because they generate polypeptides that are oversolubilized and have low packing efficiency. Surprisingly, we find that the inclusion of a short-chain basic amino acid also decreases polypeptides' secondary structure potential, for which we suggest a biophysical model. Our results support the view that, despite lacking basic residues, the early canonical alphabet was remarkably adaptive at supporting protein folding and explain why basic residues were only incorporated at a later stage of protein evolution.

MeSH
aminokyseliny * chemie MeSH
peptidová knihovna MeSH
peptidy genetika MeSH
proteiny * chemie MeSH
sbalování proteinů MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Research Support, N.I.H., Extramural MeSH
Názvy látek
aminokyseliny * MeSH
peptidová knihovna MeSH
peptidy MeSH
proteiny * MeSH

Článek

Modern and prebiotic amino acids support distinct structural profiles in proteins

Open biology. 2022 Jun ; 12 (6) : 220040. [epub] 20220622

Open Biol
ISSN 2046-2441
Zdroj

The earliest proteins had to rely on amino acids available on early Earth before the biosynthetic pathways for more complex amino acids evolved. In extant proteins, a significant fraction of the 'late' amino acids (such as Arg, Lys, His, Cys, Trp and Tyr) belong to essential catalytic and structure-stabilizing residues. How (or if) early proteins could sustain an early biosphere has been a major puzzle. Here, we analysed two combinatorial protein libraries representing proxies of the available sequence space at two different evolutionary stages. The first is composed of the entire alphabet of 20 amino acids while the second one consists of only 10 residues (ASDGLIPTEV) representing a consensus view of plausibly available amino acids through prebiotic chemistry. We show that compact conformations resistant to proteolysis are surprisingly similarly abundant in both libraries. In addition, the early alphabet proteins are inherently more soluble and refoldable, independent of the general Hsp70 chaperone activity. By contrast, chaperones significantly increase the otherwise poor solubility of the modern alphabet proteins suggesting their coevolution with the amino acid repertoire. Our work indicates that while both early and modern amino acids are predisposed to supporting protein structure, they do so with different biophysical properties and via different mechanisms.

Klíčová slova
amino acid alphabet, genetic code evolution, protein sequence space, protein structure, random proteins,
MeSH
aminokyseliny * chemie MeSH
prebiotika * MeSH
proteiny chemie MeSH
sbalování proteinů MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Názvy látek
aminokyseliny * MeSH
prebiotika * MeSH
proteiny MeSH

Článek

CoLiDe: Combinatorial Library Design tool for probing protein sequence space

Bioinformatics (Oxford, England). 2021 May 01 ; 37 (4) : 482-489.

Bioinformatics
ISSN 1367-4811 | 1367-4803
Zdroj

MOTIVATION: Current techniques of protein engineering focus mostly on re-designing small targeted regions or defined structural scaffolds rather than constructing combinatorial libraries of versatile compositions and lengths. This is a missed opportunity because combinatorial libraries are emerging as a vital source of novel functional proteins and are of interest in diverse research areas. RESULTS: Here, we present a computational tool for Combinatorial Library Design (CoLiDe) offering precise control over protein sequence composition, length and diversity. The algorithm uses evolutionary approach to provide solutions to combinatorial libraries of degenerate DNA templates. We demonstrate its performance and precision using four different input alphabet distribution on different sequence lengths. In addition, a model design and experimental pipeline for protein library expression and purification is presented, providing a proof-of-concept that our protocol can be used to prepare purified protein library samples of up to 1011-1012 unique sequences. CoLiDe presents a composition-centric approach to protein design towards different functional phenomena. AVAILABILITYAND IMPLEMENTATION: CoLiDe is implemented in Python and freely available at https://github.com/voracva1/CoLiDe. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

* Zobrazit nápovědu

Sequence Versus Composition: What Prescribes IDP Biophysical Properties?

Upřesnit dle MeSH