Computational protein design
Dotaz
Zobrazit nápovědu
The ability to predict and design protein structures has led to numerous applications in medicine, diagnostics and sustainable chemical manufacture. In addition, the wealth of predicted protein structures has advanced our understanding of how life's molecules function and interact. Honouring the work that has fundamentally changed the way scientists research and engineer proteins, the Nobel Prize in Chemistry in 2024 was awarded to David Baker for computational protein design and jointly to Demis Hassabis and John Jumper, who developed AlphaFold for machine-learning-based protein structure prediction. Here, we highlight notable contributions to the development of these computational tools and their importance for the design of functional proteins that are applied in organic synthesis. Notably, both technologies have the potential to impact drug discovery as any therapeutic protein target can now be modelled, allowing the de novo design of peptide binders and the identification of small molecule ligands through in silico docking of large compound libraries. Looking ahead, we highlight future research directions in protein engineering, medicinal chemistry and material design that are enabled by this transformative shift in protein science.
- Klíčová slova
- AlphaFold, Computational protein design, Nobel prize, Protein engineering, Protein structure prediction,
- MeSH
- biokatalýza MeSH
- konformace proteinů MeSH
- proteinové inženýrství MeSH
- proteiny * chemie metabolismus MeSH
- strojové učení MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- proteiny * MeSH
Protein tunnels connecting the functional buried cavities with bulk solvent and protein channels, enabling the transport through biological membranes, represent the structural features that govern the exchange rates of ligands, ions, and water solvent. Tunnels and channels are present in a vast number of known proteins and provide control over their function. Modification of these structural features by protein engineering frequently provides proteins with improved properties. Here we present a detailed computational protocol employing the CAVER software that is applicable for: (1) the analysis of tunnels and channels in protein structures, and (2) the selection of hot-spot residues in tunnels or channels that can be mutagenized for improved activity, specificity, enantioselectivity, or stability.
- Klíčová slova
- Binding, CAVER, Channel, Gate, Protein, Rational design, Software, Transport, Tunnel,
- MeSH
- algoritmy MeSH
- konformace proteinů MeSH
- ligandy MeSH
- molekulární modely MeSH
- proteinové inženýrství metody MeSH
- proteiny chemie MeSH
- rozpouštědla chemie MeSH
- software MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- ligandy MeSH
- proteiny MeSH
- rozpouštědla MeSH
Enzymes are the natural catalysts that execute biochemical reactions upholding life. Their natural effectiveness has been fine-tuned as a result of millions of years of natural evolution. Such catalytic effectiveness has prompted the use of biocatalysts from multiple sources on different applications, including the industrial production of goods (food and beverages, detergents, textile, and pharmaceutics), environmental protection, and biomedical applications. Natural enzymes often need to be improved by protein engineering to optimize their function in non-native environments. Recent technological advances have greatly facilitated this process by providing the experimental approaches of directed evolution or by enabling computer-assisted applications. Directed evolution mimics the natural selection process in a highly accelerated fashion at the expense of arduous laboratory work and economic resources. Theoretical methods provide predictions and represent an attractive complement to such experiments by waiving their inherent costs. Computational techniques can be used to engineer enzymatic reactivity, substrate specificity and ligand binding, access pathways and ligand transport, and global properties like protein stability, solubility, and flexibility. Theoretical approaches can also identify hotspots on the protein sequence for mutagenesis and predict suitable alternatives for selected positions with expected outcomes. This review covers the latest advances in computational methods for enzyme engineering and presents many successful case studies.
- Klíčová slova
- Biocatalyst, Catalytic efficiency, Computational enzyme design, Enzyme biotechnologies, Protein dynamics, Protein engineering, Software, Solubility, Stability,
- MeSH
- biokatalýza MeSH
- biotechnologie * MeSH
- enzymy genetika metabolismus MeSH
- mutageneze MeSH
- proteinové inženýrství MeSH
- řízená evoluce molekul * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- enzymy MeSH
There is great interest in increasing proteins' stability to enhance their utility as biocatalysts, therapeutics, diagnostics and nanomaterials. Directed evolution is a powerful, but experimentally strenuous approach. Computational methods offer attractive alternatives. However, due to the limited reliability of predictions and potentially antagonistic effects of substitutions, only single-point mutations are usually predicted in silico, experimentally verified and then recombined in multiple-point mutants. Thus, substantial screening is still required. Here we present FireProt, a robust computational strategy for predicting highly stable multiple-point mutants that combines energy- and evolution-based approaches with smart filtering to identify additive stabilizing mutations. FireProt's reliability and applicability was demonstrated by validating its predictions against 656 mutations from the ProTherm database. We demonstrate that thermostability of the model enzymes haloalkane dehalogenase DhaA and γ-hexachlorocyclohexane dehydrochlorinase LinA can be substantially increased (ΔTm = 24°C and 21°C) by constructing and characterizing only a handful of multiple-point mutants. FireProt can be applied to any protein for which a tertiary structure and homologous sequences are available, and will facilitate the rapid development of robust proteins for biomedical and biotechnological applications.
- MeSH
- bodová mutace genetika fyziologie MeSH
- databáze genetické MeSH
- lyasy chemie genetika metabolismus MeSH
- molekulární modely MeSH
- počítačová simulace MeSH
- proteinové inženýrství metody MeSH
- stabilita enzymů genetika MeSH
- teplota MeSH
- výpočetní biologie metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- lyasy MeSH
Protein engineering is the discipline of developing useful proteins for applications in research, therapeutic, and industrial processes by modification of naturally occurring proteins or by invention of de novo proteins. Modern protein engineering relies on the ability to rapidly generate and screen diverse libraries of mutant proteins. However, design of mutant libraries is typically hampered by scale and complexity, necessitating development of advanced automation and optimization tools that can improve efficiency and accuracy. At present, automated library design tools are functionally limited or not freely available. To address these issues, we developed Mutation Maker, an open source mutagenic oligo design software for large-scale protein engineering experiments. Mutation Maker is not only specifically tailored to multisite random and directed mutagenesis protocols, but also pioneers bespoke mutagenic oligo design for de novo gene synthesis workflows. Enabled by a novel bundle of orchestrated heuristics, optimization, constraint-satisfaction and backtracking algorithms, Mutation Maker offers a versatile toolbox for gene diversification design at industrial scale. Supported by in silico simulations and compelling experimental validation data, Mutation Maker oligos produce diverse gene libraries at high success rates irrespective of genes or vectors used. Finally, Mutation Maker was created as an extensible platform on the notion that directed evolution techniques will continue to evolve and revolutionize current and future-oriented applications.
- Klíčová slova
- PCR-based accurate synthesis, directed evolution, gene synthesis, multi site-directed mutagenesis, protein design, protein engineering, site-scanning saturation mutagenesis, synthetic biology,
- MeSH
- algoritmy MeSH
- Escherichia coli genetika MeSH
- genová knihovna MeSH
- kodon genetika MeSH
- mutace * MeSH
- mutageneze cílená metody MeSH
- mutageneze * MeSH
- mutantní proteiny MeSH
- oligonukleotidy genetika MeSH
- počítačová simulace MeSH
- proteiny genetika MeSH
- řízená evoluce molekul metody MeSH
- software * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- kodon MeSH
- mutantní proteiny MeSH
- oligonukleotidy MeSH
- proteiny MeSH
Recent advancements in deep learning and generative models have significantly expanded the applications of virtual screening for drug-like compounds. Here, we introduce a multitarget transformer model, PCMol, that leverages the latent protein embeddings derived from AlphaFold2 as a means of conditioning a de novo generative model on different targets. Incorporating rich protein representations allows the model to capture their structural relationships, enabling the chemical space interpolation of active compounds and target-side generalization to new proteins based on embedding similarities. In this work, we benchmark against other existing target-conditioned transformer models to illustrate the validity of using AlphaFold protein representations over raw amino acid sequences. We show that low-dimensional projections of these protein embeddings cluster appropriately based on target families and that model performance declines when these representations are intentionally corrupted. We also show that the PCMol model generates diverse, potentially active molecules for a wide array of proteins, including those with sparse ligand bioactivity data. The generated compounds display higher similarity known active ligands of held-out targets and have comparable molecular docking scores while maintaining novelty. Additionally, we demonstrate the important role of data augmentation in bolstering the performance of generative models in low-data regimes. Software package and AlphaFold protein embeddings are freely available at https://github.com/CDDLeiden/PCMol.
- MeSH
- konformace proteinů MeSH
- ligandy MeSH
- molekulární modely * MeSH
- proteiny * chemie metabolismus MeSH
- racionální návrh léčiv * MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- ligandy MeSH
- proteiny * MeSH
Current computational tools to assist experimentalists for the design and engineering of proteins with desired catalytic properties are reviewed. The applications of these tools for de novo design of protein active sites, optimization of substrate access and product exit pathways, redesign of protein-protein interfaces, identification of neutral/advantageous/deleterious mutations in the libraries from directed evolution and stabilization of protein structures are described. Remarkable progress is seen in de novo design of enzymes catalyzing a chemical reaction for which a natural biocatalyst does not exist. Yet, constructed biocatalysts do not match natural enzymes in their efficiency, suggesting that more research is needed to capture all the important features of natural biocatalysts in theoretical designs.
- MeSH
- biokatalýza MeSH
- katalytická doména MeSH
- ligandy MeSH
- mutace MeSH
- proteinové inženýrství metody trendy MeSH
- proteiny chemie genetika metabolismus MeSH
- stabilita proteinů MeSH
- výpočetní biologie metody trendy MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- ligandy MeSH
- proteiny MeSH
- MeSH
- design s pomocí počítače MeSH
- enzymy chemie genetika metabolismus MeSH
- katalýza MeSH
- proteinové inženýrství metody MeSH
- proteiny chemie genetika metabolismus MeSH
- řízená evoluce molekul * MeSH
- Publikační typ
- kongresy MeSH
- Názvy látek
- enzymy MeSH
- proteiny MeSH
Protein engineering strategies aimed at constructing enzymes with novel or improved activities, specificities, and stabilities greatly benefit from in silico methods. Computational methods can be principally grouped into three main categories: bioinformatics; molecular modelling; and de novo design. Particularly de novo protein design is experiencing rapid development, resulting in more robust and reliable predictions. A recent trend in the field is to combine several computational approaches in an interactive manner and to complement them with structural analysis and directed evolution. A detailed investigation of designed catalysts provides valuable information on the structural basis of molecular recognition, biochemical catalysis, and natural protein evolution.
- MeSH
- enzymy genetika MeSH
- lidé MeSH
- molekulární modely MeSH
- mutace MeSH
- proteinové inženýrství metody MeSH
- stabilita enzymů MeSH
- výpočetní biologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- přehledy MeSH
- Názvy látek
- enzymy MeSH
β-sheet proteins carry out critical functions in biology, and hence are attractive scaffolds for computational protein design. Despite this potential, de novo design of all-β-sheet proteins from first principles lags far behind the design of all-α or mixed-αβ domains owing to their non-local nature and the tendency of exposed β-strand edges to aggregate. Through study of loops connecting unpaired β-strands (β-arches), we have identified a series of structural relationships between loop geometry, side chain directionality and β-strand length that arise from hydrogen bonding and packing constraints on regular β-sheet structures. We use these rules to de novo design jellyroll structures with double-stranded β-helices formed by eight antiparallel β-strands. The nuclear magnetic resonance structure of a hyperthermostable design closely matched the computational model, demonstrating accurate control over the β-sheet structure and loop geometry. Our results open the door to the design of a broad range of non-local β-sheet protein structures.
- MeSH
- konformace proteinů, beta-řetězec MeSH
- konformace proteinů MeSH
- molekulární modely MeSH
- nukleární magnetická rezonance biomolekulární MeSH
- počítačová simulace MeSH
- proteinové inženýrství metody MeSH
- proteiny chemie genetika MeSH
- sbalování proteinů MeSH
- sekvence aminokyselin MeSH
- stabilita proteinů MeSH
- vodíková vazba MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
- Názvy látek
- proteiny MeSH