Design of proteins by parallel tempering in the sequence space

. 2025 Oct ; 34 (10) : e70246.

Jazyk angličtina Země Spojené státy americké Médium print

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid40990840

Grantová podpora
LM2023055 Ministerstvo Školství, Mládeže a Tělovýchovy
LUC 24136 Ministerstvo Školství, Mládeže a Tělovýchovy
CA21160 European Cooperation in Science and Technology
ML4NGP European Cooperation in Science and Technology

Computational design of new proteins is often performed by optimizing the amino acid sequence. This sequence is characterized by an energy (lower energy means better propensity to form the desired 3D structure) that is sampled and minimized. Here, we use the parallel tempering algorithm to accelerate this task. ESMfold was used to predict the structures of the sampled proteins and calculate energy. Starting from random amino acid sequences, each sequence was sampled using the Monte Carlo method at one of a series of temperatures, and these replicas were being exchanged by the parallel tempering method. A series of 100 or 200 residue proteins was designed to maximize confidence in structure prediction and globularity and minimize surface hydrophobic residues. We show that parallel tempering is a viable alternative to Monte Carlo sampling without replica exchanges and simulated annealing or related energy-based protein design methods, especially in the situation where a continuous flow of designed sequences is desired.

Zobrazit více v PubMed

Dauparas J, Anishchenko I, Bennett N, Bai H, Ragotte RJ, Milles LF, et al. Robust deep learning–based protein sequence design using ProteinMPNN. Science. 2022;378(6615):49–56. PubMed PMC

Earl DJ, Deem MW. Parallel tempering: theory, applications, and new perspectives. Phys Chem Chem Phys. 2005;7:3910–3916. PubMed

Frank C, Khoshouei A, Fuβ L, Schiwietz D, Putz D, Weber L, et al. Scalable protein design using optimization in a relaxed sequence space. Science. 2024;386(6720):439–445. PubMed PMC

Goverde CA, Wolf B, Khakzad H, Rosset S, Correia BE. De novo protein design by inversion of the AlphaFold structure prediction network. Protein Sci. 2023;32(6):e4653. PubMed PMC

Hie B, Candido S, Lin Z, Kabeli O, Rao R, Smetanin N, et al. A high‐level programming language for generative protein design. bioRxiv. 2022, 10.1101/2022.12.21.521526. DOI

Humphrey W, Dalke A, Schulten K. VMD – visual molecular dynamics. J Mol Graph. 1996;14:33–38. PubMed

Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. PubMed PMC

Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary‐scale prediction of atomic‐level protein structure with a language model. Science. 2023;379(6637):1123–1130. PubMed

Lisanza SL, Gershon JM, Tipps SWK, Sims JN, Arnoldt L, Hendel SJ, et al. Multistate and functional protein design using RoseTTAFold sequence space diffusion. Nat Biotechnol. 2024, 10.1038/s41587-024-02395-w. PubMed DOI PMC

Listov D, Goverde CA, Correia BE, Fleishman SJ. Opportunities and challenges in design and optimization of protein function. Nat Rev Mol Cell Biol. 2024;25:639–653. PubMed PMC

Liu Y, Kuhlman B. RosettaDesign server for protein design. Nucleic Acids Res. 2006;34(Suppl 2):W235–W238. PubMed PMC

Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21(6):1087–1092.

Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci USA. 2021;118(15):e2016239118. PubMed PMC

Spiwok V, Sucur Z, Hosek P. Enhanced sampling techniques in biomolecular simulations. Biotech Adv. 2015;33(6, Part 2):1130–1140. BioTech 2014 and 6th Czech‐Swiss Biotechnology Symposium. PubMed

Swendsen RH, Wang JS. Replica Monte Carlo simulation of spin‐glasses. Phys Rev Lett. 1986;57:2607–2609. PubMed

van der Maaten L, Hinton G. Visualizing Data using t‐SNE. J Mach Learn Res. 2008;9:2579–2605.

Verkuil R, Kabeli O, Du Y, Wicky BIM, Milles LF, Dauparas J, et al. Language models generalize beyond natural proteins. bioRxiv. 2022, 10.1101/2022.12.21.521521. DOI

Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, et al. De novo design of protein structure and function with RFdiffusion. Nature. 2023;620:1476–4687. PubMed PMC

Wicky BIM, Milles LF, Courbet A, Ragotte RJ, Dauparas J, Kinfu E, et al. Hallucinating symmetric protein assemblies. Science. 2022;378(6615):56–61. PubMed PMC

Yamamoto R, Kob W. Replica‐exchange molecular dynamics simulation for supercooled liquids. Phys Rev E. 2000;61:5473–5476. PubMed

Najít záznam

Citační ukazatele

Pouze přihlášení uživatelé

Možnosti archivace

Nahrávání dat ...