Solvent Accessibility Promotes Rotamer Errors during Protein Modeling with Major Side-Chain Prediction Programs
Language English Country United States Media print-electronic
Document type Journal Article, Research Support, Non-U.S. Gov't
PubMed
37410883
PubMed Central
PMC10369486
DOI
10.1021/acs.jcim.3c00134
Knihovny.cz E-resources
- MeSH
- Algorithms MeSH
- Amino Acids * chemistry MeSH
- Protein Conformation MeSH
- Proteins * chemistry MeSH
- Solvents MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Names of Substances
- Amino Acids * MeSH
- Proteins * MeSH
- Solvents MeSH
Side-chain rotamer prediction is one of the most critical late stages in protein 3D structure building. Highly advanced and specialized algorithms (e.g., FASPR, RASP, SCWRL4, and SCWRL4v) optimize this process by use of rotamer libraries, combinatorial searches, and scoring functions. We seek to identify the sources of key rotamer errors as a basis for correcting and improving the accuracy of protein modeling going forward. In order to evaluate the aforementioned programs, we process 2496 high-quality single-chained all-atom filtered 30% homology protein 3D structures and use discretized rotamer analysis to compare original with calculated structures. Among 513,024 filtered residue records, increased amino acid residue-dependent rotamer errors─associated in particular with polar and charged amino acid residues (ARG, LYS, and GLN)─clearly correlate with increased amino acid residue solvent accessibility and an increased residue tendency toward the adoption of non-canonical off rotamers which modeling programs struggle to predict accurately. Understanding the impact of solvent accessibility now appears key to improved side-chain prediction accuracies.
KP Therapeutics s r o Purkyňova 649 127 CZ 612 00 Brno Czech Republic
Veterinary Research Institute Hudcova 296 70 CZ 621 00 Brno Czech Republic
See more in PubMed
Hameduh T.; Haddad Y.; Adam V.; Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput. Struct. Biotechnol. J. 2020, 18, 3494–3506. 10.1016/j.csbj.2020.11.007. PubMed DOI PMC
Pakhrin S. C.; Shrestha B.; Adhikari B.; Kc D. B. Deep learning-based advances in protein structure prediction. Int. J. Mol. Sci. 2021, 22, 5553.10.3390/ijms22115553. PubMed DOI PMC
Laine E.; Eismann S.; Elofsson A.; Grudinin S. Protein sequence-to-structure learning: Is this the end (-to-end revolution)?. Proteins: Struct., Funct., Bioinf. 2021, 89, 1770–1786. 10.1002/prot.26235. PubMed DOI
Kwon S.; Won J.; Kryshtafovych A.; Seok C. Assessment of protein model structure accuracy estimation in CASP14: Old and new challenges. Proteins: Struct., Funct., Bioinf. 2021, 89, 1940–1948. 10.1002/prot.26192. PubMed DOI PMC
Cramer P. AlphaFold2 and the future of structural biology. Nat. Struct. Mol. Biol. 2021, 28, 704–705. 10.1038/s41594-021-00650-1. PubMed DOI
Jumper J.; Evans R.; Pritzel A.; Green T.; Figurnov M.; Ronneberger O.; Tunyasuvunakool K.; Bates R.; Žídek A.; Potapenko A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. 10.1038/s41586-021-03819-2. PubMed DOI PMC
Chowdhury R.; Bouatta N.; Biswas S.; Floristean C.; Kharkar A.; Roy K.; Rochereau C.; Ahdritz G.; Zhang J.; Church G. M.; et al. Single-sequence protein structure prediction using a language model and deep learning. Nat. Biotechnol. 2022, 40, 1617.10.1038/s41587-022-01432-w. PubMed DOI PMC
Baek M.; DiMaio F.; Anishchenko I.; Dauparas J.; Ovchinnikov S.; Lee G. R.; Wang J.; Cong Q.; Kinch L. N.; Schaeffer R. D.; et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871–876. 10.1126/science.abj8754. PubMed DOI PMC
Lin Z.; Akin H.; Rao R.; Hie B.; Zhu Z.; Lu W.; Smetanin N.; Verkuil R.; Kabeli O.; Shmueli Y.; dos Santos Costa A.; Fazel-Zarandi M.; Sercu T.; Candido S.; Rives A., Evolutionary-scale prediction of atomic level protein structure with a language model. 2022, bioRxiv:2022.07.20.500902. PubMed
Colbes J.; Corona R. I.; Lezcano C.; Rodríguez D.; Brizuela C. A. Protein side-chain packing problem: is there still room for improvement?. Briefings Bioinf. 2016, 18, 1033–1043. 10.1093/bib/bbw079. PubMed DOI
Huang X.; Pearce R.; Zhang Y. FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 2020, 36, 3758–3765. 10.1093/bioinformatics/btaa234. PubMed DOI PMC
Dunbrack R. L. Jr. Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 2002, 12, 431–440. 10.1016/s0959-440x(02)00344-5. PubMed DOI
Huang X.; Pearce R.; Zhang Y. Toward the Accuracy and Speed of Protein Side-Chain Packing: A Systematic Study on Rotamer Libraries. J. Chem. Inf. Model. 2020, 60, 410–420. 10.1021/acs.jcim.9b00812. PubMed DOI PMC
Miao Z.; Cao Y.; Jiang T. RASP: rapid modeling of protein side chain conformations. Bioinformatics 2011, 27, 3117–3122. 10.1093/bioinformatics/btr538. PubMed DOI
Krivov G. G.; Shapovalov M. V.; Dunbrack R. L. Jr. Improved prediction of protein side-chain conformations with SCWRL4. Proteins 2009, 77, 778–795. 10.1002/prot.22488. PubMed DOI PMC
Shapovalov M. V.; Dunbrack R. L. Jr. A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure 2011, 19, 844–858. 10.1016/j.str.2011.03.019. PubMed DOI PMC
Haddad Y.; Adam V.; Heger Z. Rotamer dynamics: analysis of rotamers in molecular dynamics simulations of proteins. Biophys. J. 2019, 116, 2062–2072. 10.1016/j.bpj.2019.04.017. PubMed DOI PMC
Hameduh T.; Mokry M.; Miller A. D.; Adam V.; Heger Z.; Haddad Y. A rotamer relay information system in the epidermal growth factor receptor–drug complexes reveals clues to new paradigm in protein conformational change. Comput. Struct. Biotechnol. J. 2021, 19, 5443–5454. 10.1016/j.csbj.2021.09.026. PubMed DOI PMC
Hintze B. J.; Lewis S. M.; Richardson J. S.; Richardson D. C. Molprobity’s ultimate rotamer-library distributions for model validation. Proteins: Struct., Funct., Bioinf. 2016, 84, 1177–1189. 10.1002/prot.25039. PubMed DOI PMC
Lovell S. C.; Word J. M.; Richardson J. S.; Richardson D. C. The penultimate rotamer library. Proteins 2000, 40, 389–408. 10.1002/1097-0134(20000815)40:3<389::aid-prot50>3.0.co;2-2. PubMed DOI
He Z.; Zhang C.; Xu Y.; Zeng S.; Zhang J.; Xu D. MUFOLD-DB: a processed protein structure database for protein structure prediction and analysis. BMC genomics 2014, 15, S2.10.1186/1471-2164-15-s11-s2. PubMed DOI PMC
IUPAC-IUB Commission on Biochemical Nomenclature. J. Mol. Biol. 1970, 52, 1–17. 10.1016/0022-2836(70)90173-7. PubMed DOI
Grant B. J.; Rodrigues A. P. C.; ElSawy K. M.; McCammon J. A.; Caves L. S. D. Bio3d: an R package for the comparative analysis of protein structures. Bioinformatics 2006, 22, 2695–2696. 10.1093/bioinformatics/btl461. PubMed DOI
Kabsch W.; Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22, 2577–2637. 10.1002/bip.360221211. PubMed DOI
Ponder J. W.; Richards F. M. Tertiary templates for proteins. J. Mol. Biol. 1987, 193, 775–791. 10.1016/0022-2836(87)90358-5. PubMed DOI
Towse C. L.; Rysavy S. J.; Vulovic I. M.; Daggett V. New Dynamic Rotamer Libraries: Data-Driven Analysis of Side-Chain Conformational Propensities. Structure 2016, 24, 187–199. 10.1016/j.str.2015.10.017. PubMed DOI PMC
Petrella R. J.; Karplus M. The energetics of off-rotamer protein side-chain conformations11Edited by F. Cohen. J. Mol. Biol. 2001, 312, 1161–1175. 10.1006/jmbi.2001.4965. PubMed DOI
Zhu X.; Lopes P. E.; Shim J.; MacKerell A. D. Jr. Intrinsic energy landscapes of amino acid side-chains. J. Chem. Inf. Model. 2012, 52, 1559–1572. 10.1021/ci300079j. PubMed DOI PMC
Haddad Y.; Adam V.; Heger Z. Ten quick tips for homology modeling of high-resolution protein 3D structures. PLoS Comput. Biol. 2020, 16, e100744910.1371/journal.pcbi.1007449. PubMed DOI PMC
Chopra G.; Summa C. M.; Levitt M. Solvent dramatically affects protein structure refinement. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 20239–20244. 10.1073/pnas.0810818105. PubMed DOI PMC
Fan H.; Mark A. E. Refinement of homology-based protein structures by molecular dynamics simulation techniques. Protein Sci. 2004, 13, 211–220. 10.1110/ps.03381404. PubMed DOI PMC
Hong S. H.; Joung I.; Flores-Canales J. C.; Manavalan B.; Cheng Q.; Heo S.; Kim J. Y.; Lee S. Y.; Nam M.; Joo K.; et al. Protein structure modeling and refinement by global optimization in CASP12. Proteins: Struct., Funct., Bioinf. 2018, 86, 122–135. 10.1002/prot.25426. PubMed DOI
Bhattacharya D. refineD: improved protein structure refinement using machine learning based restrained relaxation. Bioinformatics 2019, 35, 3320–3328. 10.1093/bioinformatics/btz101. PubMed DOI
Pechlaner M.; van Gunsteren W. F.; Hansen N.; Smith L. J. Molecular dynamics simulation or structure refinement of proteins: are solvent molecules required? A case study using hen lysozyme. Eur. Biophys. J. 2022, 51, 265–282. 10.1007/s00249-022-01593-1. PubMed DOI PMC
Hernandez-Ayon S. E.; Brizuela C. A., Designing rotamer libraries based on pairs of consecutive residues: A preliminary analysis. Proceedings 2015 Ieee International Conference on Bioinformatics and Biomedicine, 2015, 1231-1238.