Outcomes of the EMDataResource Cryo-EM Ligand Modeling Challenge
Status PubMed-not-MEDLINE Jazyk angličtina Země Spojené státy americké Médium electronic
Typ dokumentu preprinty, časopisecké články
Grantová podpora
MC_UP_A025_1012
Medical Research Council - United Kingdom
R24 GM141254
NIGMS NIH HHS - United States
R01 GM079429
NIGMS NIH HHS - United States
R01 GM071939
NIGMS NIH HHS - United States
P01 GM063210
NIGMS NIH HHS - United States
MR/V000403/1
Medical Research Council - United Kingdom
Wellcome Trust - United Kingdom
R35 GM131883
NIGMS NIH HHS - United States
R01 GM146340
NIGMS NIH HHS - United States
R01 GM123089
NIGMS NIH HHS - United States
R01 GM073919
NIGMS NIH HHS - United States
R01 GM133840
NIGMS NIH HHS - United States
PubMed
38343795
PubMed Central
PMC10854310
DOI
10.21203/rs.3.rs-3864137/v1
PII: rs.3.rs-3864137
Knihovny.cz E-zdroje
- Publikační typ
- časopisecké články MeSH
- preprinty MeSH
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: E. coli beta-galactosidase with inhibitor, SARS-CoV-2 RNA-dependent RNA polymerase with covalently bound nucleotide analog, and SARS-CoV-2 ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. We found that (1) the quality of submitted ligand models and surrounding atoms varied, as judged by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics, and contact scores, and (2) a composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Biodesign Institute Arizona State University Tempe AZ USA
Center for Development of Therapeutics Broad Institute of MIT and Harvard Cambridge MA USA
Chemical Computing Group Montreal Quebec CA
Department of Biochemistry and Institute for Protein Design University of Washington Seattle WA USA
Department of Biochemistry and Molecular Biology University of Chicago Chicago IL USA
Department of Biochemistry Duke University Durham NC USA
Department of Biochemistry University of Cambridge Cambridge UK
Department of Biological Sciences Purdue University West Lafayette IN USA
Department of Chemistry and Quantum Theory Project University of Florida Gainesville FL USA
Department of Chemistry Carleton University Ottawa ON Canada
Department of Computer Science Pacific Lutheran University Tacoma WA USA
Department of Computer Science Purdue University West Lafayette IN USA
Department of Computer Science Saint Louis University St Louis MO USA
Department of Electrical Engineering and Computer Science University of Missouri Columbia MO USA
Departments of Bioengineering and of Microbiology and Immunology Stanford University Stanford CA USA
Discovery Chemistry Genentech Inc South San Francisco USA
Division of Computing and Software Systems University of Washington Bothell WA USA
Division of Cryo EM and Bioimaging SSRL SLAC National Accelerator Laboratory Menlo Park CA USA
Electron Bio Imaging Centre Diamond Light Source Harwell Science and Innovation Campus Didcot UK
European Molecular Biology Laboratory Hamburg Unit Hamburg Germany
Genome Center University of California Davis CA USA
Institute for Quantitative Biomedicine Rutgers The State University of New Jersey Piscataway NJ USA
Institute of Biological Information Processing Forschungszentrum Jülich Jülich Germany
Institute of Biotechnology Czech Academy of Sciences Vestec
MRC Laboratory of Molecular Biology Cambridge UK
Nature's Toolbox Rio Rancho NM USA
Physics Department Heinrich Heine University Düsseldorf Düsseldorf Germany
San Diego Supercomputer Center University of California San Diego La Jolla CA USA
School of Advanced Sciences and Languages VIT Bhopal University Bhopal India
Structural Biology Center 10 ray Science Division Argonne National Laboratory Argonne IL USA
Structural Biology Genentech Inc South San Francisco USA
York Structural Biology Laboratory Department of Chemistry University of York York UK
Zobrazit více v PubMed
Adams P. D. et al. Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 24, 502–508 (2016). PubMed PMC
Gore S. et al. Validation of Structures in the Protein Data Bank. Structure 25, 1916–1927 (2017). PubMed PMC
Smart O. S. et al. Validation of ligands in macromolecular structures determined by X-ray crystallography. Acta Crystallogr D Struct Biol 74, 228–236 (2018). PubMed PMC
Feng Z. et al. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank. Structure 29, 393–400.e1 (2021). PubMed PMC
Lawson C. L., Berman H. M. & Chiu W. Evolving data standards for cryo-EM structures. Struct Dyn 7, 014701 (2020). PubMed PMC
Lawson C. L. & Chiu W. Comparing cryo-EM structures. J. Struct. Biol. 204, 523–526 (2018). PubMed PMC
Barad B. A. et al. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat. Methods 12, 943–946 (2015). PubMed PMC
Lawson C. L. et al. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat. Methods 18, 156–164 (2021). PubMed PMC
Williams C. J. et al. MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci. 27, 293–315 (2018). PubMed PMC
Pintilie G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020). PubMed PMC
Wang Z., Patwardhan A. & Kleywegt G. J. Validation analysis of EMDB entries. Acta Crystallogr D Struct Biol 78, 542–552 (2022). PubMed PMC
Bartesaghi A. et al. Atomic Resolution Cryo-EM Structure of β-Galactosidase. Structure 26, 848–856.e3 (2018). PubMed PMC
Yin W. et al. Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir. Science 368, 1499–1504 (2020). PubMed PMC
Kokic G. et al. Mechanism of SARS-CoV-2 polymerase stalling by remdesivir. Nat. Commun. 12, 279 (2021). PubMed PMC
Kern D. M. et al. Cryo-EM structure of SARS-CoV-2 ORF3a in lipid nanodiscs. Nat. Struct. Mol. Biol. 28, 573–582 (2021). PubMed PMC
Kryshtafovych A., Adams P. D., Lawson C. L. & Chiu W. Evaluation system and web infrastructure for the second cryo-EM model challenge. J. Struct. Biol. 204, 96–108 (2018). PubMed PMC
Rosenthal P. B. & Henderson R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryo-microscopy. J. Mol. Biol. 333, 721–745 (2003). PubMed
Lagerstedt I. et al. Web-based visualisation and analysis of 3D electron-microscopy data from EMDB and PDB. J. Struct. Biol. 184, 173–181 (2013). PubMed PMC
Joseph A. P., Lagerstedt I., Patwardhan A., Topf M. & Winn M. Improved metrics for comparing structures of macromolecular assemblies determined by 3D electron-microscopy. J. Struct. Biol. 199, 12–26 (2017). PubMed PMC
Afonine P. V. et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr D Struct Biol 74, 814–840 (2018). PubMed PMC
Chen V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010). PubMed PMC
Liebschner D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr D Struct Biol 75, 861–877 (2019). PubMed PMC
Kryshtafovych A. et al. Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10. Proteins 82 Suppl 2, 26–42 (2014). PubMed PMC
Bruno I. J. et al. Retrieval of crystallographically-derived molecular geometry information. J. Chem. Inf. Comput. Sci. 44, 2133–2144 (2004). PubMed
Shao C. et al. Simplified quality assessment for small-molecule ligands in the Protein Data Bank. Structure 30, 252–262.e4 (2022). PubMed PMC
Casañal A., Lohkamp B. & Emsley P. Current developments in Coot for macromolecular model building of Electron Cryo-microscopy and Crystallographic Data. Protein Sci. 29, 1069–1078 (2020). PubMed PMC
Nicholls R. A. et al. Modelling covalent linkages in CCP4. Acta Crystallogr D Struct Biol 77, 712–726 (2021). PubMed PMC
Černý J., Božíková P., Svoboda J. & Schneider B. A unified dinucleotide alphabet describing both RNA and DNA structures. Nucleic Acids Res. 48, 6367–6381 (2020). PubMed PMC
Černý J. et al. Structural alphabets for conformational analysis of nucleic acids available at dnatco.datmos.org. Acta Crystallogr D Struct Biol 76, 805–813 (2020). PubMed PMC
Biedermannová L. & Schneider B. Structure of the ordered hydration of amino acids in proteins: analysis of crystal structures. Acta Crystallogr. D Biol. Crystallogr. 71, 2192–2202 (2015). PubMed PMC
Černý J., Schneider B. & Biedermannová L. WatAA: Atlas of Protein Hydration. Exploring synergies between data mining and ab initio calculations. Phys. Chem. Chem. Phys. 19, 17094–17102 (2017). PubMed
Prisant M. G., Williams C. J., Chen V. B., Richardson J. S. & Richardson D. C. New tools in MolProbity validation: CaBLAM for Cryo-EM backbone, UnDowser to rethink ‘waters,’ and NGL Viewer to recapture online 3D graphics. Protein Sci. 29, 315–329 (2020). PubMed PMC
Jiang S., Feher M., Williams C., Cole B. & Shaw D. E. AutoPH4: An Automated Method for Generating Pharmacophore Models from Protein Binding Pockets. J. Chem. Inf. Model. 60, 4326–4338 (2020). PubMed
Tyagi R., Singh A., Chaudhary K. K. & Yadav M. K. Chapter 17 - Pharmacophore modeling and its applications. in Bioinformatics (eds. Singh D. B. & Pathak R. K.) 269–289 (Academic Press, 2022).
Sellers B. D., James N. C. & Gobbi A. A Comparison of Quantum and Molecular Mechanical Methods to Estimate Strain Energy in Druglike Fragments. J. Chem. Inf. Model. 57, 1265–1275 (2017). PubMed
Lee M.-L. et al. chemalot and chemalot_knime: Command line programs as workflow tools for drug discovery. J. Cheminform. 9, 38 (2017). PubMed PMC
Smith J. S., Isayev O. & Roitberg A. E. ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017). PubMed PMC
Croll T. I., Williams C. J., Chen V. B., Richardson D. C. & Richardson J. S. Improving SARS-CoV-2 structures: Peer review by early coordinate release. Biophys. J. 120, 1085–1096 (2021). PubMed PMC
Modi V., Xu Q., Adhikari S. & Dunbrack R. L. Jr. Assessment of template-based modeling of protein structure in CASP11. Proteins 84 Suppl 1, 200–220 (2016). PubMed PMC
Zhang K. et al. Cryo-EM structure of a 40 kDa SAM-IV riboswitch RNA at 3.7 Å resolution. Nat. Commun. 10, 5511 (2019). PubMed PMC
Su Z. et al. Cryo-EM structures of full-length Tetrahymena ribozyme at 3.1 Å resolution. Nature 596, 603–607 (2021). PubMed PMC
Lawson C. L., Berman H. M., Chen L., Vallat B. & Zirbel C. L. The Nucleic Acid Knowledgebase: a new portal for 3D structural information about nucleic acids. Nucleic Acids Res. (2023) doi:10.1093/nar/gkad957. PubMed DOI PMC
Sun S. Y. et al. Cryo-ET of parasites gives subnanometer insight into tubulin-based structures. Proc. Natl. Acad. Sci. U. S. A. 119, (2022). PubMed PMC
Liu H.-F. et al. nextPYP: a comprehensive and scalable platform for characterizing protein variability in situ using single-particle cryo-electron tomography. Nat. Methods (2023) doi:10.1038/s41592-023-02045-0. PubMed DOI PMC
Chmielewski D. et al. Integrated analyses reveal a hinge glycan regulates coronavirus spike tilting and virus infectivity. Res Sq (2023) doi:10.21203/rs.3.rs-2553619/v1. DOI
Yang H. et al. Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Crystallogr. D Biol. Crystallogr. 60, 1833–1839 (2004). PubMed
wwPDB Consortium. EMDB-the Electron Microscopy Data Bank. Nucleic Acids Res. (2023) doi:10.1093/nar/gkad1019. PubMed DOI PMC
Westbrook J. D. et al. The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank. Bioinformatics 31, 1274–1278 (2015). PubMed PMC
Gražulis S. et al. Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration. Nucleic Acids Res. 40, D420–7 (2012). PubMed PMC
Moriarty N. W., Grosse-Kunstleve R. W. & Adams P. D. electronic Ligand Builder and Optimization Workbench (eLBOW): a tool for ligand coordinate and restraint generation. Acta Crystallogr. D Biol. Crystallogr. 65, 1074–1080 (2009). PubMed PMC
Nicholls R. A. et al. The missing link: covalent linkages in structural models. Acta Crystallogr D Struct Biol 77, 727–745 (2021). PubMed PMC
Chaudhury S., Lyskov S. & Gray J. J. PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics 26, 689–691 (2010). PubMed PMC
Wang J., Wolf R. M., Caldwell J. W., Kollman P. A. & Case D. A. Development and testing of a general amber force field. J. Comput. Chem. 25, 1157–1174 (2004). PubMed
O’Boyle N. M. et al. Open Babel: An open chemical toolbox. J. Cheminform. 3, 33 (2011). PubMed PMC
Vanommeslaeghe K. et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 31, 671–690 (2010). PubMed PMC
Vagin A. A. et al. REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr. D Biol. Crystallogr. 60, 2184–2195 (2004). PubMed
Chojnowski G., Sobolev E., Heuser P. & Lamzin V. S. The accuracy of protein models automatically built into cryo-EM maps with ARP/wARP. Acta Crystallogr D Struct Biol 77, 142–150 (2021). PubMed PMC
Terashi G. & Kihara D. De novo main-chain modeling for EM maps using MAINMAST. Nat. Commun. 9, 1618 (2018). PubMed PMC
Terashi G., Kagaya Y. & Kihara D. MAINMASTseg: Automated Map Segmentation Method for Cryo-EM Density Maps with Symmetry. J. Chem. Inf. Model. 60, 2634–2643 (2020). PubMed PMC
Chen M. & Baker M. L. Automation and assessment of de novo modeling with Pathwalking in near atomic resolution cryo-EM density maps. J. Struct. Biol. 204, 555–563 (2018). PubMed
DiMaio F., Tyka M. D., Baker M. L., Chiu W. & Baker D. Refinement of protein structures into low-resolution density maps using rosetta. J. Mol. Biol. 392, 181–190 (2009). PubMed PMC
Webb B. & Sali A. Protein structure modeling with MODELLER. Methods Mol. Biol. 1137, 1–15 (2014). PubMed
Si D. et al. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci. Rep. 10, 4282 (2020). PubMed PMC
Pfab J., Phan N. M. & Si D. DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc. Natl. Acad. Sci. U. S. A. 118, (2021). PubMed PMC
Igaev M., Kutzner C., Bock L. V., Vaiana A. C. & Grubmüller H. Automated cryo-EM structure refinement using correlation-driven molecular dynamics. Elife 8, (2019). PubMed PMC
Brown A. et al. Tools for macromolecular model building and refinement into electron cryo-microscopy reconstructions. Acta Crystallogr. D Biol. Crystallogr. 71, 136–153 (2015). PubMed PMC
Yamashita K., Palmer C. M., Burnley T. & Murshudov G. N. Cryo-EM single-particle structure refinement and map calculation using Servalcat. Acta Crystallogr D Struct Biol 77, 1282–1291 (2021). PubMed PMC
Nicholls R. A., Fischer M., McNicholas S. & Murshudov G. N. Conformation-independent structural comparison of macromolecules with ProSMART. Acta Crystallogr. D Biol. Crystallogr. 70, 2487–2499 (2014). PubMed PMC
Singharoy A. et al. Molecular dynamics-based refinement and validation for sub-5 Å cryo-electron microscopy maps. Elife 5, (2016). PubMed PMC
Shekhar M. et al. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. Matter 4, 3195–3216 (2021). PubMed PMC
MacCallum J. L., Perez A. & Dill K. A. Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference. Proc. Natl. Acad. Sci. U. S. A. 112, 6985–6990 (2015). PubMed PMC
Perez A., MacCallum J. L. & Dill K. A. Accelerating molecular simulations of proteins using Bayesian inference on weak information. Proc. Natl. Acad. Sci. U. S. A. 112, 11846–11851 (2015). PubMed PMC
Chojnowski G. DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models. Nucleic Acids Res. 51, 8255–8269 (2023). PubMed PMC
Hsin J., Arkhipov A., Yin Y., Stone J. E. & Schulten K. Using VMD: an introductory tutorial. Curr. Protoc. Bioinformatics Chapter 5, Unit 5.7 (2008). PubMed PMC
Pettersen E. F. et al. UCSF Chimera--a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004). PubMed
Goddard T. D. et al. UCSF ChimeraX: Meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018). PubMed PMC
Croll T. I. ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr D Struct Biol 74, 519–530 (2018). PubMed PMC
Warshamanage R., Yamashita K. & Murshudov G. N. EMDA: A Python package for Electron Microscopy Data Analysis. J. Struct. Biol. 214, 107826 (2022). PubMed PMC
Burnley T., Palmer C. M. & Winn M. Recent developments in the CCP-EM software suite. Acta Crystallogr D Struct Biol 73, 469–477 (2017). PubMed PMC
Ramlaul K., Palmer C. M. & Aylett C. H. S. A Local Agreement Filtering Algorithm for Transmission EM Reconstructions. J. Struct. Biol. 205, 30–40 (2019). PubMed PMC
Rose Y. et al. RCSB Protein Data Bank: Architectural Advances Towards Integrated Searching and Efficient Access to Macromolecular Structure Data from the PDB Archive. J. Mol. Biol. 433, 166704 (2021). PubMed PMC
Burley S. K. et al. Electron microscopy holdings of the Protein Data Bank: the impact of the resolution revolution, new validation tools, and implications for the future. Biophys. Rev. 14, 1281–1301 (2022). PubMed PMC
Chen V. B., Davis I. W. & Richardson D. C. KING (Kinemage, Next Generation): a versatile interactive molecular and scientific visualization program. Protein Sci. 18, 2403–2409 (2009). PubMed PMC