Outcomes of the EMDataResource cryo-EM Ligand Modeling Challenge
Language English Country United States Media print-electronic
Document type Journal Article
Grant support
R24GM141254
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
R01 GM079429
NIGMS NIH HHS - United States
R01GM123089
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
R01GM079429
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
R01GM146340
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
CRG/2022/002761
DST | Science and Engineering Research Board (SERB)
R01 GM071939
NIGMS NIH HHS - United States
DESC0019749
U.S. Department of Energy (DOE)
R35GM131883
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
CIBSS - EXC-2189 - 390939984
Deutsche Forschungsgemeinschaft (German Research Foundation)
DGE-1762114
National Science Foundation (NSF)
R01GM133198
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
MC_UP_A025_1012
RCUK | Medical Research Council (MRC)
IIS2211598
National Science Foundation (NSF)
R24 GM141254
NIGMS NIH HHS - United States
BB/S007083/1
RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
DE-AC02-05CH11231
U.S. Department of Energy (DOE)
BB/S005099/1
RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
P01GM063210
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
R01 GM146340
NIGMS NIH HHS - United States
CHE-2235785
National Science Foundation (NSF)
R01 GM133198
NIGMS NIH HHS - United States
R01 GM123089
NIGMS NIH HHS - United States
DE-AC02-06CH11357
U.S. Department of Energy (DOE)
208398/Z/17/Z
Wellcome Trust (Wellcome)
R01GM133840
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
DBI-1832184
National Science Foundation (NSF)
P01 GM063210
NIGMS NIH HHS - United States
RTG 2756
Deutsche Forschungsgemeinschaft (German Research Foundation)
Wellcome Trust - United Kingdom
R01GM073919
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
BB/T012935/1
RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
MR/V000403/1
RCUK | Medical Research Council (MRC)
R01 GM073919
NIGMS NIH HHS - United States
R01GM071939
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
R01 GM133840
NIGMS NIH HHS - United States
209407/Z/17/Z
Wellcome Trust (Wellcome)
R35 GM131883
NIGMS NIH HHS - United States
PubMed
38918604
PubMed Central
PMC11526832
DOI
10.1038/s41592-024-02321-7
PII: 10.1038/s41592-024-02321-7
Knihovny.cz E-resources
- MeSH
- beta-Galactosidase chemistry metabolism MeSH
- COVID-19 virology MeSH
- Cryoelectron Microscopy * methods MeSH
- Escherichia coli MeSH
- Protein Conformation MeSH
- Ligands MeSH
- Models, Molecular * MeSH
- Reproducibility of Results MeSH
- SARS-CoV-2 MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- beta-Galactosidase MeSH
- Ligands MeSH
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Biodesign Institute Arizona State University Tempe AZ USA
Center for Development of Therapeutics Broad Institute of MIT and Harvard Cambridge MA USA
Chemical Computing Group Montreal Quebec Canada
Computational Chemistry Vilya South San Francisco CA USA
Department of Biochemistry and Institute for Protein Design University of Washington Seattle WA USA
Department of Biochemistry and Molecular Biology University of Chicago Chicago IL USA
Department of Biochemistry Duke University Durham NC USA
Department of Biochemistry University of Cambridge Cambridge UK
Department of Biological Sciences Purdue University West Lafayette IN USA
Department of Chemistry and Quantum Theory Project University of Florida Gainesville FL USA
Department of Chemistry Carleton University Ottawa Ontario Canada
Department of Computer Science Pacific Lutheran University Tacoma WA USA
Department of Computer Science Purdue University West Lafayette IN USA
Department of Computer Science Saint Louis University St Louis MO USA
Department of Electrical Engineering and Computer Science University of Missouri Columbia MO USA
Departments of Bioengineering and of Microbiology and Immunology Stanford University Stanford CA USA
Discovery Chemistry Genentech Inc San Francisco CA USA
Division of Computing and Software Systems University of Washington Bothell WA USA
Division of Cryo EM and Bioimaging SSRL SLAC National Accelerator Laboratory Menlo Park CA USA
Electron Bio Imaging Centre Diamond Light Source Harwell Science and Innovation Campus Didcot UK
European Molecular Biology Laboratory Hamburg Unit Hamburg Germany
Genome Center University of California Davis CA USA
Institute of Biological Information Processing Forschungszentrum Jülich Jülich Germany
Institute of Biotechnology Czech Academy of Sciences Vestec Czech Republic
MRC Laboratory of Molecular Biology Cambridge UK
MSU DOE Plant Research Laboratory East Lansing MI USA
National Renewable Energy Laboratory Golden CO USA
Nature's Toolbox Rio Rancho NM USA
Physics Department Heinrich Heine University Düsseldorf Düsseldorf Germany
Protein Science Septerna South San Francisco CA USA
School of Advanced Sciences and Languages VIT Bhopal University Bhopal India
School of Molecular Sciences Arizona State University Tempe AZ USA
Structural Biology Center 10 ray Science Division Argonne National Laboratory Argonne IL USA
Structural Biology Genentech Inc South San Francisco CA USA
The Chinese University of Hong Kong Hong Kong China
Trivedi School of Biosciences Ashoka University Sonipat India
York Structural Biology Laboratory Department of Chemistry University of York York UK
See more in PubMed
Adams PD et al. Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 24, 502–508 (2016). PubMed PMC
Gore S. et al. Validation of Structures in the Protein Data Bank. Structure 25, 1916–1927 (2017). PubMed PMC
Smart OS et al. Validation of ligands in macromolecular structures determined by X-ray crystallography. Acta Crystallogr D Struct Biol 74, 228–236 (2018). PubMed PMC
Feng Z. et al. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank. Structure 29, 393–400.e1 (2021). PubMed PMC
Lawson CL, Berman HM & Chiu W Evolving data standards for cryo-EM structures. Struct Dyn 7, 014701 (2020). PubMed PMC
Lawson CL & Chiu W Comparing cryo-EM structures. J. Struct. Biol 204, 523–526 (2018). PubMed PMC
Barad BA et al. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat. Methods 12, 943–946 (2015). PubMed PMC
Lawson CL et al. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat. Methods 18, 156–164 (2021). PubMed PMC
Williams CJ et al. MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci. 27, 293–315 (2018). PubMed PMC
Pintilie G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020). PubMed PMC
Wang Z., Patwardhan A & Kleywegt GJ Validation analysis of EMDB entries. Acta Crystallogr D Struct Biol 78, 542–552 (2022). PubMed PMC
Bartesaghi A. et al. Atomic Resolution Cryo-EM Structure of β-Galactosidase. Structure 26, 848–856.e3 (2018). PubMed PMC
Yin W. et al. Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir. Science 368, 1499–1504 (2020). PubMed PMC
Kokic G. et al. Mechanism of SARS-CoV-2 polymerase stalling by remdesivir. Nat. Commun 12, 279 (2021). PubMed PMC
Kern DM et al. Cryo-EM structure of SARS-CoV-2 ORF3a in lipid nanodiscs. Nat. Struct. Mol. Biol 28, 573–582 (2021). PubMed PMC
Kryshtafovych A, Adams PD, Lawson CL & Chiu W Evaluation system and web infrastructure for the second cryo-EM model challenge. J. Struct. Biol 204, 96–108 (2018). PubMed PMC
Rosenthal PB & Henderson R Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol 333, 721–745 (2003). PubMed
Lagerstedt I. et al. Web-based visualisation and analysis of 3D electron-microscopy data from EMDB and PDB. J. Struct. Biol 184, 173–181 (2013). PubMed PMC
Joseph AP, Lagerstedt I, Patwardhan A, Topf M & Winn M Improved metrics for comparing structures of macromolecular assemblies determined by 3D electron-microscopy. J. Struct. Biol 199, 12–26 (2017). PubMed PMC
Afonine PV et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr D Struct Biol 74, 814–840 (2018). PubMed PMC
Chen VB et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr 66, 12–21 (2010). PubMed PMC
Liebschner D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr D Struct Biol 75, 861–877 (2019). PubMed PMC
Kryshtafovych A. et al. Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10. Proteins 82 Suppl 2, 26–42 (2014). PubMed PMC
Bruno IJ et al. Retrieval of crystallographically-derived molecular geometry information. J. Chem. Inf. Comput. Sci 44, 2133–2144 (2004). PubMed
Shao C. et al. Simplified quality assessment for small-molecule ligands in the Protein Data Bank. Structure 30, 252–262.e4 (2022). PubMed PMC
Casañal A, Lohkamp B & Emsley P Current developments in Coot for macromolecular model building of Electron Cryo-microscopy and Crystallographic Data. Protein Sci. 29, 1069–1078 (2020). PubMed PMC
Nicholls RA et al. Modelling covalent linkages in CCP4. Acta Crystallogr D Struct Biol 77, 712–726 (2021). PubMed PMC
Černý J, Božíková P, Svoboda J & Schneider B A unified dinucleotide alphabet describing both RNA and DNA structures. Nucleic Acids Res. 48, 6367–6381 (2020). PubMed PMC
Černý J et al. Structural alphabets for conformational analysis of nucleic acids available at dnatco.datmos.org. Acta Crystallogr D Struct Biol 76, 805–813 (2020). PubMed PMC
Biedermannová L & Schneider B Structure of the ordered hydration of amino acids in proteins: analysis of crystal structures. Acta Crystallogr. D Biol. Crystallogr. 71, 2192–2202 (2015). PubMed PMC
Černý J, Schneider B & Biedermannová L WatAA: Atlas of Protein Hydration. Exploring synergies between data mining and ab initio calculations. Phys. Chem. Chem. Phys 19, 17094–17102 (2017). PubMed
Prisant MG, Williams CJ, Chen VB, Richardson JS & Richardson DC New tools in MolProbity validation: CaBLAM for CryoEM backbone, UnDowser to rethink ‘waters,’ and NGL Viewer to recapture online 3D graphics. Protein Sci. 29, 315–329 (2020). PubMed PMC
Jiang S, Feher M, Williams C, Cole B & Shaw DE AutoPH4: An Automated Method for Generating Pharmacophore Models from Protein Binding Pockets. J. Chem. Inf. Model 60, 4326–4338 (2020). PubMed
Tyagi R, Singh A, Chaudhary KK & Yadav MK Chapter 17 - Pharmacophore modeling and its applications. in Bioinformatics (eds. Singh DB & Pathak RK) 269–289 (Academic Press, 2022).
Sellers BD, James NC & Gobbi A A Comparison of Quantum and Molecular Mechanical Methods to Estimate Strain Energy in Druglike Fragments. J. Chem. Inf. Model 57, 1265–1275 (2017). PubMed
Lee M-L et al. chemalot and chemalot_knime: Command line programs as workflow tools for drug discovery. J. Cheminform 9, 38 (2017). PubMed PMC
Smith JS, Isayev O & Roitberg AE ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8, 3192–3203 (2017). PubMed PMC
Croll TI, Williams CJ, Chen VB, Richardson DC & Richardson JS Improving SARS-CoV-2 structures: Peer review by early coordinate release. Biophys. J. 120, 1085–1096 (2021). PubMed PMC
Modi V, Xu Q, Adhikari S & Dunbrack RL Jr. Assessment of template-based modeling of protein structure in CASP11. Proteins 84 Suppl 1, 200–220 (2016). PubMed PMC
Giri N & Cheng J Improving Protein-Ligand Interaction Modeling with cryo-EM Data, Templates, and Deep Learning in 2021 Ligand Model Challenge. Biomolecules 13, (2023). PubMed PMC
Zhang K. et al. Cryo-EM structure of a 40 kDa SAM-IV riboswitch RNA at 3.7 Å resolution. Nat. Commun 10, 5511 (2019). PubMed PMC
Su Z. et al. Cryo-EM structures of full-length Tetrahymena ribozyme at 3.1 Å resolution. Nature 596, 603–607 (2021). PubMed PMC
Lawson CL, Berman HM, Chen L, Vallat B & Zirbel CL The Nucleic Acid Knowledgebase: a new portal for 3D structural information about nucleic acids. Nucleic Acids Res. (2023) doi:10.1093/nar/gkad957. PubMed DOI PMC
Sun SY et al. Cryo-ET of parasites gives subnanometer insight into tubulin-based structures. Proc. Natl. Acad. Sci. U. S. A 119, (2022). PubMed PMC
Liu H-F et al. nextPYP: a comprehensive and scalable platform for characterizing protein variability in situ using single-particle cryo-electron tomography. Nat. Methods (2023) doi:10.1038/s41592-023-02045-0. PubMed DOI PMC
Chmielewski D. et al. Integrated analyses reveal a hinge glycan regulates coronavirus spike tilting and virus infectivity. Res Sq (2023) doi:10.21203/rs.3.rs-2553619/v1. DOI
Yang H. et al. Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Crystallogr. D Biol. Crystallogr 60, 1833–1839 (2004). PubMed
wwPDB Consortium. EMDB-the Electron Microscopy Data Bank. Nucleic Acids Res. (2023) doi:10.1093/nar/gkad1019. PubMed DOI PMC
Westbrook JD et al. The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank. Bioinformatics 31, 1274–1278 (2015). PubMed PMC
Gražulis S. et al. Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration. Nucleic Acids Res. 40, D420–7 (2012). PubMed PMC
Moriarty NW, Grosse-Kunstleve RW & Adams PD electronic Ligand Builder and Optimization Workbench (eLBOW): a tool for ligand coordinate and restraint generation. Acta Crystallogr. D Biol. Crystallogr 65, 1074–1080 (2009). PubMed PMC
Nicholls RA et al. The missing link: covalent linkages in structural models. Acta Crystallogr D Struct Biol 77, 727–745 (2021). PubMed PMC
Chaudhury S, Lyskov S & Gray JJ PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta. Bioinformatics 26, 689–691 (2010). PubMed PMC
Wang J, Wolf RM, Caldwell JW, Kollman PA & Case DA Development and testing of a general amber force field. J. Comput. Chem 25, 1157–1174 (2004). PubMed
O’Boyle NM et al. Open Babel: An open chemical toolbox. J. Cheminform 3, 33 (2011). PubMed PMC
Vanommeslaeghe K. et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields. J. Comput. Chem. 31, 671–690 (2010). PubMed PMC
Vagin AA et al. REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use. Acta Crystallogr. D Biol. Crystallogr 60, 2184–2195 (2004). PubMed
Chojnowski G, Sobolev E, Heuser P & Lamzin VS The accuracy of protein models automatically built into cryo-EM maps with ARP/wARP. Acta Crystallogr D Struct Biol 77, 142–150 (2021). PubMed PMC
Terashi G & Kihara D De novo main-chain modeling for EM maps using MAINMAST. Nat. Commun 9, 1618 (2018). PubMed PMC
Terashi G, Kagaya Y & Kihara D MAINMASTseg: Automated Map Segmentation Method for Cryo-EM Density Maps with Symmetry. J. Chem. Inf. Model 60, 2634–2643 (2020). PubMed PMC
Chen M & Baker ML Automation and assessment of de novo modeling with Pathwalking in near atomic resolution cryoEM density maps. J. Struct. Biol 204, 555–563 (2018). PubMed
DiMaio F, Tyka MD, Baker ML, Chiu W & Baker D Refinement of protein structures into low-resolution density maps using rosetta. J. Mol. Biol 392, 181–190 (2009). PubMed PMC
Webb B & Sali A Protein structure modeling with MODELLER. Methods Mol. Biol 1137, 1–15 (2014). PubMed
Si D. et al. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci. Rep 10, 4282 (2020). PubMed PMC
Pfab J, Phan NM & Si D DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc. Natl. Acad. Sci. U. S. A 118, (2021). PubMed PMC
Igaev M, Kutzner C, Bock LV, Vaiana AC & Grubmüller H Automated cryo-EM structure refinement using correlation-driven molecular dynamics. Elife 8, (2019). PubMed PMC
Brown A. et al. Tools for macromolecular model building and refinement into electron cryo-microscopy reconstructions. Acta Crystallogr. D Biol. Crystallogr 71, 136–153 (2015). PubMed PMC
Yamashita K, Palmer CM, Burnley T & Murshudov GN Cryo-EM single-particle structure refinement and map calculation using Servalcat. Acta Crystallogr D Struct Biol 77, 1282–1291 (2021). PubMed PMC
Nicholls RA, Fischer M, McNicholas S & Murshudov GN Conformation-independent structural comparison of macromolecules with ProSMART. Acta Crystallogr. D Biol. Crystallogr 70, 2487–2499 (2014). PubMed PMC
Singharoy A. et al. Molecular dynamics-based refinement and validation for sub-5 Å cryo-electron microscopy maps. Elife 5, (2016). PubMed PMC
Shekhar M. et al. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. Matter 4, 3195–3216 (2021). PubMed PMC
Chang L, Mondal A, MacCallum JL & Perez A CryoFold 2.0: Cryo-EM Structure Determination with MELD. J. Phys. Chem. A 127, 3906–3913 (2023). PubMed
Wang J, Wolf RM, Caldwell JW, Kollman PA & Case DA Development and testing of a general amber force field. J. Comput. Chem 25, 1157–1174 (2004). PubMed
MacCallum JL, Perez A & Dill KA Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference. Proc. Natl. Acad. Sci. U. S. A. 112, 6985–6990 (2015). PubMed PMC
Perez A, MacCallum JL & Dill KA Accelerating molecular simulations of proteins using Bayesian inference on weak information. Proc. Natl. Acad. Sci. U. S. A 112, 11846–11851 (2015). PubMed PMC
Chojnowski G. DoubleHelix: nucleic acid sequence identification, assignment and validation tool for cryo-EM and crystal structure models. Nucleic Acids Res. 51, 8255–8269 (2023). PubMed PMC
Hsin J, Arkhipov A, Yin Y, Stone JE & Schulten K Using VMD: an introductory tutorial. Curr. Protoc. Bioinformatics Chapter 5, Unit 5.7 (2008). PubMed PMC
Pettersen EF et al. UCSF Chimera--a visualization system for exploratory research and analysis. J. Comput. Chem 25, 1605–1612 (2004). PubMed
Goddard TD et al. UCSF ChimeraX: Meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018). PubMed PMC
Croll TI ISOLDE: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr D Struct Biol 74, 519–530 (2018). PubMed PMC
Warshamanage R, Yamashita K & Murshudov GN EMDA: A Python package for Electron Microscopy Data Analysis. J. Struct. Biol 214, 107826 (2022). PubMed PMC
Burnley T, Palmer CM & Winn M Recent developments in the CCP-EM software suite. Acta Crystallogr D Struct Biol 73, 469–477 (2017). PubMed PMC
Ramlaul K, Palmer CM & Aylett CHS A Local Agreement Filtering Algorithm for Transmission EM Reconstructions. J. Struct. Biol 205, 30–40 (2019). PubMed PMC
Olechnovič K & Venclovas Č Contact Area-Based Structural Analysis of Proteins and Their Complexes Using CAD-Score. Methods Mol. Biol 2112, 75–90 (2020). PubMed
McDonald IK & Thornton JM Satisfying hydrogen bonding potential in proteins. J. Mol. Biol 238, 777–793 (1994). PubMed
Zemla A LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 31, 3370–3374 (2003). PubMed PMC
Mukherjee S & Zhang Y MM-align: a quick algorithm for aligning multiple-chain protein complex structures using iterative dynamic programming. Nucleic Acids Res. 37, e83 (2009). PubMed PMC
Biasini M. et al. OpenStructure: an integrated software framework for computational structural biology. Acta Crystallogr. D Biol. Crystallogr 69, 701–709 (2013). PubMed PMC
Chen VB, Davis IW & Richardson DC KING (Kinemage, Next Generation): a versatile interactive molecular and scientific visualization program. Protein Sci. 18, 2403–2409 (2009). PubMed PMC
Rose Y. et al. RCSB Protein Data Bank: Architectural Advances Towards Integrated Searching and Efficient Access to Macromolecular Structure Data from the PDB Archive. J. Mol. Biol 433, 166704 (2021) PubMed PMC
Lawson CL, et al. 2021 EMDataResource Ligand Model Challenge Dataset. Zenodo; 10.5281/zenodo.10551958 (2024). DOI