Annotating Macromolecular Complexes in the Protein Data Bank: Improving the FAIRness of Structure Data
Language English Country Great Britain, England Media electronic
Document type Journal Article
Grant support
Wellcome Trust - United Kingdom
PubMed
38040737
PubMed Central
PMC10692154
DOI
10.1038/s41597-023-02778-9
PII: 10.1038/s41597-023-02778-9
Knihovny.cz E-resources
- MeSH
- Databases, Protein MeSH
- Protein Conformation MeSH
- Macromolecular Substances MeSH
- Molecular Conformation MeSH
- Translational Research, Biomedical * MeSH
- Publication type
- Journal Article MeSH
- Names of Substances
- Macromolecular Substances MeSH
Macromolecular complexes are essential functional units in nearly all cellular processes, and their atomic-level understanding is critical for elucidating and modulating molecular mechanisms. The Protein Data Bank (PDB) serves as the global repository for experimentally determined structures of macromolecules. Structural data in the PDB offer valuable insights into the dynamics, conformation, and functional states of biological assemblies. However, the current annotation practices lack standardised naming conventions for assemblies in the PDB, complicating the identification of instances representing the same assembly. In this study, we introduce a method leveraging resources external to PDB, such as the Complex Portal, UniProt and Gene Ontology, to describe assemblies and contextualise them within their biological settings accurately. Employing the proposed approach, we assigned standard names to over 90% of unique assemblies in the PDB and provided persistent identifiers for each assembly. This standardisation of assembly data enhances the PDB, facilitating a deeper understanding of macromolecular complexes. Furthermore, the data standardisation improves the PDB's FAIR attributes, fostering more effective basic and translational research and scientific education.
CEITEC Central European Institute of Technology Masaryk University Brno Czech Republic
Univ Grenoble Alpes CNRS Grenoble INP LJK 38000 Grenoble France
See more in PubMed
Ramakrishnan V. Ribosome Structure and the Mechanism of Translation. Cell. 2002;108:557–572. doi: 10.1016/S0092-8674(02)00619-0. PubMed DOI
Hahn, S. Structure and mechanism of the RNA polymerase II transcription machinery. Nat. Struct. Mol. Biol. 11, 394–403 (2004). PubMed PMC
Nooren IMA, Thornton JM. Diversity of protein–protein interactions. EMBO J. 2003;22:3486–3492. doi: 10.1093/emboj/cdg359. PubMed DOI PMC
Acuner Ozbabacan SE, Engin HB, Gursoy A, Keskin O. Transient protein–protein interactions. Protein Eng. Des. Sel. 2011;24:635–648. doi: 10.1093/protein/gzr025. PubMed DOI
Raju RM, Goldberg AL, Rubin EJ. Bacterial proteolytic complexes as therapeutic targets. Nat. Rev. Drug Discov. 2012;11:777–789. doi: 10.1038/nrd3846. PubMed DOI
Hauser AS, Attwood MM, Rask-Andersen M, Schiöth HB, Gloriam DE. Trends in GPCR drug discovery: new agents, targets and indications. Nat. Rev. Drug Discov. 2017;16:829–842. doi: 10.1038/nrd.2017.178. PubMed DOI PMC
Lin J, Zhou D, Steitz TA, Polikanov YS, Gagnon MG. Ribosome-Targeting Antibiotics: Modes of Action, Mechanisms of Resistance, and Implications for Drug Design. Annu. Rev. Biochem. 2018;87:451–478. doi: 10.1146/annurev-biochem-062917-011942. PubMed DOI PMC
Abrahams JP, Leslie AGW, Lutter R, Walker JE. Structure at 2.8 Â resolution of F1-ATPase from bovine heart mitochondria. Nature. 1994;370:621–628. doi: 10.1038/370621a0. PubMed DOI
Bowler MW, Montgomery MG, Leslie AGW, Walker JE. Ground state structure of F1-ATPase from bovine heart mitochondria at 1.9 A resolution. J. Biol. Chem. 2007;282:14238–14242. doi: 10.1074/jbc.M700203200. PubMed DOI
Kabaleeswaran V, et al. Asymmetric Structure of the Yeast F1 ATPase in the Absence of Bound. Nucleotides. J. Biol. Chem. 2009;284:10546–10551. doi: 10.1074/jbc.M900544200. PubMed DOI PMC
Xu F, et al. Structure of an agonist-bound human A2A adenosine receptor. Science. 2011;332:322–327. doi: 10.1126/science.1202793. PubMed DOI PMC
Zhang K, et al. Structure of the human P2Y12 receptor in complex with an antithrombotic drug. Nature. 2014;509:115–118. doi: 10.1038/nature13083. PubMed DOI PMC
Glukhova A, et al. Structure of the Adenosine A1 Receptor Reveals the Basis for Subtype Selectivity. Cell. 2017;168:867–877.e13. doi: 10.1016/j.cell.2017.01.042. PubMed DOI
Groll M, et al. Structure of 20S proteasome from yeast at 2.4Å resolution. Nature. 1997;386:463–471. doi: 10.1038/386463a0. PubMed DOI
Löwe J, et al. Crystal structure of the 20S proteasome from the archaeon T. acidophilum at 3.4 A resolution. Science. 1995;268:533–539. doi: 10.1126/science.7725097. PubMed DOI
Schrader J, et al. The inhibition mechanism of human 20S proteasomes enables next-generation inhibitor design. Science. 2016;353:594–598. doi: 10.1126/science.aaf8993. PubMed DOI
Ban N, Nissen P, Hansen J, Moore PB, Steitz TA. The Complete Atomic Structure of the Large Ribosomal Subunit at 2.4 Å Resolution. Science. 2000;289:905–920. doi: 10.1126/science.289.5481.905. PubMed DOI
Wimberly BT, et al. Structure of the 30S ribosomal subunit. Nature. 2000;407:327–339. doi: 10.1038/35030006. PubMed DOI
Yusupova G, Jenner L, Rees B, Moras D, Yusupov M. Structural basis for messenger RNA movement on the ribosome. Nature. 2006;444:391–394. doi: 10.1038/nature05281. PubMed DOI
Karagöz GE, et al. Hsp90-Tau Complex Reveals Molecular Basis for Specificity in Chaperone Action. Cell. 2014;156:963–974. doi: 10.1016/j.cell.2014.01.037. PubMed DOI PMC
Lapinaite A, et al. The structure of the box C/D enzyme reveals regulation of RNA methylation. Nature. 2013;502:519–523. doi: 10.1038/nature12581. PubMed DOI
Huang C, Rossi P, Saio T, Kalodimos CG. Structural basis for the antifolding activity of a molecular chaperone. Nature. 2016;537:202–206. doi: 10.1038/nature18965. PubMed DOI PMC
Rosenzweig R, Moradi S, Zarrine-Afsar A, Glover JR, Kay LE. Unraveling the Mechanism of Protein Disaggregation Through a ClpB-DnaK Interaction. Science. 2013;339:1080–1083. doi: 10.1126/science.1233066. PubMed DOI
Chua EYD, et al. Better, Faster, Cheaper: Recent Advances in Cryo–Electron Microscopy. Annu. Rev. Biochem. 2022;91:1–32. doi: 10.1146/annurev-biochem-032620-110705. PubMed DOI PMC
Guaita M, Watters SC, Loerch S. Recent advances and current trends in cryo-electron microscopy. Curr. Opin. Struct. Biol. 2022;77:102484. doi: 10.1016/j.sbi.2022.102484. PubMed DOI PMC
Srivastava A, Tiwari SP, Miyashita O, Tama F. Integrative/Hybrid Modeling Approaches. for Studying Biomolecules. J. Mol. Biol. 2020;432:2846–2860. doi: 10.1016/j.jmb.2020.01.039. PubMed DOI
Kim SJ, et al. Integrative structure and functional anatomy of a nuclear pore complex. Nature. 2018;555:475–482. doi: 10.1038/nature26003. PubMed DOI PMC
Chou H-T, et al. The Molecular Architecture of Native BBSome Obtained by an Integrated Structural Approach. Structure. 2019;27:1384–1394.e4. doi: 10.1016/j.str.2019.06.006. PubMed DOI PMC
Aryal RP, et al. Macromolecular Assemblies of the Mammalian Circadian Clock. Mol. Cell. 2017;67:770–782.e6. doi: 10.1016/j.molcel.2017.07.017. PubMed DOI PMC
wwPDB consortium Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019;47:D520–D528. doi: 10.1093/nar/gky949. PubMed DOI PMC
Berman H, Henrick K, Nakamura H, Markley JL. The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 2007;35:D301–D303. doi: 10.1093/nar/gkl971. PubMed DOI PMC
Lawson CL, et al. EMDataBank unified data resource for 3DEM. Nucleic Acids Res. 2016;44:D396–D403. doi: 10.1093/nar/gkv1126. PubMed DOI PMC
Valentini E, Kikhney AG, Previtali G, Jeffries CM, Svergun DI. SASBDB, a repository for biological small-angle scattering data. Nucleic Acids Res. 2015;43:D357–D363. doi: 10.1093/nar/gku1047. PubMed DOI PMC
Hoch JC, et al. Biological Magnetic Resonance Data Bank. Nucleic Acids Res. 2023;51:D368–D376. doi: 10.1093/nar/gkac1050. PubMed DOI PMC
Burley SK, et al. PDB-Dev: a Prototype System for Depositing Integrative/Hybrid Structural Models. Struct. Lond. Engl. 1993. 2017;25:1317–1318. PubMed PMC
Velankar S, et al. PDBe: improved accessibility of macromolecular structure data from PDB and EMDB. Nucleic Acids Res. 2016;44:D385–D395. doi: 10.1093/nar/gkv1047. PubMed DOI PMC
Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 2007;372:774–797. doi: 10.1016/j.jmb.2007.05.022. PubMed DOI
Ponstingl H, Henrick K, Thornton JM. Discriminating between homodimeric and monomeric proteins in the crystalline state. Proteins Struct. Funct. Bioinforma. 2000;41:47–57. doi: 10.1002/1097-0134(20001001)41:1<47::AID-PROT80>3.0.CO;2-8. PubMed DOI
Capitani G, Duarte JM, Baskaran K, Bliven S, Somody JC. Understanding the fabric of protein crystals: computational classification of biological interfaces and crystal contacts. Bioinformatics. 2016;32:481–489. doi: 10.1093/bioinformatics/btv622. PubMed DOI PMC
Duarte JM, Srebniak A, Schärer MA, Capitani G. Protein interface classification by evolutionary analysis. BMC Bioinformatics. 2012;13:334. doi: 10.1186/1471-2105-13-334. PubMed DOI PMC
Dey S, Ritchie DW, Levy ED. PDB-wide identification of biological assemblies from conserved quaternary structure geometry. Nat. Methods. 2018;15:67–72. doi: 10.1038/nmeth.4510. PubMed DOI
Dana JM, et al. SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins. Nucleic Acids Res. 2019;47:D482–D489. doi: 10.1093/nar/gky1114. PubMed DOI PMC
Kalvari I, et al. Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res. 2018;46:D335–D342. doi: 10.1093/nar/gkx1038. PubMed DOI PMC
Meldal BHM, et al. Complex Portal 2022: new curation frontiers. Nucleic Acids Res. 2022;50:D578–D586. doi: 10.1093/nar/gkab991. PubMed DOI PMC
Ruan Z, Orozco IJ, Du J, Lü W. Structures of human pannexin 1 reveal ion pathways and mechanism of gating. Nature. 2020;584:646–651. doi: 10.1038/s41586-020-2357-y. PubMed DOI PMC
Rodnina MV, Fischer N, Maracci C, Stark H. Ribosome dynamics during decoding. Philos. Trans. R. Soc. B Biol. Sci. 2017;372:20160182. doi: 10.1098/rstb.2016.0182. PubMed DOI PMC
Zhou J, Lancaster L, Trakhanov S, Noller HF. Crystal structure of release factor RF3 trapped in the GTP state on a rotated conformation of the ribosome. RNA. 2012;18:230–240. doi: 10.1261/rna.031187.111. PubMed DOI PMC
Pagès G, Grudinin S. AnAnaS: Software for Analytical Analysis of Symmetries in Protein Structures. Methods Mol. Biol. Clifton NJ. 2020;2165:245–257. doi: 10.1007/978-1-0716-0708-4_14. PubMed DOI
Bhate MP, Molnar KS, Goulian M, DeGrado WF. Signal Transduction in Histidine Kinases: Insights from New Structures. Struct. Lond. Engl. 1993. 2015;23:981–994. PubMed PMC
Wilkinson MD, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data. 2016;3:160018. doi: 10.1038/sdata.2016.18. PubMed DOI PMC
Dunbar J, Deane CM. ANARCI: antigen receptor numbering and receptor classification. Bioinformatics. 2016;32:298–300. doi: 10.1093/bioinformatics/btv552. PubMed DOI PMC
Westhof, E. & Leontis, N. B. An RNA-centric historical narrative around the Protein Data Bank. J. Biol. Chem. 296, (2021). PubMed PMC
Ashburner M, et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. PubMed DOI PMC
The Gene Ontology Consortium The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113. PubMed DOI PMC