Functional implications of glycans and their curation: insights from the workshop held at the 16th Annual International Biocuration Conference in Padua, Italy
Language English Country Great Britain, England Media print
Document type Congress, Research Support, N.I.H., Extramural, Journal Article
Grant support
R01 HL103411
NHLBI NIH HHS - United States
1R24GM146616
NIH HHS - United States
R24 GM146616
NIGMS NIH HHS - United States
U24 HG012212
NHGRI NIH HHS - United States
Society for Glycobiology
PubMed
39137905
PubMed Central
PMC11321244
DOI
10.1093/database/baae073
PII: 7732847
Knihovny.cz E-resources
- MeSH
- Biocuration MeSH
- Data Curation * methods MeSH
- Glycosylation MeSH
- Humans MeSH
- Polysaccharides * metabolism MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Congress MeSH
- Research Support, N.I.H., Extramural MeSH
- Geographicals
- Italy MeSH
- Names of Substances
- Polysaccharides * MeSH
Dynamic changes in protein glycosylation impact human health and disease progression. However, current resources that capture disease and phenotype information focus primarily on the macromolecules within the central dogma of molecular biology (DNA, RNA, proteins). To gain a better understanding of organisms, there is a need to capture the functional impact of glycans and glycosylation on biological processes. A workshop titled "Functional impact of glycans and their curation" was held in conjunction with the 16th Annual International Biocuration Conference to discuss ongoing worldwide activities related to glycan function curation. This workshop brought together subject matter experts, tool developers, and biocurators from over 20 projects and bioinformatics resources. Participants discussed four key topics for each of their resources: (i) how they curate glycan function-related data from publications and other sources, (ii) what type of data they would like to acquire, (iii) what data they currently have, and (iv) what standards they use. Their answers contributed input that provided a comprehensive overview of state-of-the-art glycan function curation and annotations. This report summarizes the outcome of discussions, including potential solutions and areas where curators, data wranglers, and text mining experts can collaborate to address current gaps in glycan and glycosylation annotations, leveraging each other's work to improve their respective resources and encourage impactful data sharing among resources. Database URL: https://wiki.glygen.org/Glycan_Function_Workshop_2023.
Swiss Prot Group Swiss Institute of Bioinformatics CMU 1 rue Michel Servet Geneva 4 1211 Switzerland
See more in PubMed
Dayhoff MO, Eck RV, Chang MA. et al. Atlas of Protein Sequence and Structure . Silver Spring, Maryland: National Biomedical Research Foundation, 1965.
Strasser BJ. Collecting, comparing, and computing sequences: the making of Margaret O. Dayhoff’s Atlas of protein sequence and structure, 1954-1965. J Hist Biol 2010;43:623–60. PubMed
Gagneux P, Panin V, Hennet T et al. Evolution of glycan diversity. In: Varki A, Cummings RD, Esko JD. et al. (eds.), Essentials of Glycobiology. 4th edn. Cold Spring Harbor NY: Cold Spring Harbor Laboratory Press, 2022, 265–78.
Schnaar RL, Sandhoff R, Tiemeyer M et al. Glycosphingolipids. In: Varki A, Cummings RD, Esko JD. et al. (eds.), Essentials of Glycobiology. 4th edn. Cold Spring Harbor NY: Cold Spring Harbor Laboratory Press, 2022, 129–40.
Flynn RA, Pedram K, Malaker SA. et al. Small RNAs are modified with N-glycans and displayed on the surface of living cells. Cell 2021;184:3109–3124e3122. 10.1016/j.cell.2021.04.023 PubMed DOI PMC
Suzuki T, Cummings RD, Aebi M et al. Glycans in glycoprotein quality control. In: Varki A, Cummings RD, Esko JD. et al. (eds.), Essentials of Glycobiology. 4th edn. Cold Spring Harbor NY: Cold Spring Harbor Laboratory Press, 2022, 529–38.
Varki A. Biological roles of glycans. Glycobiology 2017;27:3–49. 10.1093/glycob/cww086 PubMed DOI PMC
Fujita A, Aoki NP, Shinmachi D. et al. The international glycan repository GlyTouCan version 3.0. Nucleic Acids Res 2021;49:D1529–D1533. 10.1093/nar/gkaa947 PubMed DOI PMC
Hastings J, Owen G, Dekker A. et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res 2016;44:D1214–1219. 10.1093/nar/gkv1031 PubMed DOI PMC
Kim S, Chen J, Cheng T. et al. PubChem 2023 update. Nucleic Acids Res 2023;51:D1373–D1380. 10.1093/nar/gkac956 PubMed DOI PMC
York WS, Agravat S, Aoki-Kinoshita KF. et al. MIRAGE: the minimum information required for a glycomics experiment. Glycobiology 2014;24:402–06. 10.1093/glycob/cwu018 PubMed DOI PMC
Neelamegham S, Aoki-Kinoshita K, Bolton E. et al. Updates to the Symbol Nomenclature for Glycans guidelines. Glycobiology 2019;29:620–24. 10.1093/glycob/cwz045 PubMed DOI PMC
Herget S, Ranzinger R, Maass K. et al. GlycoCT-a unifying sequence format for carbohydrates. Carbohydr Res 2008;343:2162–71. 10.1016/j.carres.2008.03.011 PubMed DOI
Tanaka K, Aoki-Kinoshita KF, Kotera M. et al. WURCS: the Web3 unique representation of carbohydrate structures. J Chem Inf Model 2014;54:1558–66. 10.1021/ci400571e PubMed DOI
Matsubara M, Aoki-Kinoshita KF, Aoki NP. et al. WURCS 2.0 update to encapsulate ambiguous carbohydrate structures. J Chem Inf Model 2017;57:632–37. 10.1021/acs.jcim.6b00650 PubMed DOI
Allot A, Lee K, Chen Q. et al. LitSuggest: a web-based system for literature recommendation and curation using machine learning. Nucleic Acids Res 2021;49:W352–W358. 10.1093/nar/gkab326 PubMed DOI PMC
Wei CH, Kao HY, Lu Z. PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res 2013;41:W518–522. 10.1093/nar/gkt441 PubMed DOI PMC
Bansal P, Morgat A, Axelsen KB. et al. Rhea, the reaction knowledgebase in 2022. Nucleic Acids Res 2022;50:D693–D700. PubMed PMC
Islamaj R, Kwon D, Kim S. et al. TeamTat: a collaborative text annotation tool. Nucleic Acids Res 2020;48:W5–W11. 10.1093/nar/gkaa333 PubMed DOI PMC
Lai PT, Wei CH, Luo L. et al. BioREx: Improving biomedical relation extraction by leveraging heterogeneous datasets. J Biomed Informat 2023;146:ArXiv. PubMed
Beck T, Shorter T, Hu Y. et al. Auto-CORPus: a natural language processing tool for standardizing and reusing biomedical literature. Front Digit Health 2022;4:788124. 10.3389/fdgth.2022.788124 PubMed DOI PMC
York WS, Mazumder R, Ranzinger R. et al. GlyGen: computational and informatics resources for glycoscience. Glycobiology 2020;30:72–73. 10.1093/glycob/cwz080 PubMed DOI PMC
Vora J, Navelkar R, Vijay-Shanker K. et al. The Glycan Structure Dictionary-a dictionary describing commonly used glycan structure terms. Glycobiology 2023;33:354–57. 10.1093/glycob/cwad014 PubMed DOI PMC
Del Toro N, Shrivastava A, Ragueneau E. et al. The IntAct database: efficient access to fine-grained molecular interaction data. Nucleic Acids Res 2022;50:D648–D653. 10.1093/nar/gkab1006 PubMed DOI PMC
Rosonovski S, Levchenko M, Bhatnagar R. et al. Europe PMC in 2023. Nucleic Acids Res 2023;52:D1668–D1676. PubMed PMC
Mariethoz J, Alocci D, Gastaldello A. et al. Glycomics@ExPASy: bridging the gap. Mol Cell Proteomics 2018;17:2164–76. 10.1074/mcp.RA118.000799 PubMed DOI PMC
Bateman A, Martin M-J, Orchard S. UniProt_Consortium . UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res 2023;51:D523–D531. PubMed PMC
Clerc O, Deniaud M, Vallet SD. et al. MatrixDB: integration of new data with a focus on glycosaminoglycan interactions. Nucleic Acids Res 2019;47:D376–D381. 10.1093/nar/gky1035 PubMed DOI PMC
Huang H, Arighi CN, Ross KE. et al. iPTMnet: an integrated resource for protein post-translational modification network discovery. Nucleic Acids Res 2018;46:D542–D550. 10.1093/nar/gkx1104 PubMed DOI PMC
Lo Surdo P, Iannuccelli M, Contino S. et al. SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update. Nucleic Acids Res 2023;51:D631–D637. 10.1093/nar/gkac883 PubMed DOI PMC
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27–30. 10.1093/nar/28.1.27 PubMed DOI PMC
Alocci D, Mariethoz J, Gastaldello A. et al. GlyConnect: glycoproteomics goes visual, interactive, and analytical. J Proteome Res 2019;18:664–77. 10.1021/acs.jproteome.8b00766 PubMed DOI
Ma J, Li Y, Hou C. et al. O-GlcNAcAtlas: a database of experimentally identified O-GlcNAc sites and proteins. Glycobiology 2021;31:719–23. 10.1093/glycob/cwab003 PubMed DOI PMC
Wulff-Fuentes E, Berendt RR, Massman L. et al. The human O-GlcNAcome database and meta-analysis. Sci Data 2021;8:25. 10.1038/s41597-021-00810-4 PubMed DOI PMC
Kale NS, Haug K, Conesa P. et al. MetaboLights: an open-access database repository for metabolomics data. Curr Protoc Bioinform 2016;53:14–3. 10.1002/0471250953.bi1413s53 PubMed DOI
Bojar D, Powers RK, Camacho DM. et al. Deep-learning resources for studying glycan-mediated host-microbe interactions. Cell Host Microbe 2021;29:132–144e133. 10.1016/j.chom.2020.10.004 PubMed DOI
Vita R, Mahajan S, Overton JA. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res 2019;47:D339–D343. 10.1093/nar/gky1006 PubMed DOI PMC
Jackson R, Matentzoglu N, Overton JA. et al. OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies. Database (Oxford) 2021;2021. 10.1093/database/baab069 PubMed DOI PMC
Mi H, Muruganujan A, Ebert D. et al. PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools. Nucleic Acids Res 2019;47:D419–D426. 10.1093/nar/gky1038 PubMed DOI PMC
Gene_Ontology_Consortium. Aleksander SA, Balhoff J, Carbon S. et al. The Gene Ontology knowledgebase in 2023. Genetics 2023;224:iyad031. 10.1093/genetics/iyad031 PubMed DOI PMC
Vallet SD, Berthollier C, Ricard-Blum S. The glycosaminoglycan interactome 2.0. Am J Physiol Cell Physiol 2022;322:C1271–C1278. 10.1152/ajpcell.00095.2022 PubMed DOI
Weininger D. SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inf Comput Sci 1988;28:31–36. 10.1021/ci00057a005 DOI
Groth T, Diehl AD, Gunawan R. et al. GlycoEnzOnto: a GlycoEnzyme pathway and molecular function ontology. Bioinformatics 2022;38:5413–20. 10.1093/bioinformatics/btac704 PubMed DOI PMC
Perez-Riverol Y, Bai J, Bandla C. et al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res 2022;50:D543–D552. 10.1093/nar/gkab1038 PubMed DOI PMC
Agirre J, Iglesias-Fernandez J, Rovira C. et al. Privateer: software for the conformational validation of carbohydrate structures. Nat Struct Mol Biol 2015;22:833–34. 10.1038/nsmb.3115 PubMed DOI
Wormald MR, Petrescu AJ, Pao YL. et al. Conformational studies of oligosaccharides and glycopeptides: complementarity of NMR, X-ray crystallography, and molecular modelling. Chem Rev 2002;102:371–86. 10.1021/cr990368i PubMed DOI
Nagae M, Yamaguchi Y. Function and 3D structure of the N-glycans on glycoproteins. Int J Mol Sci 2012;13:8398–429. 10.3390/ijms13078398 PubMed DOI PMC
Atanasova M, Bagdonas H, Agirre J. Structural glycobiology in the age of electron cryo-microscopy. Curr Opin Struct Biol 2020;62:70–78. 10.1016/j.sbi.2019.12.003 PubMed DOI
Agirre J, Davies GJ, Wilson KS. et al. Carbohydrate structure: the rocky road to automation. Curr Opin Struct Biol 2017;44:39–47. 10.1016/j.sbi.2016.11.011 PubMed DOI
Mohl JE, Gerken TA, Leung MY. ISOGlyP: de novo prediction of isoform-specific mucin-type O-glycosylation. Glycobiology 2021;31:168–72. 10.1093/glycob/cwaa067 PubMed DOI PMC
Nam HJ, Yamada R, Park HS. Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review. Genomics Inform 2020;18:e13. 10.5808/GI.2020.18.2.e13 PubMed DOI PMC
Kouka T, Akase S, Sogabe I. et al. Computational modeling of O-linked glycan biosynthesis in CHO cells. Molecules 2022;27:1766. 10.3390/molecules27061766 PubMed DOI PMC
Huang YF, Aoki K, Akase S. et al. Global mapping of glycosylation pathways in human-derived cells. Dev Cell 2021;56:1195–1209e1197. 10.1016/j.devcel.2021.02.023 PubMed DOI PMC
Hosoda M, Aoki K, Guerardel Y. et al. Meeting report on the international symposium on microbial Glycoconjugates and the GlySpace alliance: from micro- to macroglycoscience (MiGGA symposium). Glycobiology 2022;32:1066–67. 10.1093/glycob/cwac062 PubMed DOI
Thomes L, Burkholz R, Bojar D. Glycowork: a Python package for glycan data science and machine learning. Glycobiology 2021;31:1240–44. 10.1093/glycob/cwab067 PubMed DOI PMC
Clerc O, Mariethoz J, Rivet A. et al. A pipeline to translate glycosaminoglycan sequences into 3D models. Application to the exploration of glycosaminoglycan conformational space. Glycobiology 2019;29:36–44. 10.1093/glycob/cwy084 PubMed DOI
Porras P, Barrera E, Bridge A. et al. Towards a unified open access dataset of molecular interactions. Nat Commun 2020;11:6144. 10.1038/s41467-020-19942-z PubMed DOI PMC
Orchard S, Kerrien S, Abbani S. et al. Protein interaction data curation: the International Molecular Exchange (IMEx) consortium. Nat Methods 2012;9:345–50. 10.1038/nmeth.1931 PubMed DOI PMC
Sanou G, Giudicelli V, Abdollahi N et al. IMGT-KG: A Knowledge Graph for Immunogenetics. In: Sattler Uet al.The Semantic Web – ISWC 2022. ISWC 2022. Lecture Notes in Computer Science. Vol. 13489, pp.628–42. Cham: Springer, 2022.
Berman HM, Westbrook J, Feng Z. et al. The Protein Data Bank. Nucleic Acids Res 2000;28:235–42. 10.1093/nar/28.1.235 PubMed DOI PMC
Ives CM, Singh O, D’Andrea S. et al. Restoring protein glycosylation with GlycoShape. biorxiv 2023. 10.1101/2023.12.11.571101 DOI
Lisacek F, Tiemeyer M, Mazumder R. et al. Worldwide glycoscience informatics infrastructure: the GlySpace Alliance. JACS Au 2023;3:4–12. PubMed PMC
Seal RL, Braschi B, Gray K. et al. Genenames.org: the HGNC resources in 2023. Nucleic Acids Res 2023;51:D1003–D1009. 10.1093/nar/gkac888 PubMed DOI PMC
O’Leary NA, Wright MW, Brister JR. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 2016;44:D733–745. PubMed PMC
Schoch CL, Ciufo S, Domrachev M. et al. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database (Oxford) 2020;2020:1–21. PubMed PMC
Harvey DJ, Merry AH, Royle L. et al. Proposal for a standard system for drawing structural diagrams of N- and O-linked carbohydrates and related compounds. Proteomics 2009;9:3796–801. PubMed
Ikeda S, Ono H, Ohta T. et al. TogoID: an exploratory ID converter to bridge biological datasets. Bioinformatics 2022;38:4194–99. PubMed PMC
Heller SR, McNaught A, Pletnev I. et al. InChI, the IUPAC International Chemical Identifier. J Cheminform 2015;7:23. 10.1186/s13321-015-0068-4 PubMed DOI PMC
Bohne-Lang A, Lang E, Forster T. et al. LINUCS: linear notation for unique description of carbohydrate sequences. Carbohydr Res 2001;336:1–11. 10.1016/S0008-6215(01)00230-0 PubMed DOI
Bohm M, Bohne-Lang A, Frank M. et al. Glycosciences.DB: an annotated data collection linking glycomics and proteomics data (2018 update). Nucleic Acids Res 2019;47:D1195–D1201. 10.1093/nar/gky994 PubMed DOI PMC
Natale DA, Arighi CN, Blake JA. et al. Protein Ontology: a controlled structured network of protein entities. Nucleic Acids Res 2014;42:D415–421. PubMed PMC
Munoz-Fuentes V, Cacheiro P, Meehan TF. et al. The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation. Conserv Genet 2018;19:995–1005. PubMed PMC
Smith CL, Goldsmith CA, Eppig JT. The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol 2005;6:R7. PubMed PMC
Kohler S, Gargano M, Matentzoglu N. et al. The Human Phenotype Ontology in 2021. Nucleic Acids Res 2021;49:D1207–D1217. PubMed PMC
Schriml LM, Munro JB, Schor M. et al. The Human Disease Ontology 2022 update. Nucleic Acids Res 2022;50:D1255–D1261. PubMed PMC
Vasilevsky NA, Matentzoglu NA, Toro S. et al. Mondo: unifying diseases for the world, by the world. medRxiv 2022. 10.1101/2022.04.13.22273750 DOI
Mungall CJ, Torniai C, Gkoutos GV. et al. Uberon, an integrative multi-species anatomy ontology. Genome Biol 2012;13:R5. 10.1186/gb-2012-13-1-r5 PubMed DOI PMC
Gremse M, Chang A, Schomburg I. et al. The BRENDA Tissue Ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources. Nucleic Acids Res 2011;39:D507–513. PubMed PMC
Bairoch A. The Cellosaurus, a cell-line knowledge resource. J Biomol Tech 2018;29:25–38. 10.7171/jbt.18-2902-002 PubMed DOI PMC
Sarntivijai S, Lin Y, Xiang Z. et al. CLO: the cell line ontology. J Biomed Semantics 2014;5:37. 10.1186/2041-1480-5-37 PubMed DOI PMC
Diehl AD, Meehan TF, Bradford YM. et al. The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability. J Biomed Semantics 2016;7:44. PubMed PMC
Malone J, Holloway E, Adamusiak T. et al. Modeling sample variables with an Experimental Factor Ontology. Bioinformatics 2010;26:1112–18. 10.1093/bioinformatics/btq099 PubMed DOI PMC
Shao C, Feng Z, Westbrook JD. et al. Modernized uniform representation of carbohydrate molecules in the Protein Data Bank. Glycobiology 2021;31:1204–18. 10.1093/glycob/cwab039 PubMed DOI PMC
Feng Z, Westbrook JD, Sala R. et al. Enhanced validation of small-molecule ligands and carbohydrates in the Protein Data Bank. Structure 2021;29:393–400e391. 10.1016/j.str.2021.02.004 PubMed DOI PMC
Toukach PV, Egorova KS. Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts. Nucleic Acids Res 2016;44:D1229–1236. 10.1093/nar/gkv840 PubMed DOI PMC
Hashimoto K, Goto S, Kawano S. et al. KEGG as a glycome informatics resource. Glycobiology 2006;16:63R–70R. 10.1093/glycob/cwj010 PubMed DOI
Togayachi A, Dae K, Shikanai T et al. A database system for glycogenes (GGDB). 2008. In: Taniguchi N, Suzuki A, Ito Y, Narimatsu H, Kawasaki T, Hase S (eds.), Experimental Glycoscience Glycobiology. Japan: Springer. 2008, 423–25.
Yurekten O, Payne T, Tejera N. et al. MetaboLights: open data repository for metabolomics. Nucleic Acids Res 2024;52:D640–D646. 10.1093/nar/gkad1045 PubMed DOI PMC
Milacic M, Beavers D, Conley P. et al. The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res 2024;52:D672–D678. 10.1093/nar/gkad1025 PubMed DOI PMC
Aoki-Kinoshita KF, Campbell MP, Lisacek F et al. Glycoinformatics. Chapter 52. In: Varki A, Cummings RD, Esko JD. et al. (eds.), Essentials of Glycobiology. 4th edn. Cold Spring Harbor NY: Cold Spring Harbor Laboratory Press, 2022, 705–18.