MSnLib: efficient generation of open multi-stage fragmentation mass spectral libraries
Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články
Grantová podpora
R01 DK136117
NIDDK NIH HHS - United States
R01 GM107550
NIGMS NIH HHS - United States
891397
EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 Marie Sklodowska-Curie Actions (H2020 Excellent Science - Marie Sklodowska-Curie Actions)
PubMed
40954295
PubMed Central
PMC12510872
DOI
10.1038/s41592-025-02813-0
PII: 10.1038/s41592-025-02813-0
Knihovny.cz E-zdroje
- MeSH
- hmotnostní spektrometrie * metody MeSH
- knihovny malých molekul * MeSH
- metabolomika * metody MeSH
- software * MeSH
- strojové učení MeSH
- tandemová hmotnostní spektrometrie * metody MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- knihovny malých molekul * MeSH
Untargeted high-resolution mass spectrometry is a key tool in clinical metabolomics, natural product discovery and exposomics, with compound identification remaining the major bottleneck. Currently, the standard workflow applies spectral library matching against tandem mass spectrometry (MS2) fragmentation data. Multi-stage fragmentation (MSn) yields more profound insights into substructures, enabling validation of fragmentation pathways; however, the community lacks open MSn reference data of diverse natural products and other chemicals. Here we describe MSnLib, a machine learning-ready open resource of >2 million spectra in MSn trees of 30,008 unique small molecules, built with a high-throughput data acquisition and processing pipeline in the open-source software mzmine.
Department of Biochemistry University of California Riverside Riverside CA USA
Department of Cell Biology Faculty of Science Charles University Prague Czechia
Department of Pharmaceutical Sciences Faculty of Life Sciences University of Vienna Vienna Austria
Institute of Inorganic and Analytical Chemistry University of Münster Münster Germany
Institute of Microbiology of the Czech Academy of Sciences Prague Czechia
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Prague Czechia
Interdisciplinary Institute for Artificial Intelligence Côte d'Azur Sophia Antipolis Valbonne France
Zobrazit více v PubMed
Stein, S. Mass spectral reference libraries: an ever-expanding resource for chemical identification. PubMed
Blaženović, I. et al. Structure annotation of all mass spectra in untargeted metabolomics. PubMed PMC
Bittremieux, W., Wang, M. & Dorrestein, P. C. The critical role that spectral libraries play in capturing the metabolomics community knowledge. PubMed PMC
Sorokina, M., Merseburger, P., Rajan, K., Yirik, M. A. & Steinbeck, C. COCONUT online: Collection of Open Natural Products database. PubMed PMC
Zdrazil, B. et al. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. PubMed PMC
Kim, S. et al. PubChem 2023 update. PubMed PMC
de Jonge, N. F. et al. Good practices and recommendations for using and benchmarking computational metabolomics metabolite annotation tools. PubMed PMC
van der Hooft, J. J. J., Vervoort, J., Bino, R. J. & de Vos, R. C. H. Spectral trees as a robust annotation tool in LC–MS based metabolomics.
Kasper, P. T. et al. Fragmentation trees for the structural characterisation of metabolites. PubMed PMC
Vaniya, A. & Fiehn, O. Using fragmentation trees and mass spectral trees for identifying unknown compounds in metabolomics. PubMed PMC
Waridel, P. et al. Evaluation of quadrupole time-of-flight tandem mass spectrometry and ion-trap multiple-stage mass spectrometry for the differentiation of C-glycosidic flavonoid isomers. PubMed
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. PubMed PMC
Horai, H. et al. MassBank: a public repository for sharing mass spectral data for life sciences. PubMed
Guijas, C. et al. METLIN: a technology platform for identifying knowns and unknowns. PubMed PMC
Schmid, R. et al. Integrative analysis of multimodal mass spectrometry data in MZmine 3. PubMed PMC
Probst, D. & Reymond, J.-L. Visualization of very large high-dimensional data sets as minimum spanning trees. PubMed PMC
Nothias, L.-F. et al. Feature-based molecular networking in the GNPS analysis environment. PubMed PMC
Bento, A. P. et al. An open source chemical structure curation pipeline using RDKit. PubMed PMC
Kretschmer, F., Seipp, J., Ludwig, M., Klau, G. W. & Böcker, S. Coverage bias in small molecule machine learning. PubMed PMC