JavaScript is NOT enabled !

Please enable JavaScript.

* Show help

Reset

MeSH: Data Mining-methods

26 hits in Medvik Filters

Online article

MS-DIAL 5 multimodal mass spectrometry data mining unveils lipidome complexities

Takeda, Hiroaki
Author Takeda, Hiroaki ORCID Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama, 351-0106, Japan
Matsuzawa, Yuki
Author Matsuzawa, Yuki Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
Takeuchi, Manami
Author Takeuchi, Manami Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
Takahashi, Mikiko
Author Takahashi, Mikiko ORCID RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
Nishida, Kozo
Author Nishida, Kozo Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
Harayama, Takeshi
Author Harayama, Takeshi ORCID Institut de Pharmacologie Moléculaire et Cellulaire, Université Côte d'Azur - CNRS UMR7275 - Inserm U1323, 660 Route des Lucioles, 06560, Valbonne, France. harayama@ipmc.cnrs.fr Institute of Global Innovation Research, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan. harayama@ipmc.cnrs.fr
Todoroki, Yoshimasa
Author Todoroki, Yoshimasa Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
Shimizu, Kuniyoshi
Author Shimizu, Kuniyoshi ORCID Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
Sakamoto, Nami
Author Sakamoto, Nami ORCID Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan
Oka, Takaki
Author Oka, Takaki Department of Biotechnology and Life Science, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan

Nature communications. 2024 ; 15 (1) : 9903. [pub] 20241128

Nat Commun
ISSN 2041-1723
Medvik
Source

Lipidomics and metabolomics communities comprise various informatics tools; however, software programs handling multimodal mass spectrometry (MS) data with structural annotations guided by the Lipidomics Standards Initiative are limited. Here, we provide MS-DIAL 5 for in-depth lipidome structural elucidation through electron-activated dissociation (EAD)-based tandem MS and determining their molecular localization through MS imaging (MSI) data using a species/tissue-specific lipidome database containing the predicted collision-cross section values. With the optimized EAD settings using 14 eV kinetic energy, the program correctly delineated lipid structures for 96.4% of authentic standards, among which 78.0% had the sn-, OH-, and/or C = C positions correctly assigned at concentrations exceeding 1 μM. We showcased our workflow by annotating the sn- and double-bond positions of eye-specific phosphatidylcholines containing very-long-chain polyunsaturated fatty acids (VLC-PUFAs), characterized as PC n-3-VLC-PUFA/FA. Using MSI data from the eye and n-3-VLC-PUFA-supplemented HeLa cells, we identified glycerol 3-phosphate acyltransferase as an enzyme candidate responsible for incorporating n-3 VLC-PUFAs into the sn1 position of phospholipids in mammalian cells, which was confirmed using EAD-MS/MS and recombinant proteins in a cell-free system. Therefore, the MS-DIAL 5 environment, combined with optimized MS data acquisition methods, facilitates a better understanding of lipid structures and their localization, offering insights into lipid biology.

MeSH
Data Mining * methods MeSH
Phosphatidylcholines metabolism chemistry MeSH
HeLa Cells MeSH
Mass Spectrometry methods MeSH
Humans MeSH
Lipidomics * methods MeSH
Lipids chemistry analysis MeSH
Metabolomics methods MeSH
Fatty Acids, Unsaturated metabolism chemistry MeSH
Software MeSH
Tandem Mass Spectrometry methods MeSH
Animals MeSH
Check Tag
Humans MeSH
Animals MeSH
Publication type
Journal Article MeSH

Online article

The Allele Catalog Tool: a web-based interactive tool for allele discovery and analysis

BMC genomics. 2023 ; 24 (1) : 107. [pub] 20230310

BMC Genomics
ISSN 1471-2164
Medvik
Source

BACKGROUND: The advancement of sequencing technologies today has made a plethora of whole-genome re-sequenced (WGRS) data publicly available. However, research utilizing the WGRS data without further configuration is nearly impossible. To solve this problem, our research group has developed an interactive Allele Catalog Tool to enable researchers to explore the coding region allelic variation present in over 1,000 re-sequenced accessions each for soybean, Arabidopsis, and maize. RESULTS: The Allele Catalog Tool was designed originally with soybean genomic data and resources. The Allele Catalog datasets were generated using our variant calling pipeline (SnakyVC) and the Allele Catalog pipeline (AlleleCatalog). The variant calling pipeline is developed to parallelly process raw sequencing reads to generate the Variant Call Format (VCF) files, and the Allele Catalog pipeline takes VCF files to perform imputations, functional effect predictions, and assemble alleles for each gene to generate curated Allele Catalog datasets. Both pipelines were utilized to generate the data panels (VCF files and Allele Catalog files) in which the accessions of the WGRS datasets were collected from various sources, currently representing over 1,000 diverse accessions for soybean, Arabidopsis, and maize individually. The main features of the Allele Catalog Tool include data query, visualization of results, categorical filtering, and download functions. Queries are performed from user input, and results are a tabular format of summary results by categorical description and genotype results of the alleles for each gene. The categorical information is specific to each species; additionally, available detailed meta-information is provided in modal popups. The genotypic information contains the variant positions, reference or alternate genotypes, the functional effect classes, and the amino-acid changes of each accession. Besides that, the results can also be downloaded for other research purposes. CONCLUSIONS: The Allele Catalog Tool is a web-based tool that currently supports three species: soybean, Arabidopsis, and maize. The Soybean Allele Catalog Tool is hosted on the SoyKB website ( https://soykb.org/SoybeanAlleleCatalogTool/ ), while the Allele Catalog Tool for Arabidopsis and maize is hosted on the KBCommons website ( https://kbcommons.org/system/tools/AlleleCatalogTool/Zmays and https://kbcommons.org/system/tools/AlleleCatalogTool/Athaliana ). Researchers can use this tool to connect variant alleles of genes with meta-information of species.

Chapter

Big data, umělá inteligence a strojové učení

Lhotská, Lenka, 1961-
Author Authority ORCID Český institut informatiky, robotiky a kybernetiky Českého vysokého učení technického v Praze

Digitální medicína 2022. 2022 ; () : 91-103.

ISBN 978-80-908638-8-0
Medvik
Source

Online article

A decision support system for the prediction of mortality in patients with acute kidney injury admitted in intensive care unit

Journal of Applied Biomedicine. 2020 ; 18 (1) : 26-32.

ISSN 1214-021X
Medvik
Source

Intensive care unit (ICU) is a very special unit of a hospital, where healthcare professionals provide treatment and, later, close follow-up to the patients. It is crucial to estimate mortality in ICU patients from many viewpoints. The purpose of this study is to classify the status of patients with acute kidney injury (AKI) in ICU as early mortality, late mortality, and survival by the application of Classification and Regression Trees (CART) algorithm to the patients' attributes such as blood urea nitrogen, creatinine, serum and urine neutrophil gelatinase-associated lipocalin (NGAL), alkaline phosphatase, lactate dehydrogenase (LDH), gamma-glutamyl transferase, laboratory electrolytes, blood gas, mean arterial pressure, central venous pressure and demographic details of patients. This study was conducted 50 patients with AKI who were followed up in the ICU. The study also aims to determine the significance of relationship between the attributes used in the prediction of mortality in CART and patients' status by employing the Kruskal-Wallis H test. The classification accuracy, sensitivity, and specificity of CART for the tested attributes for the prediction of early mortality, late mortality, and survival of patients were 90.00%, 83.33%, and 91.67%, respectively. The values of both urine NGAL and LDH on day 7 showed a considerable difference according to the patients' status after being examined by the Kruskal-Wallis H test.

MeSH
Acute Kidney Injury * mortality MeSH
Algorithms MeSH
Biomarkers analysis MeSH
Data Mining methods MeSH
Adult MeSH
Classification MeSH
Lactate Dehydrogenases analysis MeSH
Humans MeSH
Lipocalin-2 analysis MeSH
Decision Support Techniques MeSH
Hospital Mortality * MeSH
Prognosis MeSH
Decision Support Systems, Management MeSH
Decision Trees MeSH
Statistics as Topic MeSH
Check Tag
Adult MeSH
Humans MeSH

Online article

Prot2HG: a database of protein domains mapped to the human genome

Database. 2020 ; 2020 (-) : . [pub] 20200101

ISSN 1758-0463
Medvik
Source

Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency.Database URL: www.prot2hg.com.

MeSH
Molecular Sequence Annotation methods MeSH
Data Mining methods MeSH
Databases, Genetic * MeSH
Data Curation methods MeSH
Genetic Variation * MeSH
Genome, Human genetics MeSH
Genomics methods MeSH
Internet MeSH
Humans MeSH
Protein Domains genetics MeSH
Proteins chemistry genetics metabolism MeSH
Computational Biology methods MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH

Article

Analysis of Microbial Siderophores by Mass Spectrometry

Methods in molecular biology. 2019 ; 1996 (-) : 131-153. [pub] -

Methods Mol Biol
ISSN 1940-6029
Medvik
Source

Siderophores represent important microbial virulence factors and infection biomarkers. Their monitoring in fermentation broths, bodily fluids, and tissues should be reproducible. Similar isolation, characterization, and quantitation studies can often have conflicting results, and without proper documentation of sample collection, data processing, and analysis methods, it is difficult to reexamine the data and reconcile these differences. In this Springer Nature Protocol, we present the procedure optimized for ferricrocin/triacetylfusarinine C extraction from biological material as well as for tissue fixation and cryosectioning for optical microscopy and for both elemental and molecular mass spectrometry imaging. Special attention is paid to siderophore data mining from conventional and product ion mass spectra, liquid chromatography, and mass spectrometry imaging datasets, performed here by our free software called CycloBranch.

MeSH
Aspergillus fumigatus metabolism MeSH
Biomarkers analysis MeSH
Chromatography, Liquid methods MeSH
Data Mining methods MeSH
Datasets as Topic MeSH
Ferrichrome analogs & derivatives isolation & purification metabolism MeSH
Tissue Fixation methods MeSH
Mass Spectrometry methods MeSH
Invasive Pulmonary Aspergillosis diagnosis microbiology MeSH
Cryoultramicrotomy methods MeSH
Rats MeSH
Hydroxamic Acids isolation & purification metabolism MeSH
Humans MeSH
Disease Models, Animal MeSH
Siderophores isolation & purification metabolism MeSH
Software MeSH
Ferric Compounds isolation & purification metabolism MeSH
Animals MeSH
Check Tag
Rats MeSH
Humans MeSH
Animals MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH

Online article

A survey on biomedical named entity recognition and normalization

European Journal for Biomedical Informatics. 2019 ; 15 (2) : 11-16.

Eur. J. Biomed. Inform. (Praha)
ISSN 1801-5603
Medvik
Source

With a rapidly-growing amount of biomedical information available only in textual form, there is considerable interest in applying NLP techniques to extract such information from the biomedical literature. Much of the research has paid special attention to extracting information about biomedical named entities. In this paper, we conducted a survey on biomedical named entity recognition and normalization, focusing on gene mention recognition and normalization. We believe this can help researchers to find work of their interest and interpret their own research.

Article

Interpretation of QSAR Models: Mining Structural Patterns Taking into Account Molecular Context

Molecular informatics. 2019 ; 38 (3) : e1800084. [pub] 20181022

Mol Inform
ISSN 1868-1751
Medvik
Source

The study focused on QSAR model interpretation. The goal was to develop a workflow for the identification of molecular fragments in different contexts important for the property modelled. Using a previously established approach - Structural and physicochemical interpretation of QSAR models (SPCI) - fragment contributions were calculated and their relative influence on the compounds' properties characterised. Analysis of the distributions of these contributions using Gaussian mixture modelling was performed to identify groups of compounds (clusters) comprising the same fragment, where these fragments had substantially different contributions to the property studied. SMARTSminer was used to detect patterns discriminating groups of compounds from each other and visual inspection if the former did not help. The approach was applied to analyse the toxicity, in terms of 40 hour inhibition of growth, of 1984 compounds to Tetrahymena pyriformis. The results showed that the clustering technique correctly identified known toxicophoric patterns: it detected groups of compounds where fragments have specific molecular context making them contribute substantially more to toxicity. The results show the applicability of the interpretation of QSAR models to retrieve reasonable patterns, even from data sets consisting of compounds having different mechanisms of action, something which is difficult to achieve using conventional pattern/data mining approaches.

MeSH
Antiprotozoal Agents chemistry toxicity MeSH
Data Mining methods MeSH
Quantitative Structure-Activity Relationship * MeSH
Drug Design * MeSH
Molecular Docking Simulation methods MeSH
Software MeSH
Tetrahymena drug effects MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH

Online article

Complete mitochondrial genomes from transcriptomes: assessing pros and cons of data mining for assembling new mitogenomes

Scientific reports. 2019 ; 9 (1) : 14806. [pub] 20191015

Sci Rep
ISSN 2045-2322
Medvik
Source

Thousands of eukaryotes transcriptomes have been generated, mainly to investigate nuclear genes expression, and the amount of available data is constantly increasing. A neglected but promising use of this large amount of data is to assemble organelle genomes. To assess the reliability of this approach, we attempted to reconstruct complete mitochondrial genomes from RNA-Seq experiments of Reticulitermes termite species, for which transcriptomes and conspecific mitogenomes are available. We successfully assembled complete molecules, although a few gaps corresponding to tRNAs had to be filled manually. We also reconstructed, for the first time, the mitogenome of Reticulitermes banyulensis. The accuracy and completeness of mitogenomes reconstruction appeared independent from transcriptome size, read length and sequencing design (single/paired end), and using reference genomes from congeneric or intra-familial taxa did not significantly affect the assembly. Transcriptome-derived mitogenomes were found highly similar to the conspecific ones obtained from genome sequencing (nucleotide divergence ranging from 0% to 3.5%) and yielded a congruent phylogenetic tree. Reads from contaminants and nuclear transcripts, although slowing down the process, did not result in chimeric sequence reconstruction. We suggest that the described approach has the potential to increase the number of available mitogenomes by exploiting the rapidly increasing number of transcriptomes.

MeSH
Molecular Sequence Annotation methods MeSH
Data Mining methods MeSH
Phylogeny MeSH
Genome, Mitochondrial * MeSH
Isoptera genetics MeSH
Reproducibility of Results MeSH
Base Sequence genetics MeSH
Sequence Analysis, DNA MeSH
RNA-Seq MeSH
Transcriptome genetics MeSH
High-Throughput Nucleotide Sequencing MeSH
Animals MeSH
Check Tag
Animals MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Validation Study MeSH

Online article

A deep learning genome-mining strategy for biosynthetic gene cluster prediction

Nucleic acids research. 2019 ; 47 (18) : e110. [pub] 20191010

Nucleic Acids Res
ISSN 1362-4962
Medvik
Source

Natural products represent a rich reservoir of small molecule drug candidates utilized as antimicrobial drugs, anticancer therapies, and immunomodulatory agents. These molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The increase in full microbial genomes and similar resources has led to development of BGC prediction algorithms, although their precision and ability to identify novel BGC classes could be improved. Here we present a deep learning strategy (DeepBGC) that offers reduced false positive rates in BGC identification and an improved ability to extrapolate and identify novel BGC classes compared to existing machine-learning tools. We supplemented this with random forest classifiers that accurately predicted BGC product classes and potential chemical activity. Application of DeepBGC to bacterial genomes uncovered previously undetectable putative BGCs that may code for natural products with novel biologic activities. The improved accuracy and classification ability of DeepBGC represents a major addition to in-silico BGC identification.

Collections

Published

Filters

* Show help

* Show help

Refine by MeSH