JavaScript is NOT enabled !

Please enable JavaScript.

Genomic databases Query Show help

Exact matching Semantic

Reset

288 hits in Medvik

Journal

BMC genomic data

London : BioMed Central, [2021]-

1 online zdroj

Conspectus
Obecná genetika. Obecná cytogenetika. Evoluce
NML Fields
lékařská informatika
genetika, lékařská genetika

Online article

PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions

... An important message taken from human genome sequencing projects is that the human population exhibits ...

PLoS computational biology. 2016 ; 12 (5) : e1004962. [pub] 20160525

PLoS Comput Biol
ISSN 1553-7358
Medvik
Source

An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.

MeSH
Databases, Nucleic Acid MeSH
Databases, Protein MeSH
Genetic Variation MeSH
Genome, Human MeSH
Genomics statistics & numerical data MeSH
Polymorphism, Single Nucleotide * MeSH
Humans MeSH
Software * MeSH
Computational Biology MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH

Article

Genomic reanalysis of a pan-European rare-disease resource yields new diagnoses

... Genetic diagnosis of rare diseases requires accurate identification and interpretation of genomic variants ...

Nature medicine. 2025 ; 31 (2) : 478-489. [pub] 20250117

Nat Med
ISSN 1546-170X
Medvik
Source

Genetic diagnosis of rare diseases requires accurate identification and interpretation of genomic variants. Clinical and molecular scientists from 37 expert centers across Europe created the Solve-Rare Diseases Consortium (Solve-RD) resource, encompassing clinical, pedigree and genomic rare-disease data (94.5% exomes, 5.5% genomes), and performed systematic reanalysis for 6,447 individuals (3,592 male, 2,855 female) with previously undiagnosed rare diseases from 6,004 families. We established a collaborative, two-level expert review infrastructure that allowed a genetic diagnosis in 506 (8.4%) families. Of 552 disease-causing variants identified, 464 (84.1%) were single-nucleotide variants or short insertions/deletions. These variants were either located in recently published novel disease genes (n = 67), recently reclassified in ClinVar (n = 187) or reclassified by consensus expert decision within Solve-RD (n = 210). Bespoke bioinformatics analyses identified the remaining 15.9% of causative variants (n = 88). Ad hoc expert review, parallel to the systematic reanalysis, diagnosed 249 (4.1%) additional families for an overall diagnostic yield of 12.6%. The infrastructure and collaborative networks set up by Solve-RD can serve as a blueprint for future further scalable international efforts. The resource is open to the global rare-disease community, allowing phenotype, variant and gene queries, as well as genome-wide discoveries.

Online article

Genomic benchmarks: a collection of datasets for genomic sequence classification

... In Genomics, we have similar challenges (annotation of genomes and identification of functional elements ...

BMC genomic data. 2023 ; 24 (1) : 25. [pub] 20230501

BMC Genom Data
ISSN 2730-6844
Medvik
Source

BACKGROUND: Recently, deep neural networks have been successfully applied in many biological fields. In 2020, a deep learning model AlphaFold won the protein folding competition with predicted structures within the error tolerance of experimental methods. However, this solution to the most prominent bioinformatic challenge of the past 50 years has been possible only thanks to a carefully curated benchmark of experimentally predicted protein structures. In Genomics, we have similar challenges (annotation of genomes and identification of functional elements) but currently, we lack benchmarks similar to protein folding competition. RESULTS: Here we present a collection of curated and easily accessible sequence classification datasets in the field of genomics. The proposed collection is based on a combination of novel datasets constructed from the mining of publicly available databases and existing datasets obtained from published articles. The collection currently contains nine datasets that focus on regulatory elements (promoters, enhancers, open chromatin region) from three model organisms: human, mouse, and roundworm. A simple convolution neural network is also included in a repository and can be used as a baseline model. Benchmarks and the baseline model are distributed as the Python package 'genomic-benchmarks', and the code is available at https://github.com/ML-Bioinfo-CEITEC/genomic_benchmarks . CONCLUSIONS: Deep learning techniques revolutionized many biological fields but mainly thanks to the carefully curated benchmarks. For the field of Genomics, we propose a collection of benchmark datasets for the classification of genomic sequences with an interface for the most commonly used deep learning libraries, implementation of the simple neural network and a training framework that can be used as a starting point for future research. The main aim of this effort is to create a repository for shared datasets that will make machine learning for genomics more comparable and reproducible while reducing the overhead of researchers who want to enter the field, leading to healthy competition and new discoveries.

Online article

AmtDB: a database of ancient human mitochondrial genomes

... The number of published ancient mitochondrial genomes has increased in recent years, alongside with the ...

Nucleic acids research. 2019 ; 47 (D1) : D29-D32. [pub] 20190108

Nucleic Acids Res
ISSN 1362-4962
Medvik
Source

Ancient mitochondrial DNA is used for tracing human past demographic events due to its population-level variability. The number of published ancient mitochondrial genomes has increased in recent years, alongside with the development of high-throughput sequencing and capture enrichment methods. Here, we present AmtDB, the first database of ancient human mitochondrial genomes. Release version contains 1107 hand-curated ancient samples, freely accessible for download, together with the individual descriptors, including geographic location, radiocarbon dating, and archaeological culture affiliation. The database also features an interactive map for sample location visualization. AmtDB is a key platform for ancient population genetic studies and is available at https://amtdb.org.

Online article

A corpus of GA4GH phenopackets: Case-level phenotyping for genomic diagnostics and discovery

... The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved ...

HGG advances. 2025 ; 6 (1) : 100371. [pub] 20241010

HGG Adv
ISSN 2666-2477
Medvik
Source

The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present Phenopacket Store. Phenopacket Store v.0.1.19 includes 6,668 phenopackets representing 475 Mendelian and chromosomal diseases associated with 423 genes and 3,834 unique pathogenic alleles curated from 959 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.

Abstract

Enabling pathway analysis - connecting genomic experiment and gene ontology databases

Imrichová, Hana
Author Imrichová, Hana Faculty of Science, Masaryk University, Brno

Proceedings of the ... International Summer School on Computational Biology. 2010 ; () : 153-154.

Medvik
Source

Publication type
Overall MeSH

Online article

Lifetime risk of autosomal recessive neurodegeneration with brain iron accumulation (NBIA) disorders calculated from genetic databases

... The allele frequencies of these disease-causing variants were assessed in exome/genome collections: the ...

EBioMedicine. 2022 ; 77 (-) : 103869. [pub] 20220215

ISSN 2352-3964
Medvik
Source

BACKGROUND: Neurodegeneration with brain iron accumulation (NBIA) are a group of clinically and genetically heterogeneous diseases characterized by iron overload in basal ganglia and progressive neurodegeneration. Little is known about the epidemiology of NBIA disorders. In the absence of large-scale population-based studies, obtaining reliable epidemiological data requires innovative approaches. METHODS: All pathogenic variants were collected from the 13 genes associated with autosomal recessive NBIA (PLA2G6, PANK2, COASY, ATP13A2, CP, AP4M1, FA2H, CRAT, SCP2, C19orf12, DCAF17, GTPBP2, REPS1). The allele frequencies of these disease-causing variants were assessed in exome/genome collections: the Genome Aggregation Database (gnomAD) and our in-house database. Lifetime risks were calculated from the sum of allele frequencies in the respective genes under assumption of Hardy-Weinberg equilibrium. FINDINGS: The combined estimated lifetime risk of all 13 investigated NBIA disorders is 0.88 (95% confidence interval 0.70-1.10) per 100,000 based on the global gnomAD dataset (n = 282,912 alleles), 0.92 (0.65-1.29) per 100,000 in the European gnomAD dataset (n = 129,206), and 0.90 (0.48-1.62) per 100,000 in our in-house database (n = 44,324). Individually, the highest lifetime risks (>0.15 per 100,000) are found for disorders caused by variants in PLA2G6, PANK2 and COASY. INTERPRETATION: This population-genetic estimation on lifetime risks of recessive NBIA disorders reveals frequencies far exceeding previous population-based numbers. Importantly, our approach represents lifetime risks from conception, thus including prenatal deaths. Understanding the true lifetime risk of NBIA disorders is important in estimating disease burden, allocating resources and targeting specific interventions. FUNDING: This work was carried out in the framework of TIRCON ("Treat Iron-Related Childhood-Onset Neurodegeneration").

MeSH
Databases, Genetic MeSH
Child MeSH
Nuclear Proteins MeSH
Ubiquitin-Protein Ligase Complexes MeSH
Humans MeSH
Mitochondrial Proteins genetics MeSH
Brain pathology MeSH
Neuroaxonal Dystrophies * epidemiology genetics pathology MeSH
Neurodegenerative Diseases * epidemiology genetics pathology MeSH
Iron Metabolism Disorders * genetics pathology MeSH
Calcium-Binding Proteins MeSH
Check Tag
Child MeSH
Humans MeSH
Publication type
Journal Article MeSH

Online article

Prot2HG: a database of protein domains mapped to the human genome

... all human protein domains, residue positions were reverse translated and mapped to the reference genome ...

Database. 2020 ; 2020 (-) : . [pub] 20200101

ISSN 1758-0463
Medvik
Source

Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency.Database URL: www.prot2hg.com.

MeSH
Molecular Sequence Annotation methods MeSH
Data Mining methods MeSH
Databases, Genetic * MeSH
Data Curation methods MeSH
Genetic Variation * MeSH
Genome, Human genetics MeSH
Genomics methods MeSH
Internet MeSH
Humans MeSH
Protein Domains genetics MeSH
Proteins chemistry genetics metabolism MeSH
Computational Biology methods MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH

Online article

Genomic surveillance of invasive meningococcal disease in the Czech Republic, 2015-2017

... INTRODUCTION: The study presents the results of the genomic surveillance of invasive meningococcal disease ...

PLoS One. 2019 ; 14 (7) : e0219477. [pub] 20190711

ISSN 1932-6203
Medvik
Source

INTRODUCTION: The study presents the results of the genomic surveillance of invasive meningococcal disease (IMD) in the Czech Republic for the period of 2015-2017. MATERIAL AND METHODS: The study set includes all available IMD isolates recovered in the Czech Republic and referred to the National Reference Laboratory for Meningococcal Infections in 2015-2017, a total of 89 Neissseria meningitidis isolates-from 2015 (n = 20), 2016 (n = 27), and from 2017 (n = 42). All isolates were studied by whole genome sequencing (WGS). RESULTS: Serogroup B (MenB) was the most common, followed by serogroups C, W, and Y. Altogether 17 clonal complexes were identified, the most common of which was hypervirulent complex cc11, followed by complexes cc32, cc41/44, cc269, and cc865. Over the three study years, hypervirulent cc11 (MenC) showed an upward trend. The WGS method showed two clearly differentiated clusters of N. meningitidis C: P1.5,2:F3-3:ST-11 (cc11). The first cluster is represented by nine isolates, all of which are from 2017. The second cluster consisted of five isolates from 2016 and eight isolates from 2017. Their genetic discordance is illustrated by the changing nadA allele and subsequently by the variance in BAST type. Clonal complex cc269 (MenB) also increased over the time frame. WGS identified the presence of MenB vaccine antigen genes in all B and non-B isolates of N. meningitidis. Altogether 49 different Bexsero antigen sequence types (BAST) were identified and 10 combinations of these have not been previously described in the PubMLST database. CONCLUSIONS: The genomic surveillance of IMD in the Czech Republic provides data needed to update immunisation guidelines for this disease. WGS showed a higher discrimination power and provided more accurate data on molecular characteristics and genetic relationships among invasive N. meningitidis isolates.

MeSH
Antigens, Bacterial genetics MeSH
Genome, Bacterial genetics MeSH
Genomics MeSH
Humans MeSH
Meningococcal Infections epidemiology genetics microbiology MeSH
Neisseria meningitidis, Serogroup B genetics pathogenicity MeSH
Neisseria meningitidis genetics pathogenicity MeSH
Whole Genome Sequencing MeSH
Vaccination MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH
Research Support, Non-U.S. Gov't MeSH
Geographicals
Czech Republic MeSH

Collections

Published

Filters

Genomic databases Query Show help

Exact matching Semantic

Genomic databases Query Show help Exact matching Semantic

Refine by MeSH

Genomic databases Query Show help

Exact matching Semantic