The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present Phenopacket Store. Phenopacket Store v.0.1.19 includes 6,668 phenopackets representing 475 Mendelian and chromosomal diseases associated with 423 genes and 3,834 unique pathogenic alleles curated from 959 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.
- MeSH
- algoritmy MeSH
- databáze genetické MeSH
- fenotyp * MeSH
- genomika * metody MeSH
- lidé MeSH
- software * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Recent advances in protein 3D structure prediction using deep learning have focused on the importance of amino acid residue-residue connections (i.e., pairwise atomic contacts) for accuracy at the expense of mechanistic interpretability. Therefore, we decided to perform a series of analyses based on an alternative framework of residue-residue connections making primary use of the TOP2018 dataset. This framework of residue-residue connections is derived from amino acid residue pairing models both historic and new, all based on genetic principles complemented by relevant biophysical principles. Of these pairing models, three new models (named the GU, Transmuted and Shift pairing models) exhibit the highest observed-over-expected ratios and highest correlations in statistical analyses with various intra- and inter-chain datasets, in comparison to the remaining models. In addition, these new pairing models are universally frequent across different connection ranges, secondary structure connections, and protein sizes. Accordingly, following further statistical and other analyses described herein, we have come to a major conclusion that all three pairing models together could represent the basis of a universal proteomic code (second genetic code) sufficient, in and of itself, to "encode" for both protein folding mechanisms and protein-protein interactions.
- MeSH
- aminokyseliny * chemie genetika MeSH
- databáze proteinů MeSH
- lidé MeSH
- molekulární modely * MeSH
- proteiny * chemie genetika metabolismus MeSH
- proteomika * MeSH
- sbalování proteinů * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Advancements in artificial intelligence (AI) and machine learning (ML) have revolutionized the medical field and transformed translational medicine. These technologies enable more accurate disease trajectory models while enhancing patient-centered care. However, challenges such as heterogeneous datasets, class imbalance, and scalability remain barriers to achieving optimal predictive performance. METHODS: This study proposes a novel AI-based framework that integrates Gradient Boosting Machines (GBM) and Deep Neural Networks (DNN) to address these challenges. The framework was evaluated using two distinct datasets: MIMIC-IV, a critical care database containing clinical data of critically ill patients, and the UK Biobank, which comprises genetic, clinical, and lifestyle data from 500,000 participants. Key performance metrics, including Accuracy, Precision, Recall, F1-Score, and AUROC, were used to assess the framework against traditional and advanced ML models. RESULTS: The proposed framework demonstrated superior performance compared to classical models such as Logistic Regression, Random Forest, Support Vector Machines (SVM), and Neural Networks. For example, on the UK Biobank dataset, the model achieved an AUROC of 0.96, significantly outperforming Neural Networks (0.92). The framework was also efficient, requiring only 32.4 s for training on MIMIC-IV, with low prediction latency, making it suitable for real-time applications. CONCLUSIONS: The proposed AI-based framework effectively addresses critical challenges in translational medicine, offering superior predictive accuracy and efficiency. Its robust performance across diverse datasets highlights its potential for integration into real-time clinical decision support systems, facilitating personalized medicine and improving patient outcomes. Future research will focus on enhancing scalability and interpretability for broader clinical applications.
Today, MALDI-ToF MS is an established technique to characterize and identify pathogenic bacteria. The technique is increasingly applied by clinical microbiological laboratories that use commercially available complete solutions, including spectra databases covering clinically relevant bacteria. Such databases are validated for clinical, or research applications, but are often less comprehensive concerning highly pathogenic bacteria (HPB). To improve MALDI-ToF MS diagnostics of HPB we initiated a program to develop protocols for reliable and MALDI-compatible microbial inactivation and to acquire mass spectra thereof many years ago. As a result of this project, databases covering HPB, closely related bacteria, and bacteria of clinical relevance have been made publicly available on platforms such as ZENODO. This publication in detail describes the most recent version of this database. The dataset contains a total of 11,055 spectra from altogether 1,601 microbial strains and 264 species and is primarily intended to improve the diagnosis of HPB. We hope that our MALDI-ToF MS data may also be a valuable resource for developing machine learning-based bacterial identification and classification methods.
Genetic diagnosis of rare diseases requires accurate identification and interpretation of genomic variants. Clinical and molecular scientists from 37 expert centers across Europe created the Solve-Rare Diseases Consortium (Solve-RD) resource, encompassing clinical, pedigree and genomic rare-disease data (94.5% exomes, 5.5% genomes), and performed systematic reanalysis for 6,447 individuals (3,592 male, 2,855 female) with previously undiagnosed rare diseases from 6,004 families. We established a collaborative, two-level expert review infrastructure that allowed a genetic diagnosis in 506 (8.4%) families. Of 552 disease-causing variants identified, 464 (84.1%) were single-nucleotide variants or short insertions/deletions. These variants were either located in recently published novel disease genes (n = 67), recently reclassified in ClinVar (n = 187) or reclassified by consensus expert decision within Solve-RD (n = 210). Bespoke bioinformatics analyses identified the remaining 15.9% of causative variants (n = 88). Ad hoc expert review, parallel to the systematic reanalysis, diagnosed 249 (4.1%) additional families for an overall diagnostic yield of 12.6%. The infrastructure and collaborative networks set up by Solve-RD can serve as a blueprint for future further scalable international efforts. The resource is open to the global rare-disease community, allowing phenotype, variant and gene queries, as well as genome-wide discoveries.
- MeSH
- databáze genetické MeSH
- exom genetika MeSH
- genom lidský genetika MeSH
- genomika * metody MeSH
- lidé MeSH
- rodokmen MeSH
- výpočetní biologie metody MeSH
- vzácné nemoci * genetika diagnóza MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Evropa MeSH
BACKGROUND: The Clinical Genome Resource (ClinGen) is an international collaborative effort among scientists and clinicians, diagnostic and research laboratories, and the patient community. Using a standardized framework, ClinGen has established guidelines to classify gene-disease relationships as definitive, strong, moderate, and limited on the basis of available scientific and clinical evidence. When the genetic and functional evidence for a gene-disease relationship has conflicting interpretations or contradictory evidence, they can be disputed or refuted. OBJECTIVE: We assessed genes related to primary antibody deficiencies. METHODS: The ClinGen Antibody Deficiencies Gene Curation Expert Panel, using the ClinGen framework, classified genes related to primary antibody deficiency that primarily affect B-cell development and/or function, and that account for the largest proportion of inborn errors of immunity or primary immunodeficiencies. RESULTS: The expert panel curated a total of 65 genes associated with humoral immune defects to validate 74 gene-disease relationships. Of these, 40 were classified as definitive, 1 as strong, 16 as moderate, 15 as limited, and 2 as disputed. The curation process involved reviewing 490 patient records and 3546 associated human phenotype ontology entries. The 3 most frequently observed terms related to primary antibody deficiency were decreased circulating antibody level, pneumonia, and lymphadenopathy. CONCLUSIONS: These curations (publicly available at ClinicalGenome.org) represent the first effort to provide a comprehensive genetic and phenotypic revision of genetic disorders affecting humoral immunity, as reviewed and approved by experts in the field.
OBJECTIVE: The objective of this study is to evaluate whether adding oral glucocorticoids to immunosuppressive therapy improves skin scores and ensures safety in patients with early diffuse cutaneous systemic sclerosis (dcSSc). METHODS: We performed an emulated randomized trial comparing the changes from baseline to 12 ± 3 months of the modified Rodnan skin score (mRSS: primary outcome) in patients with early dcSSc receiving either oral glucocorticoids (≤20 mg/day prednisone equivalent) combined with immunosuppression (treated) or immunosuppression alone (controls), using data from the European Scleroderma Trials and Research Group. Secondary end points were the difference occurrence of progressive skin or lung fibrosis and scleroderma renal crisis. Matching propensity score was used to adjust for baseline imbalance between groups. RESULTS: We matched 208 patients (mean age 49 years; 33% male; 59% anti-Scl70), 104 in each treatment group, obtaining comparable characteristics at baseline. In the treated group, patients received a median prednisone dose of 5 mg/day. Mean mRSS change at 12 ± 3 months was similar in the two groups (decrease of 2.7 [95% confidence interval {95% CI} 1.4-4.0] in treated vs 3.1 [95% CI 1.9-4.4] in control, P = 0.64). Similar results were observed in patients with shorter disease duration (≤ 24 months) or with mRSS ≤22. There was no between-group difference for all prespecified secondary outcomes. A case of scleroderma renal crisis occurred in both groups. CONCLUSION: We did not find any significant benefit of adding low-dose oral glucocorticoids to immunosuppression for skin fibrosis, and at this dosage, glucocorticoid did not increase the risk of scleroderma renal crisis.
- MeSH
- aplikace orální MeSH
- databáze faktografické MeSH
- difuzní sklerodermie * farmakoterapie patologie diagnóza MeSH
- dospělí MeSH
- fibróza MeSH
- glukokortikoidy * aplikace a dávkování škodlivé účinky MeSH
- imunosupresiva * aplikace a dávkování škodlivé účinky MeSH
- kombinovaná farmakoterapie MeSH
- kůže * patologie účinky léků MeSH
- lidé středního věku MeSH
- lidé MeSH
- prednison * aplikace a dávkování škodlivé účinky MeSH
- výsledek terapie MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- randomizované kontrolované studie MeSH
- Geografické názvy
- Evropa MeSH
Specialized or secondary metabolites are small molecules of biological origin, often showing potent biological activities with applications in agriculture, engineering and medicine. Usually, the biosynthesis of these natural products is governed by sets of co-regulated and physically clustered genes known as biosynthetic gene clusters (BGCs). To share information about BGCs in a standardized and machine-readable way, the Minimum Information about a Biosynthetic Gene cluster (MIBiG) data standard and repository was initiated in 2015. Since its conception, MIBiG has been regularly updated to expand data coverage and remain up to date with innovations in natural product research. Here, we describe MIBiG version 4.0, an extensive update to the data repository and the underlying data standard. In a massive community annotation effort, 267 contributors performed 8304 edits, creating 557 new entries and modifying 590 existing entries, resulting in a new total of 3059 curated entries in MIBiG. Particular attention was paid to ensuring high data quality, with automated data validation using a newly developed custom submission portal prototype, paired with a novel peer-reviewing model. MIBiG 4.0 also takes steps towards a rolling release model and a broader involvement of the scientific community. MIBiG 4.0 is accessible online at https://mibig.secondarymetabolites.org/.
The European Chemical Biology Database (ECBD, https://ecbd.eu) serves as the central repository for data generated by the EU-OPENSCREEN research infrastructure consortium. It is developed according to FAIR principles, which emphasize findability, accessibility, interoperability and reusability of data. This data is made available to the scientific community following open access principles. The ECBD stores both positive and negative results from the entire chemical biology project pipeline, including data from primary or counter-screening assays. The assays utilize a defined and diverse library of over 107 000 compounds, the annotations of which are continuously enriched by external user supported screening projects and by internal EU-OPENSCREEN bioprofiling efforts. These compounds were screened in 89 currently deposited datasets (assays), with 48 already being publicly accessible, while the remaining will be published after a publication embargo period of up to 3 years. Together these datasets encompass ∼4.3 million experimental data points. All public data within ECBD can be accessed through its user interface, API or by database dump under the CC-BY 4.0 license.
PURPOSE: The purpose of this narrative review is to provide a comparison of several countries with different legislation and approaches to pharmacovigilance and to point out how these impact the number of adverse drug reactions (ADRs) that are reported to national competent authorities. METHODS: Legislative and statistical data regarding ADR reporting from various national competent authorities' websites, databases, and pharmacovigilance centers were used. In combination with the WHO pharmacovigilance quantitative indicator that was applied to evaluate the effectiveness of particular national pharmacovigilance systems in our scope. RESULTS: The study compared pharmacovigilance systems in six countries, focusing on ADR reporting from 2010 onwards. All countries required MAHs to report ADRs, while healthcare professionals' obligations varied. Per-capita ADR reports increased in all countries with available data, with the United States having a significantly higher reporting rate, possibly due to FDA campaigns. Despite starting later, China's per-capita reporting rate surpassed that of the Czech Republic and Japan. The study highlighted various measures taken by countries to enhance ADR reporting systems since the inception of their programs, contributing to the overall increase in reporting rates. CONCLUSIONS: ADR reporting is a global priority, with efforts made by different countries to strengthen their pharmacovigilance systems. Some success can be seen in gradually improving per-capita ADR reporting rates. The varying reporting rates and measures taken by each country may serve as a basis for further research and exchange of best practices to improve drug safety monitoring worldwide.