Computational pipeline
Dotaz
Zobrazit nápovědu
INTRODUCTION: The histopathological classification for antineutrophil cytoplasmic autoantibody (ANCA)-associated glomerulonephritis (ANCA-GN) is a well-established tool to reflect the variety of patterns and severity of lesions that can occur in kidney biopsies. It was demonstrated previously that deep learning (DL) approaches can aid in identifying histopathological classes of kidney diseases; for example, of diabetic kidney disease. These models can potentially be used as decision support tools for kidney pathologists. Although they reach high prediction accuracies, their "black box" structure makes them nontransparent. Explainable (X) artificial intelligence (AI) techniques can be used to make the AI model decisions accessible for human experts. We have developed a DL-based model, which detects and classifies the glomerular lesions according to the Berden classification. METHODS: Kidney biopsy slides of 80 patients with ANCA-GN from 3 European centers, who underwent a diagnostic kidney biopsy between 1991 and 2011, were included. We also investigated the explainability of our model using Gradient-weighted Class Activation Mapping (Grad-CAM) heatmaps. These maps were analyzed by pathologists to compare the decision-making criteria of humans and the DL model and assess the impact of different training settings. RESULTS: The DL model shows a prediction accuracy of 93% for classifying lesions. The heatmaps from our trained DL models showed that the most predictive areas in the image correlated well with the areas deemed to be important by the pathologist. CONCLUSION: We present the first DL-based computational pipeline for classifying ANCA-GN kidney biopsies as per the Berden classification. XAI techniques helped us to make the decision-making criteria of the DL accessible for renal pathologists, potentially improving clinical decision-making.
- Publikační typ
- časopisecké články MeSH
The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present Phenopacket Store. Phenopacket Store v.0.1.19 includes 6,668 phenopackets representing 475 Mendelian and chromosomal diseases associated with 423 genes and 3,834 unique pathogenic alleles curated from 959 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.
- MeSH
- algoritmy MeSH
- databáze genetické MeSH
- fenotyp * MeSH
- genomika * metody MeSH
- lidé MeSH
- software * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Neuromuscular diseases (NMDs) are rare disorders characterized by progressive muscle fibre loss, leading to replacement by fibrotic and fatty tissue, muscle weakness and disability. Early diagnosis is critical for therapeutic decisions, care planning and genetic counselling. Muscle magnetic resonance imaging (MRI) has emerged as a valuable diagnostic tool by identifying characteristic patterns of muscle involvement. However, the increasing complexity of these patterns complicates their interpretation, limiting their clinical utility. Additionally, multi-study data aggregation introduces heterogeneity challenges. This study presents a novel multi-study harmonization pipeline for muscle MRI and an AI-driven diagnostic tool to assist clinicians in identifying disease-specific muscle involvement patterns. METHODS: We developed a preprocessing pipeline to standardize MRI fat content across datasets, minimizing source bias. An ensemble of XGBoost models was trained to classify patients based on intramuscular fat replacement, age at MRI and sex. The SHapley Additive exPlanations (SHAP) framework was adapted to analyse model predictions and identify disease-specific muscle involvement patterns. To address class imbalance, training and evaluation were conducted using class-balanced metrics. The model's performance was compared against four expert clinicians using 14 previously unseen MRI scans. RESULTS: Using our harmonization approach, we curated a dataset of 2961 MRI samples from genetically confirmed cases of 20 paediatric and adult NMDs. The model achieved a balanced accuracy of 64.8% ± 3.4%, with a weighted top-3 accuracy of 84.7% ± 1.8% and top-5 accuracy of 90.2% ± 2.4%. It also identified key features relevant for differential diagnosis, aiding clinical decision-making. Compared to four expert clinicians, the model obtained the highest top-3 accuracy (75.0% ± 4.8%). The diagnostic tool has been implemented as a free web platform, providing global access to the medical community. CONCLUSIONS: The application of AI in muscle MRI for NMD diagnosis remains underexplored due to data scarcity. This study introduces a framework for dataset harmonization, enabling advanced computational techniques. Our findings demonstrate the potential of AI-based approaches to enhance differential diagnosis by identifying disease-specific muscle involvement patterns. The developed tool surpasses expert performance in diagnostic ranking and is accessible to clinicians worldwide via the Myo-Guide online platform.
- MeSH
- dospělí MeSH
- internet MeSH
- lidé středního věku MeSH
- lidé MeSH
- magnetická rezonanční tomografie * metody MeSH
- neuromuskulární nemoci * diagnóza diagnostické zobrazování MeSH
- strojové učení * MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
Tools for post-operative localization of deep brain stimulation (DBS) electrodes may be of major benefit in the evaluation of the stimulation area. However, little is known about their precision. This study compares 3 different software packages used for DBS electrode localization. T1-weighted MRI images before and after the implantation of the electrodes into the subthalamic nucleus for DBS in 105 Parkinson's disease patients were processed using the pipelines implemented in Lead-DBS, SureTune4, and Brainlab. Euclidean distance between active contacts determined by individual software packages and in repeated processing by the same and by a different operator was calculated. Furthermore, Dice coefficient for overlap of volume of tissue activated (VTA) was determined for Lead-DBS. Medians of Euclidean distances between estimated active contact locations in inter-software package comparison ranged between 1.5 mm and 2 mm. Euclidean distances in within-software package intra- and inter-rater assessments were 0.6-1 mm and 1-1.7 mm, respectively. Median intra- and inter-rater Dice coefficients for VTAs were 0.78 and 0.75, respectively. Since the median distances are close to the size of the target nucleus, any clinical use should be preceded by careful review of the outputs.
- MeSH
- hluboká mozková stimulace * metody přístrojové vybavení MeSH
- implantované elektrody * MeSH
- lidé středního věku MeSH
- lidé MeSH
- magnetická rezonanční tomografie MeSH
- nucleus subthalamicus chirurgie MeSH
- Parkinsonova nemoc * terapie MeSH
- senioři MeSH
- software MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
OBJECTIVE: Thyroid cancer (TC) is the most common endocrine malignancy, with 90%-95% of the cases representing non-medullary thyroid cancer (NMTC). Familial cases account only for a few of all cases and the underlying genetic causes are still poorly understood. METHODS: We whole-genome sequenced affected and unaffected members of an Italian NMTC family and applied our in-house developed Familial Cancer Variant Prioritization Pipeline (FCVPPv2) which prioritized 12 coding variants. We refined this selection using the VarSome American College of Medical Genetics and Genomics (ACMG) implementation, SNAP2 predictions and further in silico scores. RESULTS: We prioritized 4 possibly pathogenic variants in 4 genes including Ret proto-oncogene (RET), polypeptide N-acetylgalactosaminyltransferase 10 (GALNT10), ubinuclein-1 (UBN1), and prostaglandin I2 receptor (PTGIR). The role of RET point mutations in medullary thyroid carcinoma is well established. Similarly, somatic rearrangements of RET are known in papillary TC, a specific histotype of NMTC. In contrast to RET, no germline variants in PTGIR, GALNT10, or UBN1 have been linked to the development of TC to date. However, alterations in these genes have been shown to affect pathways related to cell proliferation, apoptosis, growth, and differentiation, as well as posttranslational modification and gene regulation. A thorough review of the available literature together with computational evidence supported the interpretation of the 4 shortlisted variants as possibly disease-causing in this family. CONCLUSIONS: Our results implicate the first germline variant in RET in a family with NMTC as well as the first germline variants in PTGIR, GALNT10, and UBN1 in TC.
- MeSH
- dospělí MeSH
- genetická predispozice k nemoci * genetika MeSH
- lidé středního věku MeSH
- lidé MeSH
- N-acetylgalaktosaminyltransferasy genetika MeSH
- nádory štítné žlázy * genetika MeSH
- neuroendokrinní karcinom * genetika MeSH
- polypeptid-N-acetylgalaktosaminyltransferasa MeSH
- protoonkogen Mas MeSH
- protoonkogenní proteiny c-ret genetika MeSH
- rodokmen MeSH
- sekvenování celého genomu * metody MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
The European Chemical Biology Database (ECBD, https://ecbd.eu) serves as the central repository for data generated by the EU-OPENSCREEN research infrastructure consortium. It is developed according to FAIR principles, which emphasize findability, accessibility, interoperability and reusability of data. This data is made available to the scientific community following open access principles. The ECBD stores both positive and negative results from the entire chemical biology project pipeline, including data from primary or counter-screening assays. The assays utilize a defined and diverse library of over 107 000 compounds, the annotations of which are continuously enriched by external user supported screening projects and by internal EU-OPENSCREEN bioprofiling efforts. These compounds were screened in 89 currently deposited datasets (assays), with 48 already being publicly accessible, while the remaining will be published after a publication embargo period of up to 3 years. Together these datasets encompass ∼4.3 million experimental data points. All public data within ECBD can be accessed through its user interface, API or by database dump under the CC-BY 4.0 license.
PURPOSE: Ktrans$$ {K}^{\mathrm{trans}} $$ has often been proposed as a quantitative imaging biomarker for diagnosis, prognosis, and treatment response assessment for various tumors. None of the many software tools for Ktrans$$ {K}^{\mathrm{trans}} $$ quantification are standardized. The ISMRM Open Science Initiative for Perfusion Imaging-Dynamic Contrast-Enhanced (OSIPI-DCE) challenge was designed to benchmark methods to better help the efforts to standardize Ktrans$$ {K}^{\mathrm{trans}} $$ measurement. METHODS: A framework was created to evaluate Ktrans$$ {K}^{\mathrm{trans}} $$ values produced by DCE-MRI analysis pipelines to enable benchmarking. The perfusion MRI community was invited to apply their pipelines for Ktrans$$ {K}^{\mathrm{trans}} $$ quantification in glioblastoma from clinical and synthetic patients. Submissions were required to include the entrants' Ktrans$$ {K}^{\mathrm{trans}} $$ values, the applied software, and a standard operating procedure. These were evaluated using the proposed OSIPIgold$$ \mathrm{OSIP}{\mathrm{I}}_{\mathrm{gold}} $$ score defined with accuracy, repeatability, and reproducibility components. RESULTS: Across the 10 received submissions, the OSIPIgold$$ \mathrm{OSIP}{\mathrm{I}}_{\mathrm{gold}} $$ score ranged from 28% to 78% with a 59% median. The accuracy, repeatability, and reproducibility scores ranged from 0.54 to 0.92, 0.64 to 0.86, and 0.65 to 1.00, respectively (0-1 = lowest-highest). Manual arterial input function selection markedly affected the reproducibility and showed greater variability in Ktrans$$ {K}^{\mathrm{trans}} $$ analysis than automated methods. Furthermore, provision of a detailed standard operating procedure was critical for higher reproducibility. CONCLUSIONS: This study reports results from the OSIPI-DCE challenge and highlights the high inter-software variability within Ktrans$$ {K}^{\mathrm{trans}} $$ estimation, providing a framework for ongoing benchmarking against the scores presented. Through this challenge, the participating teams were ranked based on the performance of their software tools in the particular setting of this challenge. In a real-world clinical setting, many of these tools may perform differently with different benchmarking methodology.
- MeSH
- algoritmy MeSH
- kontrastní látky * MeSH
- lidé MeSH
- magnetická rezonanční tomografie * metody MeSH
- reprodukovatelnost výsledků MeSH
- software MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
The inherent diversity of approaches in proteomics research has led to a wide range of software solutions for data analysis. These software solutions encompass multiple tools, each employing different algorithms for various tasks such as peptide-spectrum matching, protein inference, quantification, statistical analysis, and visualization. To enable an unbiased comparison of commonly used bottom-up label-free proteomics workflows, we introduce WOMBAT-P, a versatile platform designed for automated benchmarking and comparison. WOMBAT-P simplifies the processing of public data by utilizing the sample and data relationship format for proteomics (SDRF-Proteomics) as input. This feature streamlines the analysis of annotated local or public ProteomeXchange data sets, promoting efficient comparisons among diverse outputs. Through an evaluation using experimental ground truth data and a realistic biological data set, we uncover significant disparities and a limited overlap in the quantified proteins. WOMBAT-P not only enables rapid execution and seamless comparison of workflows but also provides valuable insights into the capabilities of different software solutions. These benchmarking metrics are a valuable resource for researchers in selecting the most suitable workflow for their specific data sets. The modular architecture of WOMBAT-P promotes extensibility and customization. The software is available at https://github.com/wombat-p/WOMBAT-Pipelines.
- MeSH
- analýza dat MeSH
- benchmarking * MeSH
- proteiny MeSH
- proteomika * MeSH
- průběh práce MeSH
- software MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
BACKGROUND: Up-to-date estimates of stroke burden and attributable risks and their trends at global, regional, and national levels are essential for evidence-based health care, prevention, and resource allocation planning. We aimed to provide such estimates for the period 1990-2021. METHODS: We estimated incidence, prevalence, death, and disability-adjusted life-year (DALY) counts and age-standardised rates per 100 000 people per year for overall stroke, ischaemic stroke, intracerebral haemorrhage, and subarachnoid haemorrhage, for 204 countries and territories from 1990 to 2021. We also calculated burden of stroke attributable to 23 risk factors and six risk clusters (air pollution, tobacco smoking, behavioural, dietary, environmental, and metabolic risks) at the global and regional levels (21 GBD regions and Socio-demographic Index [SDI] quintiles), using the standard GBD methodology. 95% uncertainty intervals (UIs) for each individual future estimate were derived from the 2·5th and 97·5th percentiles of distributions generated from propagating 500 draws through the multistage computational pipeline. FINDINGS: In 2021, stroke was the third most common GBD level 3 cause of death (7·3 million [95% UI 6·6-7·8] deaths; 10·7% [9·8-11·3] of all deaths) after ischaemic heart disease and COVID-19, and the fourth most common cause of DALYs (160·5 million [147·8-171·6] DALYs; 5·6% [5·0-6·1] of all DALYs). In 2021, there were 93·8 million (89·0-99·3) prevalent and 11·9 million (10·7-13·2) incident strokes. We found disparities in stroke burden and risk factors by GBD region, country or territory, and SDI, as well as a stagnation in the reduction of incidence from 2015 onwards, and even some increases in the stroke incidence, death, prevalence, and DALY rates in southeast Asia, east Asia, and Oceania, countries with lower SDI, and people younger than 70 years. Globally, ischaemic stroke constituted 65·3% (62·4-67·7), intracerebral haemorrhage constituted 28·8% (28·3-28·8), and subarachnoid haemorrhage constituted 5·8% (5·7-6·0) of incident strokes. There were substantial increases in DALYs attributable to high BMI (88·2% [53·4-117·7]), high ambient temperature (72·4% [51·1 to 179·5]), high fasting plasma glucose (32·1% [26·7-38·1]), diet high in sugar-sweetened beverages (23·4% [12·7-35·7]), low physical activity (11·3% [1·8-34·9]), high systolic blood pressure (6·7% [2·5-11·6]), lead exposure (6·5% [4·5-11·2]), and diet low in omega-6 polyunsaturated fatty acids (5·3% [0·5-10·5]). INTERPRETATION: Stroke burden has increased from 1990 to 2021, and the contribution of several risk factors has also increased. Effective, accessible, and affordable measures to improve stroke surveillance, prevention (with the emphasis on blood pressure, lifestyle, and environmental factors), acute care, and rehabilitation need to be urgently implemented across all countries to reduce stroke burden. FUNDING: Bill & Melinda Gates Foundation.
- MeSH
- celosvětové zdraví * MeSH
- cévní mozková příhoda * epidemiologie MeSH
- globální zátěž nemocemi * MeSH
- incidence MeSH
- kvalitativně upravené roky života MeSH
- lidé MeSH
- počet let života s onemocněním MeSH
- prevalence MeSH
- rizikové faktory MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
Single-cell RNA sequencing (scRNA-seq) methods are widely used in life sciences, including immunology. Typical scRNA-seq analysis pipelines quantify the abundance of particular transcripts without accounting for alternative splicing. However, a well-established pan-leukocyte surface marker, CD45, encoded by the PTPRC gene, presents alternatively spliced variants that define different immune cell subsets. Information about some of the splicing patterns in particular cells in the scRNA-seq data can be obtained using isotype-specific DNA oligo-tagged anti-CD45 antibodies. However, this requires generation of an additional sequencing DNA library. Here, we present IDEIS, an easy-to-use software for CD45 isoform quantification that uses single-cell transcriptomic data as the input. We showed that IDEIS accurately identifies canonical human CD45 isoforms in datasets generated by 10× Genomics 5' sequencing assays. Moreover, we used IDEIS to determine the specificity of the Ptprc splicing pattern in mouse leukocyte subsets.
- MeSH
- alternativní sestřih MeSH
- analýza jednotlivých buněk metody MeSH
- antigeny CD45 * genetika metabolismus MeSH
- leukocyty metabolismus imunologie MeSH
- lidé MeSH
- myši MeSH
- protein - isoformy genetika MeSH
- sekvenční analýza RNA metody MeSH
- software * MeSH
- stanovení celkové genové exprese metody MeSH
- transkriptom MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH