supervised classification
Dotaz
Zobrazit nápovědu
Manual visual review, annotation and categorization of electroencephalography (EEG) is a time-consuming task that is often associated with human bias and requires trained electrophysiology experts with specific domain knowledge. This challenge is now compounded by development of measurement technologies and devices allowing large-scale heterogeneous, multi-channel recordings spanning multiple brain regions over days, weeks. Currently, supervised deep-learning techniques were shown to be an effective tool for analyzing big data sets, including EEG. However, the most significant caveat in training the supervised deep-learning models in a clinical research setting is the lack of adequate gold-standard annotations created by electrophysiology experts. Here, we propose a semi-supervised machine learning technique that utilizes deep-learning methods with a minimal amount of gold-standard labels. The method utilizes a temporal autoencoder for dimensionality reduction and a small number of the expert-provided gold-standard labels used for kernel density estimating (KDE) maps. We used data from electrophysiological intracranial EEG (iEEG) recordings acquired in two hospitals with different recording systems across 39 patients to validate the method. The method achieved iEEG classification (Pathologic vs. Normal vs. Artifacts) results with an area under the receiver operating characteristic (AUROC) scores of 0.862 ± 0.037, 0.879 ± 0.042, and area under the precision-recall curve (AUPRC) scores of 0.740 ± 0.740, 0.714 ± 0.042. This demonstrates that semi-supervised methods can provide acceptable results while requiring only 100 gold-standard data samples in each classification category. Subsequently, we deployed the technique to 12 novel patients in a pseudo-prospective framework for detecting Interictal epileptiform discharges (IEDs). We show that the proposed temporal autoencoder was able to generalize to novel patients while achieving AUROC of 0.877 ± 0.067 and AUPRC of 0.705 ± 0.154.
The scarcity of high-quality annotations in many application scenarios has recently led to an increasing interest in devising learning techniques that combine unlabeled data with labeled data in a network. In this work, we focus on the label propagation problem in multilayer networks. Our approach is inspired by the heat diffusion model, which shows usefulness in machine learning problems such as classification and dimensionality reduction. We propose a novel boundary-based heat diffusion algorithm that guarantees a closed-form solution with an efficient implementation. We experimentally validated our method on synthetic networks and five real-world multilayer network datasets representing scientific coauthorship, spreading drug adoption among physicians, two bibliographic networks, and a movie network. The results demonstrate the benefits of the proposed algorithm, where our boundary-based heat diffusion dominates the performance of the state-of-the-art methods.
- MeSH
- algoritmy MeSH
- řízené strojové učení * MeSH
- strojové učení MeSH
- vysoká teplota * MeSH
- Publikační typ
- časopisecké články MeSH
Precise classification of acute leukemia (AL) is crucial for adequate treatment. EuroFlow has previously designed an AL orientation tube (ALOT) to guide towards the relevant classification panel (T-cell acute lymphoblastic leukemia (T-ALL), B-cell precursor (BCP)-ALL and/or acute myeloid leukemia (AML)) and final diagnosis. Now we built a reference database with 656 typical AL samples (145 T-ALL, 377 BCP-ALL, 134 AML), processed and analyzed via standardized protocols. Using principal component analysis (PCA)-based plots and automated classification algorithms for direct comparison of single-cells from individual patients against the database, another 783 cases were subsequently evaluated. Depending on the database-guided results, patients were categorized as: (i) typical T, B or Myeloid without or; (ii) with a transitional component to another lineage; (iii) atypical; or (iv) mixed-lineage. Using this automated algorithm, in 781/783 cases (99.7%) the right panel was selected, and data comparable to the final WHO-diagnosis was already provided in >93% of cases (85% T-ALL, 97% BCP-ALL, 95% AML and 87% mixed-phenotype AL patients), even without data on the full-characterization panels. Our results show that database-guided analysis facilitates standardized interpretation of ALOT results and allows accurate selection of the relevant classification panels, hence providing a solid basis for designing future WHO AL classifications.
- MeSH
- akutní lymfatická leukemie patologie MeSH
- akutní myeloidní leukemie patologie MeSH
- akutní nemoc MeSH
- dítě MeSH
- dospělí MeSH
- imunofenotypizace metody MeSH
- kojenec MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- novorozenec MeSH
- předškolní dítě MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- Check Tag
- dítě MeSH
- dospělí MeSH
- kojenec MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- novorozenec MeSH
- předškolní dítě MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
This paper presents a fully automated method for the identification of bone marrow infiltration in femurs in low-dose CT of patients with multiple myeloma. We automatically find the femurs and the bone marrow within them. In the next step, we create a probabilistic, spatially dependent density model of normal tissue. At test time, we detect unexpectedly high density voxels which may be related to bone marrow infiltration, as outliers to this model. Based on a set of global, aggregated features representing all detections from one femur, we classify the subjects as being either healthy or not. This method was validated on a dataset of 127 subjects with ground truth created from a consensus of two expert radiologists, obtaining an AUC of 0.996 for the task of distinguishing healthy controls and patients with bone marrow infiltration. To the best of our knowledge, no other automatic image-based method for this task has been published before.
- MeSH
- lidé středního věku MeSH
- lidé MeSH
- metastázy nádorů MeSH
- mnohočetný myelom MeSH
- nádory kostní dřeně MeSH
- počítačová rentgenová tomografie metody MeSH
- počítačové zpracování obrazu * MeSH
- senioři MeSH
- strojové učení * MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- randomizované kontrolované studie MeSH
BACKGROUND: A growing number of crystal and NMR structures reveals a considerable structural polymorphism of DNA architecture going well beyond the usual image of a double helical molecule. DNA is highly variable with dinucleotide steps exhibiting a substantial flexibility in a sequence-dependent manner. An analysis of the conformational space of the DNA backbone and the enhancement of our understanding of the conformational dependencies in DNA are therefore important for full comprehension of DNA structural polymorphism. RESULTS: A detailed classification of local DNA conformations based on the technique of Fourier averaging was published in our previous work. However, this procedure requires a considerable amount of manual work. To overcome this limitation we developed an automatic classification method consisting of the combination of supervised and unsupervised approaches. A proposed workflow is composed of k-NN method followed by a non-hierarchical single-pass clustering algorithm. We applied this workflow to analyze 816 X-ray and 664 NMR DNA structures released till February 2013. We identified and annotated six new conformers, and we assigned four of these conformers to two structurally important DNA families: guanine quadruplexes and Holliday (four-way) junctions. We also compared populations of the assigned conformers in the dataset of X-ray and NMR structures. CONCLUSIONS: In the present work we developed a machine learning workflow for the automatic classification of dinucleotide conformations. Dinucleotides with unassigned conformations can be either classified into one of already known 24 classes or they can be flagged as unclassifiable. The proposed machine learning workflow permits identification of new classes among so far unclassifiable data, and we identified and annotated six new conformations in the X-ray structures released since our previous analysis. The results illustrate the utility of machine learning approaches in the classification of local DNA conformations.
- MeSH
- algoritmy MeSH
- DNA chemie MeSH
- G-kvadruplexy MeSH
- klasifikace metody MeSH
- konformace nukleové kyseliny MeSH
- krystalografie rentgenová MeSH
- nukleární magnetická rezonance biomolekulární MeSH
- průběh práce MeSH
- shluková analýza MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Česká technická norma
139 s. : il., tab. + 30 cm
- MeSH
- organizace a řízení normy MeSH
- ústavní řízení a kontrola klasifikace normy MeSH
- Publikační typ
- směrnice MeSH
- Konspekt
- Řízení a správa podniku
- NLK Obory
- státní správa
We examined how penalized linear discriminant analysis with resampling, which is a supervised, multivariate, whole-brain reduction technique, can help schizophrenia diagnostics and research. In an experiment with magnetic resonance brain images of 52 first-episode schizophrenia patients and 52 healthy controls, this method allowed us to select brain areas relevant to schizophrenia, such as the left prefrontal cortex, the anterior cingulum, the right anterior insula, the thalamus, and the hippocampus. Nevertheless, the classification performance based on such reduced data was not significantly better than the classification of data reduced by mass univariate selection using a t-test or unsupervised multivariate reduction using principal component analysis. Moreover, we found no important influence of the type of imaging features, namely local deformations or gray matter volumes, and the classification method, specifically linear discriminant analysis or linear support vector machines, on the classification results. However, we ascertained significant effect of a cross-validation setting on classification performance as classification results were overestimated even though the resampling was performed during the selection of brain imaging features. Therefore, it is critically important to perform cross-validation in all steps of the analysis (not only during classification) in case there is no external validation set to avoid optimistically biasing the results of classification studies.
- Publikační typ
- časopisecké články MeSH
Precise classification of acute leukemia (AL) is crucial for adequate treatment. EuroFlow has previously designed an AL orientation tube (ALOT) to guide toward the relevant classification panel and final diagnosis. In this study, we designed and validated an algorithm for automated (database-supported) gating and identification (AGI tool) of cell subsets within samples stained with ALOT. A reference database of normal peripheral blood (PB, n = 41) and bone marrow (BM; n = 45) samples analyzed with the ALOT was constructed, and served as a reference for the AGI tool to automatically identify normal cells. Populations not unequivocally identified as normal cells were labeled as checks and were classified by an expert. Additional normal BM (n = 25) and PB (n = 43) and leukemic samples (n = 109), analyzed in parallel by experts and the AGI tool, were used to evaluate the AGI tool. Analysis of normal PB and BM samples showed low percentages of checks (<3% in PB, <10% in BM), with variations between different laboratories. Manual analysis and AGI analysis of normal and leukemic samples showed high levels of correlation between cell numbers (r2 > 0.95 for all cell types in PB and r2 > 0.75 in BM) and resulted in highly concordant classification of leukemic cells by our previously published automated database-guided expert-supervised orientation tool for immunophenotypic diagnosis and classification of acute leukemia (Compass tool). Similar data were obtained using alternative, commercially available tubes, confirming the robustness of the developed tools. The AGI tool represents an innovative step in minimizing human intervention and requirements in expertise, toward a "sample-in and result-out" approach which may result in more objective and reproducible data analysis and diagnostics. The AGI tool may improve quality of immunophenotyping in individual laboratories, since high percentages of checks in normal samples are an alert on the quality of the internal procedures.
- MeSH
- akutní myeloidní leukemie diagnóza MeSH
- algoritmy * MeSH
- imunofenotypizace metody MeSH
- leukocyty patologie MeSH
- lidé MeSH
- průtoková cytometrie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherence-controlled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity.
- MeSH
- algoritmy MeSH
- buňky klasifikace cytologie MeSH
- holografie metody MeSH
- lidé MeSH
- mikroskopie metody MeSH
- rozpoznávání automatizované MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Background: Classifying diseases into ICD codes has mainly relied on human reading a large amount of written materials, such as discharge diagnoses, chief complaints, medical history, and operation records as the basis for classification. Coding is both laborious and time consuming because a disease coder with professional abilities takes about 20 minutes per case in average. Therefore, an automatic code classification system can significantly reduce the human effort. Objectives: This paper aims at constructing a machine learning model for ICD-10 coding, where the model is to automatically determine the corresponding diagnosis codes solely based on free-text medical notes. Methods: In this paper, we apply Natural Language Processing (NLP) and Recurrent Neural Network (RNN) architecture to classify ICD-10 codes from natural language texts with supervised learning. Results: In the experiments on large hospital data, our predicting result can reach F1-score of 0.62 on ICD-10-CM code. Conclusion: The developed model can significantly reduce manpower in coding time compared with a professional coder.
- MeSH
- automatizované zpracování dat metody MeSH
- deep learning * MeSH
- elektronické zdravotní záznamy MeSH
- mezinárodní klasifikace nemocí * MeSH
- neuronové sítě MeSH
- strojové učení MeSH
- ukládání a vyhledávání informací metody statistika a číselné údaje MeSH
- vizualizace dat MeSH
- zpracování přirozeného jazyka MeSH
- Publikační typ
- práce podpořená grantem MeSH