interpretable machine learning
Dotaz
Zobrazit nápovědu
Early detection of malignant thyroid nodules is crucial for effective treatment, but traditional diagnostic methods face challenges such as variability in expert opinions and limited integration of advanced imaging techniques. This prospective cohort study investigates a novel multimodal approach, integrating traditional methods with advanced machine learning techniques. We studied 181 patients who underwent fine-needle aspiration (FNA) biopsy, each contributing one nodule, resulting in a total of 181 nodules for our analysis. Data collection included sex, age, and ultrasound imaging, which incorporated elastography. Features extracted from these images included Thyroid Imaging Reporting and Data System (TIRADS) scores, elastography parameters, and radiomic features. The pathological results based on the FNA biopsy, provided by the pathologists, served as our gold standard for nodule classification. Our methodology, termed ELTIRADS, combines these features with interpretable machine learning techniques. Performance evaluation showed that a Support Vector Machine (SVM) classifier using TIRADS, elastography data, and radiomic features achieved high accuracy (0.92), with sensitivity (0.89), specificity (0.94), precision (0.89), and F1 score (0.89). To enhance interpretability, we used hierarchical clustering, shapley additive explanations (SHAP), and partial dependence plots (PDP). This combined approach holds promise for enhancing the accuracy of thyroid nodule malignancy detection, thereby contributing to advancements in personalized and precision medicine in the field of thyroid cancer research.
- MeSH
- dospělí MeSH
- elastografie * metody MeSH
- lidé středního věku MeSH
- lidé MeSH
- nádory štítné žlázy diagnostické zobrazování klasifikace patologie diagnóza MeSH
- prospektivní studie MeSH
- radiomika MeSH
- senioři MeSH
- štítná žláza diagnostické zobrazování patologie MeSH
- strojové učení * MeSH
- support vector machine MeSH
- tenkojehlová biopsie MeSH
- uzly štítné žlázy * diagnostické zobrazování patologie klasifikace MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Interpretable machine learning (ML) for early detection of cancer has the potential to improve risk assessment and early intervention. METHODS: Data from 261 proteins related to inflammation and/or tumor processes in 123 blood samples collected from healthy persons, but of whom a sub-group later developed squamous cell carcinoma of the oral tongue (SCCOT), were analyzed. Samples from people who developed SCCOT within less than 5 years were classified as tumor-to-be and all other samples as tumor-free. The optimal ML algorithm for feature selection was identified and feature importance computed by the SHapley Additive exPlanations (SHAP) method. Five popular ML algorithms (AdaBoost, Artificial neural networks [ANNs], Decision Tree [DT], eXtreme Gradient Boosting [XGBoost], and Support Vector Machine [SVM]) were applied to establish prediction models, and decisions of the optimal models were interpreted by SHAP. RESULTS: Using the 22 selected features, the SVM prediction model showed the best performance (sensitivity = 0.867, specificity = 0.859, balanced accuracy = 0.863, area under the receiver operating characteristic curve [ROC-AUC] = 0.924). SHAP analysis revealed that the 22 features rendered varying person-specific impacts on model decision and the top three contributors to prediction were Interleukin 10 (IL10), TNF Receptor Associated Factor 2 (TRAF2), and Kallikrein Related Peptidase 12 (KLK12). CONCLUSION: Using multidimensional plasma protein analysis and interpretable ML, we outline a systematic approach for early detection of SCCOT before the appearance of clinical signs.
- MeSH
- jazyk MeSH
- krevní proteiny MeSH
- lidé MeSH
- nádory jazyka * diagnóza MeSH
- spinocelulární karcinom * diagnóza MeSH
- strojové učení MeSH
- ubikvitinligasy MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are often uncomputable, or lack practical implementations. In this paper we attempt to follow a big picture view while also providing a particular theory and its implementation to present a novel, purposely simple, and interpretable hierarchical architecture. This architecture incorporates the unsupervised learning of a model of the environment, learning the influence of one's own actions, model-based reinforcement learning, hierarchical planning, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations which are increasingly more abstract, but can retain details when needed. We demonstrate the universality of the architecture by testing it on a series of diverse environments ranging from audio/visual compression to discrete and continuous action spaces, to learning disentangled representations.
- MeSH
- algoritmy MeSH
- lidé MeSH
- neuronové sítě MeSH
- posilování (psychologie) MeSH
- strojové učení bez učitele MeSH
- učení fyziologie MeSH
- umělá inteligence * MeSH
- životní prostředí * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The Data Optimization Model Evaluation (DOME) recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions. Here, we introduce the DOME registry (URL: registry.dome-ml.org), a database that allows scientists to manage and access comprehensive DOME-related information on published ML studies. The registry uses external resources like ORCID, APICURON, and the Data Stewardship Wizard to streamline the annotation process and ensure comprehensive documentation. By assigning unique identifiers and DOME scores to publications, the registry fosters a standardized evaluation of ML methods. Future plans include continuing to grow the registry through community curation, improving the DOME score definition and encouraging publishers to adopt DOME standards, and promoting transparency and reproducibility of ML in the life sciences.
- MeSH
- databáze faktografické MeSH
- lidé MeSH
- registrace * MeSH
- reprodukovatelnost výsledků MeSH
- řízené strojové učení * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
The utilization of prescient quality marks to help clinical choice is turning out to be increasingly significant. Profound learning has a gigantic potential in the expectation of aggregate from quality articulation profiles. Nonetheless, neural organizations are seen as secret elements, where precise forecasts are given with no clarification. The necessities for these models to become interpretable are expanding, particularly in the clinical field.
- MeSH
- lidé MeSH
- neuronové sítě MeSH
- strojové učení MeSH
- umělá inteligence * MeSH
- výpočetní biologie MeSH
- Check Tag
- lidé MeSH
Narcolepsy is a rare life-long disease that exists in two forms, narcolepsy type-1 (NT1) or type-2 (NT2), but only NT1 is accepted as clearly defined entity. Both types of narcolepsies belong to the group of central hypersomnias (CH), a spectrum of poorly defined diseases with excessive daytime sleepiness as a core feature. Due to the considerable overlap of symptoms and the rarity of the diseases, it is difficult to identify distinct phenotypes of CH. Machine learning (ML) can help to identify phenotypes as it learns to recognize clinical features invisible for humans. Here we apply ML to data from the huge European Narcolepsy Network (EU-NN) that contains hundreds of mixed features of narcolepsy making it difficult to analyze with classical statistics. Stochastic gradient boosting, a supervised learning model with built-in feature selection, results in high performances in testing set. While cataplexy features are recognized as the most influential predictors, machine find additional features, e.g. mean rapid-eye-movement sleep latency of multiple sleep latency test contributes to classify NT1 and NT2 as confirmed by classical statistical analysis. Our results suggest ML can identify features of CH on machine scale from complex databases, thus providing 'ideas' and promising candidates for future diagnostic classifications.
- MeSH
- biologické modely * MeSH
- databáze faktografické statistika a číselné údaje MeSH
- datové soubory jako téma MeSH
- dospělí MeSH
- interpretace statistických dat MeSH
- lidé MeSH
- mladý dospělý MeSH
- narkolepsie klasifikace diagnóza patofyziologie MeSH
- polysomnografie statistika a číselné údaje MeSH
- řízené strojové učení * MeSH
- ROC křivka MeSH
- spánek REM fyziologie MeSH
- spánková latence fyziologie MeSH
- stochastické procesy MeSH
- vzácné nemoci klasifikace diagnóza patofyziologie MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Motor disability is a dominant and restricting symptom in multiple sclerosis, yet its neuroimaging correlates are not fully understood. We apply statistical and machine learning techniques on multimodal neuroimaging data to discriminate between multiple sclerosis patients and healthy controls and to predict motor disability scores in the patients. We examine the data of sixty-four multiple sclerosis patients and sixty-five controls, who underwent the MRI examination and the evaluation of motor disability scales. The modalities used comprised regional fractional anisotropy, regional grey matter volumes, and functional connectivity. For analysis, we employ two approaches: high-dimensional support vector machines run on features selected by Fisher Score (aiming for maximal classification accuracy), and low-dimensional logistic regression on the principal components of data (aiming for increased interpretability). We apply analogous regression methods to predict symptom severity. While fractional anisotropy provides the classification accuracy of 96.1% and 89.9% with both approaches respectively, including other modalities did not bring further improvement. Concerning the prediction of motor impairment, the low-dimensional approach performed more reliably. The first grey matter volume component was significantly correlated (R = 0.28-0.46, p < 0.05) with most clinical scales. In summary, we identified the relationship between both white and grey matter changes and motor impairment in multiple sclerosis. Furthermore, we were able to achieve the highest classification accuracy based on quantitative MRI measures of tissue integrity between patients and controls yet reported, while also providing a low-dimensional classification approach with comparable results, paving the way to interpretable machine learning models of brain changes in multiple sclerosis.
SIGNIFICANCE: Machine learning is increasingly being applied to the classification of microscopic data. In order to detect some complex and dynamic cellular processes, time-resolved live-cell imaging might be necessary. Incorporating the temporal information into the classification process may allow for a better and more specific classification. AIM: We propose a methodology for cell classification based on the time-lapse quantitative phase images (QPIs) gained by digital holographic microscopy (DHM) with the goal of increasing performance of classification of dynamic cellular processes. APPROACH: The methodology was demonstrated by studying epithelial-mesenchymal transition (EMT) which entails major and distinct time-dependent morphological changes. The time-lapse QPIs of EMT were obtained over a 48-h period and specific novel features representing the dynamic cell behavior were extracted. The two distinct end-state phenotypes were classified by several supervised machine learning algorithms and the results were compared with the classification performed on single-time-point images. RESULTS: In comparison to the single-time-point approach, our data suggest the incorporation of temporal information into the classification of cell phenotypes during EMT improves performance by nearly 9% in terms of accuracy, and further indicate the potential of DHM to monitor cellular morphological changes. CONCLUSIONS: Proposed approach based on the time-lapse images gained by DHM could improve the monitoring of live cell behavior in an automated fashion and could be further developed into a tool for high-throughput automated analysis of unique cell behavior.
Covering: up to the end of 2020. The machine learning field can be defined as the study and application of algorithms that perform classification and prediction tasks through pattern recognition instead of explicitly defined rules. Among other areas, machine learning has excelled in natural language processing. As such methods have excelled at understanding written languages (e.g. English), they are also being applied to biological problems to better understand the "genomic language". In this review we focus on recent advances in applying machine learning to natural products and genomics, and how those advances are improving our understanding of natural product biology, chemistry, and drug discovery. We discuss machine learning applications in genome mining (identifying biosynthetic signatures in genomic data), predictions of what structures will be created from those genomic signatures, and the types of activity we might expect from those molecules. We further explore the application of these approaches to data derived from complex microbiomes, with a focus on the human microbiome. We also review challenges in leveraging machine learning approaches in the field, and how the availability of other "omics" data layers provides value. Finally, we provide insights into the challenges associated with interpreting machine learning models and the underlying biology and promises of applying machine learning to natural product drug discovery. We believe that the application of machine learning methods to natural product research is poised to accelerate the identification of new molecular entities that may be used to treat a variety of disease indications.
- MeSH
- biologické přípravky * chemie farmakologie MeSH
- biosyntetické dráhy genetika MeSH
- genomika * MeSH
- lidé MeSH
- mikrobiota MeSH
- objevování léků MeSH
- strojové učení * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- přehledy MeSH
TransCelerate reports on the results of 2019, 2020, and 2021 member company (MC) surveys on the use of intelligent automation in pharmacovigilance processes. MCs increased the number and extent of implementation of intelligent automation solutions throughout Individual Case Safety Report (ICSR) processing, especially with rule-based automations such as robotic process automation, lookups, and workflows, moving from planning to piloting to implementation over the 3 survey years. Companies remain highly interested in other technologies such as machine learning (ML) and artificial intelligence, which can deliver a human-like interpretation of data and decision making rather than just automating tasks. Intelligent automation solutions are usually used in combination with more than one technology being used simultaneously for the same ICSR process step. Challenges to implementing intelligent automation solutions include finding/having appropriate training data for ML models and the need for harmonized regulatory guidance.
- MeSH
- automatizace MeSH
- farmakovigilance * MeSH
- lidé MeSH
- strojové učení MeSH
- technologie MeSH
- umělá inteligence * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH