ML, Machine Learning
Dotaz
Zobrazit nápovědu
AIMS: Takotsubo syndrome (TTS) is associated with a substantial rate of adverse events. We sought to design a machine learning (ML)-based model to predict the risk of in-hospital death and to perform a clustering of TTS patients to identify different risk profiles. METHODS AND RESULTS: A ridge logistic regression-based ML model for predicting in-hospital death was developed on 3482 TTS patients from the International Takotsubo (InterTAK) Registry, randomly split in a train and an internal validation cohort (75% and 25% of the sample size, respectively) and evaluated in an external validation cohort (1037 patients). Thirty-one clinically relevant variables were included in the prediction model. Model performance represented the primary endpoint and was assessed according to area under the curve (AUC), sensitivity and specificity. As secondary endpoint, a K-medoids clustering algorithm was designed to stratify patients into phenotypic groups based on the 10 most relevant features emerging from the main model. The overall incidence of in-hospital death was 5.2%. The InterTAK-ML model showed an AUC of 0.89 (0.85-0.92), a sensitivity of 0.85 (0.78-0.95) and a specificity of 0.76 (0.74-0.79) in the internal validation cohort and an AUC of 0.82 (0.73-0.91), a sensitivity of 0.74 (0.61-0.87) and a specificity of 0.79 (0.77-0.81) in the external cohort for in-hospital death prediction. By exploiting the 10 variables showing the highest feature importance, TTS patients were clustered into six groups associated with different risks of in-hospital death (28.8% vs. 15.5% vs. 5.4% vs. 1.0.8% vs. 0.5%) which were consistent also in the external cohort. CONCLUSION: A ML-based approach for the identification of TTS patients at risk of adverse short-term prognosis is feasible and effective. The InterTAK-ML model showed unprecedented discriminative capability for the prediction of in-hospital death.
- Klíčová slova
- Artificial intelligence, Machine learning, Mortality prediction, Outcome, Takotsubo syndrome,
- MeSH
- lidé MeSH
- mortalita v nemocnicích MeSH
- prognóza MeSH
- srdeční selhání * komplikace MeSH
- strojové učení MeSH
- takotsubo kardiomyopatie * diagnóza komplikace MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The Data Optimization Model Evaluation (DOME) recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions. Here, we introduce the DOME registry (URL: registry.dome-ml.org), a database that allows scientists to manage and access comprehensive DOME-related information on published ML studies. The registry uses external resources like ORCID, APICURON, and the Data Stewardship Wizard to streamline the annotation process and ensure comprehensive documentation. By assigning unique identifiers and DOME scores to publications, the registry fosters a standardized evaluation of ML methods. Future plans include continuing to grow the registry through community curation, improving the DOME score definition and encouraging publishers to adopt DOME standards, and promoting transparency and reproducibility of ML in the life sciences.
- Klíčová slova
- machine learning, reproducibility, standards, transparency,
- MeSH
- databáze faktografické MeSH
- lidé MeSH
- registrace * MeSH
- reprodukovatelnost výsledků MeSH
- řízené strojové učení * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Additive friction stir deposition (AFSD) is a novel solid-state additive manufacturing technique that circumvents issues of porosity, cracking, and properties anisotropy that plague traditional powder bed fusion and directed energy deposition approaches. However, correlations between process parameters, thermal profiles, and resulting microstructure in AFSD still need to be better understood. This hinders process optimization for properties. This work employs a framework combining supervised machine learning (SML) and physics-informed neural networks (PINNs) to predict peak temperature distribution in AFSD from process parameters. Eight regression algorithms were implemented for SML modeling, while four PINNs leveraged governing equations for transport, wave propagation, heat transfer, and quantum mechanics. Across multiple statistical measures, ensemble techniques like gradient boosting proved superior for SML, with the lowest MSE of 165.78. The integrated ML approach was also applied to classify deposition quality from process factors, with logistic regression delivering robust accuracy. By fusing data-driven learning and fundamental physics, this dual methodology provides comprehensive insights into tailoring microstructure through thermal management in AFSD. The work demonstrates the power of bridging statistical and physics-based modeling for elucidating AM process-property relationships.
Classification problems in the small data regime (with small data statistic T and relatively large feature space dimension D) impose challenges for the common machine learning (ML) and deep learning (DL) tools. The standard learning methods from these areas tend to show a lack of robustness when applied to data sets with significantly fewer data points than dimensions and quickly reach the overfitting bound, thus leading to poor performance beyond the training set. To tackle this issue, we propose eSPA+, a significant extension of the recently formulated entropy-optimal scalable probabilistic approximation algorithm (eSPA). Specifically, we propose to change the order of the optimization steps and replace the most computationally expensive subproblem of eSPA with its closed-form solution. We prove that with these two enhancements, eSPA+ moves from the polynomial to the linear class of complexity scaling algorithms. On several small data learning benchmarks, we show that the eSPA+ algorithm achieves a many-fold speed-up with respect to eSPA and even better performance results when compared to a wide array of ML and DL tools. In particular, we benchmark eSPA+ against the standard eSPA and the main classes of common learning algorithms in the small data regime: various forms of support vector machines, random forests, and long short-term memory algorithms. In all the considered applications, the common learning methods and eSPA are markedly outperformed by eSPA+, which achieves significantly higher prediction accuracy with an orders-of-magnitude lower computational cost.
- MeSH
- algoritmy * MeSH
- entropie MeSH
- strojové učení * MeSH
- support vector machine MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Untargeted tandem mass spectrometry serves as a scalable solution for the organization of small molecules. One of the most prevalent techniques for analyzing the acquired tandem mass spectrometry data (MS/MS) - called molecular networking - organizes and visualizes putatively structurally related compounds. However, a key bottleneck of this approach is the comparison of MS/MS spectra used to identify nearby structural neighbors. Machine learning (ML) approaches have emerged as a promising technique to predict structural similarity from MS/MS that may surpass the current state-of-the-art algorithmic methods. However, the comparison between these different ML methods remains a challenge because there is a lack of standardization to benchmark, evaluate, and compare MS/MS similarity methods, and there are no methods that address data leakage between training and test data in order to analyze model generalizability. RESULT: In this work, we present the creation of a new evaluation methodology using a train/test split that allows for the evaluation of machine learning models at varying degrees of structural similarity between training and test sets. We also introduce a training and evaluation framework that measures prediction accuracy on domain-inspired annotation and retrieval metrics designed to mirror real-world applications. We further show how two alternative training methods that leverage MS specific insights (e.g., similar instrumentation, collision energy, adduct) affect method performance and demonstrate the orthogonality of the proposed metrics. We especially highlight the role that collision energy plays in prediction errors. Finally, we release a continually updated version of our dataset online along with our data cleaning and splitting pipelines for community use. CONCLUSION: It is our hope that this benchmark will serve as the basis of development for future machine learning approaches in MS/MS similarity and facilitate comparison between models. We anticipate that the introduced set of evaluation metrics allows for a better reflection of practical performance.
- Klíčová slova
- Benchmark, Machine learning, Mass spectrometry, Metabolomics, Spectral similarity measure,
- MeSH
- algoritmy MeSH
- strojové učení * MeSH
- tandemová hmotnostní spektrometrie * metody MeSH
- Publikační typ
- časopisecké články MeSH
TransCelerate reports on the results of 2019, 2020, and 2021 member company (MC) surveys on the use of intelligent automation in pharmacovigilance processes. MCs increased the number and extent of implementation of intelligent automation solutions throughout Individual Case Safety Report (ICSR) processing, especially with rule-based automations such as robotic process automation, lookups, and workflows, moving from planning to piloting to implementation over the 3 survey years. Companies remain highly interested in other technologies such as machine learning (ML) and artificial intelligence, which can deliver a human-like interpretation of data and decision making rather than just automating tasks. Intelligent automation solutions are usually used in combination with more than one technology being used simultaneously for the same ICSR process step. Challenges to implementing intelligent automation solutions include finding/having appropriate training data for ML models and the need for harmonized regulatory guidance.
- MeSH
- automatizace MeSH
- farmakovigilance * MeSH
- lidé MeSH
- strojové učení MeSH
- technologie MeSH
- umělá inteligence * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
In this study, machine learning (ML) models coupled with genetic algorithm (GA) and particle swarm optimization (PSO) were applied to predict the relative influence of experimental parameters of photocatalytic dye removal. Specifically, the impact of bandgap, dye concentration, photocatalyst dosage, solution volume, specific surface area, and time duration on photocatalytic degradation rate constant of cationic dyes was discerned using selected ML models, i.e., ensembled learning tree (ELT), gaussian process regression (GPR), support vector machine (SVM), and decision tree (DT). Thus, the data points were sourced from literature studies recently published in 2024 and 2023 on materials related to working on fundamental principles of photocatalysis. The ELT-PSO hybrid model outperformed all models with R2 = 0.992 and RMSE = 2.6408e-04, followed by DT, GPR, and SVM. The partial dependence plots and Shapley's analysis demonstrate that the type of dye, bandgap, dye initial concentration, and time duration are essential parameters for photocatalytic degradation, while sensitivity analysis further displayed solution volume and time duration to be the most influential parameters for rate constant determination. The optimized ML model's prediction was also experimentally validated using as-synthesized different compositions of Cu2O/WO3 heterostructures and ZnO nanoparticles. The results suggest that an ML-optimized study can be used in designing photocatalysts with optimum properties desired for the removal of cationic dyes at high rates from wastewater, thus saving energy and cost for a sustainable environment.
- Klíčová slova
- Cationic dyes, Machine learning, Optimization removal, Photocatalytic degradation, Wastewater treatment,
- MeSH
- algoritmy MeSH
- barvicí látky * chemie MeSH
- chemické látky znečišťující vodu chemie MeSH
- katalýza MeSH
- strojové učení * MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- barvicí látky * MeSH
- chemické látky znečišťující vodu MeSH
The search for non-invasive, fast, and low-cost diagnostic tools has gained significant traction among many researchers worldwide. Dielectric properties calculated from microwave signals offer unique insights into biological tissue. Material properties, such as relative permittivity (εr) and conductivity (σ), can vary significantly between healthy and unhealthy tissue types at a given frequency. Understanding this difference in properties is key for identifying the disease state. The frequency-dependent nature of the dielectric measurements results in large datasets, which can be postprocessed using artificial intelligence (AI) methods. In this work, the dielectric properties of liver tissues in three mouse models of liver disease are characterized using dielectric spectroscopy. The measurements are grouped into four categories based on the diets or disease state of the mice, i.e., healthy mice, mice with non-alcoholic steatohepatitis (NASH) induced by choline-deficient high-fat diet, mice with NASH induced by western diet, and mice with liver fibrosis. Multi-class classification machine learning (ML) models are then explored to differentiate the liver tissue groups based on dielectric measurements. The results show that the support vector machine (SVM) model was able to differentiate the tissue groups with an accuracy up to 90%. This technology pipeline, thus, shows great potential for developing the next generation non-invasive diagnostic tools.
- Klíčová slova
- dielectric spectroscopy, fibrosis, machine learning, microwave, non-alcoholic steatohepatitis, relative permittivity,
- MeSH
- jaterní cirhóza MeSH
- játra patologie MeSH
- myši inbrední C57BL MeSH
- myši MeSH
- nealkoholová steatóza jater * diagnóza patologie MeSH
- strojové učení MeSH
- umělá inteligence MeSH
- zvířata MeSH
- Check Tag
- myši MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
This study tested whether machine learning (ML) methods can effectively separate individual plants from complex 3D canopy laser scans as a prerequisite to analyzing particular plant features. For this, we scanned mung bean and chickpea crops with PlantEye (R) laser scanners. Firstly, we segmented the crop canopies from the background in 3D space using the Region Growing Segmentation algorithm. Then, Convolutional Neural Network (CNN) based ML algorithms were fine-tuned for plant counting. Application of the CNN-based (Convolutional Neural Network) processing architecture was possible only after we reduced the dimensionality of the data to 2D. This allowed for the identification of individual plants and their counting with an accuracy of 93.18% and 92.87% for mung bean and chickpea plants, respectively. These steps were connected to the phenotyping pipeline, which can now replace manual counting operations that are inefficient, costly, and error-prone. The use of CNN in this study was innovatively solved with dimensionality reduction, addition of height information as color, and consequent application of a 2D CNN-based approach. We found there to be a wide gap in the use of ML on 3D information. This gap will have to be addressed, especially for more complex plant feature extractions, which we intend to implement through further research.
- Klíčová slova
- 3D point clouds, computer vision, machine learning, phenotyping, plant detection,
- MeSH
- algoritmy * MeSH
- neuronové sítě MeSH
- strojové učení * MeSH
- Publikační typ
- časopisecké články MeSH
Acute heart failure (AHF) is a common and severe condition with a poor prognosis. Its course is often complicated by worsening renal function (WRF), exacerbating the outcome. The population of AHF patients experiencing WRF is heterogenous, and some novel possibilities for its analysis have recently emerged. Clustering is a machine learning (ML) technique that divides the population into distinct subgroups based on the similarity of cases (patients). Given that, we decided to use clustering to find subgroups inside the AHF population that differ in terms of WRF occurrence. We evaluated data from the three hundred and twelve AHF patients hospitalized in our institution who had creatinine assessed four times during hospitalization. Eighty-six variables evaluated at admission were included in the analysis. The k-medoids algorithm was used for clustering, and the quality of the procedure was judged by the Davies-Bouldin index. Three clinically and prognostically different clusters were distinguished. The groups had significantly (p = 0.004) different incidences of WRF. Inside the AHF population, we successfully discovered that three groups varied in renal prognosis. Our results provide novel insight into the AHF and WRF interplay and can be valuable for future trial construction and more tailored treatment.
- Klíčová slova
- acute heart failure, artificial intelligence, cardiorenal syndrome, clustering, machine learning,
- MeSH
- akutní nemoc MeSH
- kreatinin MeSH
- ledviny fyziologie MeSH
- lidé MeSH
- srdeční selhání * MeSH
- strojové učení MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- kreatinin MeSH