Narcolepsy is a rare life-long disease that exists in two forms, narcolepsy type-1 (NT1) or type-2 (NT2), but only NT1 is accepted as clearly defined entity. Both types of narcolepsies belong to the group of central hypersomnias (CH), a spectrum of poorly defined diseases with excessive daytime sleepiness as a core feature. Due to the considerable overlap of symptoms and the rarity of the diseases, it is difficult to identify distinct phenotypes of CH. Machine learning (ML) can help to identify phenotypes as it learns to recognize clinical features invisible for humans. Here we apply ML to data from the huge European Narcolepsy Network (EU-NN) that contains hundreds of mixed features of narcolepsy making it difficult to analyze with classical statistics. Stochastic gradient boosting, a supervised learning model with built-in feature selection, results in high performances in testing set. While cataplexy features are recognized as the most influential predictors, machine find additional features, e.g. mean rapid-eye-movement sleep latency of multiple sleep latency test contributes to classify NT1 and NT2 as confirmed by classical statistical analysis. Our results suggest ML can identify features of CH on machine scale from complex databases, thus providing 'ideas' and promising candidates for future diagnostic classifications.
- MeSH
- Models, Biological * MeSH
- Databases, Factual statistics & numerical data MeSH
- Datasets as Topic MeSH
- Adult MeSH
- Data Interpretation, Statistical MeSH
- Humans MeSH
- Young Adult MeSH
- Narcolepsy classification diagnosis physiopathology MeSH
- Polysomnography statistics & numerical data MeSH
- Supervised Machine Learning * MeSH
- ROC Curve MeSH
- Sleep, REM physiology MeSH
- Sleep Latency physiology MeSH
- Stochastic Processes MeSH
- Rare Diseases classification diagnosis physiopathology MeSH
- Check Tag
- Adult MeSH
- Humans MeSH
- Young Adult MeSH
- Male MeSH
- Female MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: Advancements in artificial intelligence (AI) and machine learning (ML) have revolutionized the medical field and transformed translational medicine. These technologies enable more accurate disease trajectory models while enhancing patient-centered care. However, challenges such as heterogeneous datasets, class imbalance, and scalability remain barriers to achieving optimal predictive performance. METHODS: This study proposes a novel AI-based framework that integrates Gradient Boosting Machines (GBM) and Deep Neural Networks (DNN) to address these challenges. The framework was evaluated using two distinct datasets: MIMIC-IV, a critical care database containing clinical data of critically ill patients, and the UK Biobank, which comprises genetic, clinical, and lifestyle data from 500,000 participants. Key performance metrics, including Accuracy, Precision, Recall, F1-Score, and AUROC, were used to assess the framework against traditional and advanced ML models. RESULTS: The proposed framework demonstrated superior performance compared to classical models such as Logistic Regression, Random Forest, Support Vector Machines (SVM), and Neural Networks. For example, on the UK Biobank dataset, the model achieved an AUROC of 0.96, significantly outperforming Neural Networks (0.92). The framework was also efficient, requiring only 32.4 s for training on MIMIC-IV, with low prediction latency, making it suitable for real-time applications. CONCLUSIONS: The proposed AI-based framework effectively addresses critical challenges in translational medicine, offering superior predictive accuracy and efficiency. Its robust performance across diverse datasets highlights its potential for integration into real-time clinical decision support systems, facilitating personalized medicine and improving patient outcomes. Future research will focus on enhancing scalability and interpretability for broader clinical applications.
- MeSH
- Databases, Factual MeSH
- Humans MeSH
- Neural Networks, Computer MeSH
- Patient-Centered Care * MeSH
- Machine Learning * MeSH
- Translational Science, Biomedical MeSH
- Translational Research, Biomedical MeSH
- Artificial Intelligence * MeSH
- Treatment Outcome MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
BACKGROUND: Interpretable machine learning (ML) for early detection of cancer has the potential to improve risk assessment and early intervention. METHODS: Data from 261 proteins related to inflammation and/or tumor processes in 123 blood samples collected from healthy persons, but of whom a sub-group later developed squamous cell carcinoma of the oral tongue (SCCOT), were analyzed. Samples from people who developed SCCOT within less than 5 years were classified as tumor-to-be and all other samples as tumor-free. The optimal ML algorithm for feature selection was identified and feature importance computed by the SHapley Additive exPlanations (SHAP) method. Five popular ML algorithms (AdaBoost, Artificial neural networks [ANNs], Decision Tree [DT], eXtreme Gradient Boosting [XGBoost], and Support Vector Machine [SVM]) were applied to establish prediction models, and decisions of the optimal models were interpreted by SHAP. RESULTS: Using the 22 selected features, the SVM prediction model showed the best performance (sensitivity = 0.867, specificity = 0.859, balanced accuracy = 0.863, area under the receiver operating characteristic curve [ROC-AUC] = 0.924). SHAP analysis revealed that the 22 features rendered varying person-specific impacts on model decision and the top three contributors to prediction were Interleukin 10 (IL10), TNF Receptor Associated Factor 2 (TRAF2), and Kallikrein Related Peptidase 12 (KLK12). CONCLUSION: Using multidimensional plasma protein analysis and interpretable ML, we outline a systematic approach for early detection of SCCOT before the appearance of clinical signs.
- MeSH
- Tongue MeSH
- Blood Proteins MeSH
- Humans MeSH
- Tongue Neoplasms * diagnosis MeSH
- Carcinoma, Squamous Cell * diagnosis MeSH
- Machine Learning MeSH
- Ubiquitin-Protein Ligases MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
BACKGROUND: Machine learning (ML) approaches can significantly improve the classical Rout-based evaluation of the lumbar infusion test (LIT) and the clinical management of the normal pressure hydrocephalus. OBJECTIVE: To develop a ML model that accurately identifies patients as candidates for permanent cerebral spinal fluid shunt implantation using only intracranial pressure and electrocardiogram signals recorded throughout LIT. METHODS: This was a single-center cohort study of prospectively collected data of 96 patients who underwent LIT and 5-day external lumbar cerebral spinal fluid drainage (external lumbar drainage) as a reference diagnostic method. A set of selected 48 intracranial pressure/electrocardiogram complex signal waveform features describing nonlinear behavior, wavelet transform spectral signatures, or recurrent map patterns were calculated for each patient. After applying a leave-one-out cross-validation training-testing split of the data set, we trained and evaluated the performance of various state-of-the-art ML algorithms. RESULTS: The highest performing ML algorithm was the eXtreme Gradient Boosting. This model showed a good calibration and discrimination on the testing data, with an area under the receiver operating characteristic curve of 0.891 (accuracy: 82.3%, sensitivity: 86.1%, and specificity: 73.9%) obtained for 8 selected features. Our ML model clearly outperforms the classical Rout-based manual classification commonly used in clinical practice with an accuracy of 62.5%. CONCLUSION: This study successfully used the ML approach to predict the outcome of a 5-day external lumbar drainage and hence which patients are likely to benefit from permanent shunt implantation. Our automated ML model thus enhances the diagnostic utility of LIT in management.
Background: The prevalence of noncommunicable diseases (NCDs) is increasing throughout the world, including in developing countries. An NCD prevention program using information communication technology was implemented for 2 years in Bangladesh. Health checkup data were collected from 16,741 study subjects. However, the effectiveness of the utilized prevention strategy has not yet been evaluated, and some subjects with a risk of NCD have gone undetected. Objective: This study aimed to improve intervention strategies by analyzing collected data and proposing a costeffective personalized predictive model to identify subjects predicted to be at future risk of NCD. Methods: We selected 2,110 subjects who participated in both years of the program and used a machine learning algorithm, gradient boosting decision tree, to build models that would identify subjects who were at risk of future high blood pressure, blood sugar or body mass index (BMI). We used area under the curve (AUC) of receiver operating characteristic curves and cumulative accuracy profile (CAP) curves to evaluate the performance of our models. Results: Models showed fairly good performance: the BMI model (AUC=0.910) yielded the greatest AUC whereas the BS model (AUC=0.730) yielded the lowest. CAP curves indicated that the BMI model could correctly identify 98.0% of at-risk subjects at only 50% of the total time cost. Conclusions: Our models represent powerful tools with which to improve the effect of health intervention programs and the effectiveness at which they are performed with limited medical resources.
- MeSH
- Early Medical Intervention MeSH
- Medical Informatics * MeSH
- Humans MeSH
- Noncommunicable Diseases MeSH
- Developing Countries MeSH
- Public Health MeSH
- Check Tag
- Humans MeSH
- Publication type
- Review MeSH
- Geographicals
- Bangladesh MeSH
BACKGROUND: Variation in usual practice in fluid trials assessing lower versus higher volumes may affect overall comparisons. To address this, we will evaluate the effects of heterogeneity in treatment intensity in the Conservative versus Liberal Approach to Fluid Therapy of Septic Shock in Intensive Care trial. This will reflect the effects of differences in site-specific intensities of standard fluid treatment due to local practice preferences while considering participant characteristics. METHODS: We will assess the effects of heterogeneity in treatment intensity across one primary (all-cause mortality) and three secondary outcomes (serious adverse events or reactions, days alive without life support and days alive out of hospital) after 90 days. We will classify sites based on the site-specific intensity of standard fluid treatment, defined as the mean differences in observed versus predicted intravenous fluid volumes in the first 24 h in the standard-fluid group while accounting for differences in participant characteristics. Predictions will be made using a machine learning model including 22 baseline predictors using the extreme gradient boosting algorithm. Subsequently, sites will be grouped into fluid treatment intensity subgroups containing at least 100 participants each. Subgroups differences will be assessed using hierarchical Bayesian regression models with weakly informative priors. We will present the full posterior distributions of relative (risk ratios and ratios of means) and absolute differences (risk differences and mean differences) in each subgroup. DISCUSSION: This study will provide data on the effects of heterogeneity in treatment intensity while accounting for patient characteristics in critically ill adult patients with septic shock. REGISTRATIONS: The European Clinical Trials Database (EudraCT): 2018-000404-42, ClinicalTrials. gov: NCT03668236.
- MeSH
- Bayes Theorem MeSH
- Humans MeSH
- Critical Care methods MeSH
- Shock, Septic * therapy MeSH
- Machine Learning MeSH
- Fluid Therapy * methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
BACKGROUND: Patients with squamous cell carcinoma of the head and neck (SCCHN) have a high-risk of recurrence. We aimed to develop machine learning methods to identify transcriptomic and proteomic features that provide accurate classification models for predicting risk of early recurrence in SCCHN patients. METHODS: Clinical, genomic, transcriptomic and proteomic features distinguishing recurrence risk were examined in SCCHN patients from The Cancer Genome Atlas (TCGA). Recurrence within one year after treatment was classified as high-risk and no recurrence as low-risk. RESULTS: No significant differences in individual clinicopathological characteristics, mutation profiles or mRNA expression patterns were seen between the groups using conventional statistical analysis. Using the machine learning algorithm, extreme gradient boosting (XGBoost), ten proteins (RAD50, 4E-BP1, MYH11, MAP2K1, BECN1, NF2, RAB25, ERRFI1, KDR, SERPINE1) and five mRNAs (PLAUR, DKK1, AXIN2, ANG and VEGFA) made the greatest contribution to classification. These features were used to build improved models in XGBoost, achieving the best discrimination performance when combining transcriptomic and proteomic data, providing an accuracy of 0.939 and an Area Under the ROC Curve (AUC) of 0.951. CONCLUSIONS: This study highlights machine learning to identify transcriptomic and proteomic factors that play important roles in predicting risk of recurrence in patients with SCCHN and to develop such models by iterative cycles to enhance their accuracy, thereby aiding the introduction of personalized treatment regimens.
- MeSH
- Squamous Cell Carcinoma of Head and Neck genetics MeSH
- Humans MeSH
- RNA, Messenger genetics MeSH
- Head and Neck Neoplasms * genetics MeSH
- Proteomics MeSH
- rab GTP-Binding Proteins genetics MeSH
- Carcinoma, Squamous Cell * genetics MeSH
- Transcriptome genetics MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: Optimal management of outpatients with heart failure (HF) requires serially updating the estimates of their risk for adverse clinical outcomes to guide treatment. Patient-reported outcomes (PROs) are becoming increasingly used in clinical care. The purpose of this study was to determine whether the inclusion of PROs can improve the risk prediction for HF hospitalization and death in ambulatory patients with HF. METHODS AND RESULTS: We included consecutive patients with HF with reduced ejection fraction (HFrEF) and HF with preserved EF (HFpEF) seen in a HF clinic between 2015 and 2019 who completed PROs as part of routine care. Cox regression with a least absolute shrinkage and selection operator regularization and gradient boosting machine analyses were used to estimate risk for a combined outcome of HF hospitalization, heart transplant, left ventricular assist device implantation, or death. The performance of the prediction models was evaluated with the time-dependent concordance index (Cτ). Among 1165 patients with HFrEF (mean age 59.1 ± 16.1, 68% male), the median follow-up was 487 days. Among 456 patients with HFpEF (mean age 64.2 ± 16.0 years, 55% male) the median follow-up was 494 days. Gradient boosting regression that included PROs had the best prediction performance - Cτ 0.73 for patients with HFrEF and 0.74 in patients with HFpEF, and showed very good stratification of risk by time to event analysis by quintile of risk. The Kansas City Cardiomyopathy Questionnaire overall summary score, visual analogue scale and Patient Reported Outcomes Measurement Information System dimensions of satisfaction with social roles and physical function had high variable importance measure in the models. CONCLUSIONS: PROs improve risk prediction in both HFrEF and HFpEF, independent of traditional clinical factors. Routine assessment of PROs and leveraging the comprehensive data in the electronic health record in routine clinical care could help more accurately assess risk and support the intensification of treatment in patients with HF.
- MeSH
- Risk Assessment methods MeSH
- Patient Reported Outcome Measures * MeSH
- Hospitalization statistics & numerical data MeSH
- Quality of Life * psychology MeSH
- Middle Aged MeSH
- Humans MeSH
- Follow-Up Studies MeSH
- Retrospective Studies MeSH
- Aged MeSH
- Heart Failure * physiopathology psychology therapy diagnosis mortality MeSH
- Stroke Volume physiology MeSH
- Check Tag
- Middle Aged MeSH
- Humans MeSH
- Male MeSH
- Aged MeSH
- Female MeSH
- Publication type
- Journal Article MeSH