unsupervised machine learning
Dotaz
Zobrazit nápovědu
Identification of active electrodes that record task-relevant neurophysiological activity is needed for clinical and industrial applications as well as for investigating brain functions. We developed an unsupervised, fully automated approach to classify active electrodes showing event-related intracranial EEG (iEEG) responses from 115 patients performing a free recall verbal memory task. Our approach employed new interpretable metrics that quantify spectral characteristics of the normalized iEEG signal based on power-in-band and synchrony measures. Unsupervised clustering of the metrics identified distinct sets of active electrodes across different subjects. In the total population of 11,869 electrodes, our method achieved 97% sensitivity and 92.9% specificity with the most efficient metric. We validated our results with anatomical localization revealing significantly greater distribution of active electrodes in brain regions that support verbal memory processing. We propose our machine-learning framework for objective and efficient classification and interpretation of electrophysiological signals of brain activities supporting memory and cognition.
- MeSH
- algoritmy MeSH
- biomedicínské inženýrství metody trendy MeSH
- datové soubory jako téma MeSH
- elektroencefalografie metody MeSH
- elektrofyziologické jevy MeSH
- elektrokortikografie * metody MeSH
- epilepsie diagnóza patofyziologie psychologie MeSH
- evokované potenciály fyziologie MeSH
- implantované elektrody * MeSH
- kognice fyziologie MeSH
- krátkodobá paměť fyziologie MeSH
- lidé MeSH
- mapování mozku metody MeSH
- mozek diagnostické zobrazování fyziologie MeSH
- plnění a analýza úkolů * MeSH
- retrospektivní studie MeSH
- senzitivita a specificita MeSH
- strojové učení bez učitele * MeSH
- verbální chování fyziologie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- validační studie MeSH
Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are often uncomputable, or lack practical implementations. In this paper we attempt to follow a big picture view while also providing a particular theory and its implementation to present a novel, purposely simple, and interpretable hierarchical architecture. This architecture incorporates the unsupervised learning of a model of the environment, learning the influence of one's own actions, model-based reinforcement learning, hierarchical planning, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations which are increasingly more abstract, but can retain details when needed. We demonstrate the universality of the architecture by testing it on a series of diverse environments ranging from audio/visual compression to discrete and continuous action spaces, to learning disentangled representations.
- MeSH
- algoritmy MeSH
- lidé MeSH
- neuronové sítě MeSH
- posilování (psychologie) MeSH
- strojové učení bez učitele MeSH
- učení fyziologie MeSH
- umělá inteligence * MeSH
- životní prostředí * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
The academic curriculum has shown to promote sedentary behavior in college students. This study aimed to profile the physical fitness of physical education majors using unsupervised machine learning and to identify the differences between sexes, academic years, socioeconomic strata, and the generated profiles. A total of 542 healthy and physically active students (445 males, 97 females; 19.8 [2.2] years; 66.0 [10.3] kg; 169.5 [7.8] cm) participated in this cross-sectional study. Their indirect VO2max (Cooper and Shuttle-Run 20 m tests), lower-limb power (horizontal jump), sprint (30 m), agility (shuttle run), and flexibility (sit-and-reach) were assessed. The participants were profiled using clustering algorithms after setting the optimal number of clusters through an internal validation using R packages. Non-parametric tests were used to identify the differences (p < 0.05). The higher percentage of the population were freshmen (51.4%) and middle-income (64.0%) students. Seniors and juniors showed a better physical fitness than first-year students. No significant differences were found between their socioeconomic strata (p > 0.05). Two profiles were identified using hierarchical clustering (Cluster 1 = 318 vs. Cluster 2 = 224). The matching analysis revealed that physical fitness explained the variation in the data, with Cluster 2 as a sex-independent and more physically fit group. All variables differed significantly between the sexes (except the body mass index [p = 0.218]) and the generated profiles (except stature [p = 0.559] and flexibility [p = 0.115]). A multidimensional analysis showed that the body mass, cardiorespiratory fitness, and agility contributed the most to the data variation so that they can be used as profiling variables. This profiling method accurately identified the relevant variables to reinforce exercise recommendations in a low physical performance and overweight majors.
A major challenge in cancer treatment is predicting the clinical response to anti-cancer drugs on a personalized basis. The success of such a task largely depends on the ability to develop computational resources that integrate big "omic" data into effective drug-response models. Machine learning is both an expanding and an evolving computational field that holds promise to cover such needs. Here we provide a focused overview of: 1) the various supervised and unsupervised algorithms used specifically in drug response prediction applications, 2) the strategies employed to develop these algorithms into applicable models, 3) data resources that are fed into these frameworks and 4) pitfalls and challenges to maximize model performance. In this context we also describe a novel in silico screening process, based on Association Rule Mining, for identifying genes as candidate drivers of drug response and compare it with relevant data mining frameworks, for which we generated a web application freely available at: https://compbio.nyumc.org/drugs/. This pipeline explores with high efficiency large sample-spaces, while is able to detect low frequency events and evaluate statistical significance even in the multidimensional space, presenting the results in the form of easily interpretable rules. We conclude with future prospects and challenges of applying machine learning based drug response prediction in precision medicine.
Acute heart failure (AHF) is a life-threatening, heterogeneous disease requiring urgent diagnosis and treatment. The clinical severity and medical procedures differ according to a complex interplay between the deterioration cause, underlying cardiac substrate, and comorbidities. This study aimed to analyze the natural phenotypic heterogeneity of the AHF population and evaluate the possibilities offered by clustering (unsupervised machine-learning technique) in a medical data assessment. We evaluated data from 381 AHF patients. Sixty-three clinical and biochemical features were assessed at the admission of the patients and were included in the analysis after the preprocessing. The K-medoids algorithm was implemented to create the clusters, and optimization, based on the Davies-Bouldin index, was used. The clustering was performed while blinded to the outcome. The outcome associations were evaluated using the Kaplan-Meier curves and Cox proportional-hazards regressions. The algorithm distinguished six clusters that differed significantly in 58 variables concerning i.e., etiology, clinical status, comorbidities, laboratory parameters and lifestyle factors. The clusters differed in terms of the one-year mortality (p = 0.002). Using the clustering techniques, we extracted six phenotypes from AHF patients with distinct clinical characteristics and outcomes. Our results can be valuable for future trial constructions and customized treatment.
- Publikační typ
- časopisecké články MeSH
Current studies of gene × air pollution interaction typically seek to identify unknown heritability of common complex illnesses arising from variability in the host's susceptibility to environmental pollutants of interest. Accordingly, a single component generalized linear models are often used to model the risk posed by an environmental exposure variable of interest in relation to a priori determined DNA variants. However, reducing the phenotypic heterogeneity may further optimize such approach, primarily represented by the modeled DNA variants. Here, we reduce phenotypic heterogeneity of asthma severity, and also identify single nucleotide polymorphisms (SNP) associated with phenotype subgroups. Specifically, we first apply an unsupervised learning algorithm method and a non-parametric regression to find a biclustering structure of children according to their allergy and asthma severity. We then identify a set of SNPs most closely correlated with each sub-group. We subsequently fit a logistic regression model for each group against the healthy controls using benzo[a]pyrene (B[a]P) as a representative airborne carcinogen. Application of such approach in a case-control data set shows that SNP clustering may help to partly explain heterogeneity in children's asthma susceptibility in relation to ambient B[a]P concentration with greater efficiency.
- MeSH
- algoritmy MeSH
- benzopyren toxicita MeSH
- bronchiální astma chemicky indukované genetika MeSH
- dítě MeSH
- genetická predispozice k nemoci * MeSH
- interakce genů a prostředí MeSH
- jednonukleotidový polymorfismus MeSH
- látky znečišťující vzduch toxicita MeSH
- lidé MeSH
- multifaktoriální dědičnost * MeSH
- statistika jako téma MeSH
- strojové učení bez učitele MeSH
- studie případů a kontrol MeSH
- vystavení vlivu životního prostředí škodlivé účinky MeSH
- znečištění ovzduší škodlivé účinky MeSH
- Check Tag
- dítě MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
OBJECTIVES: Circulating insulin concentrations mediate vascular-inflammatory and prothrombotic factors. However, it is unknown whether interindividual differences in circulating insulin levels are associated with different inflammatory and prothrombotic profiles in type 1 diabetes (T1D). We applied an unsupervised machine-learning approach to determine whether interindividual differences in rapid-acting insulin levels associate with parameters of vascular health in patients with T1D. METHODS: We re-analyzed baseline pretreatment meal-tolerance test data from 2 randomized controlled trials in which 32 patients consumed a mixed-macronutrient meal and self-administered a single dose of rapid-acting insulin individualized by carbohydrate counting. Postprandial serum insulin, tumour necrosis factor (TNF)-alpha, plasma fibrinogen, human tissue factor (HTF) activity and plasminogen activator inhibitor-1 (PAI-1) were measured. Two-step clustering categorized individuals based on shared clinical characteristics. For analyses, insulin pharmacokinetic summary statistics were normalized, allowing standardized intraindividual comparisons. RESULTS: Despite standardization of insulin dose, individuals exhibited marked interpersonal variability in peak insulin concentrations (48.63%), time to peak (64.95%) and insulin incremental area under the curve (60.34%). Two clusters were computed: cluster 1 (n=14), representing increased serum insulin concentrations; and cluster 2 (n=18), representing reduced serum insulin concentrations (cluster 1: 389.50±177.10 pmol/L/IU h-1; cluster 2: 164.29±41.91 pmol/L/IU h-1; p<0.001). Cluster 2 was characterized by increased levels of fibrinogen, PAI-1, TNF-alpha and HTF activity; higher glycated hemoglobin; increased body mass index; lower estimated glucose disposal rate (increased insulin resistance); older age; and longer diabetes duration (p<0.05 for all analyses). CONCLUSIONS: Reduced serum insulin concentrations are associated with insulin resistance and a prothrombotic milieu in individuals with T1D, and therefore may be a marker of adverse vascular outcome.
- MeSH
- diabetes mellitus 1. typu * komplikace MeSH
- fibrinogen terapeutické užití MeSH
- hypoglykemika farmakologie terapeutické užití MeSH
- inhibitor aktivátoru plazminogenu 1 terapeutické užití MeSH
- injekce subkutánní MeSH
- inzulin terapeutické užití MeSH
- inzulinová rezistence * MeSH
- krátkodobě působící inzuliny terapeutické užití MeSH
- krevní glukóza analýza MeSH
- lidé MeSH
- postprandiální období MeSH
- strojové učení MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: A statistical pipeline was developed and used for determining candidate genes and candidate gene coexpression networks involved in 2 alcohol (i.e., ethanol [EtOH]) metabolism phenotypes, namely alcohol clearance and acetate area under the curve in a recombinant inbred (RI) (HXB/BXH) rat panel. The approach was also used to provide an indication of how EtOH metabolism can impact the normal function of the identified networks. METHODS: RNA was extracted from alcohol-naïve liver tissue of 30 strains of HXB/BXH RI rats. The reconstructed transcripts were quantitated, and data were used to construct gene coexpression modules and networks. A separate group of rats, comprising the same 30 strains, were injected with EtOH (2 g/kg) for measurement of blood EtOH and acetate levels. These data were used for quantitative trait loci (QTL) analysis of the rate of EtOH disappearance and circulating acetate levels. The analysis pipeline required calculation of the module eigengene values, the correction of these values with EtOH metabolism rates and acetate levels across the rat strains, and the determination of the eigengene QTLs. For a module to be considered a candidate for determining phenotype, the module eigengene values had to have significant correlation with the strain phenotypic values and the module eigengene QTLs had to overlap the phenotypic QTLs. RESULTS: Of the 658 transcript coexpression modules generated from liver RNA sequencing data, a single module satisfied all criteria for being a candidate for determining the alcohol clearance trait. This module contained 2 alcohol dehydrogenase genes, including the gene whose product was previously shown to be responsible for the majority of alcohol elimination in the rat. This module was also the only module identified as a candidate for influencing circulating acetate levels. This module was also linked to the process of generation and utilization of retinoic acid as related to the autonomous immune response. CONCLUSIONS: We propose that our analytical pipeline can successfully identify genetic regions and transcripts which predispose a particular phenotype and our analysis provides functional context for coexpression module components.
- MeSH
- ethanol aplikace a dávkování metabolismus MeSH
- játra účinky léků metabolismus MeSH
- krysa rodu rattus MeSH
- metabolická clearance účinky léků fyziologie MeSH
- multifaktoriální dědičnost účinky léků fyziologie MeSH
- pití alkoholu genetika metabolismus MeSH
- potkani inbrední BN MeSH
- potkani inbrední SHR MeSH
- potkani transgenní MeSH
- strojové učení bez učitele * MeSH
- systémová biologie metody MeSH
- zvířata MeSH
- Check Tag
- krysa rodu rattus MeSH
- mužské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Manual and semi-automatic identification of artifacts and unwanted physiological signals in large intracerebral electroencephalographic (iEEG) recordings is time consuming and inaccurate. To date, unsupervised methods to accurately detect iEEG artifacts are not available. This study introduces a novel machine-learning approach for detection of artifacts in iEEG signals in clinically controlled conditions using convolutional neural networks (CNN) and benchmarks the method's performance against expert annotations. The method was trained and tested on data obtained from St Anne's University Hospital (Brno, Czech Republic) and validated on data from Mayo Clinic (Rochester, Minnesota, U.S.A). We show that the proposed technique can be used as a generalized model for iEEG artifact detection. Moreover, a transfer learning process might be used for retraining of the generalized version to form a data-specific model. The generalized model can be efficiently retrained for use with different EEG acquisition systems and noise environments. The generalized and specialized model F1 scores on the testing dataset were 0.81 and 0.96, respectively. The CNN model provides faster, more objective, and more reproducible iEEG artifact detection compared to manual approaches.