classification and regression tree
Dotaz
Zobrazit nápovědu
The increasing prevalence of autism spectrum disorders (ASD) has led to worldwide interest in factors influencing the age of ASD diagnosis. Parents or caregivers of 237 ASD children (193 boys, 44 girls) diagnosed using the Autism Diagnostic Observation Schedule (ADOS) completed a simple descriptive questionnaire. The data were analyzed using the variable-centered multiple regression analysis and the person-centered classification tree method. We believed that the concurrent use of these two methods could produce robust results. The mean age at diagnosis was 5.8 ± 2.2 years (median 5.3 years). Younger ages for ASD diagnosis were predicted (using multiple regression analysis) by higher scores in the ADOS social domain, higher scores in ADOS restrictive and repetitive behaviors and interest domain, higher maternal education, and the shared household of parents. Using the classification tree method, the subgroup with the lowest mean age at diagnosis were children, in whom the summation of ADOS communication and social domain scores was ≥ 17, and paternal age at the delivery was ≥ 29 years. In contrast, the subgroup with the oldest mean age at diagnosis included children with summed ADOS communication and social domain scores < 17 and maternal education at the elementary school level. The severity of autism and maternal education played a significant role in both types of data analysis focused on age at diagnosis.
- Klíčová slova
- ADOS, Age at diagnosis, Autism spectrum disorders, Maternal education, Paternal age, Shared household,
- MeSH
- autistická porucha * MeSH
- dítě MeSH
- dospělí MeSH
- komunikace MeSH
- lidé MeSH
- pervazivní vývojové poruchy u dětí * MeSH
- poruchy autistického spektra * diagnóza epidemiologie MeSH
- předškolní dítě MeSH
- regresní analýza MeSH
- Check Tag
- dítě MeSH
- dospělí MeSH
- lidé MeSH
- mužské pohlaví MeSH
- předškolní dítě MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
PURPOSE: The purposes of this study are to identify the strongest clinical parameters in relation to in-hospital mortality, which are available in the earliest phase of the hospitalization of patients, and to create an easy tool for the early identification of patients at risk. MATERIALS AND METHODS: The classification and regression tree analysis was applied to data from the Acute Heart Failure Database-Main registry comprising patients admitted to specialized cardiology centers with all syndromes of acute heart failure. The classification model was built on derivation cohort (n = 2543) and evaluated on validation cohort (n = 1387). RESULTS: The classification tree stratifies patients according to the presence of cardiogenic shock (CS), the level of creatinine, and the systolic blood pressure (SBP) at admission into the 5 risk groups with in-hospital mortality ranging from 2.8% to 66.2%. Patients without CS and creatinine level of 155 μmol/L or less were classified into very-low-risk group; patients without CS, creatinine level greater than 155 μmol/L, and SBP greater than 103 mm Hg, into low-risk group, whereas patients without CS, creatinine level greater than 155 μmol/L, and SBP of 103 mm Hg or lower, into intermediate-risk group. The high-risk group patients had CS and creatinine of 140 μmol/L or less; patients with CS and creatinine level greater than 140 μmol/L belong to very-high-risk group. The area under receiver operating characteristic curve was 0.823 and 0.832, and the value of Brier's score was estimated on level 0.091 and 0.084, for the derivation and the validation cohort, respectively. CONCLUSIONS: The presented classification model effectively stratified patients with all syndromes of acute heart failure into in-hospital mortality risk groups and might be of advantage for clinical practice.
- MeSH
- hodnocení rizik metody MeSH
- lidé středního věku MeSH
- lidé MeSH
- mortalita v nemocnicích * MeSH
- registrace MeSH
- rizikové faktory MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- srdeční selhání klasifikace mortalita MeSH
- statistické modely * MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
The increasing trend of adolescents' emotional symptoms has become a global public health problem. Especially, adolescents with chronic diseases or disabilities face more risks of emotional problems. Ample evidence showed family environment associates with adolescents' emotional health. However, the categories of family-related factors that most strongly influence adolescents' emotional health remained unclear. Additionally, it was not known that whether family environment influences emotional health differently between normally developed adolescents and those with chronic condition(s). Health Behaviours in School-aged Children (HBSC) database provides mass data about adolescents' self-reported health and social environmental backgrounds, which offers opportunities to apply data-driven approaches to determine critical family environmental factors that influence adolescents' health. Thus, based on the national HBSC data in the Czech Republic collected from 2017 to 2018, the current study adopted a data-driven method, classification-regression-decision-tree analysis, to investigate the impacts of family environmental factors, including demographic factors and psycho-social factors on adolescents' emotional health. The results suggested that family psycho-social functions played a significant role in maintaining adolescents' emotional health. Both normally developed adolescents and chronic-condition(s) adolescents benefited from communication with parents, family support, and parental monitoring. Besides, for adolescents with chronic condition(s), school-related parental support was also meaningful for decreasing emotional problems. In conclusion, the findings suggest the necessity of interventions to strengthen family-school communication and cooperation to improve chronic-disease adolescents' mental health. The interventions aiming to improve parent-adolescent communication, parental monitoring, and family support are essential for all adolescents.
- Klíčová slova
- adolescent, chronic condition, decision tree, emotional health, family environment,
- MeSH
- chronická nemoc MeSH
- dítě MeSH
- duševní zdraví * MeSH
- emoce MeSH
- lidé MeSH
- mladiství MeSH
- rodiče * psychologie MeSH
- rozhodovací stromy MeSH
- Check Tag
- dítě MeSH
- lidé MeSH
- mladiství MeSH
- Publikační typ
- časopisecké články MeSH
- Klíčová slova
- Brassicaceae, Classification & Regression Tree, Heliophila, Random Forests, climatic seasonality, drought regime, life history, phylogenetic tree,
- MeSH
- Brassicaceae * MeSH
- klimatické změny MeSH
- lesy MeSH
- období sucha * MeSH
- stromy MeSH
- Publikační typ
- dopisy MeSH
Quadrupole inductively coupled plasma mass spectrometry (Q-ICP-MS) and direct mercury analysis were used to determine the elemental composition of 180 transformed (salt-ripened) anchovies from three different fishing areas before and after packaging. To this purpose, four decision trees-based algorithms, corresponding to C5.0, classification and regression trees (CART), chi-squareautomatic interaction detection (CHAID), and quick unbiased efficient statistical tree (QUEST) were applied to the elemental datasets to find the most accurate data mining procedure to achieve the ultimate goal of fish origin prediction. Classification rules generated by the trained CHAID model optimally identified unlabelled testing bulk anchovies (93.9% F-score) by using just 6 out of 52 elements (As, K, P, Cd, Li, and Sr). The finished packaged product was better modelled by the QUEST algorithm which recognised the origin of anchovies with F-score of 97.7%, considering the information carried out by 5 elements (B, As, K. Cd, and Pd). Results obtained suggested that the traceability system in the fishery sector may be supported by simplified machine learning techniques applied to a limited but effective number of inorganic predictors of origin.
- Klíčová slova
- Data mining, Decision trees, Engraulis encrasicolus, Fish products, Geographical origin, ICP-MS,
- MeSH
- algoritmy MeSH
- rozhodovací stromy MeSH
- rtuť analýza MeSH
- rybí výrobky analýza MeSH
- ryby MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- rtuť MeSH
In this paper, we present the results of the research concerning extraction of informative gene expression profiles from high-dimensional array of gene expressions considering the state of patients' health using clustering method, ML-based binary classifiers and fuzzy inference system. Applying of the proposed stepwise procedure can allow us to extract the most informative genes taking into account both the subtypes of disease or state of the patient's health for further reconstruction of gene regulatory networks based on the allocated genes and following simulation of the reconstructed models. We used the publicly available gene expressions data as the experimental ones which were obtained using DNA microarray experiments and contained two types of patients' gene expression profiles-the patients with lung cancer tumor and healthy patients. The stepwise procedure of the data processing assumes the following steps-in the beginning, we reduce the number of genes by removing non-informative genes in terms of statistical criteria and Shannon entropy; then, we perform the stepwise hierarchical clustering of gene expression profiles at hierarchical levels from 1 to 10 using the SOTA (Self-Organizing Tree Algorithm) clustering algorithm with correlation distance metric. The quality of the obtained clustering was evaluated using the complex clustering quality criterion which is considered both the gene expression profiles distribution relative to center of the clusters where these gene expression profiles are allocated and the centers of the clusters distribution. The result of this stage execution was a selection of the optimal cluster at each of the hierarchical levels which corresponded to the minimum value of the quality criterion. At the next step, we have implemented a classification procedure of the examined objects using four well known binary classifiers-logistic regression, support-vector machine, decision trees and random forest classifier. The effectiveness of the appropriate technique was evaluated based on the use of ROC (Receiver Operating Characteristic) analysis using criteria, included as the components, the errors of both the first and the second kinds. The final decision concerning the extraction of the most informative subset of gene expression profiles was taken based on the use of the fuzzy inference system, the inputs of which are the results of the appropriate single classifiers operation and the output is the final solution concerning state of the patient's health. To our mind, the implementation of the proposed stepwise procedure of the informative gene expression profiles extraction create the conditions for the increasing effectiveness of the further procedure of gene regulatory networks reconstruction and the following simulation of the reconstructed models considering the subtypes of the disease and/or state of the patient's health.
INTRODUCTION: The concept of phenotyping emerged, reflecting specific clinical, pulmonary and extrapulmonary features of each particular chronic obstructive pulmonary disease (COPD) case. Our aim was to analyze prognostic utility of: "Czech" COPD phenotypes and their most frequent combinations, "Spanish" phenotypes and Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages + groups in relation to long-term mortality risk. METHODS: Data were extracted from the Czech Multicenter Research Database (CMRD) of COPD. Kaplan-Meier (KM) estimates (at 60 months from inclusion) were used for mortality assessment. Survival rates were calculated for the six elementary "Czech" phenotypes and their most frequent and relevant combinations, "Spanish" phenotypes, GOLD grades and groups. Statistically significant differences were tested by Log Rank test. An analysis of factors underlying mortality risk (the role of confounders) has been assessed with the use of classification and regression tree (CART) analysis. Basic factors showing significant differences between deceased and living patients were entered into the CART model. This showed six different risk groups, the differences in risk were tested by a Log Rank test. RESULTS: The cohort (n=720) was 73.1% men, with a mean age of 66.6 years and mean FEV1 44.4% pred. KM estimates showed bronchiectases/COPD overlap (HR 1.425, p=0.045), frequent exacerbator (HR 1.58, p<0.001), cachexia (HR 2.262, p<0.001) and emphysematous (HR 1.786, p=0.015) phenotypes associated with higher mortality risk. Co-presence of multiple phenotypes in a single patient had additive effect on risk; combination of emphysema, cachexia and frequent exacerbations translated into poorest prognosis (HR 3.075; p<0.001). Of the "Spanish" phenotypes, AE CB and AE non-CB were associated with greater risk of mortality (HR 1.787 and 2.001; both p=0.001). FEV1% pred., cachexia and chronic heart failure in patient history were the major underlying factors determining mortality risk in our cohort. CONCLUSION: Certain phenotypes ("Czech" or "Spanish") of COPD are associated with higher risk of death. Co-presence of multiple phenotypes (emphysematous plus cachectic plus frequent exacerbator) in a single individual was associated with amplified risk of mortality.
- Klíčová slova
- chronic obstructive pulmonary disease; COPD, classification and regression tree; CART, cluster, mortality, phenotypes,
- MeSH
- chronická bronchitida * MeSH
- chronická obstrukční plicní nemoc * diagnóza MeSH
- fenotyp MeSH
- lidé MeSH
- progrese nemoci MeSH
- prospektivní studie MeSH
- senioři MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- multicentrická studie MeSH
- Geografické názvy
- Španělsko MeSH
Climate change is expected to intensify bark beetle population outbreaks in forests globally, affecting biodiversity and trajectories of change. Aspects of individual tree resistance remain poorly quantified, particularly with regard to the role of phenolic compounds, hindering robust predictions of forest response to future conditions. In 2003, we conducted a mechanical wounding experiment in a Norway spruce forest that coincided with an outbreak of the bark beetle, Ips typographus. We collected phloem samples from 97 trees and monitored tree survival for 5 months. Using high-performance liquid chromatography, we quantified induced changes in the concentrations of phenolics. Classification and regression tools were used to evaluate relationships between phenolic production and bark beetle resistance, in the context of other survival factors. The proximity of beetle source populations was a principal determinant of survival. Proxy measures of tree vigor, such as crown defoliation, mediated tree resistance. Controlling for these factors, synthesis of catechin was found to exponentially increase tree survival probability. However, even resistant trees were susceptible in late season due to high insect population growth. Our results show that incorporating trait-mediated effects improves predictions of survival. Using an integrated analytical approach, we demonstrate that phenolics play a direct role in tree defense to herbivory.
- Klíčová slova
- Bark beetle outbreak, Catechin, Crown defoliation, Primary attraction, Resistance, Tree survival,
- MeSH
- brouci * fyziologie MeSH
- býložravci MeSH
- fenoly MeSH
- floém MeSH
- smrk * MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- fenoly MeSH
BACKGROUND: Overcoming boundaries is crucial for incursion of alien plant species and their successful naturalization and invasion within protected areas. Previous work showed that in Kruger National Park, South Africa, this process can be quantified and that factors determining the incursion of invasive species can be identified and predicted confidently. Here we explore the similarity between determinants of incursions identified by the general model based on a multispecies assemblage, and those identified by species-specific models. We analyzed the presence and absence of six invasive plant species in 1.0×1.5 km segments along the border of the park as a function of environmental characteristics from outside and inside the KNP boundary, using two data-mining techniques: classification trees and random forests. PRINCIPAL FINDINGS: The occurrence of Ageratum houstonianum, Chromolaena odorata, Xanthium strumarium, Argemone ochroleuca, Opuntia stricta and Lantana camara can be reliably predicted based on landscape characteristics identified by the general multispecies model, namely water runoff from surrounding watersheds and road density in a 10 km radius. The presence of main rivers and species-specific combinations of vegetation types are reliable predictors from inside the park. CONCLUSIONS: The predictors from the outside and inside of the park are complementary, and are approximately equally reliable for explaining the presence/absence of current invaders; those from the inside are, however, more reliable for predicting future invasions. Landscape characteristics determined as crucial predictors from outside the KNP serve as guidelines for management to enact proactive interventions to manipulate landscape features near the KNP to prevent further incursions. Predictors from the inside the KNP can be used reliably to identify high-risk areas to improve the cost-effectiveness of management, to locate invasive plants and target them for eradication.
- MeSH
- druhová specificita MeSH
- regresní analýza MeSH
- řeky MeSH
- rostliny klasifikace MeSH
- vývoj rostlin * MeSH
- zachování přírodních zdrojů * MeSH
- zavlečené druhy * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Jihoafrická republika MeSH
BACKGROUND: There is a need for early identification of children with immunoglobulin A nephropathy (IgAN) at risk of progression of kidney disease. METHODS: Data on 261 young patients [age <23 years; mean follow-up of 4.9 (range 2.5-8.1) years] enrolled in VALIGA, a study designed to validate the Oxford Classification of IgAN, were assessed. Renal biopsies were scored for the presence of mesangial hypercellularity (M1), endocapillary hypercellularity (E1), segmental glomerulosclerosis (S1), tubular atrophy/interstitial fibrosis (T1-2) (MEST score) and crescents (C1). Progression was assessed as end stage renal disease and/or a 50 % loss of estimated glomerular filtration rate (eGFR) (combined endpoint) as well as the rate of renal function decline (slope of eGFR). Cox regression and tree classification binary models were used and compared. RESULTS: In this cohort of 261 subjects aged <23 years, Cox analysis validated the MEST M, S and T scores for predicting survival to the combined endpoint but failed to prove that these scores had predictive value in the sub-group of 174 children aged <18 years. The regression tree classification indicated that patients with M1 were at risk of developing higher time-averaged proteinuria (p < 0.0001) and the combined endpoint (p < 0.001). An initial proteinuria of ≥0.4 g/day/1.73 m2 and an eGFR of <90 ml/min/1.73 m2 were determined to be risk factors in subjects with M0. Children aged <16 years with M0 and well-preserved eGFR (>90 ml/min/1.73 m2) at presentation had a significantly high probability of proteinuria remission during follow-up and a higher remission rate following treatment with corticosteroid and/or immunosuppressive therapy. CONCLUSION: This new statistical approach has identified clinical and histological risk factors associated with outcome in children and young adults with IgAN.
- Klíčová slova
- IgA nephropathy, Pathology classification, Progression, Proteinuria, Risk factors,
- MeSH
- analýza přežití MeSH
- biopsie MeSH
- chronické selhání ledvin epidemiologie patologie MeSH
- dítě MeSH
- hodnoty glomerulární filtrace MeSH
- hormony kůry nadledvin terapeutické užití MeSH
- IgA nefropatie farmakoterapie epidemiologie patologie MeSH
- imunosupresiva MeSH
- kohortové studie MeSH
- kojenec MeSH
- ledviny patologie MeSH
- lidé MeSH
- předškolní dítě MeSH
- progrese nemoci MeSH
- proteinurie epidemiologie patologie MeSH
- retrospektivní studie MeSH
- rizikové faktory MeSH
- sexuální faktory MeSH
- stanovení cílového parametru MeSH
- věkové faktory MeSH
- Check Tag
- dítě MeSH
- kojenec MeSH
- lidé MeSH
- mužské pohlaví MeSH
- předškolní dítě MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Evropa epidemiologie MeSH
- Názvy látek
- hormony kůry nadledvin MeSH
- imunosupresiva MeSH