Bagged neural network
Dotaz
Zobrazit nápovědu
The purpose of the study is to determine radon-prone areas in the Czech Republic based on the measurements of indoor radon concentration and independent predictors (rock type and permeability of the bedrock, gamma dose rate, GPS coordinates and the average age of family houses). The relationship between the mean observed indoor radon concentrations in monitored areas (∼22% municipalities) and the independent predictors was modelled using a bagged neural network. Levels of mean indoor radon concentration in the unmonitored areas were predicted using the bagged neural network model fitted for the monitored areas. The propensity to increased indoor radon was determined by estimated probability of exceeding the action level of 300Bq/m(3).
- MeSH
- monitorování radiace * MeSH
- neuronové sítě * MeSH
- radioaktivní látky znečišťující vzduch analýza MeSH
- radioaktivní znečištění ovzduší statistika a číselné údaje MeSH
- radon analýza MeSH
- teoretické modely MeSH
- znečištění vzduchu ve vnitřním prostředí statistika a číselné údaje MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Česká republika MeSH
The automated detection of arrhythmia in a Holter ECG signal is a challenging task due to its complex clinical content and data quantity. It is also challenging due to the fact that Holter ECG is usually affected by noise. Such noise may be the result of the regular activity of patients using the Holter ECG-partially unplugged electrodes, short-time disconnections due to movement, or disturbances caused by electric devices or infrastructure. Furthermore, regular patient activities such as movement also affect the ECG signals and, in connection with artificial noise, may render the ECG non-readable or may lead to misinterpretation of the ECG. OBJECTIVE: In accordance with the PhysioNet/CinC Challenge 2017, we propose a method for automated classification of 1-lead Holter ECG recordings. APPROACH: The proposed method classifies a tested record into one of four classes-'normal', 'atrial fibrillation', 'other arrhythmia' or 'too noisy to classify'. It uses two machine learning methods in parallel. The first-a bagged tree ensemble (BTE)-processes a set of 43 features based on QRS detection and PQRS morphology. The second-a convolutional neural network connected to a shallow neural network (CNN/NN)-uses ECG filtered by nine different filters (8× envelograms, 1× band-pass). If the output of CNN/NN reaches a specific level of certainty, its output is used. Otherwise, the BTE output is preferred. MAIN RESULTS: The proposed method was trained using a reduced version of the public PhysioNet/CinC Challenge 2017 dataset (8183 records) and remotely tested on the hidden dataset on PhysioNet servers (3658 records). The method achieved F1 test scores of 0.92, 0.82 and 0.74 for normal recordings, atrial fibrillation and recordings containing other arrhythmias, respectively. The overall F1 score measured on the hidden test-set was 0.83. SIGNIFICANCE: This F1 score led to shared rank #2 in the follow-up PhysioNet/CinC Challenge 2017 ranking.
Automated sentiment analysis is becoming increasingly recognized due to the growing importance of social media and e-commerce platform review websites. Deep neural networks outperform traditional lexicon-based and machine learning methods by effectively exploiting contextual word embeddings to generate dense document representation. However, this representation model is not fully adequate to capture topical semantics and the sentiment polarity of words. To overcome these problems, a novel sentiment analysis model is proposed that utilizes richer document representations of word-emotion associations and topic models, which is the main computational novelty of this study. The sentiment analysis model integrates word embeddings with lexicon-based sentiment and emotion indicators, including negations and emoticons, and to further improve its performance, a topic modeling component is utilized together with a bag-of-words model based on a supervised term weighting scheme. The effectiveness of the proposed model is evaluated using large datasets of Amazon product reviews and hotel reviews. Experimental results prove that the proposed document representation is valid for the sentiment analysis of product and hotel reviews, irrespective of their class imbalance. The results also show that the proposed model improves on existing machine learning methods.
- MeSH
- algoritmy * MeSH
- emoce MeSH
- lidé MeSH
- neuronové sítě * MeSH
- sémantika MeSH
- strojové učení MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Random Forest is an ensemble of decision trees based on the bagging and random subspace concepts. As suggested by Breiman, the strength of unstable learners and the diversity among them are the ensemble models' core strength. In this paper, we propose two approaches known as oblique and rotation double random forests. In the first approach, we propose rotation based double random forest. In rotation based double random forests, transformation or rotation of the feature space is generated at each node. At each node different random feature subspace is chosen for evaluation, hence the transformation at each node is different. Different transformations result in better diversity among the base learners and hence, better generalization performance. With the double random forest as base learner, the data at each node is transformed via two different transformations namely, principal component analysis and linear discriminant analysis. In the second approach, we propose oblique double random forest. Decision trees in random forest and double random forest are univariate, and this results in the generation of axis parallel split which fails to capture the geometric structure of the data. Also, the standard random forest may not grow sufficiently large decision trees resulting in suboptimal performance. To capture the geometric properties and to grow the decision trees of sufficient depth, we propose oblique double random forest. The oblique double random forest models are multivariate decision trees. At each non-leaf node, multisurface proximal support vector machine generates the optimal plane for better generalization performance. Also, different regularization techniques (Tikhonov regularization, axis-parallel split regularization, Null space regularization) are employed for tackling the small sample size problems in the decision trees of oblique double random forest. The proposed ensembles of decision trees produce trees with bigger size compared to the standard ensembles of decision trees as bagging is used at each non-leaf node which results in improved performance. The evaluation of the baseline models and the proposed oblique and rotation double random forest models is performed on benchmark 121 UCI datasets and real-world fisheries datasets. Both statistical analysis and the experimental results demonstrate the efficacy of the proposed oblique and rotation double random forest models compared to the baseline models on the benchmark datasets.
- MeSH
- algoritmy * MeSH
- analýza hlavních komponent MeSH
- rotace MeSH
- support vector machine * MeSH
- Publikační typ
- časopisecké články MeSH