support vector machine classification
Dotaz
Zobrazit nápovědu
Imbalanced datasets are prominent in real-world problems. In such problems, the data samples in one class are significantly higher than in the other classes, even though the other classes might be more important. The standard classification algorithms may classify all the data into the majority class, and this is a significant drawback of most standard learning algorithms, so imbalanced datasets need to be handled carefully. One of the traditional algorithms, twin support vector machines (TSVM), performed well on balanced data classification but poorly on imbalanced datasets classification. In order to improve the TSVM algorithm's classification ability for imbalanced datasets, recently, driven by the universum twin support vector machine (UTSVM), a reduced universum twin support vector machine for class imbalance learning (RUTSVM) was proposed. The dual problem and finding classifiers involve matrix inverse computation, which is one of RUTSVM's key drawbacks. In this paper, we improve the RUTSVM and propose an improved reduced universum twin support vector machine for class imbalance learning (IRUTSVM). We offer alternative Lagrangian functions to tackle the primal problems of RUTSVM in the suggested IRUTSVM approach by inserting one of the terms in the objective function into the constraints. As a result, we obtain new dual formulation for each optimization problem so that we need not compute inverse matrices neither in the training process nor in finding the classifiers. Moreover, the smaller size of the rectangular kernel matrices is used to reduce the computational time. Extensive testing is carried out on a variety of synthetic and real-world imbalanced datasets, and the findings show that the IRUTSVM algorithm outperforms the TSVM, UTSVM, and RUTSVM algorithms in terms of generalization performance.
- MeSH
- algoritmy * MeSH
- support vector machine * MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Atherosclerosis leads to coronary artery disease (CAD) and myocardial infarction (MI), a major cause of morbidity and mortality worldwide. The computer-aided prognosis of atherosclerotic events with the electrocardiogram (ECG) derived heart rate variability (HRV) can be a robust method in the prognosis of atherosclerosis events. METHODS: A total of 70 male subjects aged 55 ± 5 years participated in the study. The lead-II ECG was recorded and sampled at 200 Hz. The tachogram was obtained from the ECG signal and used to extract twenty-five HRV features. The one-way Analysis of variance (ANOVA) test was performed to find the significant differences between the CAD, MI, and control subjects. Features were used in the training and testing of a two-class artificial neural network (ANN) and support vector machine (SVM). RESULTS: The obtained results revealed depressed HRV under atherosclerosis. Accuracy of 100% was obtained in classifying CAD and MI subjects from the controls using ANN. Accuracy was 99.6% with SVM, and in the classification of CAD from MI subjects using SVM and ANN, 99.3% and 99.0% accuracy was obtained respectively. CONCLUSIONS: Depressed HRV has been suggested to be a marker in the identification of atherosclerotic events. The good accuracy observed in classification between control, CAD, and MI subjects, revealed it to be a non-invasive cost-effective approach in the prognosis of atherosclerotic events.
Early detection of malignant thyroid nodules is crucial for effective treatment, but traditional diagnostic methods face challenges such as variability in expert opinions and limited integration of advanced imaging techniques. This prospective cohort study investigates a novel multimodal approach, integrating traditional methods with advanced machine learning techniques. We studied 181 patients who underwent fine-needle aspiration (FNA) biopsy, each contributing one nodule, resulting in a total of 181 nodules for our analysis. Data collection included sex, age, and ultrasound imaging, which incorporated elastography. Features extracted from these images included Thyroid Imaging Reporting and Data System (TIRADS) scores, elastography parameters, and radiomic features. The pathological results based on the FNA biopsy, provided by the pathologists, served as our gold standard for nodule classification. Our methodology, termed ELTIRADS, combines these features with interpretable machine learning techniques. Performance evaluation showed that a Support Vector Machine (SVM) classifier using TIRADS, elastography data, and radiomic features achieved high accuracy (0.92), with sensitivity (0.89), specificity (0.94), precision (0.89), and F1 score (0.89). To enhance interpretability, we used hierarchical clustering, shapley additive explanations (SHAP), and partial dependence plots (PDP). This combined approach holds promise for enhancing the accuracy of thyroid nodule malignancy detection, thereby contributing to advancements in personalized and precision medicine in the field of thyroid cancer research.
- MeSH
- dospělí MeSH
- elastografie * metody MeSH
- lidé středního věku MeSH
- lidé MeSH
- nádory štítné žlázy diagnostické zobrazování klasifikace patologie diagnóza MeSH
- prospektivní studie MeSH
- radiomika MeSH
- senioři MeSH
- štítná žláza diagnostické zobrazování patologie MeSH
- strojové učení * MeSH
- support vector machine MeSH
- tenkojehlová biopsie MeSH
- uzly štítné žlázy * diagnostické zobrazování patologie klasifikace MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
Atrial fibrillation (AF) is a serious heart arrhythmia leading to a significant increase of the risk for occurrence of ischemic stroke. Clinically, the AF episode is recognized in an electrocardiogram. However, detection of asymptomatic AF, which requires a long-term monitoring, is more efficient when based on irregularity of beat-to-beat intervals estimated by the heart rate (HR) features. Automated classification of heartbeats into AF and non-AF by means of the Lagrangian Support Vector Machine has been proposed. The classifier input vector consisted of sixteen features, including four coefficients very sensitive to beat-to-beat heart changes, taken from the fetal heart rate analysis in perinatal medicine. Effectiveness of the proposed classifier has been verified on the MIT-BIH Atrial Fibrillation Database. Designing of the LSVM classifier using very large number of feature vectors requires extreme computational efforts. Therefore, an original approach has been proposed to determine a training set of the smallest possible size that still would guarantee a high quality of AF detection. It enables to obtain satisfactory results using only 1.39% of all heartbeats as the training data. Post-processing stage based on aggregation of classified heartbeats into AF episodes has been applied to provide more reliable information on patient risk. Results obtained during the testing phase showed the sensitivity of 98.94%, positive predictive value of 98.39%, and classification accuracy of 98.86%.
- MeSH
- algoritmy MeSH
- databáze faktografické MeSH
- diagnóza počítačová MeSH
- elektrokardiografie metody MeSH
- fibrilace síní diagnóza patofyziologie MeSH
- lidé MeSH
- počítačové zpracování signálu MeSH
- srdeční frekvence fyziologie MeSH
- support vector machine MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Early diagnosis of schizophrenia could improve the outcome of the illness. Unlike classical between-group comparisons, machine learning can identify subtle disease patterns on a single subject level, which could help realize the potential of MRI in establishing a psychiatric diagnosis. Machine learning has previously been predominantly tested on gray-matter structural or functional MRI data. In this paper we used a machine learning classifier to differentiate patients with a first episode of schizophrenia-spectrum disorder (FES) from healthy controls using diffusion tensor imaging. METHODS: We applied linear support-vector machine (SVM) and traditional tract based spatial statistics between group analyses to brain fractional anisotropy (FA) data from 77 FES and 77 age and sex matched healthy controls. We also evaluated the effects of medication and symptoms on the SVM classification. RESULTS: The SVM distinguished between patients and controls with significant accuracy of 62.34% (p = 0.005). Participants with FES showed widespread FA reductions relative to controls in a large cluster (N = 56,647 voxels, corrected p = 0.002). The white matter regions, which contributed to the correct identification of participants with FES, overlapped with the regions, which showed lower FA in patients relative to controls. There was no association between the classification performance and medication or symptoms. CONCLUSIONS: Our results provide a proof of concept that SVM might help differentiate FES patients early in the course of illness from healthy controls using white-matter fractional anisotropy. As there was no effect of medications or symptoms, the SVM classification seemed to be based on trait rather than state markers and appeared to capture the lower FA in FES participants relative to controls.
- MeSH
- anizotropie MeSH
- bílá hmota patologie MeSH
- časná diagnóza * MeSH
- dospělí MeSH
- lidé MeSH
- mladý dospělý MeSH
- mozek patologie MeSH
- schizofrenie diagnostické zobrazování patologie MeSH
- studie případů a kontrol MeSH
- support vector machine * MeSH
- zobrazování difuzních tenzorů MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
This work explores the design and implementation of an algorithm for the classification of magnetic resonance imaging data for computer-aided diagnosis of schizophrenia. Features for classification were first extracted using two morphometric methods: voxel-based morphometry (VBM) and deformation-based morphometry (DBM). These features were then transformed into a wavelet domain using the discrete wavelet transform with various numbers of decomposition levels. The number of features was then reduced by thresholding and subsequent selection by: Fisher's Discrimination Ratio (FDR), Bhattacharyya Distance, and Variances (Var.). A Support Vector Machine with a linear kernel was used for classification. The evaluation strategy was based on leave-one-out cross-validation.
- MeSH
- lidé MeSH
- magnetická rezonanční tomografie MeSH
- schizofrenie * MeSH
- support vector machine MeSH
- vlnková analýza MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Analysis of gene expression data in terms of a priori-defined gene sets has recently received significant attention as this approach typically yields more compact and interpretable results than those produced by traditional methods that rely on individual genes. The set-level strategy can also be adopted with similar benefits in predictive classification tasks accomplished with machine learning algorithms. Initial studies into the predictive performance of set-level classifiers have yielded rather controversial results. The goal of this study is to provide a more conclusive evaluation by testing various components of the set-level framework within a large collection of machine learning experiments. RESULTS: Genuine curated gene sets constitute better features for classification than sets assembled without biological relevance. For identifying the best gene sets for classification, the Global test outperforms the gene-set methods GSEA and SAM-GS as well as two generic feature selection methods. To aggregate expressions of genes into a feature value, the singular value decomposition (SVD) method as well as the SetSig technique improve on simple arithmetic averaging. Set-level classifiers learned with 10 features constituted by the Global test slightly outperform baseline gene-level classifiers learned with all original data features although they are slightly less accurate than gene-level classifiers learned with a prior feature-selection step. CONCLUSION: Set-level classifiers do not boost predictive accuracy, however, they do achieve competitive accuracy if learned with the right combination of ingredients. AVAILABILITY: Open-source, publicly available software was used for classifier learning and testing. The gene expression datasets and the gene set database used are also publicly available. The full tabulation of experimental results is available at http://ida.felk.cvut.cz/CESLT.
One of the biggest problems in automated diagnosis of psychiatric disorders from medical images is the lack of sufficiently large samples for training. Sample size is especially important in the case of highly heterogeneous disorders such as schizophrenia, where machine learning models built on relatively low numbers of subjects may suffer from poor generalizability. Via multicenter studies and consortium initiatives researchers have tried to solve this problem by combining data sets from multiple sites. The necessary sharing of (raw) data is, however, often hindered by legal and ethical issues. Moreover, in the case of very large samples, the computational complexity might become too large. The solution to this problem could be distributed learning. In this paper we investigated the possibility to create a meta-model by combining support vector machines (SVM) classifiers trained on the local datasets, without the need for sharing medical images or any other personal data. Validation was done in a 4-center setup comprising of 480 first-episode schizophrenia patients and healthy controls in total. We built SVM models to separate patients from controls based on three different kinds of imaging features derived from structural MRI scans, and compared models built on the joint multicenter data to the meta-models. The results showed that the combined meta-model had high similarity to the model built on all data pooled together and comparable classification performance on all three imaging features. Both similarity and performance was superior to that of the local models. We conclude that combining models is thus a viable alternative that facilitates data sharing and creating bigger and more informative models.
- MeSH
- datové soubory jako téma * MeSH
- lidé MeSH
- magnetická rezonanční tomografie MeSH
- multicentrické studie jako téma * MeSH
- neurozobrazování metody MeSH
- rozpoznávání automatizované metody MeSH
- schizofrenie diagnostické zobrazování MeSH
- support vector machine * MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
Long QT syndrome (LQTS) presents a group of inheritable channelopathies with prolonged ventricular repolarization, leading to syncope, ventricular tachycardia, and sudden death. Differentiating LQTS genotypes is crucial for targeted management and treatment, yet conventional genetic testing remains costly and time-consuming. This study aims to improve the distinction between LQTS genotypes, particularly LQT3, through a novel electrocardiogram (ECG)-based approach. Patients with LQT3 are at elevated risk due to arrhythmia triggers associated with rest and sleep. Employing a database of genotyped long QT syndrome E-HOL-03-0480-013 ECG signals, we introduced two innovative parameterization techniques-area under the ECG curve and wave transformation into the unit circle-to classify LQT3 against LQT1 and LQT2 genotypes. Our methodology utilized single-lead ECG data with a 200 Hz sampling frequency. The support vector machine (SVM) model demonstrated the ability to discriminate LQT3 with a recall of 90% and a precision of 81%, achieving an F1-score of 0.85. This parameterization offers a potential substitute for genetic testing and is practical for low frequencies. These single-lead ECG data could enhance smartwatches' functionality and similar cardiovascular monitoring applications. The results underscore the viability of ECG morphology-based genotype classification, promising a significant step towards streamlined diagnosis and improved patient care in LQTS.
- MeSH
- dospělí MeSH
- elektrokardiografie * metody MeSH
- genotyp MeSH
- lidé MeSH
- strojové učení * MeSH
- support vector machine MeSH
- syndrom dlouhého QT * genetika diagnóza patofyziologie MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- srovnávací studie MeSH
BACKGROUND: Early diagnosis of schizophrenia could improve the outcomes and limit the negative effects of untreated illness. Although participants with schizophrenia show aberrant functional connectivity in brain networks, these between-group differences have a limited diagnostic utility. Novel methods of magnetic resonance imaging (MRI) analyses, such as machine learning (ML), may help bring neuroimaging from the bench to the bedside. Here, we used ML to differentiate participants with a first episode of schizophrenia-spectrum disorder (FES) from healthy controls based on resting-state functional connectivity (rsFC). METHOD: We acquired resting-state functional MRI data from 63 patients with FES who were individually matched by age and sex to 63 healthy controls. We applied linear kernel support vector machines (SVM) to rsFC within the default mode network, the salience network and the central executive network. RESULTS: The SVM applied to the rsFC within the salience network distinguished the FES from the control participants with an accuracy of 73.0% (p = 0.001), specificity of 71.4% and sensitivity of 74.6%. The classification accuracy was not significantly affected by medication dose, or by the presence of psychotic symptoms. The functional connectivity within the default mode or the central executive networks did not yield classification accuracies above chance level. CONCLUSIONS: Seed-based functional connectivity maps can be utilized for diagnostic classification, even early in the course of schizophrenia. The classification was probably based on trait rather than state markers, as symptoms or medications were not significantly associated with classification accuracy. Our results support the role of the anterior insula/salience network in the pathophysiology of FES.
- MeSH
- dospělí MeSH
- konektom metody MeSH
- lidé MeSH
- magnetická rezonanční tomografie MeSH
- mladý dospělý MeSH
- mozková kůra diagnostické zobrazování patofyziologie MeSH
- schizofrenie diagnostické zobrazování patofyziologie MeSH
- support vector machine * MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH