Imbalanced datasets are prominent in real-world problems. In such problems, the data samples in one class are significantly higher than in the other classes, even though the other classes might be more important. The standard classification algorithms may classify all the data into the majority class, and this is a significant drawback of most standard learning algorithms, so imbalanced datasets need to be handled carefully. One of the traditional algorithms, twin support vector machines (TSVM), performed well on balanced data classification but poorly on imbalanced datasets classification. In order to improve the TSVM algorithm's classification ability for imbalanced datasets, recently, driven by the universum twin support vector machine (UTSVM), a reduced universum twin support vector machine for class imbalance learning (RUTSVM) was proposed. The dual problem and finding classifiers involve matrix inverse computation, which is one of RUTSVM's key drawbacks. In this paper, we improve the RUTSVM and propose an improved reduced universum twin support vector machine for class imbalance learning (IRUTSVM). We offer alternative Lagrangian functions to tackle the primal problems of RUTSVM in the suggested IRUTSVM approach by inserting one of the terms in the objective function into the constraints. As a result, we obtain new dual formulation for each optimization problem so that we need not compute inverse matrices neither in the training process nor in finding the classifiers. Moreover, the smaller size of the rectangular kernel matrices is used to reduce the computational time. Extensive testing is carried out on a variety of synthetic and real-world imbalanced datasets, and the findings show that the IRUTSVM algorithm outperforms the TSVM, UTSVM, and RUTSVM algorithms in terms of generalization performance.
- MeSH
- Algorithms * MeSH
- Support Vector Machine * MeSH
- Publication type
- Journal Article MeSH
BACKGROUND: Atherosclerosis leads to coronary artery disease (CAD) and myocardial infarction (MI), a major cause of morbidity and mortality worldwide. The computer-aided prognosis of atherosclerotic events with the electrocardiogram (ECG) derived heart rate variability (HRV) can be a robust method in the prognosis of atherosclerosis events. METHODS: A total of 70 male subjects aged 55 ± 5 years participated in the study. The lead-II ECG was recorded and sampled at 200 Hz. The tachogram was obtained from the ECG signal and used to extract twenty-five HRV features. The one-way Analysis of variance (ANOVA) test was performed to find the significant differences between the CAD, MI, and control subjects. Features were used in the training and testing of a two-class artificial neural network (ANN) and support vector machine (SVM). RESULTS: The obtained results revealed depressed HRV under atherosclerosis. Accuracy of 100% was obtained in classifying CAD and MI subjects from the controls using ANN. Accuracy was 99.6% with SVM, and in the classification of CAD from MI subjects using SVM and ANN, 99.3% and 99.0% accuracy was obtained respectively. CONCLUSIONS: Depressed HRV has been suggested to be a marker in the identification of atherosclerotic events. The good accuracy observed in classification between control, CAD, and MI subjects, revealed it to be a non-invasive cost-effective approach in the prognosis of atherosclerotic events.
Atrial fibrillation (AF) is a serious heart arrhythmia leading to a significant increase of the risk for occurrence of ischemic stroke. Clinically, the AF episode is recognized in an electrocardiogram. However, detection of asymptomatic AF, which requires a long-term monitoring, is more efficient when based on irregularity of beat-to-beat intervals estimated by the heart rate (HR) features. Automated classification of heartbeats into AF and non-AF by means of the Lagrangian Support Vector Machine has been proposed. The classifier input vector consisted of sixteen features, including four coefficients very sensitive to beat-to-beat heart changes, taken from the fetal heart rate analysis in perinatal medicine. Effectiveness of the proposed classifier has been verified on the MIT-BIH Atrial Fibrillation Database. Designing of the LSVM classifier using very large number of feature vectors requires extreme computational efforts. Therefore, an original approach has been proposed to determine a training set of the smallest possible size that still would guarantee a high quality of AF detection. It enables to obtain satisfactory results using only 1.39% of all heartbeats as the training data. Post-processing stage based on aggregation of classified heartbeats into AF episodes has been applied to provide more reliable information on patient risk. Results obtained during the testing phase showed the sensitivity of 98.94%, positive predictive value of 98.39%, and classification accuracy of 98.86%.
- MeSH
- Algorithms MeSH
- Databases, Factual MeSH
- Diagnosis, Computer-Assisted MeSH
- Electrocardiography methods MeSH
- Atrial Fibrillation diagnosis physiopathology MeSH
- Humans MeSH
- Signal Processing, Computer-Assisted MeSH
- Heart Rate physiology MeSH
- Support Vector Machine MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
Fragmented QRS (fQRS) is an electrocardiographic (ECG) marker of myocardial conduction abnormality, characterized by additional notches in the QRS complex. The presence of fQRS has been associated with an increased risk of all-cause mortality and arrhythmia in patients with cardiovascular disease. However, current binary visual analysis is prone to intra- and inter-observer variability and different definitions are problematic in clinical practice. Therefore, objective quantification of fQRS is needed and could further improve risk stratification of these patients. We present an automated method for fQRS detection and quantification. First, a novel robust QRS complex segmentation strategy is proposed, which combines multi-lead information and excludes abnormal heartbeats automatically. Afterwards extracted features, based on variational mode decomposition (VMD), phase-rectified signal averaging (PRSA) and the number of baseline-crossings of the ECG, were used to train a machine learning classifier (Support Vector Machine) to discriminate fragmented from non-fragmented ECG-traces using multi-center data and combining different fQRS criteria used in clinical settings. The best model was trained on the combination of two independent previously annotated datasets and, compared to these visual fQRS annotations, achieved Kappa scores of 0.68 and 0.44, respectively. We also show that the algorithm might be used in both regular sinus rhythm and irregular beats during atrial fibrillation. These results demonstrate that the proposed approach could be relevant for clinical practice by objectively assessing and quantifying fQRS. The study sets the path for further clinical application of the developed automated fQRS algorithm.
- MeSH
- Algorithms MeSH
- Electrocardiography * methods MeSH
- Atrial Fibrillation * diagnosis MeSH
- Humans MeSH
- Machine Learning MeSH
- Support Vector Machine MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Early detection of malignant thyroid nodules is crucial for effective treatment, but traditional diagnostic methods face challenges such as variability in expert opinions and limited integration of advanced imaging techniques. This prospective cohort study investigates a novel multimodal approach, integrating traditional methods with advanced machine learning techniques. We studied 181 patients who underwent fine-needle aspiration (FNA) biopsy, each contributing one nodule, resulting in a total of 181 nodules for our analysis. Data collection included sex, age, and ultrasound imaging, which incorporated elastography. Features extracted from these images included Thyroid Imaging Reporting and Data System (TIRADS) scores, elastography parameters, and radiomic features. The pathological results based on the FNA biopsy, provided by the pathologists, served as our gold standard for nodule classification. Our methodology, termed ELTIRADS, combines these features with interpretable machine learning techniques. Performance evaluation showed that a Support Vector Machine (SVM) classifier using TIRADS, elastography data, and radiomic features achieved high accuracy (0.92), with sensitivity (0.89), specificity (0.94), precision (0.89), and F1 score (0.89). To enhance interpretability, we used hierarchical clustering, shapley additive explanations (SHAP), and partial dependence plots (PDP). This combined approach holds promise for enhancing the accuracy of thyroid nodule malignancy detection, thereby contributing to advancements in personalized and precision medicine in the field of thyroid cancer research.
- MeSH
- Adult MeSH
- Elasticity Imaging Techniques * methods MeSH
- Middle Aged MeSH
- Humans MeSH
- Thyroid Neoplasms diagnostic imaging classification pathology diagnosis MeSH
- Prospective Studies MeSH
- Radiomics MeSH
- Aged MeSH
- Thyroid Gland diagnostic imaging pathology MeSH
- Machine Learning * MeSH
- Support Vector Machine MeSH
- Biopsy, Fine-Needle MeSH
- Thyroid Nodule * diagnostic imaging pathology classification MeSH
- Check Tag
- Adult MeSH
- Middle Aged MeSH
- Humans MeSH
- Male MeSH
- Aged MeSH
- Female MeSH
- Publication type
- Journal Article MeSH
One of the biggest problems in automated diagnosis of psychiatric disorders from medical images is the lack of sufficiently large samples for training. Sample size is especially important in the case of highly heterogeneous disorders such as schizophrenia, where machine learning models built on relatively low numbers of subjects may suffer from poor generalizability. Via multicenter studies and consortium initiatives researchers have tried to solve this problem by combining data sets from multiple sites. The necessary sharing of (raw) data is, however, often hindered by legal and ethical issues. Moreover, in the case of very large samples, the computational complexity might become too large. The solution to this problem could be distributed learning. In this paper we investigated the possibility to create a meta-model by combining support vector machines (SVM) classifiers trained on the local datasets, without the need for sharing medical images or any other personal data. Validation was done in a 4-center setup comprising of 480 first-episode schizophrenia patients and healthy controls in total. We built SVM models to separate patients from controls based on three different kinds of imaging features derived from structural MRI scans, and compared models built on the joint multicenter data to the meta-models. The results showed that the combined meta-model had high similarity to the model built on all data pooled together and comparable classification performance on all three imaging features. Both similarity and performance was superior to that of the local models. We conclude that combining models is thus a viable alternative that facilitates data sharing and creating bigger and more informative models.
- MeSH
- Datasets as Topic * MeSH
- Humans MeSH
- Magnetic Resonance Imaging MeSH
- Multicenter Studies as Topic * MeSH
- Neuroimaging methods MeSH
- Pattern Recognition, Automated methods MeSH
- Schizophrenia diagnostic imaging MeSH
- Support Vector Machine * MeSH
- Check Tag
- Humans MeSH
- Male MeSH
- Female MeSH
- Publication type
- Journal Article MeSH
BACKGROUND: Early diagnosis of schizophrenia could improve the outcome of the illness. Unlike classical between-group comparisons, machine learning can identify subtle disease patterns on a single subject level, which could help realize the potential of MRI in establishing a psychiatric diagnosis. Machine learning has previously been predominantly tested on gray-matter structural or functional MRI data. In this paper we used a machine learning classifier to differentiate patients with a first episode of schizophrenia-spectrum disorder (FES) from healthy controls using diffusion tensor imaging. METHODS: We applied linear support-vector machine (SVM) and traditional tract based spatial statistics between group analyses to brain fractional anisotropy (FA) data from 77 FES and 77 age and sex matched healthy controls. We also evaluated the effects of medication and symptoms on the SVM classification. RESULTS: The SVM distinguished between patients and controls with significant accuracy of 62.34% (p = 0.005). Participants with FES showed widespread FA reductions relative to controls in a large cluster (N = 56,647 voxels, corrected p = 0.002). The white matter regions, which contributed to the correct identification of participants with FES, overlapped with the regions, which showed lower FA in patients relative to controls. There was no association between the classification performance and medication or symptoms. CONCLUSIONS: Our results provide a proof of concept that SVM might help differentiate FES patients early in the course of illness from healthy controls using white-matter fractional anisotropy. As there was no effect of medications or symptoms, the SVM classification seemed to be based on trait rather than state markers and appeared to capture the lower FA in FES participants relative to controls.
- MeSH
- Anisotropy MeSH
- White Matter pathology MeSH
- Early Diagnosis * MeSH
- Adult MeSH
- Humans MeSH
- Young Adult MeSH
- Brain pathology MeSH
- Schizophrenia diagnostic imaging pathology MeSH
- Case-Control Studies MeSH
- Support Vector Machine * MeSH
- Diffusion Tensor Imaging MeSH
- Check Tag
- Adult MeSH
- Humans MeSH
- Young Adult MeSH
- Male MeSH
- Female MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
Detection of grapes in real-life images is a serious task solved by researchers dealing with precision viticulture. In the case of white wine varieties, grape detectors based on SVMs classifiers, in combination with a HOG descriptor, have proven to be very efficient. Simplified versions of the detectors seem to be the best solution for practical applications. They offer the best known performance versus time-complexity ratio. As our research showed, a conversion of RGB images to grayscale format, which is implemented at an image preprocessing level, is ideal means for further improvement of performance of the detectors. In order to enhance the ratio, we explored relevance of the conversion in a context of a detector potential sensitivity to a rotation of berries. For this purpose, we proposed a modification of the conversion, and we designed an appropriate method for a tuning of such modified detectors. To evaluate the effect of the new parameter space on their performance, we developed a specialized visualization method. In order to provide accurate results, we formed new datasets for both tuning and evaluation of the detectors. Our effort resulted in a robust grape detector which is less sensitive to image distortion.
Breast cancer survival prediction can have an extreme effect on selection of best treatment protocols. Many approaches such as statistical or machine learning models have been employed to predict the survival prospects of patients, but newer algorithms such as deep learning can be tested with the aim of improving the models and prediction accuracy. In this study, we used machine learning and deep learning approaches to predict breast cancer survival in 4,902 patient records from the University of Malaya Medical Centre Breast Cancer Registry. The results indicated that the multilayer perceptron (MLP), random forest (RF) and decision tree (DT) classifiers could predict survivorship, respectively, with 88.2 %, 83.3 % and 82.5 % accuracy in the tested samples. Support vector machine (SVM) came out to be lower with 80.5 %. In this study, tumour size turned out to be the most important feature for breast cancer survivability prediction. Both deep learning and machine learning methods produce desirable prediction accuracy, but other factors such as parameter configurations and data transformations affect the accuracy of the predictive model.
- MeSH
- Survival Analysis MeSH
- Deep Learning * MeSH
- Demography MeSH
- Adult MeSH
- Calibration MeSH
- Middle Aged MeSH
- Humans MeSH
- Young Adult MeSH
- Breast Neoplasms mortality MeSH
- Neural Networks, Computer MeSH
- Decision Trees MeSH
- Aged, 80 and over MeSH
- Aged MeSH
- Support Vector Machine MeSH
- Check Tag
- Adult MeSH
- Middle Aged MeSH
- Humans MeSH
- Young Adult MeSH
- Aged, 80 and over MeSH
- Aged MeSH
- Female MeSH
- Publication type
- Journal Article MeSH
Parkinson's disease (PD) is a neurodegenerative disorder which impairs motor skills, speech, and other functions such as behavior, mood, and cognitive processes. One of the most typical clinical hallmarks of PD is handwriting deterioration, usually the first manifestation of PD. The aim of this study is twofold: (a) to find a subset of handwriting features suitable for identifying subjects with PD and (b) to build a predictive model to efficiently diagnose PD. We collected handwriting samples from 37 medicated PD patients and 38 age- and sex-matched controls. The handwriting samples were collected during seven tasks such as writing a syllable, word, or sentence. Every sample was used to extract the handwriting measures. In addition to conventional kinematic and spatio-temporal handwriting measures, we also computed novel handwriting measures based on entropy, signal energy, and empirical mode decomposition of the handwriting signals. The selected features were fed to the support vector machine classifier with radial Gaussian kernel for automated diagnosis. The accuracy of the classification of PD was as high as 88.13%, with the highest values of sensitivity and specificity equal to 89.47% and 91.89%, respectively. Handwriting may be a valuable marker as a diagnostic and screening tool.
- MeSH
- Algorithms MeSH
- Biomarkers MeSH
- Biomechanical Phenomena MeSH
- Energy Metabolism MeSH
- Entropy MeSH
- Middle Aged MeSH
- Humans MeSH
- Neuropsychological Tests MeSH
- Normal Distribution MeSH
- Parkinson Disease * diagnosis psychology therapy MeSH
- Aged MeSH
- Support Vector Machine MeSH
- Decision Support Systems, Clinical * MeSH
- Check Tag
- Middle Aged MeSH
- Humans MeSH
- Male MeSH
- Aged MeSH
- Female MeSH