Verifying the speaker of a speech fragment can be crucial in attributing a crime to a suspect. The question can be addressed given disputed and reference speech material, adopting the recommended and scientifically accepted likelihood ratio framework for reporting evidential strength in court. In forensic practice, usually, auditory and acoustic analyses are performed to carry out such a verification task considering a diversity of features, such as language competence, pronunciation, or other linguistic features. Automated speaker comparison systems can also be used alongside those manual analyses. State-of-the-art automatic speaker comparison systems are based on deep neural networks that take acoustic features as input. Additional information, though, may be obtained from linguistic analysis. In this paper, we aim to answer if, when and how modern acoustic-based systems can be complemented by an authorship technique based on frequent words, within the likelihood ratio framework. We consider three different approaches to derive a combined likelihood ratio: using a support vector machine algorithm, fitting bivariate normal distributions, and passing the score of the acoustic system as additional input to the frequent-word analysis. We apply our method to the forensically relevant dataset FRIDA and the FISHER corpus, and we explore under which conditions fusion is valuable. We evaluate our results in terms of log likelihood ratio cost (Cllr) and equal error rate (EER). We show that fusion can be beneficial, especially in the case of intercepted phone calls with noise in the background.
- MeSH
- akustika řeči MeSH
- algoritmy MeSH
- lidé MeSH
- lingvistika MeSH
- pravděpodobnostní funkce MeSH
- řeč MeSH
- soudní vědy * metody MeSH
- support vector machine MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Imbalanced datasets are prominent in real-world problems. In such problems, the data samples in one class are significantly higher than in the other classes, even though the other classes might be more important. The standard classification algorithms may classify all the data into the majority class, and this is a significant drawback of most standard learning algorithms, so imbalanced datasets need to be handled carefully. One of the traditional algorithms, twin support vector machines (TSVM), performed well on balanced data classification but poorly on imbalanced datasets classification. In order to improve the TSVM algorithm's classification ability for imbalanced datasets, recently, driven by the universum twin support vector machine (UTSVM), a reduced universum twin support vector machine for class imbalance learning (RUTSVM) was proposed. The dual problem and finding classifiers involve matrix inverse computation, which is one of RUTSVM's key drawbacks. In this paper, we improve the RUTSVM and propose an improved reduced universum twin support vector machine for class imbalance learning (IRUTSVM). We offer alternative Lagrangian functions to tackle the primal problems of RUTSVM in the suggested IRUTSVM approach by inserting one of the terms in the objective function into the constraints. As a result, we obtain new dual formulation for each optimization problem so that we need not compute inverse matrices neither in the training process nor in finding the classifiers. Moreover, the smaller size of the rectangular kernel matrices is used to reduce the computational time. Extensive testing is carried out on a variety of synthetic and real-world imbalanced datasets, and the findings show that the IRUTSVM algorithm outperforms the TSVM, UTSVM, and RUTSVM algorithms in terms of generalization performance.
- MeSH
- algoritmy * MeSH
- support vector machine * MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Atherosclerosis leads to coronary artery disease (CAD) and myocardial infarction (MI), a major cause of morbidity and mortality worldwide. The computer-aided prognosis of atherosclerotic events with the electrocardiogram (ECG) derived heart rate variability (HRV) can be a robust method in the prognosis of atherosclerosis events. METHODS: A total of 70 male subjects aged 55 ± 5 years participated in the study. The lead-II ECG was recorded and sampled at 200 Hz. The tachogram was obtained from the ECG signal and used to extract twenty-five HRV features. The one-way Analysis of variance (ANOVA) test was performed to find the significant differences between the CAD, MI, and control subjects. Features were used in the training and testing of a two-class artificial neural network (ANN) and support vector machine (SVM). RESULTS: The obtained results revealed depressed HRV under atherosclerosis. Accuracy of 100% was obtained in classifying CAD and MI subjects from the controls using ANN. Accuracy was 99.6% with SVM, and in the classification of CAD from MI subjects using SVM and ANN, 99.3% and 99.0% accuracy was obtained respectively. CONCLUSIONS: Depressed HRV has been suggested to be a marker in the identification of atherosclerotic events. The good accuracy observed in classification between control, CAD, and MI subjects, revealed it to be a non-invasive cost-effective approach in the prognosis of atherosclerotic events.
Random Forest is an ensemble of decision trees based on the bagging and random subspace concepts. As suggested by Breiman, the strength of unstable learners and the diversity among them are the ensemble models' core strength. In this paper, we propose two approaches known as oblique and rotation double random forests. In the first approach, we propose rotation based double random forest. In rotation based double random forests, transformation or rotation of the feature space is generated at each node. At each node different random feature subspace is chosen for evaluation, hence the transformation at each node is different. Different transformations result in better diversity among the base learners and hence, better generalization performance. With the double random forest as base learner, the data at each node is transformed via two different transformations namely, principal component analysis and linear discriminant analysis. In the second approach, we propose oblique double random forest. Decision trees in random forest and double random forest are univariate, and this results in the generation of axis parallel split which fails to capture the geometric structure of the data. Also, the standard random forest may not grow sufficiently large decision trees resulting in suboptimal performance. To capture the geometric properties and to grow the decision trees of sufficient depth, we propose oblique double random forest. The oblique double random forest models are multivariate decision trees. At each non-leaf node, multisurface proximal support vector machine generates the optimal plane for better generalization performance. Also, different regularization techniques (Tikhonov regularization, axis-parallel split regularization, Null space regularization) are employed for tackling the small sample size problems in the decision trees of oblique double random forest. The proposed ensembles of decision trees produce trees with bigger size compared to the standard ensembles of decision trees as bagging is used at each non-leaf node which results in improved performance. The evaluation of the baseline models and the proposed oblique and rotation double random forest models is performed on benchmark 121 UCI datasets and real-world fisheries datasets. Both statistical analysis and the experimental results demonstrate the efficacy of the proposed oblique and rotation double random forest models compared to the baseline models on the benchmark datasets.
- MeSH
- algoritmy * MeSH
- analýza hlavních komponent MeSH
- rotace MeSH
- support vector machine * MeSH
- Publikační typ
- časopisecké články MeSH
Fragmented QRS (fQRS) is an electrocardiographic (ECG) marker of myocardial conduction abnormality, characterized by additional notches in the QRS complex. The presence of fQRS has been associated with an increased risk of all-cause mortality and arrhythmia in patients with cardiovascular disease. However, current binary visual analysis is prone to intra- and inter-observer variability and different definitions are problematic in clinical practice. Therefore, objective quantification of fQRS is needed and could further improve risk stratification of these patients. We present an automated method for fQRS detection and quantification. First, a novel robust QRS complex segmentation strategy is proposed, which combines multi-lead information and excludes abnormal heartbeats automatically. Afterwards extracted features, based on variational mode decomposition (VMD), phase-rectified signal averaging (PRSA) and the number of baseline-crossings of the ECG, were used to train a machine learning classifier (Support Vector Machine) to discriminate fragmented from non-fragmented ECG-traces using multi-center data and combining different fQRS criteria used in clinical settings. The best model was trained on the combination of two independent previously annotated datasets and, compared to these visual fQRS annotations, achieved Kappa scores of 0.68 and 0.44, respectively. We also show that the algorithm might be used in both regular sinus rhythm and irregular beats during atrial fibrillation. These results demonstrate that the proposed approach could be relevant for clinical practice by objectively assessing and quantifying fQRS. The study sets the path for further clinical application of the developed automated fQRS algorithm.
- MeSH
- algoritmy MeSH
- elektrokardiografie * metody MeSH
- fibrilace síní * diagnóza MeSH
- lidé MeSH
- strojové učení MeSH
- support vector machine MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
The skull, along with the pelvic bone, serves an important source of clues as to the sex of human skeletal remains. The frontal bone is one of the most significant sexually dimorphic structures employed in anthropological research, especially when studied by methods of virtual anthropology. For this reason, many new methods have been developed, but their utility for other populations remains to be verified. In the present study, we tested one such approach-the landmark-free method of Bulut et al. (2016) for quantifying sexually dimorphic differences in the shape of the frontal bone, developed using a sample of the Turkish population. Our study builds upon this methodology and tests its utility for the Czech population. We evaluated the shape of the male and female frontal bone using 3D morphometrics, comparing virtual models of frontal bones and corresponding software-generated spheres. To do so, we calculated the relative size of the frontal bone area deviating from the fitted sphere by less than 1 mm and used these data to estimate the sex of individuals. Using our sample of the Czech population, the method estimated the sex correctly in 72.8% of individuals. This success rate is about 5% lower than that achieved with the Turkish sample. This method is therefore not very suitable for estimating the sex of Czech individuals, especially considering the significantly greater success rates of other approaches.
- MeSH
- čelní kost anatomie a histologie diagnostické zobrazování MeSH
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladý dospělý MeSH
- počítačová rentgenová tomografie MeSH
- počítačová simulace * MeSH
- počítačové zpracování obrazu MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- soudní antropologie MeSH
- support vector machine MeSH
- určení pohlaví podle kostry metody MeSH
- zobrazování trojrozměrné * MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- Publikační typ
- časopisecké články MeSH
- Geografické názvy
- Česká republika MeSH
Machine learning classifications of first-episode psychosis (FEP) using neuroimaging have predominantly analyzed brain volumes. Some studies examined cortical thickness, but most of them have used parcellation approaches with data from single sites, which limits claims of generalizability. To address these limitations, we conducted a large-scale, multi-site analysis of cortical thickness comparing parcellations and vertex-wise approaches. By leveraging the multi-site nature of the study, we further investigated how different demographical and site-dependent variables affected predictions. Finally, we assessed relationships between predictions and clinical variables. 428 subjects (147 females, mean age 27.14) with FEP and 448 (230 females, mean age 27.06) healthy controls were enrolled in 8 centers by the ClassiFEP group. All subjects underwent a structural MRI and were clinically assessed. Cortical thickness parcellation (68 areas) and full cortical maps (20,484 vertices) were extracted. Linear Support Vector Machine was used for classification within a repeated nested cross-validation framework. Vertex-wise thickness maps outperformed parcellation-based methods with a balanced accuracy of 66.2% and an Area Under the Curve of 72%. By stratifying our sample for MRI scanner, we increased generalizability across sites. Temporal brain areas resulted as the most influential in the classification. The predictive decision scores significantly correlated with age at onset, duration of treatment, and positive symptoms. In conclusion, although far from the threshold of clinical relevance, temporal cortical thickness proved to classify between FEP subjects and healthy individuals. The assessment of site-dependent variables permitted an increase in the across-site generalizability, thus attempting to address an important machine learning limitation.
- MeSH
- dospělí MeSH
- lidé MeSH
- magnetická rezonanční tomografie metody MeSH
- mozek MeSH
- neurozobrazování MeSH
- psychotické poruchy * diagnostické zobrazování MeSH
- support vector machine MeSH
- Check Tag
- dospělí MeSH
- lidé MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- multicentrická studie MeSH
- práce podpořená grantem MeSH
This work explores the design and implementation of an algorithm for the classification of magnetic resonance imaging data for computer-aided diagnosis of schizophrenia. Features for classification were first extracted using two morphometric methods: voxel-based morphometry (VBM) and deformation-based morphometry (DBM). These features were then transformed into a wavelet domain using the discrete wavelet transform with various numbers of decomposition levels. The number of features was then reduced by thresholding and subsequent selection by: Fisher's Discrimination Ratio (FDR), Bhattacharyya Distance, and Variances (Var.). A Support Vector Machine with a linear kernel was used for classification. The evaluation strategy was based on leave-one-out cross-validation.
- MeSH
- lidé MeSH
- magnetická rezonanční tomografie MeSH
- schizofrenie * MeSH
- support vector machine MeSH
- vlnková analýza MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
OBJECTIVE: The clinical diagnosis of corticobasal syndrome (CBS) represents a challenge for physicians and reliable diagnostic imaging biomarkers would support the diagnostic work-up. We aimed to investigate the neural signatures of CBS using multimodal T1-weighted and resting-state functional magnetic resonance imaging (MRI). METHODS: Nineteen patients with CBS (age 67.0 ± 6.0 years; mean±SD) and 19 matched controls (66.5 ± 6.0) were enrolled from the German Frontotemporal Lobar Degeneration Consortium. Changes in functional connectivity and structure were respectively assessed with eigenvector centrality mapping complemented by seed-based analysis and with voxel-based morphometry. In addition to mass-univariate statistics, multivariate support vector machine (SVM) classification tested the potential of multimodal MRI to differentiate patients and controls. External validity of SVM was assessed on independent CBS data from the 4RTNI database. RESULTS: A decrease in brain interconnectedness was observed in the right central operculum, middle temporal gyrus and posterior insula, while widespread connectivity increases were found in the anterior cingulum, medial superior-frontal gyrus and in the bilateral caudate nuclei. Severe and diffuse gray matter volume reduction, especially in the bilateral insula, putamen and thalamus, characterized CBS. SVM classification revealed that both connectivity (area under the curve 0.81) and structural abnormalities (0.80) distinguished CBS from controls, while their combination led to statistically non-significant improvement in discrimination power, questioning the additional value of functional connectivity over atrophy. SVM analyses based on structural MRI generalized moderately well to new data, which was decisively improved when guided by meta-analytically derived disease-specific regions-of-interest. CONCLUSIONS: Our data-driven results show impairment of functional connectivity and brain structure in CBS and explore their potential as imaging biomarkers.
- MeSH
- konektom metody MeSH
- lidé středního věku MeSH
- lidé MeSH
- magnetická rezonanční tomografie * MeSH
- mozková kůra diagnostické zobrazování patologie patofyziologie MeSH
- multimodální zobrazování MeSH
- nemoci bazálních ganglií diagnostické zobrazování patologie patofyziologie MeSH
- nervová síť diagnostické zobrazování patologie patofyziologie MeSH
- neurozobrazování metody MeSH
- šedá hmota diagnostické zobrazování patologie patofyziologie MeSH
- senioři MeSH
- support vector machine * MeSH
- Check Tag
- lidé středního věku MeSH
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, N.I.H., Extramural MeSH
Speech is controlled by axial neuromotor systems, therefore, it is highly sensitive to the effects of neurodegenerative illnesses such as Parkinson's Disease (PD). Patients suffering from PD present important alterations in speech, which are manifested in phonation, articulation, prosody, and fluency. These alterations may be evaluated using statistical methods on features obtained from glottal, spectral, cepstral, or fractal descriptions of speech. This work introduces an evaluation paradigm based on Information Theory (IT) to differentiate the effects of PD and aging on glottal amplitude distributions. The study is conducted on a database including 48 PD patients (24 males, 24 females), 48 age-matched healthy controls (HC, 24 males, 24 females), and 48 mid-age normative subjects (NS, 24 males, 24 females). It may be concluded from the study that Hierarchical Clustering (HiCl) methods produce a clear separation between the phonation of PD patients from NS subjects (accuracy of 89.6% for both male and female subsets), but the separation between PD patients and HC subjects is less efficient (accuracy of 75.0% for the male subset and 70.8% for the female subset). Conversely, using feature selection and Support Vector Machine (SVM) classification, the differentiation between PD and HC is substantially improved (accuracy of 94.8% for the male subset and 92.8% for the female subset). This improvement was mainly boosted by feature selection, at a cost of information and generalization losses. The results point to the possibility that speech deterioration may affect HC phonation with aging, reducing its difference to PD phonation.
- MeSH
- akustika řeči MeSH
- diferenciální diagnóza MeSH
- fonace fyziologie MeSH
- lidé MeSH
- Parkinsonova nemoc komplikace patofyziologie MeSH
- poruchy řeči etiologie patofyziologie MeSH
- senioři MeSH
- stárnutí fyziologie MeSH
- support vector machine * MeSH
- Check Tag
- lidé MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH