JavaScript NENÍ povolen !

Support vector regression Dotaz Zobrazit nápovědu

Přesná shoda

44 záznamů v PubMed

Článek

Statistical sex determination from craniometrics: Comparison of linear discriminant analysis, logistic regression, and support vector machines

... used to test three different classification methods: linear discriminant analysis (LDA), logistic regression ...

Santos, Frédéric
Autor Santos, Frédéric University Bordeaux-CNRS-MCC, PACEA, UMR 5199, F-33615 Pessac, France. Electronic address: frederic.santos@u-bordeaux.fr
Guyomarc'h, Pierre
Autor Guyomarc'h, Pierre University Bordeaux-CNRS-MCC, PACEA, UMR 5199, F-33615 Pessac, France
Bruzek, Jaroslav
Autor Bruzek, Jaroslav University Bordeaux-CNRS-MCC, PACEA, UMR 5199, F-33615 Pessac, France; Charles University, Faculty of Science, Department of Anthropology and Human Genetics, Prague, Czech Republic; West Bohemia University, Faculty of Humanities, Department of Anthropology, Plzen, Czech Republic

Forensic science international. 2014 Dec ; 245 () : 204.e1-8. [epub] 20141013

Forensic Sci Int
ISSN 1872-6283 | 0379-0738
Zdroj

Accuracy of identification tools in forensic anthropology primarily rely upon the variations inherent in the data upon which they are built. Sex determination methods based on craniometrics are widely used and known to be specific to several factors (e.g. sample distribution, population, age, secular trends, measurement technique, etc.). The goal of this study is to discuss the potential variations linked to the statistical treatment of the data. Traditional craniometrics of four samples extracted from documented osteological collections (from Portugal, France, the U.S.A., and Thailand) were used to test three different classification methods: linear discriminant analysis (LDA), logistic regression (LR), and support vector machines (SVM). The Portuguese sample was set as a training model on which the other samples were applied in order to assess the validity and reliability of the different models. The tests were performed using different parameters: some included the selection of the best predictors; some included a strict decision threshold (sex assessed only if the related posterior probability was high, including the notion of indeterminate result); and some used an unbalanced sex-ratio. Results indicated that LR tends to perform slightly better than the other techniques and offers a better selection of predictors. Also, the use of a decision threshold (i.e. p>0.95) is essential to ensure an acceptable reliability of sex determination methods based on craniometrics. Although the Portuguese, French, and American samples share a similar sexual dimorphism, application of Western models on the Thai sample (that displayed a lower degree of dimorphism) was unsuccessful.

Klíčová slova
Accuracy, Forensic anthropology population data, Population, Reliability, Sex estimation, Statistics,
MeSH
diskriminační analýza MeSH
kefalometrie * MeSH
lidé MeSH
logistické modely MeSH
rasové skupiny MeSH
reprodukovatelnost výsledků MeSH
soudní antropologie MeSH
support vector machine MeSH
určení pohlaví podle kostry metody MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
srovnávací studie MeSH

Článek

Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features

... Four different classification methods were used to evaluate fish diets including Random forest (RF), Support ...

Sensors (Basel, Switzerland). 2018 Mar 29 ; 18 (4) : . [epub] 20180329

Sensors (Basel)
ISSN 1424-8220
Zdroj

The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

Klíčová slova
image colour properties, image processing, image texture properties, machine vision system, supervised classification,
MeSH
dieta MeSH
logistické modely MeSH
Oncorhynchus mykiss MeSH
support vector machine * MeSH
zvířata MeSH
Check Tag
zvířata MeSH
Publikační typ
časopisecké články MeSH
srovnávací studie MeSH

Článek

Bimolecular Nucleophilic Substitution Reactions: Predictive Models for Rate Constants and Molecular Reaction Pairs Analysis

... Consensus Support Vector Regression (SVR) model for the rate constant was prepared. ...

Molecular informatics. 2019 Apr ; 38 (4) : e1800104. [epub] 20181123

Mol Inform
ISSN 1868-1751 | 1868-1743
Zdroj

Here, we report the data visualization, analysis and modeling for a large set of 4830 SN 2 reactions the rate constant of which (logk) was measured at different experimental conditions (solvent, temperature). The reactions were encoded by one single molecular graph - Condensed Graph of Reactions, which allowed us to use conventional chemoinformatics techniques developed for individual molecules. Thus, Matched Reaction Pairs approach was suggested and used for the analyses of substituents effects on the substrates and nucleophiles reactivity. The data were visualized with the help of the Generative Topographic Mapping approach. Consensus Support Vector Regression (SVR) model for the rate constant was prepared. Unbiased estimation of the model's performance was made in cross-validation on reactions measured on unique structural transformations. The model's performance in cross-validation (RMSE=0.61 logk units) and on the external test set (RMSE=0.80) is close to the noise in data. Performances of the local models obtained for selected subsets of reactions proceeding in particular solvents or with particular type of nucleophiles were similar to that of the model built on the entire set. Finally, four different definitions of model's applicability domains for reactions were examined.

Klíčová slova
Condensed Graph of Reaction, Generative Topographic Mapping, Matched Reaction Pairs, Support Vector Regression, bimolecular nucleophilic substitution reactions, models applicability domain,
MeSH
chemické modely * MeSH
cyklické uhlovodíky chemie MeSH
kinetika MeSH
oxidace-redukce MeSH
support vector machine * MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Názvy látek
cyklické uhlovodíky MeSH

Článek

Prediction of nickel concentration in peri-urban and urban soils using hybridized empirical bayesian kriging and support vector machine regression

... The prediction results indicated that support vector machine regression (SVMR) performed well, although ...

Scientific reports. 2022 Feb 22 ; 12 (1) : 3004. [epub] 20220222

Sci Rep
ISSN 2045-2322
Zdroj

Soil pollution is a big issue caused by anthropogenic activities. The spatial distribution of potentially toxic elements (PTEs) varies in most urban and peri-urban areas. As a result, spatially predicting the PTEs content in such soil is difficult. A total number of 115 samples were obtained from Frydek Mistek in the Czech Republic. Calcium (Ca), magnesium (Mg), potassium (K), and nickel (Ni) concentrations were determined using Inductively Coupled Plasma Optical Emission Spectroscopy. The response variable was Ni, while the predictors were Ca, Mg, and K. The correlation matrix between the response variable and the predictors revealed a satisfactory correlation between the elements. The prediction results indicated that support vector machine regression (SVMR) performed well, although its estimated root mean square error (RMSE) (235.974 mg/kg) and mean absolute error (MAE) (166.946 mg/kg) were higher when compared with the other methods applied. The hybridized model of empirical bayesian kriging-multiple linear regression (EBK-MLR) performed poorly, as evidenced by a coefficient of determination value of less than 0.1. The empirical bayesian kriging-support vector machine regression (EBK-SVMR) model was the optimal model, with low RMSE (95.479 mg/kg) and MAE (77.368 mg/kg) values and a high coefficient of determination (R2 = 0.637). EBK-SVMR modelling technique output was visualized using a self-organizing map. The clustered neurons of the hybridized model CakMg-EBK-SVMR component plane showed a diverse colour pattern predicting the concentration of Ni in the urban and peri-urban soil. The results proved that combining EBK and SVMR is an effective technique for predicting Ni concentrations in urban and peri-urban soil.

Článek

Absorption Features in Soil Spectra Assessment

... techniques appropriate to relate spectra measurements with soil properties, partial least squares (PLS) regression ...

Applied spectroscopy. 2015 Dec ; 69 (12) : 1425-31.

Appl Spectrosc
ISSN 1943-3530 | 0003-7028
Zdroj

From a wide range of techniques appropriate to relate spectra measurements with soil properties, partial least squares (PLS) regression and support vector machines (SVM) are most commonly used. This is due to their predictive power and the availability of software tools. Both represent exclusively statistically based approaches and, as such, benefit from multiple responses of soil material in the spectrum. However, physical-based approaches that focus only on a single spectral feature, such as simple linear regression using selected continuum-removed spectra values as a predictor variable, often provide accurate estimates. Furthermore, if this approach extends to multiple cases by taking into account three basic absorption feature parameters (area, width, and depth) of all occurring features as predictors and subjecting them to best subset selection, one can achieve even higher prediction accuracy compared with PLS regression. Here, we attempt to further extend this approach by adding two additional absorption feature parameters (left and right side area), as they can be important diagnostic markers, too. As a result, we achieved higher prediction accuracy compared with PLS regression and SVM for exchangeable soil pH, slightly higher or comparable for dithionite-citrate and ammonium oxalate extractable Fe and Mn forms, but slightly worse for oxidizable carbon content. Therefore, we suggest incorporating the multiple linear regression approach based on absorption feature parameters into existing working practices.

MeSH
blízká infračervená spektroskopie metody MeSH
fyzikální absorpce MeSH
kalibrace MeSH
lineární modely MeSH
metoda nejmenších čtverců MeSH
půda chemie MeSH
support vector machine MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Názvy látek
půda MeSH

Článek

Estimation of potentially toxic elements contamination in anthropogenic soils on a brown coal mining dumpsite by reflectance spectroscopy: a case study

... Partial Least Square Regression (PLSR) and Support Vector Machine Regression (SVMR) with cross-validation ...

PloS one. 2015 ; 10 (2) : e0117457. [epub] 20150218

PLoS One
ISSN 1932-6203
Zdroj

In order to monitor Potentially Toxic Elements (PTEs) in anthropogenic soils on brown coal mining dumpsites, a large number of samples and cumbersome, time-consuming laboratory measurements are required. Due to its rapidity, convenience and accuracy, reflectance spectroscopy within the Visible-Near Infrared (Vis-NIR) region has been used to predict soil constituents. This study evaluated the suitability of Vis-NIR (350-2500 nm) reflectance spectroscopy for predicting PTEs concentration, using samples collected on large brown coal mining dumpsites in the Czech Republic. Partial Least Square Regression (PLSR) and Support Vector Machine Regression (SVMR) with cross-validation were used to relate PTEs data to the reflectance spectral data by applying different preprocessing strategies. According to the criteria of minimal Root Mean Square Error of Prediction of Cross Validation (RMSEPcv) and maximal coefficient of determination (R2cv) and Residual Prediction Deviation (RPD), the SVMR models with the first derivative pretreatment provided the most accurate prediction for As (R2cv) = 0.89, RMSEPcv = 1.89, RPD = 2.63). Less accurate, but acceptable prediction for screening purposes for Cd and Cu (0.66 ˂ R2cv) ˂ 0.81, RMSEPcv = 0.0.8 and 4.08 respectively, 2.0 ˂ RPD ˂ 2.5) were obtained. The PLSR model for predicting Mn (R2cv) = 0.44, RMSEPcv = 116.43, RPD = 1.45) presented an inadequate model. Overall, SVMR models for the Vis-NIR spectra could be used indirectly for an accurate assessment of PTEs' concentrations.

MeSH
monitorování životního prostředí MeSH
půda chemie MeSH
support vector machine MeSH
těžba uhlí * MeSH
znečištění životního prostředí analýza MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH
Názvy látek
půda MeSH

Článek

Can in situ spectral measurements under disturbance-reduced environmental conditions help improve soil organic carbon estimation?

... Three separate multivariate models were used to predict SOC, namely Cubist, support vector machine regression ...

The Science of the total environment. 2022 Sep 10 ; 838 (Pt 3) : 156304. [epub] 20220529

Sci Total Environ
ISSN 1879-1026 | 0048-9697
Zdroj

In situ visible and near-infrared (Vis-NIR) spectroscopy has proven to be a reliable tool for determining soil organic carbon (SOC) content with a small loss of precision as compared to laboratory measurements. The loss of precision is a result of disturbing external environmental factors that disrupt spectral measurements. For example, roughness, changes in weather conditions, humidity, temperature, human factors, spectral noise and especially soil water. It has been assumed that, in situ predictive capability could be improved if some of these factors are either minimized or eliminated during the in situ measurement. For this study, the prediction of SOC was carried out under two different in situ measurement conditions; less favourable environmental conditions (with disturbances) and more favourable site-specific conditions (disturbance-reduced conditions). The primary goal is to determine whether the estimate of SOC can be improved under more favourable site-specific conditions, as well as the impact of pre-treatment algorithms on both less and more favourable disturbed conditions. The study employed a large range of pretreatment algorithms and their combinations. Three separate multivariate models were used to predict SOC, namely Cubist, support vector machine regression (SVMR), and partial least squares regression (PLSR). The result clearly shows that reduced disturbing factors (i.e., drier and unploughed soil as well as noise reduction) result in an improvement of SOC prediction with in situ Vis-NIR spectroscopy. The best overall result was achieved with SVMR (R2CV = 0.72, RMSEPcv = 0.21, RPIQ = 2.34). Although the combination of pre-treatment algorithms resulted in an improvement, overall, these pre-treatment algorithms could not compensate for the factors affecting the measured spectra with disturbance. Though the obtained result is promising, further study is still needed to disentangle the impacts and interactions of various disturbing factors for different soil types.

Klíčová slova
Agricultural soil, In situ spectroscopy, Machine learning algorithms, Pre-treatment algorithms, SOC,
MeSH
blízká infračervená spektroskopie metody MeSH
lidé MeSH
metoda nejmenších čtverců MeSH
půda * chemie MeSH
support vector machine MeSH
uhlík * MeSH
Check Tag
lidé MeSH
Publikační typ
časopisecké články MeSH
Názvy látek
půda * MeSH
uhlík * MeSH

Článek

Dental age estimation and different predictive ability of various tooth types in the Czech population: data mining methods

Anthropologischer Anzeiger; Bericht uber die biologisch-anthropologische Literatur. 2013 ; 70 (3) : 331-45.

Anthropol Anz
ISSN 0003-5548
Zdroj

Dental development is frequently used to estimate age in many anthropological specializations. The aim of this study was to extract an accurate predictive age system for the Czech population and to discover any different predictive ability of various tooth types and their ontogenetic stability during infancy and adolescence. A cross-sectional panoramic X-ray study was based on developmental stages assessment of mandibular teeth (Moorrees et al. 1963) using 1393 individuals aged from 3 to 17 years. Data mining methods were used for dental age estimation. These are based on nonlinear relationships between the predicted age and data sets. Compared with other tested predictive models, the GAME method predicted age with the highest accuracy. Age-interval estimations between the 10th and 90th percentiles ranged from -1.06 to +1.01 years in girls and from -1.13 to +1.20 in boys. Accuracy was expressed by RMS error, which is the average deviation between estimated and chronological age. The predictive value of individual teeth changed during the investigated period from 3 to 17 years. When we evaluated the whole period, the second molars exhibited the best predictive ability. When evaluating partial age periods, we found that the accuracy of biological age prediction declines with increasing age (from 0.52 to 1.20 years in girls and from 0.62 to 1.22 years in boys) and that the predictive importance of tooth types changes, depending on variability and the number of developmental stages in the age interval. GAME is a promising tool for age-interval estimation studies as they can provide reliable predictive models.

MeSH
data mining metody MeSH
dítě MeSH
lidé MeSH
mladiství MeSH
předškolní dítě MeSH
průřezové studie MeSH
regresní analýza MeSH
rentgendiagnostika panoramatická MeSH
statistické modely MeSH
support vector machine MeSH
určení zubního věku metody MeSH
zuby anatomie a histologie diagnostické zobrazování MeSH
Check Tag
dítě MeSH
lidé MeSH
mladiství MeSH
mužské pohlaví MeSH
předškolní dítě MeSH
ženské pohlaví MeSH
Publikační typ
časopisecké články MeSH
práce podpořená grantem MeSH

Článek

Machine learning for cyanobacteria inversion via remote sensing and AlgaeTorch in the Třeboň fishponds, Czech Republic

... Partial Least Squares Regression (PLSR) and three machine learning algorithms, Random Forest (RF), Support ...

The Science of the total environment. 2024 Oct 15 ; 947 () : 174504. [epub] 20240704

Sci Total Environ
ISSN 1879-1026 | 0048-9697
Zdroj

Cyanobacteria blooms in fishponds, driven by climate change and anthropogenic activities, have become a critical concern for aquatic ecosystems worldwide. The diversity in fishpond sizes and fish densities further complicates their monitoring. This study addresses the challenge of accurately predicting cyanobacteria concentrations in turbid waters via remote sensing, hindered by optical complexities and diminished light signals. A comprehensive dataset of 740 sampling points was compiled, encompassing water quality metrics (cyanobacteria levels, total chlorophyll, turbidity, total cell count) and spectral data obtained through AlgaeTorch, alongside Sentinel-2 reflectance data from three Třeboň fishponds (UNESCO Man and Biosphere Reserve) in the Czech Republic over 2022-2023. Partial Least Squares Regression (PLSR) and three machine learning algorithms, Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost), were developed based on seasonal and annual data volumes. The SVM algorithm demonstrated commendable performance on the one-year data validation dataset from the Svět fishpond for the prediction of cyanobacteria, reflected by the key performance indicators: R2 = 0.88, RMSE = 15.07 μg Chl-a/L, and RPD = 2.82. Meanwhile, SVM displayed steady results in the unified one-year validation dataset from Naděje, Svět, and Vizír fishponds, with metrics showing R2 = 0.56, RMSE = 39.03 μg Chl-a/L, RPD = 1.50. Thus, Sentinel data proved viable for seasonal cyanobacteria monitoring across different fishponds. Overall, this study presents a novel approach for enhancing the precision of cyanobacteria predictions and long-term ecological monitoring in fishponds, contributing significantly to the water quality management strategies in the Třeboň region.

Klíčová slova
Cyanobacteria, Fishponds, Machine learning, Remote sensing, Water quality inversion,
MeSH
eutrofizace MeSH
kvalita vody MeSH
monitorování životního prostředí * metody MeSH
sinice * MeSH
strojové učení * MeSH
support vector machine MeSH
technologie dálkového snímání * MeSH
Publikační typ
časopisecké články MeSH
Geografické názvy
Česká republika MeSH

Článek

Novel age estimation model based on development of permanent teeth compared with classical approach and other modern data mining methods

... These methods were used to build a tabular multiple linear regression model, an M5P tree model and support ...

Forensic science international. 2017 Oct ; 279 () : 72-82. [epub] 20170814

Forensic Sci Int
ISSN 1872-6283 | 0379-0738
Zdroj

In order to analyze and improve the dental age estimation in children and adolescents for forensic purposes, 22 age estimation methods were compared to a sample of 976 orthopantomographs (662 males, 314 females) of healthy Czech children and adolescents aged between 2.7 and 20.5 years. All methods are compared in terms of the accuracy and complexity and are based on various data mining methods or on simple mathematical operations. The winning method is presented in detail. The comparison showed that only three methods provide the best accuracy while remaining user-friendly. These methods were used to build a tabular multiple linear regression model, an M5P tree model and support vector machine model with first-order polynomial kernel. All of them have mean absolute error (MAE) under 0.7 years for both males and females. The other well-performing data mining methods (RBF neural network, K-nearest neighbors, Kstar, etc.) have similar or slightly better accuracy, but they are not user-friendly as they require computing equipment and the implementation as computer program. The lowest estimation accuracy provides the traditional model based on age averages (MAE under 0.96 years). Different relevancy of various teeth for the age estimation was found. This finding also explains the lowest accuracy of the traditional averages-based model. In this paper, a technique for missing data replacement for the cases with missing teeth is presented in detail as well as the constrained tabular multiple regression model. Also, we provide free age prediction software based on this wining model.

Klíčová slova
Age estimation, Data mining, Model, Population-specific standards,
MeSH
data mining MeSH
dentice trvalá * MeSH
dítě MeSH
lidé MeSH
lineární modely MeSH
mladiství MeSH
mladý dospělý MeSH
neuronové sítě MeSH
předškolní dítě MeSH
rentgendiagnostika panoramatická MeSH
rozhodovací stromy MeSH
software MeSH
support vector machine MeSH
určení zubního věku metody MeSH
zuby růst a vývoj MeSH
Check Tag
dítě MeSH
lidé MeSH
mladiství MeSH
mladý dospělý MeSH
mužské pohlaví MeSH
předškolní dítě MeSH
ženské pohlaví MeSH
Publikační typ
časopisecké články MeSH
srovnávací studie MeSH

Publikováno

Filtry

Support vector regression Dotaz Zobrazit nápovědu

Přesná shoda

Support vector regression Dotaz Zobrazit nápovědu Přesná shoda

Upřesnit dle MeSH

Support vector regression Dotaz Zobrazit nápovědu

Přesná shoda