linear least square
Dotaz
Zobrazit nápovědu
In the realm of multi-class classification, the twin K-class support vector classification (Twin-KSVC) generates ternary outputs {-1,0,+1} by evaluating all training data in a "1-versus-1-versus-rest" structure. Recently, inspired by the least-squares version of Twin-KSVC and Twin-KSVC, a new multi-class classifier called improvements on least-squares twin multi-class classification support vector machine (ILSTKSVC) has been proposed. In this method, the concept of structural risk minimization is achieved by incorporating a regularization term in addition to the minimization of empirical risk. Twin-KSVC and its improvements have an influence on classification accuracy. Another aspect influencing classification accuracy is feature selection, which is a critical stage in machine learning, especially when working with high-dimensional datasets. However, most prior studies have not addressed this crucial aspect. In this study, motivated by ILSTKSVC and the cardinality-constrained optimization problem, we propose ℓp-norm least-squares twin multi-class support vector machine (PLSTKSVC) with 0
Potentiometric and spectrophotometric pH-titration of the multiprotic cytostatics bosutinib for dissociation constants determination were compared. Bosutinib treats patients with positive chronic myeloid leukemia. Bosutinib exhibits four protonatable sites in a pH range from 2 to 11, where two pK are well separated (ΔpK>3), while the other two are near dissociation constants. In the neutral medium, bosutinib occurs in the slightly water soluble form LH that can be protonated to the soluble cation LH4(3+). The molecule LH can be dissociated to still difficultly soluble anion L(-). The set of spectra upon pH from 2 to 11 in the 239.3-375.0nm was divided into two absorption bands: the first one from 239.3 to 290.5nm and the second from 312.3 to 375.0nm, which differ in sensitivity of chromophores to a pH change. Estimates of pK of the entire set of spectra were compared with those of both absorption bands. Due to limited solubility of bosutinib the protonation in a mixed aqueous-methanolic medium was studied. In low methanol content of 3-6% three dissociation constants can be reliably determined with SPECFIT/32 and SQUAD(84) and after extrapolation to zero content of methanol they lead to pKc1=3.43(12), pKc2=4.54(10), pKc3=7.56(07) and pKc4=11.04(05) at 25°C and pKc1=3.44(06), pKc2=5.03(08) pKc3=7.33(05) and pKc4=10.92(06) at 37°C. With an increasing content of methanol in solvent the dissociation of bosutinib is suppressed and the percentage of LH3(2+) decreases and LH prevails. From the potentiometric pH-titration at 25°C the concentration dissociation constants were estimated with ESAB pKc1=3.51(02), pKc2=4.37(02), pKc3=7.97(02) and pKc4=11.05(03) and with HYPERQUAD: pKc1=3.29(12), pKc2=4.24(10), pKc3=7.95(07) and pKc4=11.29(05).
- Klíčová slova
- Bosutinib, Dissociation constants, ESAB2M, HYPERQUAD, INDICES, PALLAS, Potentiometric Titration, SQUAD(84) SPECFIT/32, Spectrophotometric titration,
- MeSH
- aniliny analýza chemie MeSH
- chinoliny analýza chemie MeSH
- cytostatické látky analýza chemie MeSH
- koncentrace vodíkových iontů MeSH
- metoda nejmenších čtverců MeSH
- nelineární dynamika * MeSH
- nitrily analýza chemie MeSH
- potenciometrie MeSH
- spektrofotometrie metody MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- aniliny MeSH
- bosutinib MeSH Prohlížeč
- chinoliny MeSH
- cytostatické látky MeSH
- nitrily MeSH
The common approach for regression analysis with compositional variables is to express compositions in log-ratio coordinates (coefficients) and then perform standard statistical processing in real space. Similar to working in real space, the problem is that the standard least squares regression fails when the number of parts of all compositional covariates is higher than the number of observations. The aim of this study is to analyze in detail the partial least squares (PLS) regression which can deal with this problem. In this paper, we focus on the PLS regression between more than one compositional response variable and more than one compositional covariate. First, we give the PLS regression model with log-ratio coordinates of compositional variables, then we express the PLS model directly in the simplex. We also prove that the PLS model is invariant under the change of coordinate system, such as the ilr coordinates with a different contrast matrix or the clr coefficients. Moreover, we give the estimation and inference for parameters in PLS model. Finally, the PLS model with clr coefficients is used to analyze the relationship between the chemical metabolites of Astragali Radix and the plasma metabolites of rat after giving Astragali Radix.
- Klíčová slova
- 62H12, 62H86, 62J05, Compositional data, centered log-ratio coefficients, coordinates, linear regression model, partial least squares,
- Publikační typ
- časopisecké články MeSH
The mixed dissociation constants of four non-steroidal anti-inflammatory drugs (NSAIDs) ibuprofen, diclofenac sodium, flurbiprofen and ketoprofen at various ionic strengths I of range 0.003-0.155, and at temperatures of 25 degrees C and 37 degrees C, were determined with the use of two different multiwavelength and multivariate treatments of spectral data, SPECFIT/32 and SQUAD(84) nonlinear regression analyses and INDICES factor analysis. The factor analysis in the INDICES program predicts the correct number of components, and even the presence of minor ones, when the data quality is high and the instrumental error is known. The thermodynamic dissociation constant pK(a)(T) was estimated by nonlinear regression of (pK(a), I) data at 25 degrees C and 37 degrees C. Goodness-of-fit tests for various regression diagnostics enabled the reliability of the parameter estimates found to be proven. PALLAS, MARVIN, SPARC, ACD/pK(a) and Pharma Algorithms predict pK(a) being based on the structural formulae of drug compounds in agreement with the experimental value. The best agreement seems to be between the ACD/pK(a) program and experimentally found values and with SPARC. PALLAS and MARVIN predicted pK(a,pred) values with larger bias errors in comparison with the experimental value for all four drugs.
- MeSH
- antiflogistika nesteroidní chemie MeSH
- chemické modely MeSH
- koncentrace vodíkových iontů MeSH
- metoda nejmenších čtverců MeSH
- molekulární struktura MeSH
- nelineární dynamika MeSH
- rozpustnost MeSH
- spektrofotometrie metody MeSH
- termodynamika * MeSH
- titrace metody MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- antiflogistika nesteroidní MeSH
A computer-aided quantitative method for a complex analysis of gel electrophoretograms is presented. The analysis consists of several steps: (i) determination of the background image by methods of mathematical morphology and its subtraction from the gel image, (ii) selection of an appropriate part of the gel lane including curved lanes and lanes with a nonuniform width, (iii) computation of the lane densitogram by averaging several lane-parallel scans, (iv) decomposition of the lane densitogram into component bands using a data selecting algorithm and Marquardt's minimizer. Several different functions for component bands are utilized. It is shown that the densitogram can be decomposed into component bands with reasonable accuracy only if an appropriate model function is chosen. The algorithms are tested on several different gel electrophoretograms which show typical features as a nonuniform background, curved lanes, an asymmetrical band shape and a superposition of small bands on the shoulders of big ones. It is shown that overlapped bands are best approximated by an asymmetrical Gausian curve and an asymmetrical Gauss-Cauchy function. Linear response to the serial dilution of the protein sample is tested.
- MeSH
- denzitometrie MeSH
- DNA bakterií analýza MeSH
- elektroforéza v agarovém gelu metody MeSH
- elektroforéza v polyakrylamidovém gelu metody MeSH
- Escherichia coli genetika MeSH
- matematika MeSH
- metoda nejmenších čtverců * MeSH
- molekulová hmotnost MeSH
- plazmidy MeSH
- počítačové zpracování obrazu * MeSH
- proteiny chemie MeSH
- ribozomální proteiny analýza MeSH
- Streptomyces aureofaciens chemie MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- DNA bakterií MeSH
- proteiny MeSH
- ribozomální proteiny MeSH
Although the modern instrumentation enables for the increased amount of data to be delivered in shorter time, computer-assisted spectra analysis is limited by the intelligence and by the programmed logic tool applications. Proposed tutorial covers all the main steps of the data processing which involve the chemical model building, from calculating the concentration profiles and, using spectra regression, fitting the protonation constants of the chemical model to multiwavelength and multivariate data measured. Suggested diagnostics are examined to see whether the chemical model hypothesis can be accepted, as an incorrect model with false stoichiometric indices may lead to slow convergence, cyclization or divergence of the regression process minimization. Diagnostics concern the physical meaning of unknown parameters beta(qr) and epsilon(qr), physical sense of associated species concentrations, parametric correlation coefficients, goodness-of-fit tests, error analyses and spectra deconvolution, and the correct number of light-absorbing species determination. All of the benefits of spectrophotometric data analysis are demonstrated on the protonation constants of the ionizable anticancer drug 7-ethyl-10-hydroxycamptothecine, using data double checked with the SQUAD(84) and SPECFIT/32 regression programs and with factor analysis of the INDICES program. The experimental determination of protonation constants with their computational prediction based on a knowledge of chemical structures of the drug was through the combined MARVIN and PALLAS programs. If the proposed model adequately represents the data, the residuals should form a random pattern with a normal distribution N(0, s2), with the residual mean equal to zero, and the standard deviation of residuals being near to experimental noise. Examination of residual plots may be assisted by a graphical analysis of residuals, and systematic departures from randomness indicate that the model and parameter estimates are not satisfactory.
- Publikační typ
- časopisecké články MeSH
In this paper, we study the design aspects of an indoor visible light positioning (VLP) system that uses an artificial neural network (ANN) for positioning estimation by considering a multipath channel. Previous results usually rely on the simplistic line of sight model with limited validity. The study considers the influence of noise as a performance indicator for the comparison between different design approaches. Three different ANN algorithms are considered, including Levenberg-Marquardt, Bayesian regularization, and scaled conjugate gradient algorithms, to minimize the positioning error (εp) in the VLP system. The ANN design is optimized based on the number of neurons in the hidden layers, the number of training epochs, and the size of the training set. It is shown that, the ANN with Bayesian regularization outperforms the traditional received signal strength (RSS) technique using the non-linear least square estimation for all values of signal to noise ratio (SNR). Furthermore, in the inner region, which includes the area of the receiving plane within the transmitters, the positioning accuracy is improved by 43, 55, and 50% for the SNR of 10, 20, and 30 dB, respectively. In the outer region, which is the remaining area within the room, the positioning accuracy is improved by 57, 32, and 6% for the SNR of 10, 20, and 30 dB, respectively. Moreover, we also analyze the impact of different training dataset sizes in ANN, and we show that it is possible to achieve a minimum εp of 2 cm for 30 dB of SNR using a random selection scheme. Finally, it is observed that εp is low even for lower values of SNR, i.e., εp values are 2, 11, and 44 cm for the SNR of 30, 20, and 10 dB, respectively.
- Klíčová slova
- Bayesian regularization, artificial neural network (ANN), multipath reflections, non-linear least square, visible light communication (VLC), visible light positioning,
- MeSH
- algoritmy * MeSH
- Bayesova věta MeSH
- metoda nejmenších čtverců MeSH
- neuronové sítě * MeSH
- světlo MeSH
- Publikační typ
- časopisecké články MeSH
From a wide range of techniques appropriate to relate spectra measurements with soil properties, partial least squares (PLS) regression and support vector machines (SVM) are most commonly used. This is due to their predictive power and the availability of software tools. Both represent exclusively statistically based approaches and, as such, benefit from multiple responses of soil material in the spectrum. However, physical-based approaches that focus only on a single spectral feature, such as simple linear regression using selected continuum-removed spectra values as a predictor variable, often provide accurate estimates. Furthermore, if this approach extends to multiple cases by taking into account three basic absorption feature parameters (area, width, and depth) of all occurring features as predictors and subjecting them to best subset selection, one can achieve even higher prediction accuracy compared with PLS regression. Here, we attempt to further extend this approach by adding two additional absorption feature parameters (left and right side area), as they can be important diagnostic markers, too. As a result, we achieved higher prediction accuracy compared with PLS regression and SVM for exchangeable soil pH, slightly higher or comparable for dithionite-citrate and ammonium oxalate extractable Fe and Mn forms, but slightly worse for oxidizable carbon content. Therefore, we suggest incorporating the multiple linear regression approach based on absorption feature parameters into existing working practices.
The objective of this study is the evaluation of the potential of high-throughput direct analysis in real time-high resolution mass spectrometry (DART-HRMS) fingerprinting and multivariate regression analysis in prediction of the extent of acrylamide formation in biscuit samples prepared by various recipes and baking conditions. Information-rich mass spectral fingerprints were obtained by analysis of biscuit extracts for preparation of which aqueous methanol was used. The principal component analysis (PCA) of the acquired data revealed an apparent clustering of samples according to the extent of heat-treatment applied during the baking of the biscuits. The regression model for prediction of acrylamide in biscuits was obtained by partial least square regression (PLSR) analysis of the data matrix representing combined positive and negative ionization mode fingerprints. The model provided a least root mean square error of cross validation (RMSECV) equal to an acrylamide concentration of 5.4 μg kg(-1) and standard error of prediction (SEP) of 14.8 μg kg(-1). The results obtained indicate that this strategy can be used to accurately predict the amounts of acrylamide formed during baking of biscuits. Such rapid estimation of acrylamide concentration can become a useful tool in evaluation of the effectivity of processes aiming at mitigation of this food processing contaminant. However, the robustness this approach with respect to variability in the chemical composition of ingredients used for preparation of biscuits should be tested further.
- Klíčová slova
- Acrylamide, Biscuits, Direct analysis in real time, Mass spectrometry, Multivariate regression analysis,
- MeSH
- akrylamid analýza MeSH
- analýza hlavních komponent MeSH
- analýza potravin metody MeSH
- chléb analýza MeSH
- hmotnostní spektrometrie MeSH
- lineární modely MeSH
- metoda nejmenších čtverců MeSH
- multivariační analýza MeSH
- tandemová hmotnostní spektrometrie MeSH
- vaření * MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- akrylamid MeSH
The ISO 7029 (2000) standard defines normative hearing thresholds H (dB hearing level) as a function of age Y (years), given by H = α(Y - 18)(2), up to 8 kHz. The purpose of this study was to determine reference thresholds above 8 kHz. Hearing thresholds were examined using pure-tone audiometry over the extended frequency range 0.125-16 kHz, and the acquired values were used to specify the optimal approximation of the dependence of hearing thresholds on age. A sample of 411 otologically normal men and women 16-70 years of age was measured in both ears using a high-frequency audiometer and Sennheiser HDA 200 headphones. The coefficients of quadratic, linear, polynomial and power-law approximations were calculated using the least-squares fitting procedure. The approximation combining the square function H = α(Y - 18)(2) with a power-law function H = β(Y - 18)(1.5), both gender-independent, was found to be the most appropriate. Coefficient α was determined at frequencies of 9 kHz (α = 0.021), 10 kHz (α = 0.024), 11.2 kHz (α = 0.029), and coefficient β at frequencies of 12.5 kHz (β = 0.24), 14 kHz (β = 0.32), 16 kHz (β = 0.36). The results could be used to determine age-dependent normal hearing thresholds in an extended frequency range and to normalize hearing thresholds when comparing participants differing in age.
- MeSH
- akustická stimulace MeSH
- audiometrie čistými tóny normy MeSH
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- lineární modely MeSH
- metoda nejmenších čtverců MeSH
- mladiství MeSH
- mladý dospělý MeSH
- referenční hodnoty MeSH
- senioři MeSH
- sluch * MeSH
- sluchový práh * MeSH
- stárnutí psychologie MeSH
- věkové faktory MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mladý dospělý MeSH
- mužské pohlaví MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH