Partial least squares regression with compositional response variables and covariates
Status PubMed-not-MEDLINE Jazyk angličtina Země Velká Británie, Anglie Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
35707263
PubMed Central
PMC9041651
DOI
10.1080/02664763.2020.1795813
PII: 1795813
Knihovny.cz E-zdroje
- Klíčová slova
- 62H12, 62H86, 62J05, Compositional data, centered log-ratio coefficients, coordinates, linear regression model, partial least squares,
- Publikační typ
- časopisecké články MeSH
The common approach for regression analysis with compositional variables is to express compositions in log-ratio coordinates (coefficients) and then perform standard statistical processing in real space. Similar to working in real space, the problem is that the standard least squares regression fails when the number of parts of all compositional covariates is higher than the number of observations. The aim of this study is to analyze in detail the partial least squares (PLS) regression which can deal with this problem. In this paper, we focus on the PLS regression between more than one compositional response variable and more than one compositional covariate. First, we give the PLS regression model with log-ratio coordinates of compositional variables, then we express the PLS model directly in the simplex. We also prove that the PLS model is invariant under the change of coordinate system, such as the ilr coordinates with a different contrast matrix or the clr coefficients. Moreover, we give the estimation and inference for parameters in PLS model. Finally, the PLS model with clr coefficients is used to analyze the relationship between the chemical metabolites of Astragali Radix and the plasma metabolites of rat after giving Astragali Radix.
Zobrazit více v PubMed
Aitchison J., The Statistical Analysis of Compositional Data, Chapman and Hall, London, 1986, Reprinted in 2003 with additional material by The Blackburn Press.
Aitchison J. and Bacon-Shone J., Log contrast models for experiments with mixtures, Biometrika 71 (1984), pp. 323–330. doi: 10.1093/biomet/71.2.323 DOI
Barceló-Vidal C. and Martín-Fernández J.A., The mathematics of compositional analysis, Austrian J. Stat. 45 (2016), pp. 57–71. doi: 10.17713/ajs.v45i4.142 DOI
Bastien P., Vinzi V.E., and Tenenhaus M., PLS generalised linear regression, Comput. Stat. Data Anal. 48 (2005), pp. 17–46. doi: 10.1016/j.csda.2004.02.005 DOI
Chen J., Zhang X., Hron K., Templ M., and Li S., Regression imputation with Q-mode clustering for rounded zero replacement in high-dimensional compositional data, J. Appl. Stat. 45 (2018), pp. 2067–2080. doi: 10.1080/02664763.2017.1410524 DOI
Chen J., Zhang X., and Li S., Multiple linear regression with compositional response and covariates, J. Appl. Stat. 44 (2017), pp. 2270–2285. doi: 10.1080/02664763.2016.1157145 PubMed DOI PMC
Coenders G., Martín-Fernández J.A., and Ferrer-Rosell B., When relative and absolute information matter: Compositional predictor with a total in generalized linear models, Stat. Model. 17 (2017), pp. 494–512. doi: 10.1177/1471082X17710398 DOI
De Jong S., SIMPLS: An alternative approach to partial least squares regression, Chemom. Intell. Lab. Syst. 18 (1993), pp. 251–263. doi: 10.1016/0169-7439(93)85002-X DOI
Egozcue J.J., Barceló-Vidal C., Martín-Fernández J.A., Jarauta-Bragulat E., Díaz-Barrero J.L., and Mateu-Figueras G., Elements of simplicial linear algebra and geometry, in Compositional Data Analysis: Theory and Applications, V. Pawlowsky-Glahn and A. Buccianti, eds., John Wiley & Sons, Chichester, 2011, pp. 141–157.
Egozcue J.J., Daunis-i Estadella J., Pawlowsky-Glahn V., Hron K., and Filzmoser P., Simplicial regression: The normal model, J. Appl. Probab. Stat. 6 (2011), pp. 87–108.
Egozcue J.J., Pawlowsky-Glahn V., Mateu-Figueras G., and Barceló-Vidal C., Isometric logratio transformations for compositional data analysis, Math. Geosci. 35 (2003), pp. 279–300.
Filzmoser P., Hron K., and Templ M., Applied Compositional Data Analysis: With Worked Examples in R, Springer Series in Statistics, Springer, Switzerland, 2018.
Gallo M., Discriminant partial least squares analysis on compositional data, Stat. Model. 10 (2010), pp. 41–56. doi: 10.1177/1471082X0801000103 DOI
Gueorguieva R., Rosenheck R., and Zelterman D., Dirichlet component regression and its applications to psychiatric data, Comput. Stat. Data Anal. 52 (2008), pp. 5344–5355. doi: 10.1016/j.csda.2008.05.030 PubMed DOI PMC
Hinkle J. and Rayens W.S., Partial least squares and compositional data: Problems and alternatives, Chemometr. Intell. Lab. 30 (1995), pp. 159–172. doi: 10.1016/0169-7439(95)00062-3 DOI
Hron K., Filzmoser P., and Thompson K., Linear regression with compositional explanatory variables, J. Appl. Stat. 39 (2012), pp. 1115–1128. doi: 10.1080/02664763.2011.644268 DOI
Johnson R.A. and Wichern D.W., Applied Multivariate Statistical Analysis, Pearson Prentice Hall, Upper Saddle River, 2007, Retrieved 10 August 2012.
Kalivodova A., Hron K., Filzmoser P., Najdekr L., Janeckova H., and Adam T., PLS-DA for compositional data with application to metabolomics, J. Chemometr. 29 (2015), pp. 21–28. doi: 10.1002/cem.2657 DOI
Li A., Li Z., Sun H., Li K., Qin X., and Du G., Comparison of Two Different Astragali Radix by a PubMed DOI
Lin W., Shi P., Feng R., and Li H., Variable selection in regression with compositional covariates, Biometrika 101 (2014), pp. 785–797. doi: 10.1093/biomet/asu031 DOI
Martens H. and Martens M., Modified jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR), Food Qual. Pref. 11 (2000), pp. 5–16. doi: 10.1016/S0950-3293(99)00039-7 DOI
Mateu-Figueras G., Pawlowsky-Glahn V., and Egozcue J.J., The principle of working on coordinates, in Compositional Data Analysis: Theory and Applications, V. Pawlowsky-Glahn and A. Buccianti, eds., John Wiley & Sons, Chichester, 2011, pp. 31–42.
Morais J., Thomas-Agnan C., and Simioni M., Interpretation of explanatory variables impacts in compositional regression models, Austrian J. Stat. 47 (2018), pp. 1–25. doi: 10.17713/ajs.v47i5.718 DOI
Palarea-Albaladejo J. and Martín-Fernández J.A., zCompositions-R package for multivariate imputation of left-censored data under a compositional approach, Chemometr. Intell. Lab. Syst. 143 (2015), pp. 85–96. doi: 10.1016/j.chemolab.2015.02.019 DOI
Pawlowsky-Glahn V. and Egozcue J.J., Geometric approach to statistical analysis on the simplex, Stoch. Env. Res. Risk A 15 (2001), pp. 384–398. doi: 10.1007/s004770100077 DOI
Pawlowsky-Glahn V. and Egozcue J.J., BLU estimators and compositional data, Math. Geosci. 34 (2002), pp. 259–274.
Pawlowsky-Glahn V., Egozcue J.J., and Tolosana-Delgado R., Modeling and Analysis of Compositional Data, John Wiley & Sons, Chichester, 2015.
Scealy J.L. and Welsh A.H., Regression for compositional data by using distributions defined on the hypersphere, J. R. Stat. Soc. B 73 (2011), pp. 351–375. doi: 10.1111/j.1467-9868.2010.00766.x DOI
Tolosana-Delgado R. and von Eynatten H., Simplifying compositional multiple regression: Application to grain size controls on sediment geochemistry, Comput. Geosci.-UK 36 (2010), pp. 577–589. doi: 10.1016/j.cageo.2009.02.012 DOI
Varmuza K. and Filzmoser P., Introduction to Multivariate Statistical Analysis in Chemometrics, Taylor & Francis, New York, 2009.
Wang H., Meng J., and Tenenhaus M., Regression modelling analysis on compositional data, in Handbook of Partial Least Squares: Concepts, Methods and Applications, V.E. Vinzi, W.W. Chin, J. Henseler, and H. Wang, eds., Springer-Verlag, Berlin Heidelberg, 2010, pp. 381–406.
Wang H., Shangguan L., Wu J., and Guan R., Multiple linear regression modeling for compositional data, Neurocomputing 122 (2013), pp. 490–500. doi: 10.1016/j.neucom.2013.05.025 DOI
Partial least squares regression with compositional response variables and covariates