Abstract
Abstract Partial least squares (PLS) regression is a commonly used statistical technique for performing multivariate calibration, especially in situations where there are more variables than samples. Choosing the number of factors to include in a model is a decision that all users of PLS must make, but is complicated by the large number of empirical tests available. In most instances predictive ability is the most desired property of a PLS model and so interest has centred on making this choice based on an internal validation process. A popular approach is the calculation of a cross‐validated r 2 to gauge how much variance in the dependent variable can be explained from leave‐one‐out predictions. Using Monte Carlo simulations for different sizes of data set, the influence of chance effects on the cross‐validation process is investigated. The results are presented as tables of critical values which are compared against the values of cross‐validated r 2 obtained from the user's own data set. This gives a formal test for predictive ability of a PLS model with a given number of dimensions.
Keywords
Affiliated Institutions
Related Publications
Partial least squares regression and projection on latent structure regression (PLS Regression)
Abstract Partial least squares (PLS) regression ( a.k.a. projection on latent structures) is a recent technique that combines features from and generalizes principal component a...
PLS, Small Sample Size, and Statistical Power in MIS Research
There is a pervasive belief in the Management Information Systems (MIS) field that Partial Least Squares (PLS) has special abilities that make it more appropriate than other tec...
MCMC Methods for Multi-Response Generalized Linear Mixed Models: The<b>MCMCglmm</b><i>R</i>Package
Generalized linear mixed models provide a flexible framework for modeling a range of data, although with non-Gaussian response variables the likelihood cannot be obtained in clo...
On measurement of intangible assets: A study of robustness of partial least squares
The customer asset is an important intangible. Its value depends, for example, on the customer satisfaction level. Thus, it is important to monitor that level, and to identify c...
Regression methods for high dimensional multicollinear data
To compare their performance on high dimensional data, several regression methods are applied to data sets in which the number of exploratory variables greatly exceeds the sampl...
Publication Info
- Year
- 1993
- Type
- article
- Volume
- 7
- Issue
- 4
- Pages
- 291-304
- Citations
- 124
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1002/cem.1180070407