A test of significance for partial least squares regression

Abstract

Abstract Partial least squares (PLS) regression is a commonly used statistical technique for performing multivariate calibration, especially in situations where there are more variables than samples. Choosing the number of factors to include in a model is a decision that all users of PLS must make, but is complicated by the large number of empirical tests available. In most instances predictive ability is the most desired property of a PLS model and so interest has centred on making this choice based on an internal validation process. A popular approach is the calculation of a cross‐validated r 2 to gauge how much variance in the dependent variable can be explained from leave‐one‐out predictions. Using Monte Carlo simulations for different sizes of data set, the influence of chance effects on the cross‐validation process is investigated. The results are presented as tables of critical values which are compared against the values of cross‐validated r 2 obtained from the user's own data set. This gives a formal test for predictive ability of a PLS model with a given number of dimensions.

Keywords

Partial least squares regressionMonte Carlo methodRegression analysisVariance (accounting)Linear regressionStatisticsSet (abstract data type)Cross-validationMultivariate statisticsComputer scienceMathematics

Affiliated Institutions

AstraZeneca (United Kingdom) GB

Related Publications

Partial least squares regression and projection on latent structure regression (PLS Regression)

Hervé Abdi

Abstract Partial least squares (PLS) regression ( a.k.a. projection on latent structures) is a recent technique that combines features from and generalizes principal component a...

2010 Wiley Interdisciplinary Reviews Compu... 1363 citations

PLS, Small Sample Size, and Statistical Power in MIS Research

Dale L. Goodhue , William W. Lewis , Ron Thompson

There is a pervasive belief in the Management Information Systems (MIS) field that Partial Least Squares (PLS) has special abilities that make it more appropriate than other tec...

2006 299 citations

MCMC Methods for Multi-Response Generalized Linear Mixed Models: The<b>MCMCglmm</b><i>R</i>Package

Jarrod D. Hadfield

Generalized linear mixed models provide a flexible framework for modeling a range of data, although with non-Gaussian response variables the likelihood cannot be obtained in clo...

2010 Journal of Statistical Software 4603 citations

On measurement of intangible assets: A study of robustness of partial least squares

Claes M. Cassel , Peter Hackl , Anders Westlund

The customer asset is an important intangible. Its value depends, for example, on the customer satisfaction level. Thus, it is important to monitor that level, and to identify c...

2000 Total Quality Management 210 citations

Regression methods for high dimensional multicollinear data

Lorna Aucott , Paul H. Garthwaite , James Currall

To compare their performance on high dimensional data, several regression methods are applied to data sets in which the number of exploratory variables greatly exceeds the sampl...

2000 Communications in Statistics - Simula... 13 citations

Publication Info

Year: 1993
Type: article
Volume: 7
Issue: 4
Pages: 291-304
Citations: 124
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

A test of significance for partial least squares regression

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

124

OpenAlex

Cite This

APA Style

                            
                                    Ian Wakeling, 
                                
                                    Jeff Morris
                                
                            (1993). 
                            A test of significance for partial least squares regression. 
                            Journal of Chemometrics
                            , 7
                            (4)
                            , 291-304.
                            https://doi.org/10.1002/cem.1180070407

Identifiers

DOI: 10.1002/cem.1180070407