Abstract

Principal components analysis (PCA) is a classic method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. Contemporary datasets often have p comparable with or even much larger than n. Our main assertions, in such settings, are (a) that some initial reduction in dimensionality is desirable before applying any PCA-type search for principal modes, and (b) the initial reduction in dimensionality is best achieved by working in a basis in which the signals have a sparse representation. We describe a simple asymptotic model in which the estimate of the leading principal component vector via standard PCA is consistent if and only if p(n)/n→0. We provide a simple algorithm for selecting a subset of coordinates with largest sample variances, and show that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ⪢ n.

Keywords

Principal component analysisDimensionality reductionCurse of dimensionalitySparse PCAConsistency (knowledge bases)MathematicsRepresentation (politics)Simple (philosophy)Pattern recognition (psychology)Reduction (mathematics)StatisticsComputer scienceArtificial intelligenceDiscrete mathematics

Affiliated Institutions

Related Publications

Principal component analysis

Abstract Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter‐correlated quantitative d...

2010 Wiley Interdisciplinary Reviews Compu... 9554 citations

Publication Info

Year
2009
Type
article
Volume
104
Issue
486
Pages
682-693
Citations
858
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

858
OpenAlex

Cite This

Iain M. Johnstone, Arthur Yu Lu (2009). On Consistency and Sparsity for Principal Components Analysis in High Dimensions. Journal of the American Statistical Association , 104 (486) , 682-693. https://doi.org/10.1198/jasa.2009.0121

Identifiers

DOI
10.1198/jasa.2009.0121