Linear Model Selection by Cross-Validation

Jun Shao Jun Shao
1993 Journal of the American Statistical Association 324 citations

Abstract

Abstract We consider the problem of selecting a model having the best predictive ability among a class of linear models. The popular leave-one-out cross-validation method, which is asymptotically equivalent to many other model selection methods such as the Akaike information criterion (AIC), the C p , and the bootstrap, is asymptotically inconsistent in the sense that the probability of selecting the model with the best predictive ability does not converge to 1 as the total number of observations n → ∞. We show that the inconsistency of the leave-one-out cross-validation can be rectified by using a leave-n v -out cross-validation with n v , the number of observations reserved for validation, satisfying n v /n → 1 as n → ∞. This is a somewhat shocking discovery, because nv/n → 1 is totally opposite to the popular leave-one-out recipe in cross-validation. Motivations, justifications, and discussions of some practical aspects of the use of the leave-n v -out cross-validation method are provided, and results from a simulation study are presented.

Keywords

Selection (genetic algorithm)Cross-validationModel selectionLinear modelStatisticsComputer scienceMathematicsEconometricsArtificial intelligence

Affiliated Institutions

Related Publications

Publication Info

Year
1993
Type
article
Volume
88
Issue
422
Pages
486-486
Citations
324
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

324
OpenAlex

Cite This

Jun Shao (1993). Linear Model Selection by Cross-Validation. Journal of the American Statistical Association , 88 (422) , 486-486. https://doi.org/10.2307/2290328

Identifiers

DOI
10.2307/2290328