Abstract

Abstract The problem of selecting the best subset or subsets of independent variables in a multiple linear regression analysis is two-fold. The first, and most important problem is the development of criterion for choosing between two contending subsets. Applying these criteria to all possible subsets, if the number of independent variables is large, may not be economically feasible and so the second problem is concerned with decreasing the computational effort. This paper is concerned with the second question using the C p -statistic of Mallows as the basic criterion for comparing two regressions. A procedure is developed which will indicate ‘good’ regressions with B minimum of computation.

Keywords

Selection (genetic algorithm)StatisticsRegression analysisRegressionMathematicsComputer scienceArtificial intelligence

Affiliated Institutions

Related Publications

Subset Selection in Regression

OBJECTIVES Prediction, Explanation, Elimination or What? How Many Variables in the Prediction Formula? Alternatives to Using Subsets 'Black Box' Use of Best-Subsets Techniques L...

2003 Technometrics 1482 citations

Publication Info

Year
1967
Type
article
Volume
9
Issue
4
Pages
531-540
Citations
280
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

280
OpenAlex

Cite This

R. R. Hocking, R. N. Leslie (1967). Selection of the Best Subset in Regression Analysis. Technometrics , 9 (4) , 531-540. https://doi.org/10.1080/00401706.1967.10490502

Identifiers

DOI
10.1080/00401706.1967.10490502