Abstract

This paper studies summary measures of the predictive power of a generalized linear model, paying special attention to a generalization of the multiple correlation coefficient from ordinary linear regression. The population value is the correlation between the response and its conditional expectation given the predictors, and the sample value is the correlation between the observed response and the model predicted value. We compare four estimators of the measure in terms of bias, mean squared error and behaviour in the presence of overparameterization. The sample estimator and a jack-knife estimator usually behave adequately, but a cross-validation estimator has a large negative bias with large mean squared error. One can use bootstrap methods to construct confidence intervals for the population value of the correlation measure and to estimate the degree to which a model selection procedure may provide an overly optimistic measure of the actual predictive power.

Keywords

EstimatorStatisticsMean squared errorMathematicsLinear modelLinear regressionPredictive powerConfidence intervalMeasure (data warehouse)GeneralizationGeneralized linear modelPopulationSample size determinationEconometricsComputer science

Affiliated Institutions

Related Publications

Publication Info

Year
2000
Type
article
Volume
19
Issue
13
Pages
1771-1781
Citations
272
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

272
OpenAlex

Cite This

Beiyao Zheng, Alan Agresti (2000). Summarizing the predictive power of a generalized linear model. Statistics in Medicine , 19 (13) , 1771-1781. https://doi.org/10.1002/1097-0258(20000715)19:13<1771::aid-sim485>3.0.co;2-p

Identifiers

DOI
10.1002/1097-0258(20000715)19:13<1771::aid-sim485>3.0.co;2-p