Abstract
Multiple regression models have wide applicability in predicting the outcome of patients with a variety of diseases. However, many researchers are using such models without validating the necessary assumptions. All too frequently, researchers also "overfit" the data by developing models using too many predictor variables and insufficient sample sizes. Models developed in this way are unlikely to stand the test of validation on a separate patient sample. Without attempting such a validation, the researcher remains unaware that overfitting has occurred. When the ratio of the number of patients suffering endpoints to the number of potential predictors is small (say less than 10), data reduction methods are available that can greatly improve the performance of regression models. Regression models can make more accurate predictions than other methods such as stratification and recursive partitioning, when model assumptions are thoroughly examined; steps are taken (ie, choosing another model or transforming the data) when assumptions are violated; and the method of model formulation does not result in overfitting the data.
Keywords
Related Publications
A survey on Image Data Augmentation for Deep Learning
Abstract Deep convolutional neural networks have performed remarkably well on many Computer Vision tasks. However, these networks are heavily reliant on big data to avoid overfi...
What do we mean by validating a prognostic model?
Prognostic models are used in medicine for investigating patient outcome in relation to patient and disease characteristics. Such models do not always work well in practice, so ...
Introduction to Econometrics
Foreword. Preface to the Second Edition. Preface to the Third Edition. Obituary. INTRODUCTION AND THE LINEAR REGRESSION MODEL. What is Econometrics? Statistical Background and M...
Model Uncertainty, Data Mining and Statistical Inference
This paper takes a broad, pragmatic view of statistical inference to include all aspects of model formulation. The estimation of model parameters traditionally assumes that a mo...
Latent Class Model Diagnosis
Summary. In many areas of medical research, such as psychiatry and gerontology, latent class variables are used to classify individuals into disease categories, often with the i...
Publication Info
- Year
- 1985
- Type
- article
- Volume
- 69
- Issue
- 10
- Pages
- 1071-77
- Citations
- 566
- Access
- Closed