Abstract
Multiple imputation (MI) is now well established as a flexible, general, method for the analysis of data sets with missing values. Most implementations assume the missing data are `missing at random' (MAR), that is, given the observed data, the reason for the missing data does not depend on the unseen data. However, although this is a helpful and simplifying working assumption, it is unlikely to be true in practice. Assessing the sensitivity of the analysis to the MAR assumption is therefore important. However, there is very limited MI software for this. Further, analysis of a data set with missing values that are not missing at random (NMAR) is complicated by the need to extend the MAR imputation model to include a model for the reason for dropout. Here, we propose a simple alternative. We first impute under MAR and obtain parameter estimates for each imputed data set. The overall NMAR parameter estimate is a weighted average of these parameter estimates, where the weights depend on the assumed degree of departure from MAR. In some settings, this approach gives results that closely agree with joint modelling as the number of imputations increases. In others, it provides ball-park estimates of the results of full NMAR modelling, indicating the extent to which it is necessary and providing a check on its results. We illustrate our approach with a small simulation study, and the analysis of data from a trial of interventions to improve the quality of peer review.
Keywords
Affiliated Institutions
Related Publications
Population‐calibrated multiple imputation for a binary/categorical covariate in categorical regression models
Multiple imputation (MI) has become popular for analyses with missing data in medical research. The standard implementation of MI is based on the assumption of data being missin...
Multiple imputation: review of theory, implementation and software
Abstract Missing data is a common complication in data analysis. In many medical settings missing data can cause difficulties in estimation, precision and inference. Multiple im...
Much Ado About Nothing
Missing data are a recurring problem that can cause bias or lead to inefficient analyses. Development of statistical methods to address missingness have been actively pursued in...
A comparison of inclusive and restrictive strategies in modern missing data procedures.
Two classes of modern missing data procedures, maximum likelihood (ML) and multiple imputation (MI), tend to yield similar results when implemented in comparable ways. In either...
Inference and missing data
When making sampling distribution inferences about the parameter of the data, θ, it is appropriate to ignore the process that causes missing data if the missing data are 'missin...
Publication Info
- Year
- 2007
- Type
- article
- Volume
- 16
- Issue
- 3
- Pages
- 259-275
- Citations
- 234
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1177/0962280206075303