Abstract

Abstract The usual approach to handling missing data in a regression is to assume that the points are missing at random (MAR) and use either a fill-in method to replace the missing points or a method using maximally available pairs in the sample covariance matrix. We derive limits for the values of the least squares estimates of the coefficients (and their associated t statistics) when there are missing observations in one carrier. These limits are derived subject to a constraint on the relationship of the missing data to the present data. Calculating these limits while varying this constrained value results in a series of diagnostic plots that can be used to study the potential effect of the missing points on the regression (without assuming that the points are MAR). Simulations are performed to illustrate the use of the plots, and two real data sets are analyzed. The more general case of missing data in more than one carrier is also discussed.

Keywords

Missing dataStatisticsMathematicsRegression analysisRegressionLeast-squares function approximationRegression diagnosticPartial least squares regressionCovariance matrixCovariancePolynomial regressionEstimator

Affiliated Institutions

Related Publications

Least Median of Squares Regression

Abstract Classical least squares regression consists of minimizing the sum of the squared residuals. Many authors have produced more robust versions of this estimator by replaci...

1984 Journal of the American Statistical A... 3497 citations

Publication Info

Year
1986
Type
article
Volume
81
Issue
394
Pages
501-509
Citations
24
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

24
OpenAlex

Cite This

Gary Simon, Jeffrey S. Simonoff (1986). Diagnostic Plots for Missing Data in Least Squares Regression. Journal of the American Statistical Association , 81 (394) , 501-509. https://doi.org/10.1080/01621459.1986.10478296

Identifiers

DOI
10.1080/01621459.1986.10478296