How Far are Automatically Chosen Regression Smoothing Parameters from their Optimum?

Wolfgang Karl Härdle; Peter Hall; J. S. Marron

doi:10.1080/01621459.1988.10478568

Abstract

Abstract We address the problem of smoothing parameter selection for nonparametric curve estimators in the specific context of kernel regression estimation. Call the "optimal bandwidth" the minimizer of the average squared error. We consider several automatically selected bandwidths that approximate the optimum. How far are the automatically selected bandwidths from the optimum? The answer is studied theoretically and through simulations. The theoretical results include a central limit theorem that quantifies the convergence rate and gives the differences asymptotic distribution. The convergence rate turns out to be excruciatingly slow. This is not too disappointing, because this rate is of the same order as the convergence rate of the difference between the minimizers of the average squared error and the mean average squared error. In some simulations by John Rice, the selectors considered here performed quite differently from each other. We anticipated that these differences would be reflected in different asymptotic distributions for the various selectors. It is surprising that all of the selectors have the same limiting normal distribution. To provide insight into the gap between our theoretical results and these simulations, we did a further Monte Carlo study. Our simulations support the theoretical results, and suggest that the differences observed by Rice seemed to be principally due to the choice of a very small error standard deviation and the choice of error criterion. In the example considered here, the asymptotic normality result describes the empirical distribution of the automatically chosen bandwidths quite well, even for small samples.

Keywords

EstimatorMathematicsRate of convergenceAsymptotic distributionMean squared errorApplied mathematicsMonte Carlo methodSmoothingStatisticsContext (archaeology)Standard deviationCentral limit theoremComputer science

Affiliated Institutions

Related Publications

Variable Bandwidth Kernel Estimators of Regression Curves

Hans‐Georg Müller , Ulrich Stadtmüller

In the model $Y_i = g(t_i) + \\varepsilon_i,\\quad i = 1,\\cdots, n,$ where $Y_i$ are given observations, $\\varepsilon_i$ i.i.d. noise variables and $t_i$ nonrandom design poin...

1987 The Annals of Statistics 159 citations

Automatic Lag Selection in Covariance Matrix Estimation

Whitney K. Newey , Kenneth D. West

We propose a nonparametric method for automatically selecting the number of autocovariances to use in computing a heteroskedasticity and autocorrelation consistent covariance ma...

1994 The Review of Economic Studies 3240 citations

On the Smoothing of Probability Density Functions

Peter Whittle

Summary We consider the estimation of a probability density function by linear smoothing of the observed density. A basis for estimation is obtained by assuming that the ordinat...

1958 Journal of the Royal Statistical Soci... 187 citations

Goodness-of-Fit Testing for Latent Class Models

Linda M. Collins , Penny L. Fidler , Stuart E. Wugalter +1 more

Latent class models with sparse contingency tables can present problems for model comparison and selection, because under these conditions the distributions of goodness-of-fit i...

1993 Multivariate Behavioral Research 178 citations

Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion

Clifford M. Hurvich , Jeffrey S. Simonoff , Chih‐Ling Tsai

Summary Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoo...

1998 Journal of the Royal Statistical Soci... 1236 citations

Publication Info

Year: 1988
Type: article
Volume: 83
Issue: 401
Pages: 86-95
Citations: 361
Access: Closed

External Links