Abstract
We propose a method for nonparametric density estimation that exhibits robustness to contamination of the training sample. This method achieves robustness by combining a traditional kernel density estimator (KDE) with ideas from classical M-estimation. We interpret the KDE based on a positive semi-definite kernel as a sample mean in the associated reproducing kernel Hilbert space. Since the sample mean is sensitive to outliers, we estimate it robustly via M-estimation, yielding a robust kernel density estimator (RKDE). An RKDE can be computed efficiently via a kernelized iteratively re-weighted least squares (IRWLS) algorithm. Necessary and sufficient conditions are given for kernelized IRWLS to converge to the global minimizer of the M-estimator objective function. The robustness of the RKDE is demonstrated with a representer theorem, the influence function, and experimental results for density estimation and anomaly detection.
Keywords
Related Publications
Review Papers: Recent Developments in Nonparametric Density Estimation
Abstract Advances in computation and the fast and cheap computational facilities now available to statisticians have had a significant impact upon statistical research, and espe...
Recent Developments in Nonparametric Density Estimation
Advances in computation and the fast and cheap computational facilities now available to statisticians have had a significant impact upon statistical research, and especially th...
Robust Estimation of a Location Parameter
This paper contains a new approach toward a theory of robust estimation; it treats in detail the asymptotic theory of estimating a location parameter for contaminated normal dis...
Efficient R-Estimation of Principal and Common Principal Components
We propose rank-based estimators of principal components, both in the one-sample and, under the assumption of common principal components, in the m-sample cases. Those estimator...
Variable Bandwidth Kernel Estimators of Regression Curves
In the model $Y_i = g(t_i) + \\varepsilon_i,\\quad i = 1,\\cdots, n,$ where $Y_i$ are given observations, $\\varepsilon_i$ i.i.d. noise variables and $t_i$ nonrandom design poin...
Publication Info
- Year
- 1992
- Type
- article
- Volume
- 20
- Issue
- 3
- Citations
- 872
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1214/aos/1176348768