Abstract
Abstract Working in the context of the linear model y = Xβ + ε, we generalize the concept of variance inflation as a measure of collinearity to a subset of parameters in β (denoted by β 1, with the associated columns of X given by X 1). The essential idea underlying this generalization is to examine the impact on the precision of estimation—in particular, the size of an ellipsoidal joint confidence region for β 1—of less-than-optimal selection of other columns of the design matrix (X 2), treating still other columns (X 0) as unalterable, even hypothetically. In typical applications, X 1 contains a set of dummy regressors coding categories of a qualitative variable or a set of polynomial regressors in a quantitative variable; X 2 contains all other regressors in the model, save the constant, which is in X 0. If σ 2 V denotes the realized variance of , and σ 2 U is the variance associated with an optimal selection of X 2, then the corresponding scaled dispersion ellipsoids to be compared are ℰ v = {x : x′V –1 x ≤ 1} and ℰ U = {x : x′U –1 x ≤ 1}, where ℰ U is contained in ℰ v . The two ellipsoids can be compared by considering the radii of ℰ v relative to ℰ U , obtained through the spectral decomposition of V relative to U. We proceed to explore the geometry of generalized variance inflation, to show the relationship of these measures to correlation-matrix determinants and canonical correlations, to consider X matrices structured by relations of marginality among regressor subspaces, to develop the relationship of generalized variance inflation to hypothesis tests in the multivariate normal linear model, and to present several examples.
Keywords
Affiliated Institutions
Related Publications
Role of range and precision of the independent variable in regression of data
Abstract Regression of the experimental data of one independent variable, y vs . a linear combination of functions of an independent variable of the form y = Σβ j f j (x) is con...
Stable signal recovery from incomplete and inaccurate measurements
Abstract Suppose we wish to recover a vector x 0 ∈ ℝ 𝓂 (e.g., a digital signal or image) from incomplete and contaminated observations y = A x 0 + e ; A is an 𝓃 × 𝓂 matrix wi...
Sparsity and incoherence in compressive sampling
We consider the problem of reconstructing a sparse signal x^0\\in{\\bb R}^n from a limited number of linear measurements. Given m randomly selected samples of Ux0, where U is an...
Segmentation into Three Classes Using Gradients
Consider a three-dimensional "scene" in which a density f(x, y, z) is assigned to every point (x, y, z). In a discretized version of the scene the density D(i, j, k) assigned to...
Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach
Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed...
Publication Info
- Year
- 1992
- Type
- article
- Volume
- 87
- Issue
- 417
- Pages
- 178-183
- Citations
- 1512
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1080/01621459.1992.10475190