Abstract

We show that propensity score matching (PSM), an enormously popular method of preprocessing data for causal inference, often accomplishes the opposite of its intended goal—thus increasing imbalance, inefficiency, model dependence, and bias. The weakness of PSM comes from its attempts to approximate a completely randomized experiment, rather than, as with other matching methods, a more efficient fully blocked randomized experiment. PSM is thus uniquely blind to the often large portion of imbalance that can be eliminated by approximating full blocking with other matching methods. Moreover, in data balanced enough to approximate complete randomization, either to begin with or after pruning some observations, PSM approximates random matching which, we show, increases imbalance even relative to the original data. Although these results suggest researchers replace PSM with one of the other available matching methods, propensity scores have other productive uses.

Keywords

Propensity score matchingCausal inferenceMatching (statistics)PreprocessorInefficiencyComputer sciencePruningRandomizationInferenceAverage treatment effectEconometricsStatisticsArtificial intelligenceMathematicsRandomized controlled trial

Affiliated Institutions

Related Publications

Multiple Imputation for Missing Data

Two algorithms for producing multiple imputations for missing data are evaluated with simulated data. Software using a propensity score classifier with the approximate Bayesian ...

2000 Sociological Methods & Research 786 citations

Publication Info

Year
2019
Type
article
Volume
27
Issue
4
Pages
435-454
Citations
1505
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1505
OpenAlex
52
Influential
1190
CrossRef

Cite This

Gary King, Richard A. Nielsen (2019). Why Propensity Scores Should Not Be Used for Matching. Political Analysis , 27 (4) , 435-454. https://doi.org/10.1017/pan.2019.11

Identifiers

DOI
10.1017/pan.2019.11

Data Quality

Data completeness: 81%