Abstract

A common mistake in analysis of cluster randomized trials is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. This typically leads to an overstatement of the precision of results and anticonservative conclusions about precision and statistical significance of treatment effects. This article gives a simple correction to the t statistic that would be computed if clustering were (incorrectly) ignored. The correction is a multiplicative factor depending on the total sample size, the cluster size, and the intraclass correlation ρ. The corrected t statistic has Student’s t distribution with reduced degrees of freedom. The corrected statistic reduces to the t statistic computed by ignoring clustering when ρ = 0. It reduces to the t statistic computed using cluster means when ρ = 1. If 0 < ρ < 1, it lies between these two, and the degrees of freedom are in between those corresponding to these two extremes.

Keywords

Cluster analysisStatisticsComputer scienceTest (biology)Statistical hypothesis testingEconometricsMathematics

Related Publications

Publication Info

Year
2007
Type
article
Volume
32
Issue
2
Pages
151-179
Citations
114
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

114
OpenAlex

Cite This

Larry V. Hedges (2007). Correcting a Significance Test for Clustering. Journal of Educational and Behavioral Statistics , 32 (2) , 151-179. https://doi.org/10.3102/1076998606298040

Identifiers

DOI
10.3102/1076998606298040