Abstract

We develop a general framework for performing large-scale significance testing in the presence of arbitrarily strong dependence. We derive a low-dimensional set of random vectors, called a dependence kernel, that fully captures the dependence structure in an observed high-dimensional dataset. This result shows a surprising reversal of the “curse of dimensionality” in the high-dimensional hypothesis testing setting. We show theoretically that conditioning on a dependence kernel is sufficient to render statistical tests independent regardless of the level of dependence in the observed data. This framework for multiple testing dependence has implications in a variety of common multiple testing problems, such as in gene expression studies, brain imaging, and spatial epidemiology.

Keywords

Curse of dimensionalityStatistical hypothesis testingKernel (algebra)Multiple comparisons problemSet (abstract data type)Computer scienceStatistical physicsSignificance testingMathematicsStatisticsMachine learningPhysicsCombinatorics

Affiliated Institutions

Related Publications

A Direct Approach to False Discovery Rates

Summary Multiple-hypothesis testing involves guarding against much more complicated errors than single-hypothesis testing. Whereas we typically control the type I error rate for...

2002 Journal of the Royal Statistical Soci... 5607 citations

Publication Info

Year
2008
Type
article
Volume
105
Issue
48
Pages
18718-18723
Citations
381
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

381
OpenAlex

Cite This

Jeffrey T. Leek, John D. Storey (2008). A general framework for multiple testing dependence. Proceedings of the National Academy of Sciences , 105 (48) , 18718-18723. https://doi.org/10.1073/pnas.0808709105

Identifiers

DOI
10.1073/pnas.0808709105