Abstract

We define a statistic, called the <em>matching statistic</em>, for locating regions of the genome that exhibit excess similarity among cases when compared to controls. Such regions are reasonable candidates for harboring disease genes. We find the asymptotic distribution of the statistic while accounting for correlations among sampled individuals. We then use the Benjamini and Hochberg false discovery rate (FDR) method for multiple hypothesis testing to find regions of excess sharing. The p-values for each region involve estimated nuisance parameters. Under appropriate conditions, we show that the FDR method based on p-values and with estimated nuisance parameters asymptotically preserves the FDR property. Finally, we apply the method to a pilot study on schizophrenia.

Keywords

False discovery rateStatisticScan statisticOutlierMultiple comparisons problemMatching (statistics)MathematicsStatisticsNuisance parameterTest statisticComputational biologyComputer scienceStatistical hypothesis testingBiologyGeneticsGene

Affiliated Institutions

Related Publications

A Direct Approach to False Discovery Rates

Summary Multiple-hypothesis testing involves guarding against much more complicated errors than single-hypothesis testing. Whereas we typically control the type I error rate for...

2002 Journal of the Royal Statistical Soci... 5607 citations

Publication Info

Year
2003
Type
article
Volume
98
Issue
461
Pages
236-246
Citations
33
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

33
OpenAlex

Cite This

Jung‐Ying Tzeng, William Byerley, Bernie Devlin et al. (2003). Outlier Detection and False Discovery Rates for Whole-Genome DNA Matching. Journal of the American Statistical Association , 98 (461) , 236-246. https://doi.org/10.1198/016214503388619256

Identifiers

DOI
10.1198/016214503388619256