Abstract
A marker strongly associated with outcome (or disease) is often assumed to be effective for classifying persons according to their current or future outcome. However, for this assumption to be true, the associated odds ratio must be of a magnitude rarely seen in epidemiologic studies. In this paper, an illustration of the relation between odds ratios and receiver operating characteristic curves shows, for example, that a marker with an odds ratio of as high as 3 is in fact a very poor classification tool. If a marker identifies 10% of controls as positive (false positives) and has an odds ratio of 3, then it will correctly identify only 25% of cases as positive (true positives). The authors illustrate that a single measure of association such as an odds ratio does not meaningfully describe a marker's ability to classify subjects. Appropriate statistical methods for assessing and reporting the classification power of a marker are described. In addition, the serious pitfalls of using more traditional methods based on parameters in logistic regression models are illustrated.
Keywords
Affiliated Institutions
Related Publications
ROC Curves for Classification Trees
A common problem in medical diagnosis is to combine information from several tests or patient characteristics into a decision rule to distinguish diseased from healthy patients....
Interpretation of Genetic Association Studies: Markers with Replicated Highly Significant Odds Ratios May Be Poor Classifiers
Recent successful discoveries of potentially causal single nucleotide polymorphisms (SNPs) for complex diseases hold great promise, and commercialization of genomics in personal...
Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach
Methods of evaluating and comparing the performance of diagnostic tests are of increasing importance as new tests are developed and marketed. When a test is based on an observed...
Estimating the Relative Risk in Cohort Studies and Clinical Trials of Common Outcomes
Logistic regression yields an adjusted odds ratio that approximates the adjusted relative risk when disease incidence is rare (<10%), while adjusting for potential confounders. ...
Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction
The c statistic, or area under the receiver operating characteristic (ROC) curve, achieved popularity in diagnostic testing, in which the test characteristics of sensitivity and...
Publication Info
- Year
- 2004
- Type
- review
- Volume
- 159
- Issue
- 9
- Pages
- 882-890
- Citations
- 1227
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1093/aje/kwh101