A Basis for Analyzing Test-Retest Reliability

Louis Guttman

doi:10.1007/bf02288892

Abstract

Three sources of variation in experimental results for a test are distinguished: trials, persons, and items. Unreliability is defined only in terms of variation over trials. This definition leads to a more complete analysis than does the conventional one; Spearman's contention is verified that the conventional approach—which was formulated by Yule—introduces unnecessary hypotheses. It is emphasized that at least two trials are necessary to estimate the reliability coefficient. This paper is devoted largely to developing lower bounds to the reliability coefficient that can be computed from but a single trial ; these avoid the experimental difficulties of making two independent trials. Six different lower bounds are established, appropriate for different situations. Some of the bounds are easier to compute than are conventional formulas, and all the bounds assume less than do conventional formulas. The terminology used is that of psychological and sociological testing, but the discussion actually provides a general analysis of the reliability of the sum of n variables.

Keywords

Reliability (semiconductor)TerminologyVariation (astronomy)MathematicsTest (biology)Basis (linear algebra)StatisticsComputer scienceEconometrics

MeSH Terms

HumansIntelligence TestsReproducibility of Results

Affiliated Institutions

Cornell University US

Related Publications

Methodological index for non‐randomized studies (<i>MINORS</i>): development and validation of a new instrument

K. Slim , Emile Nini , Damien Forestier +3 more

Background: Because of specific methodological difficulties in conducting randomized trials, surgical research remains dependent predominantly on observational or non‐randomized...

2003 ANZ Journal of Surgery 6851 citations

A test–retest reliability study of child-reported psychiatric symptoms and diagnoses using the Child and Adolescent Psychiatric Assessment (CAPA-C)

Adrian Angold , E. Jane Costello

SYNOPSIS Seventy-seven 10–18-year-old psychiatric in-patients and out-patients took part in a test-retest study of the Child and Adolescent Psychiatric Assessment (CAPA). They w...

1995 Psychological Medicine 272 citations

A Series of Lower Bounds to the Reliability of a Test

J.M.F. ten Berge , Frits E. Zegers

Two well-known lower bounds to the reliability in classical test theory, Guttman’s λ 2 and Cronbach’s coefficient alpha, are shown to be terms of an infinite series of lower bou...

1978 Psychometrika 74 citations

Measurements of acute cerebral infarction: a clinical examination scale.

Thomas Brott , Harold P. Adams , Charles P. Olinger +7 more

We designed a 15-item neurologic examination stroke scale for use in acute stroke therapy trials. In a study of 24 stroke patients, interrater reliability for the scale was foun...

1989 Stroke 5596 citations

Uses and abuses of coefficient alpha.

Neal Schmitt

The article addresses some concerns about how coefficient alpha is reported and used. It also shows that alpha is not a measure of homogeneity or unidimensionality. This fact an...

1996 Psychological Assessment 2285 citations

Publication Info

Year: 1945
Type: article
Volume: 10
Issue: 4
Pages: 255-282
Citations: 1109
Access: Closed

External Links

Download PDF (Free) View on DOI.org PubMed Semantic Scholar

Social Impact

Altmetric

A Basis for Analyzing Test-Retest Reliability

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1109

OpenAlex

Influential

765

CrossRef

Cite This

APA Style

                            
                                    Louis Guttman
                                
                            (1945). 
                            A Basis for Analyzing Test-Retest Reliability. 
                            Psychometrika
                            , 10
                            (4)
                            , 255-282.
                            https://doi.org/10.1007/bf02288892

Identifiers

DOI: 10.1007/bf02288892
PMID: 21007983

Data Quality

Data completeness: 81%