Abstract

This paper presents a case study of analyzing and improving intercoder reliability in discourse tagging using statistical techniques. Bias-corrected tags are formulated and successfully used to guide a revision of the coding manual and develop an automatic classifier.

Keywords

Computer scienceClassifier (UML)SubjectivityCoding (social sciences)Artificial intelligenceNatural language processingReliability (semiconductor)Data miningMachine learningInformation retrievalStatisticsMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
1999
Type
article
Pages
246-253
Citations
498
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

498
OpenAlex
19
Influential
208
CrossRef

Cite This

Janyce Wiebe, Rebecca Bruce, Thomas P. O'Hara (1999). Development and use of a gold-standard data set for subjectivity classifications. Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics - , 246-253. https://doi.org/10.3115/1034678.1034721

Identifiers

DOI
10.3115/1034678.1034721

Data Quality

Data completeness: 81%