Abstract
Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper, we show how the combined strength and wisdom of the crowds can be used to generate a large, high‐quality, word–emotion and word–polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help to identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help to obtain annotations at sense level (rather than at word level). We conducted experiments on how to formulate the emotion‐annotation questions, and show that asking if a term is associated with an emotion leads to markedly higher interannotator agreement than that obtained by asking if a term evokes an emotion.
Keywords
Affiliated Institutions
Related Publications
Using Hashtags to Capture Fine Emotion Categories from Tweets
Detecting emotions in microblogs and social media posts has applications for industry, health, and security. Statistical, supervised automatic methods for emotion detection rely...
Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques
We present sentiment analyzer (SA) that extracts sentiment (or opinion) about a subject from online text documents. Instead of classifying the sentiment of an entire document ab...
Access to the Internal Lexicon
In order to understand how a word is read for meaning, we need to know how a reader proceeds from the printed representation of a word to the word's entry in the reader's intern...
CrowdWorkSheets: Accounting for Individual and Collective Identities Underlying Crowdsourced Dataset Annotation
Human annotated data plays a crucial role in machine learning (ML) research\nand development. However, the ethical considerations around the processes and\ndecisions that go int...
Deep Contextualized Word Representations
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses ...
Publication Info
- Year
- 2012
- Type
- article
- Volume
- 29
- Issue
- 3
- Pages
- 436-465
- Citations
- 2425
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1111/j.1467-8640.2012.00460.x
- arXiv
- 1308.6297