Abstract
We address the rating-inference problem, wherein rather than simply decide whether a review is "thumbs up" or "thumbs down", as in previous sentiment analysis work, one must determine an author's evaluation with respect to a multi-point scale (e.g., one to five "stars"). This task represents an interesting twist on standard multi-class text categorization because there are several different degrees of similarity between class labels; for example, "three stars" is intuitively closer to "four stars" than to "one star".We first evaluate human performance at the task. Then, we apply a meta-algorithm, based on a metric labeling formulation of the problem, that alters a given n-ary classifier's output in an explicit attempt to ensure that similar items receive similar labels. We show that the meta-algorithm can provide significant improvements over both multi-class and regression versions of SVMs when we employ a novel similarity measure appropriate to the problem.
Keywords
Affiliated Institutions
Related Publications
Unsupervised Feature Learning via Non-parametric Instance Discrimination
Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether...
Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting
We consider the problem of multi-step ahead prediction in time series analysis using the non-parametric Gaussian process model. k-step ahead forecasting of a discrete-time non-l...
Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study
Abstract Mixture modeling is a widely applied data analysis technique used to identify unobserved heterogeneity in a population. Despite mixture models' usefulness in practice, ...
Combining Possibly Related Estimation Problems
Summary We have two sets of parameters we wish to estimate, and wonder whether the James-Stein estimator should be applied separately to the two sets or once to the combined pro...
MCMC Methods for Multi-Response Generalized Linear Mixed Models: The<b>MCMCglmm</b><i>R</i>Package
Generalized linear mixed models provide a flexible framework for modeling a range of data, although with non-Gaussian response variables the likelihood cannot be obtained in clo...
Publication Info
- Year
- 2005
- Type
- article
- Pages
- 115-124
- Citations
- 2121
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.3115/1219840.1219855