Cluster analysis of heterogeneous rank data
2007
102 citations
Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.
The first unified account of the theory, methodology, and applications of the EM algorithm and its extensionsSince its inception in 1977, the Expectation-Maximization (EM) algor...
Abstract. The expectationâmaximization (EM) algorithm is a popular approach for obtaining maximum likelihood estimates in incomplete data problems because of its simplicity and ...
Part 1. An Introduction to Missing Data. 1.1 Introduction. 1.2 Chapter Overview. 1.3 Missing Data Patterns. 1.4 A Conceptual Overview of Missing Data heory. 1.5 A More Formal De...
SUMMARY Finite mixture models are a useful class of models for application to data. When sample sizes are not large and the number of underlying densities is in question, likeli...
Summary A broadly applicable algorithm for computing maximum likelihood estimates from incomplete data is presented at various levels of generality. Theory showing the monotone ...
Social media, news, blog, policy document mentions