Abstract

This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are uniform. Constraints on the distribution, derived from labeled training data, inform the technique where to be minimally non-uniform. The maximum entropy formulation has a unique solution which can be found by the improved iterative scaling algorithm. In this paper, maximum entropy is used for text classification by estimating the conditional distribution of the class variable given the document. In experiments on several text datasets we compare accuracy to naive Bayes and show that maximum entropy is sometimes significantly better, but also sometimes worse. Much future work remains, but the re...

Keywords

Principle of maximum entropyMaximum-entropy Markov modelMaximum entropy spectral estimationConditional entropyComputer scienceMaximum entropy probability distributionArtificial intelligenceEntropy (arrow of time)Pattern recognition (psychology)Naive Bayes classifierMathematicsNatural language processingMachine learningPhysicsMarkov chainMarkov modelSupport vector machine

Affiliated Institutions

Related Publications

Thumbs up?

We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data,...

2002 Proceedings of the ACL-02 conference ... 6965 citations

Support vector machines

My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressive results on text categorization using this analysis technique. This issue's...

1998 IEEE Intelligent Systems and their Ap... 6431 citations

Publication Info

Year
1999
Type
article
Citations
756
Access
Closed

External Links

Citation Metrics

756
OpenAlex

Cite This

Kamal Nigam, John Lafferty, Andrew McCallum (1999). Using Maximum Entropy for Text Classification. .