Abstract
Research in machine learning, statistics and related fields has produced a wide variety of algorithms for classification. However, most of these algorithms assume that all errors have the same cost, which is seldom the case in KDD problems. Individually making each classification learner costsensitive is laborious, and often non-trivial. In this paper we propose a principled method for making an arbitrary classifier cost-sensitive by wrapping a cost-minimizing procedure around it. This procedure, called MetaCost, treats the underlying classifier as a black box, requiring no knowledge of its functioning or change to it. Unlike stratification, MetaCost, is applicable to any number of classes and to arbitrary cost matrices. Empirical trials on a large suite of benchmark databases show that MetaCost almost always produces large cost reductions compared to the cost-blind classifier used (C4.5RULES) and to two forms of stratification. Further tests identify the key components of MetaCost and those that can be varied without substantial loss. Experiments on a larger database indicate that MetaCost scales well.
Keywords
Affiliated Institutions
Related Publications
Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)
Boosting is one of the most important recent developments in\nclassification methodology. Boosting works by sequentially applying a\nclassification algorithm to reweighted versi...
Active Learning with Statistical Models
For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techn...
Statistical pattern recognition: a review
The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated...
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
Abstract Background To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the ...
Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy
Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion base...
Publication Info
- Year
- 1999
- Type
- article
- Pages
- 155-164
- Citations
- 1289
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1145/312129.312220