Using output codes to boost multiclass learning problems

Abstract

This paper describes a new technique for solving multiclass learning problems by combining Freund and Schapire&apos;s boosting algorithm with the main ideas of Dietterich and Bakiri&apos;s method of error-correcting output codes (ECOC). Boosting is a general method of improving the accuracy of a given base or &quot;weak&quot; learning algorithm. ECOC is a robust method of solving multiclass learning problems by reducing to a sequence of two-class problems. We show that our new hybrid method has advantages of both: Like ECOC, our method only requires that the base learning algorithm work on binary-labeled data. Like boosting, we prove that the method comes with strong theoretical guarantees on the training and generalization error of the final combined hypothesis assuming only that the base learning algorithm perform slightly better than random guessing. Although previous methods were known for boosting multiclass problems, the new method may be significantly faster and require less programming effort in creating the base\nlearning algorithm. We also compare the new algorithm\nexperimentally to other voting methods.

Keywords

Boosting (machine learning)Computer scienceArtificial intelligenceMachine learningMulticlass classificationGeneralization errorBinary numberBase (topology)Ensemble learningClass (philosophy)AlgorithmMathematicsArtificial neural networkSupport vector machine

Affiliated Institutions

AT&T (United States) US

Related Publications

Experiments with a new boosting algorithm

Yoav Freund , Robert E. Schapire

In an earlier paper, we introduced a new &quot;boosting&quot; algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learni...

1996 7561 citations

Solving Multiclass Learning Problems via Error-Correcting Output Codes

Tom Dietterich , Ghulum Bakiri

Multiclass learning problems involve finding a definitionfor an unknown function f(x) whose range is a discrete setcontaining k > 2 values (i.e., k ``classes''). Thedefinitio...

1995 Journal of Artificial Intelligence Re... 2687 citations

Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms

Thomas G. Dietterich , Eun Hui Bae

The term &quot;bias&quot; is widely used---and with different meanings---in the fields of machine learning and statistics. This paper clarifies the uses of this term and...

2008 200 citations

Improved boosting algorithms using confidence-rated predictions

Robert E. Schapire , Yoram Singer

We describe several improvements to Freund and Schapire‘s AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their pred...

1998 2556 citations

Generalized Bradley-Terry Models and Multi-Class Probability Estimates

Tzu-Kuo Huang , Ruby C. Weng , Chih‐Jen Lin

The Bradley-Terry model for obtaining individual skill from paired comparisons has been popular in many areas. In machine learning, this model is related to multi-class probabil...

2006 Journal of Machine Learning Research 151 citations

Publication Info

Year: 1997
Type: article
Pages: 313-321
Citations: 263
Access: Closed

External Links

Citation Metrics

263

OpenAlex

Cite This

APA Style

                            
                                    Robert E. Schapire
                                
                            (1997). 
                            Using output codes to boost multiclass learning problems. 
                            
                            , 313-321.