Abstract

We propose a new method for comparing learning algorithms on multiple tasks which is based on a novel non-parametric test that we call the Poisson binomial test. The key aspect of this work is that we provide a formal definition for what is meant to have an algorithm that is better than another. Also, we are able to take into account the dependencies induced when evaluating classifiers on the same test set. Finally we make optimal use (in the Bayesian sense) of all the testing data we have. We demonstrate empirically that our approach is more reliable than the sign test and the Wilcoxon signed rank test, the current state of the art for algorithm comparisons. 1

Keywords

Computer scienceSign testMachine learningWilcoxon signed-rank testAlgorithmBayesian probabilityKey (lock)Artificial intelligenceSet (abstract data type)Test dataBayesian networkMathematicsStatisticsMann–Whitney U test

Affiliated Institutions

Related Publications

Publication Info

Year
2012
Type
article
Pages
665-675
Citations
29
Access
Closed

External Links

Citation Metrics

29
OpenAlex

Cite This

Alexandre Lacoste, Françcois Laviolette, Mario Marchand (2012). Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets. , 665-675.