Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets

Alexandre Lacoste; Françcois Laviolette; Mario Marchand

Abstract

We propose a new method for comparing learning algorithms on multiple tasks which is based on a novel non-parametric test that we call the Poisson binomial test. The key aspect of this work is that we provide a formal definition for what is meant to have an algorithm that is better than another. Also, we are able to take into account the dependencies induced when evaluating classifiers on the same test set. Finally we make optimal use (in the Bayesian sense) of all the testing data we have. We demonstrate empirically that our approach is more reliable than the sign test and the Wilcoxon signed rank test, the current state of the art for algorithm comparisons. 1

Keywords

Computer scienceSign testMachine learningWilcoxon signed-rank testAlgorithmBayesian probabilityKey (lock)Artificial intelligenceSet (abstract data type)Test dataBayesian networkMathematicsStatisticsMann–Whitney U test

Affiliated Institutions

Université Laval CA

Related Publications

A win ratio approach to comparing continuous non‐normal outcomes in clinical trials

Duolao Wang , Stuart Pocock

Clinical trials are often designed to compare continuous non‐normal outcomes. The conventional statistical method for such a comparison is a non‐parametric Mann–Whitney test, wh...

2016 Pharmaceutical Statistics 64 citations

Spatial Sign Preprocessing: A Simple Way To Impart Moderate Robustness to Multivariate Estimators

Sven Serneels , Evert De Nolf , Pierre J. Van Espen

The spatial sign is a multivariate extension of the concept of sign. Recently multivariate estimators of covariance structures based on spatial signs have been examined by vario...

2006 Journal of Chemical Information and M... 56 citations

Evidence Synthesis for Decision Making 2

Sofia Dias , Alex J. Sutton , A. E. Ades +1 more

We set out a generalized linear model framework for the synthesis of data from randomized controlled trials. A common model is described, taking the form of a linear regression ...

2012 Medical Decision Making 1141 citations

Testing for a Finite Mixture Model with Two Components

Hanfeng Chen , Jiahua Chen , John D. Kalbfleisch

Summary We consider a finite mixture model with k components and a kernel distribution from a general one-parameter family. The problem of testing the hypothesis k=2 versusk⩾3 i...

2003 Journal of the Royal Statistical Soci... 98 citations

Rank-density-based multiobjective genetic algorithm and benchmark test function study

Haiming Lu , Gary G. Yen

Concerns the use of evolutionary algorithms (EA) in solving multiobjective optimization problems (MOP). We propose the use of a rank-density-based genetic algorithm (RDGA) that ...

2003 IEEE Transactions on Evolutionary Com... 178 citations

Publication Info

Year: 2012
Type: article
Pages: 665-675
Citations: 29
Access: Closed

External Links

Citation Metrics

OpenAlex

Cite This

APA Style

                            
                                    Alexandre Lacoste, 
                                
                                    Françcois Laviolette, 
                                
                                    Mario Marchand
                                
                            (2012). 
                            Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets. 
                            
                            , 665-675.