Abstract

A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90's. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods.

Keywords

Machine learningArtificial intelligenceComputer scienceDecision treeIsotonic regressionRandom forestSupervised learningNaive Bayes classifierSupport vector machineLogistic regressionEmpirical researchRegressionArtificial neural networkMathematicsStatistics

Affiliated Institutions

Related Publications

Publication Info

Year
2006
Type
article
Pages
161-168
Citations
2655
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

2655
OpenAlex

Cite This

Rich Caruana, Alexandru Niculescu-Mizil (2006). An empirical comparison of supervised learning algorithms. , 161-168. https://doi.org/10.1145/1143844.1143865

Identifiers

DOI
10.1145/1143844.1143865