Learning Decision Trees Using the Area Under the ROC Curve

Cèsar Ferri; Peter Flach; José Hernández‐Orallo

Abstract

ROC analysis is increasingly being recognised as an important tool for evaluation and comparison of classifiers when the operating characteristics (i.e. class distribution and cost parameters) are not known at training time. Usually, each classifier is characterised by its estimated true and false positive rates and is represented by a single point in the ROC diagram. In this paper, we show how a single decision tree can represent a set of classifiers by choosing different labellings of its leaves, or equivalently, an ordering on the leaves. In this setting, rather than estimating the accuracy of a single tree, it makes more sense to use the area under the ROC curve (AUC) as a quality metric. We also propose a novel splitting criterion which chooses the split with the highest local AUC. To the best of our knowledge, this is the first probabilistic splitting criterion that is not based on weighted average impurity. We present experiments suggesting that the AUC splitting criterion leads to trees with equal or better AUC value, without sacrificing accuracy if a single labelling is chosen.

Keywords

Receiver operating characteristicDecision treeClassifier (UML)Artificial intelligenceMathematicsProbabilistic logicMetric (unit)Pattern recognition (psychology)Cut-pointDecision tree learningComputer sciencePerformance metricStatisticsMachine learningData miningAlgorithm

Affiliated Institutions

Related Publications

CLOUDS: a decision tree classifier for large datasets

Khaled Alsabti , Sanjay Ranka , Vineet Kumar Singh

Classification for very large datasets has many practical applications in data mining. Techniques such as discretization and dataset sampling can be used to scale up decision tr...

1998 Syracuse University Libraries (Syracu... 148 citations

The random subspace method for constructing decision forests

Tin Kam Ho

Much of previous attention on decision trees focuses on the splitting criteria and optimization of tree sizes. The dilemma between overfitting and achieving maximum accuracy is ...

1998 IEEE Transactions on Pattern Analysis... 6677 citations

Approximate Splitting for Ensembles of Trees using Histograms

Chandrika Kamath , Erick Cantú‐Paz , David Littau

Recent work in classification indicates that significant improvements in accuracy can be obtained by growing an ensemble of classifiers and having them vote for the most popular...

2002 6 citations

Comprehensible classification models

Alex A. Freitas

The vast majority of the literature evaluates the performance of classification models using only the criterion of predictive accuracy. This paper reviews the case for consideri...

2014 ACM SIGKDD Explorations Newsletter 548 citations

Receiver operating characteristic curve: overview and practical use for clinicians

Francis Sahngun Nahm

Using diagnostic testing to determine the presence or absence of a disease is essential in clinical practice. In many cases, test results are obtained as continuous values and r...

2022 Korean journal of anesthesiology 1170 citations

Publication Info

Year: 2002
Type: article
Pages: 139-146
Citations: 266
Access: Closed

External Links

Citation Metrics

266

OpenAlex

Cite This

APA Style

                            
                                    Cèsar Ferri, 
                                
                                    Peter Flach, 
                                
                                    José Hernández‐Orallo
                                
                            (2002). 
                            Learning Decision Trees Using the Area Under the ROC Curve. 
                            
                            , 139-146.