Abstract
It is often assumed that class imbalances are responsible for significant losses of performance in standard classifiers. The purpose of this paper is to the question whether class imbalances are truly responsible for this degradation or whether it can be explained in some other way. Our experiments suggest that the problem is not directly caused by class imbalances, but rather, that class imbalances may yield small disjuncts which, in turn, will cause degradation. We argue that, in order to improve classifier performance, it may, then, be more useful to focus on the small disjuncts problem than it is to focus on the class imbalance problem. We experiment with a method that takes the small disjunct problem into consideration, and show that, indeed, it yields a performance superior to the performance obtained using standard or advanced solutions to the class imbalance problem.
Keywords
Affiliated Institutions
Related Publications
The class imbalance problem: A systematic study1
In machine learning problems, differences in prior class probabilities -- or class imbalances -- have been reported to hinder the performance of some standard classifiers, such ...
A systematic study of the class imbalance problem in convolutional neural networks
In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used meth...
Survey on deep learning with class imbalance
Abstract The purpose of this study is to examine existing deep learning techniques for addressing class imbalanced data. Effective classification with imbalanced data is an impo...
A Multiple Resampling Method for Learning from Imbalanced Data Sets
Resampling methods are commonly used for dealing with the class‐imbalance problem. Their advantage over other methods is that they are external and thus, easily transportable. A...
SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is considered "de facto" standard in the framework of learning from imbalanced data. This is due to...
Publication Info
- Year
- 2004
- Type
- article
- Volume
- 6
- Issue
- 1
- Pages
- 40-49
- Citations
- 669
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1145/1007730.1007737