Abstract

In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step in Gradient Boosted Decision Tree (GBDT) and random forests training. Previous GPU based tree building algorithms are based on parallel multi-scan or radix sort to find the exact tree split, and thus suffer from scalability and performance issues. We show that using a histogram based algorithm to approximately find the best split is more efficient and scalable on GPU. By identifying the difference between classical GPU-based image histogram construction and the feature histogram construction in decision tree training, we develop a fast feature histogram building kernel on GPU with carefully designed computational and memory access sequence to reduce atomic update conflict and maximize GPU utilization. Our algorithm can be used as a drop-in replacement for histogram construction in popular tree boosting systems to improve their scalability. As an example, to train GBDT on epsilon dataset, our method using a main-stream GPU is 7-8 times faster than histogram based algorithm on CPU in LightGBM and 25 times faster than the exact-split finding algorithm in XGBoost on a dual-socket 28-core Xeon server, while achieving similar prediction accuracy.

Keywords

Computer scienceHistogramScalabilityBoosting (machine learning)CUDAHistogram matchingParallel computingFeature (linguistics)Decision treeTree (set theory)Artificial intelligenceAlgorithmImage (mathematics)MathematicsDatabase

Related Publications

Best-first Decision Tree Learning

Decision trees are potentially powerful predictors and explicitly represent the structure of a dataset. Standard decision tree learners such as C4.5 expand nodes in depth-first ...

2007 Research Commons (University of Waikato) 229 citations

Publication Info

Year
2017
Type
preprint
Citations
61
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

61
OpenAlex

Cite This

Huan Zhang, Si Si, Cho‐Jui Hsieh (2017). GPU-acceleration for Large-scale Tree Boosting. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1706.08359

Identifiers

DOI
10.48550/arxiv.1706.08359