Abstract

Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods. In the literature, however, most refinements are either briefly mentioned as implementation details or only visible in source code. In this paper, we will examine a collection of such refinements and empirically evaluate their impact on the final model accuracy through ablation study. We will show that, by combining these refinements together, we are able to improve various CNN models significantly. For example, we raise ResNet-50's top-1 validation accuracy from 75.3% to 79.29% on ImageNet. We will also demonstrate that improvement on image classification accuracy leads to better transfer learning performance in other application domains such as object detection and semantic segmentation.

Keywords

Computer scienceConvolutional neural networkArtificial intelligenceTransfer of learningContextual image classificationSegmentationCode (set theory)Machine learningPattern recognition (psychology)Image (mathematics)Object detectionDeep learning

Affiliated Institutions

Related Publications

A ConvNet for the 2020s

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification...

2022 2022 IEEE/CVF Conference on Computer ... 5683 citations

Publication Info

Year
2019
Type
article
Pages
558-567
Citations
1478
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1478
OpenAlex
154
Influential
1067
CrossRef

Cite This

Tong He, Zhi Zhang, Hang Zhang et al. (2019). Bag of Tricks for Image Classification with Convolutional Neural Networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 558-567. https://doi.org/10.1109/cvpr.2019.00065

Identifiers

DOI
10.1109/cvpr.2019.00065
arXiv
1812.01187

Data Quality

Data completeness: 84%