EfficientNetV2: Smaller Models and Faster Training

Mingxing Tan; Quoc V. Le

doi:10.48550/arxiv.2104.00298

Abstract

This paper introduces EfficientNetV2, a new family of convolutional networks that have faster training speed and better parameter efficiency than previous models. To develop this family of models, we use a combination of training-aware neural architecture search and scaling, to jointly optimize training speed and parameter efficiency. The models were searched from the search space enriched with new ops such as Fused-MBConv. Our experiments show that EfficientNetV2 models train much faster than state-of-the-art models while being up to 6.8x smaller. Our training can be further sped up by progressively increasing the image size during training, but it often causes a drop in accuracy. To compensate for this accuracy drop, we propose to adaptively adjust regularization (e.g., dropout and data augmentation) as well, such that we can achieve both fast training and good accuracy. With progressive learning, our EfficientNetV2 significantly outperforms previous models on ImageNet and CIFAR/Cars/Flowers datasets. By pretraining on the same ImageNet21k, our EfficientNetV2 achieves 87.3% top-1 accuracy on ImageNet ILSVRC2012, outperforming the recent ViT by 2.0% accuracy while training 5x-11x faster using the same computing resources. Code will be available at https://github.com/google/automl/tree/master/efficientnetv2.

Keywords

Computer scienceRegularization (linguistics)Convolutional neural networkTraining (meteorology)Machine learningArtificial intelligenceCode (set theory)SpeedupTraining setDropout (neural networks)Computer engineeringParallel computing

Related Publications

RepVGG: Making VGG-style ConvNets Great Again

Xiaohan Ding , Xiangyu Zhang , Ningning Ma +3 more

We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3 × 3 convolution and ...

2021 2021 IEEE/CVF Conference on Computer ... 2124 citations

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

Sangdoo Yun , Dongyoon Han , Sanghyuk Chun +3 more

Regional dropout strategies have been proposed to enhance performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to atte...

2019 4293 citations

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Mingxing Tan , Quoc V. Le

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper...

2019 arXiv (Cornell University) 5008 citations

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe , Christian Szegedy

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. T...

2024 arXiv (Cornell University) 15635 citations

Rethinking ImageNet Pre-Training

Kaiming He , Ross Girshick , Piotr Dollár

We report competitive results on object detection and instance segmentation on the COCO dataset using standard models trained from random initialization. The results are no wors...

2019 979 citations

Publication Info

Year: 2021
Type: preprint
Citations: 1103
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

EfficientNetV2: Smaller Models and Faster Training

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1103

OpenAlex

Cite This

APA Style

                            
                                    Mingxing Tan, 
                                
                                    Quoc V. Le
                                
                            (2021). 
                            EfficientNetV2: Smaller Models and Faster Training. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.2104.00298

Identifiers

DOI: 10.48550/arxiv.2104.00298