Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Abstract

We present Sparse R-CNN, a purely sparse method for object detection in images. Existing works on object detection heavily rely on dense object candidates, such as k anchor boxes pre-defined on all grids of image feature map of size H × W. In our method, however, a fixed sparse set of learned object proposals, total length of N, are provided to object recognition head to perform classification and location. By eliminating HWk (up to hundreds of thousands) hand-designed object candidates to N (e.g. 100) learnable proposals, Sparse R-CNN completely avoids all efforts related to object candidates design and many-to-one label assignment. More importantly, final predictions are directly output without non-maximum suppression post-procedure. Sparse R-CNN demonstrates accuracy, run-time and training convergence performance on par with the well-established detector baselines on the challenging COCO dataset, e.g., achieving 45.0 AP in standard 3× training schedule and running at 22 fps using ResNet-50 FPN model. We hope our work could inspire re-thinking the convention of dense prior in object detectors. The code is available at: https://github.com/PeizeSun/SparseR-CNN.

Keywords

Computer scienceObject detectionObject (grammar)Artificial intelligenceFeature (linguistics)Set (abstract data type)DetectorBenchmark (surveying)Computer visionPattern recognition (psychology)Convergence (economics)End-to-end principleTraining setScheduleImage (mathematics)Code (set theory)

Affiliated Institutions

Related Publications

Focal Loss for Dense Object Detection

Tsung-Yi Lin , Priya Goyal , Ross Girshick +2 more

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations...

2018 IEEE Transactions on Pattern Analysis... 9004 citations

Libra R-CNN: Towards Balanced Learning for Object Detection

Jiangmiao Pang , Kai Chen , Jianping Shi +3 more

Compared with model architectures, the training process, which is also crucial to the success of detectors, has received relatively less attention in object detection. In this w...

2019 2019 IEEE/CVF Conference on Computer ... 1634 citations

Focal Loss for Dense Object Detection

Tsung-Yi Lin , Priya Goyal , Ross Girshick +2 more

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations...

2017 23310 citations

Cascade R-CNN: Delving Into High Quality Object Detection

Zhaowei Cai , Nuno Vasconcelos

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An object detector, trained with low IoU threshold, e.g. 0.5, usua...

2018 6294 citations

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren , Kaiming He , Ross Girshick +1 more

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running tim...

2015 arXiv (Cornell University) 18214 citations

Publication Info

Year: 2021
Type: article
Citations: 1292
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1292

OpenAlex

Cite This

APA Style

                            
                                    Peize Sun, 
                                
                                    Rufeng Zhang, 
                                
                                    Yi Jiang
                                
                                et al.
                            
                            (2021). 
                            Sparse R-CNN: End-to-End Object Detection with Learnable Proposals. 
                            
                            .
                            https://doi.org/10.1109/cvpr46437.2021.01422

Identifiers

DOI: 10.1109/cvpr46437.2021.01422