CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Abstract

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

Keywords

Convolutional neural networkComputer sciencePoolingArtificial intelligenceFeature extractionPattern recognition (psychology)Feature (linguistics)Deep learningState (computer science)Algorithm

Affiliated Institutions

University of Illinois Urbana-Champaign US

Related Publications

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Kaiming He , Xiangyu Zhang , Shaoqing Ren +1 more

Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy f...

2015 IEEE Transactions on Pattern Analysis... 10916 citations

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

Shaoshuai Shi , Chaoxu Guo , Li Jiang +4 more

We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN), for accurate 3D object detection from point clouds. Our proposed method d...

2020 2020 IEEE/CVF Conference on Computer ... 1878 citations

Context Encoding for Semantic Segmentation

Hang Zhang , Kristin Dana , Jianping Shi +4 more

Recent work has made significant progress in improving spatial resolution for pixelwise labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous con...

2018 2018 IEEE/CVF Conference on Computer ... 1436 citations

Network In Network

Min Lin , Qiang Chen , Shuicheng Yan

Abstract: We propose a novel deep network structure called In Network (NIN) to enhance model discriminability for local patches within the receptive field. The conventional con...

2014 arXiv (Cornell University) 1037 citations

Understanding Convolution for Semantic Segmentation

Panqu Wang , Pengfei Chen , Ye Yuan +4 more

Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we ...

2018 2018 IEEE Winter Conference on Applic... 1915 citations

Publication Info

Year: 2018
Type: article
Pages: 1091-1100
Citations: 1556
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1556

OpenAlex

392

Influential

1175

CrossRef

Cite This

APA Style

                            
                                    Yuhong Li, 
                                
                                    Xiaofan Zhang, 
                                
                                    Deming Chen
                                
                            (2018). 
                            CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. 
                            2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
                            
                            , 1091-1100.
                            https://doi.org/10.1109/cvpr.2018.00120

Identifiers

DOI: 10.1109/cvpr.2018.00120
arXiv: 1802.10062

Data Quality

Data completeness: 84%