Abstract

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

Keywords

Convolutional neural networkComputer sciencePoolingArtificial intelligenceFeature extractionPattern recognition (psychology)Feature (linguistics)Deep learningState (computer science)Algorithm

Affiliated Institutions

Related Publications

Network In Network

Abstract: We propose a novel deep network structure called In Network (NIN) to enhance model discriminability for local patches within the receptive field. The conventional con...

2014 arXiv (Cornell University) 1037 citations

Publication Info

Year
2018
Type
article
Pages
1091-1100
Citations
1556
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1556
OpenAlex
392
Influential
1175
CrossRef

Cite This

Yuhong Li, Xiaofan Zhang, Deming Chen (2018). CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , 1091-1100. https://doi.org/10.1109/cvpr.2018.00120

Identifiers

DOI
10.1109/cvpr.2018.00120
arXiv
1802.10062

Data Quality

Data completeness: 84%