Understanding Convolution for Semantic Segmentation

Abstract

Recent advances in deep learning, especially deep convolutional neural networks (CNNs), have led to significant improvement over previous semantic segmentation systems. Here we show how to improve pixel-wise semantic segmentation by manipulating convolution-related operations that are of both theoretical and practical value. First, we design dense upsampling convolution (DUC) to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling. Second, we propose a hybrid dilated convolution (HDC) framework in the encoding phase. This framework 1) effectively enlarges the receptive fields (RF) of the network to aggregate global information; 2) alleviates what we call the "gridding issue"caused by the standard dilated convolution operation. We evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a state-of-art result of 80.1% mIOU in the test set at the time of submission. We also have achieved state-of-theart overall on the KITTI road estimation benchmark and the PASCAL VOC2012 segmentation task. Our source code can be found at https://github.com/TuSimple/TuSimple-DUC.

Keywords

Computer scienceUpsamplingSegmentationArtificial intelligencePascal (unit)Convolution (computer science)Convolutional neural networkDeep learningPixelBenchmark (surveying)Convolutional codePattern recognition (psychology)Artificial neural networkAlgorithmImage (mathematics)Decoding methods

Affiliated Institutions

Related Publications

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Liang-Chieh Chen , Yukun Zhu , George Papandreou +2 more

2018 Lecture notes in computer science 13300 citations

Fully convolutional networks for semantic segmentation

Jonathan Long , Evan Shelhamer , Trevor Darrell

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, ex...

2015 35498 citations

Learning Deconvolution Network for Semantic Segmentation

Hyeonwoo Noh , Seunghoon Hong , Bohyung Han

We propose a novel semantic segmentation algorithm by learning a deep deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer ne...

2015 3978 citations

Dual Attention Network for Scene Segmentation

Jun Fu , Jing Liu , Haijie Tian +4 more

In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on the self-attention mechanism. Unlike previous works that capture context...

2019 6497 citations

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Ze Liu , Yutong Lin , Yue Cao +5 more

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer ...

2021 2021 IEEE/CVF International Conferenc... 25813 citations

Publication Info

Year: 2018
Type: article
Citations: 1915
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

Understanding Convolution for Semantic Segmentation

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1915

OpenAlex

119

Influential

1347

CrossRef

Cite This

APA Style

                            
                                    Panqu Wang, 
                                
                                    Pengfei Chen, 
                                
                                    Ye Yuan
                                
                                et al.
                            
                            (2018). 
                            Understanding Convolution for Semantic Segmentation. 
                            2018 IEEE Winter Conference on Applications of Computer Vision (WACV)
                            
                            .
                            https://doi.org/10.1109/wacv.2018.00163

Identifiers

DOI: 10.1109/wacv.2018.00163
arXiv: 1702.08502

Data Quality

Data completeness: 84%