Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey

Abstract

Large-scale labeled data are generally required to train deep neural networks in order to obtain better performance in visual feature learning from images or videos for computer vision applications. To avoid extensive cost of collecting and annotating large-scale datasets, as a subset of unsupervised learning methods, self-supervised learning methods are proposed to learn general image and video features from large-scale unlabeled data without using any human-annotated labels. This paper provides an extensive review of deep learning-based self-supervised general visual feature learning methods from images or videos. First, the motivation, general pipeline, and terminologies of this field are described. Then the common deep neural network architectures that used for self-supervised learning are summarized. Next, the schema and evaluation metrics of self-supervised learning methods are reviewed followed by the commonly used datasets for images, videos, audios, and 3D data, as well as the existing self-supervised visual feature learning methods. Finally, quantitative performance comparisons of the reviewed methods on benchmark datasets are summarized and discussed for both image and video feature learning. At last, this paper is concluded and lists a set of promising future directions for self-supervised visual feature learning.

Keywords

Artificial intelligenceComputer scienceMachine learningDeep learningUnsupervised learningFeature (linguistics)Supervised learningFeature learningArtificial neural networkSemi-supervised learningPattern recognition (psychology)Feature extraction

Affiliated Institutions

Related Publications

Unsupervised Feature Learning via Non-parametric Instance Discrimination

Zhirong Wu , Yuanjun Xiong , Stella X. Yu +1 more

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether...

2018 3435 citations

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Richard Zhang , Phillip Isola , Alexei A. Efros +2 more

While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, ...

2018 10763 citations

Momentum Contrast for Unsupervised Visual Representation Learning

Kaiming He , Haoqi Fan , Yuxin Wu +2 more

We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic diction...

2020 11112 citations

Is object localization for free? - Weakly-supervised learning with convolutional neural networks

Maxime Oquab , Léon Bottou , Ivan Laptev +1 more

Successful methods for visual object recognition typically rely on training datasets containing lots of richly annotated images. Detailed image annotation, e.g. by object boundi...

2015 915 citations

Emerging Properties in Self-Supervised Vision Transformers

Mathilde Caron , Hugo Touvron , Ishan Misra +4 more

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond t...

2021 2021 IEEE/CVF International Conferenc... 4220 citations

Publication Info

Year: 2020
Type: article
Volume: 43
Issue: 11
Pages: 4037-4058
Citations: 1818
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1818

OpenAlex

Cite This

APA Style

                            
                                    Longlong Jing, 
                                
                                    Yingli Tian
                                
                            (2020). 
                            Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey. 
                            IEEE Transactions on Pattern Analysis and Machine Intelligence
                            , 43
                            (11)
                            , 4037-4058.
                            https://doi.org/10.1109/tpami.2020.2992393

Identifiers

DOI: 10.1109/tpami.2020.2992393