A survey on semi-supervised learning

Abstract

Abstract Semi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at neural network-based models and generative learning. The literature on the topic has also expanded in volume and scope, now encompassing a broad spectrum of theory, algorithms and applications. However, no recent surveys exist to collect and organize this knowledge, impeding the ability of researchers and engineers alike to utilize it. Filling this void, we present an up-to-date overview of semi-supervised learning methods, covering earlier work as well as more recent advances. We focus primarily on semi-supervised classification, where the large majority of semi-supervised learning research takes place. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches and algorithms developed over the past two decades, with an emphasis on the most prominent and currently relevant work. Furthermore, we propose a new taxonomy of semi-supervised classification algorithms, which sheds light on the different conceptual and methodological approaches for incorporating unlabelled data into the training process. Lastly, we show how the fundamental assumptions underlying most semi-supervised learning algorithms are closely connected to each other, and how they relate to the well-known semi-supervised clustering assumption.

Keywords

Artificial intelligenceMachine learningComputer scienceSupervised learningUnsupervised learningScope (computer science)Semi-supervised learningArtificial neural networkField (mathematics)Data scienceMathematics

Affiliated Institutions

Leiden University NL

Related Publications

Unsupervised Feature Learning via Non-parametric Instance Discrimination

Zhirong Wu , Yuanjun Xiong , Stella X. Yu +1 more

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether...

2018 3435 citations

Momentum Contrast for Unsupervised Visual Representation Learning

Kaiming He , Haoqi Fan , Yuxin Wu +2 more

We present Momentum Contrast (MoCo) for unsupervised visual representation learning. From a perspective on contrastive learning as dictionary look-up, we build a dynamic diction...

2020 11112 citations

Pattern classification and scene analysis

Richard O. Duda , Peter E. Hart

Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theo...

1973 CERN Document Server (European Organi... 12643 citations

Statistical pattern recognition: a review

Anil K. Jain , Peter Duin , Jianchang Mao

The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated...

2000 IEEE Transactions on Pattern Analysis... 6667 citations

Emerging Properties in Self-Supervised Vision Transformers

Mathilde Caron , Hugo Touvron , Ishan Misra +4 more

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond t...

2021 2021 IEEE/CVF International Conferenc... 4220 citations

Publication Info

Year: 2019
Type: article
Volume: 109
Issue: 2
Pages: 373-440
Citations: 2322
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

A survey on semi-supervised learning

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

2322

OpenAlex

Cite This

APA Style

                            
                                    Jesper E. van Engelen, 
                                
                                    Holger H. Hoos
                                
                            (2019). 
                            A survey on semi-supervised learning. 
                            Machine Learning
                            , 109
                            (2)
                            , 373-440.
                            https://doi.org/10.1007/s10994-019-05855-6

Identifiers

DOI: 10.1007/s10994-019-05855-6