An Overview of Overfitting and its Solutions

Ying Xue Ying Xue
2019 Journal of Physics Conference Series 2,055 citations

Abstract

Overfitting is a fundamental issue in supervised machine learning which prevents us from perfectly generalizing the models to well fit observed data on training data, as well as unseen data on testing set. Because of the presence of noise, the limited size of training set, and the complexity of classifiers, overfitting happens. This paper is going to talk about overfitting from the perspectives of causes and solutions. To reduce the effects of overfitting, various strategies are proposed to address to these causes: 1) "early-stopping" strategy is introduced to prevent overfitting by stopping training before the performance stops optimize; 2) "network-reduction" strategy is used to exclude the noises in training set; 3) "data-expansion" strategy is proposed for complicated models to fine-tune the hyper-parameters sets with a great amount of data; and 4) "regularization" strategy is proposed to guarantee models performance to a great extent while dealing with real world issues by feature-selection, and by distinguishing more useful and less useful features.

Keywords

OverfittingEarly stoppingComputer scienceMachine learningArtificial intelligenceSet (abstract data type)Training setFeature selectionRegularization (linguistics)Feature (linguistics)Selection (genetic algorithm)Data setData miningPattern recognition (psychology)Artificial neural network

Related Publications

Publication Info

Year
2019
Type
article
Volume
1168
Pages
022022-022022
Citations
2055
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

2055
OpenAlex
45
Influential
1478
CrossRef

Cite This

Ying Xue (2019). An Overview of Overfitting and its Solutions. Journal of Physics Conference Series , 1168 , 022022-022022. https://doi.org/10.1088/1742-6596/1168/2/022022

Identifiers

DOI
10.1088/1742-6596/1168/2/022022

Data Quality

Data completeness: 81%