Abstract

While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-differentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semi-supervised setups with extra-unlabeled data, deep rectifier networks can reach their best performance without requiring any unsupervised pre-training on purely supervised tasks with large labeled datasets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised neural networks, and closing the performance gap between neural networks learnt with and without unsupervised pre-training. 1

Keywords

Sigmoid functionRectifier (neural networks)TangentArtificial neural networkHyperbolic functionComputer scienceYield (engineering)Layer (electronics)Work (physics)Artificial intelligenceTopology (electrical circuits)MathematicsEngineeringMathematical analysisRecurrent neural networkTypes of artificial neural networksMaterials scienceGeometryMechanical engineering

Affiliated Institutions

Related Publications

Publication Info

Year
2012
Type
preprint
Citations
5408
Access
Closed

External Links

Citation Metrics

5408
OpenAlex

Cite This

Xavier Glorot, Antoine Bordes, Yoshua Bengio (2012). Deep Sparse Rectifier Neural Networks. .