Deep Sparse Rectifier Neural Networks

Abstract

While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-differentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semi-supervised setups with extra-unlabeled data, deep rectifier networks can reach their best performance without requiring any unsupervised pre-training on purely supervised tasks with large labeled datasets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised neural networks, and closing the performance gap between neural networks learnt with and without unsupervised pre-training. 1

Keywords

Sigmoid functionRectifier (neural networks)TangentArtificial neural networkHyperbolic functionComputer scienceYield (engineering)Layer (electronics)Work (physics)Artificial intelligenceTopology (electrical circuits)MathematicsEngineeringMathematical analysisRecurrent neural networkTypes of artificial neural networksMaterials scienceGeometryMechanical engineering

Affiliated Institutions

Related Publications

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot , Yoshua Bengio

Whereas before 2006 it appears that deep multilayer neural networks were not successfully trained, since then several algorithms have been shown to successfully train them, with...

2010 12630 citations

Greedy Layer-Wise Training of Deep Networks

Yoshua Bengio , Pascal Lamblin , Dan Popovici +1 more

Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computationa...

2007 The MIT Press eBooks 4659 citations

Exploring Strategies for Training Deep Neural Networks

Hugo Larochelle , Yoshua Bengio , Jérôme Louradour +1 more

Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently...

2009 Journal of Machine Learning Research 1114 citations

Memristive crossbar arrays for brain-inspired computing

Qiangfei Xia , J. Joshua Yang

With their working mechanisms based on ion migration, the switching dynamics and electrical behaviour of memristive devices resemble those of synapses and neurons, making these ...

2019 Nature Materials 1593 citations

Sparse Feature Learning for Deep Belief Networks

Marc’Aurelio Ranzato , Y-Lan Boureau , Y. Le Cun

Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the ...

2007 Neural Information Processing Systems 713 citations

Publication Info

Year: 2012
Type: preprint
Citations: 5408
Access: Closed

External Links

Citation Metrics

5408

OpenAlex

Cite This

APA Style

                            
                                    Xavier Glorot, 
                                
                                    Antoine Bordes, 
                                
                                    Yoshua Bengio
                                
                            (2012). 
                            Deep Sparse Rectifier Neural Networks. 
                            
                            .