One weird trick for parallelizing convolutional neural networks

Alex Krizhevsky

doi:10.48550/arxiv.1404.5997

Abstract

I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.

Keywords

Computer scienceConvolutional neural networkArtificial intelligenceProgramming languageParallel computing

Related Publications

Instant neural graphics primitives with a multiresolution hash encoding

Thomas Müller , Alex Evans , Christoph Schied +1 more

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that p...

2022 ACM Transactions on Graphics 3089 citations

Large Scale Distributed Deep Networks

Jay B. Dean , Greg S. Corrado , Rajat Monga +9 more

Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider t...

2012 2906 citations

Improved Adam Optimizer for Deep Neural Networks

Zijun Zhang

Adaptive optimization algorithms, such as Adam and RMSprop, have witnessed better optimization performance than stochastic gradient descent (SGD) in some scenarios. However, rec...

2018 1244 citations

Neural GPUs Learn Algorithms

ukasz Kaiser , Ilya Sutskever

Abstract: Learning an algorithm from examples is a fundamental problem that has been widely studied. Recently it has been addressed using neural networks, in particular by Neura...

2016 arXiv (Cornell University) 63 citations

GPU-acceleration for Large-scale Tree Boosting

Huan Zhang , Si Si , Cho‐Jui Hsieh

In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step...

2017 arXiv (Cornell University) 61 citations

Publication Info

Year: 2014
Type: preprint
Citations: 980
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

One weird trick for parallelizing convolutional neural networks

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

980

OpenAlex

Cite This

APA Style

                            
                                    Alex Krizhevsky
                                
                            (2014). 
                            One weird trick for parallelizing convolutional neural networks. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.1404.5997

Identifiers

DOI: 10.48550/arxiv.1404.5997