Dropout as a Bayesian Approximation: Representing Model Uncertainty in\n Deep Learning

Abstract

Deep learning tools have gained tremendous attention in applied machine\nlearning. However such tools for regression and classification do not capture\nmodel uncertainty. In comparison, Bayesian models offer a mathematically\ngrounded framework to reason about model uncertainty, but usually come with a\nprohibitive computational cost. In this paper we develop a new theoretical\nframework casting dropout training in deep neural networks (NNs) as approximate\nBayesian inference in deep Gaussian processes. A direct result of this theory\ngives us tools to model uncertainty with dropout NNs -- extracting information\nfrom existing models that has been thrown away so far. This mitigates the\nproblem of representing uncertainty in deep learning without sacrificing either\ncomputational complexity or test accuracy. We perform an extensive study of the\nproperties of dropout's uncertainty. Various network architectures and\nnon-linearities are assessed on tasks of regression and classification, using\nMNIST as an example. We show a considerable improvement in predictive\nlog-likelihood and RMSE compared to existing state-of-the-art methods, and\nfinish by using dropout's uncertainty in deep reinforcement learning.\n

Keywords

Dropout (neural networks)Artificial intelligenceBayesian probabilityBayesian inferenceEconometricsBayesian networkComputer scienceMachine learningMathematics

Affiliated Institutions

University of Cambridge GB

Related Publications

Gaussian Processes for Machine Learning

Carl Edward Rasmussen , Christopher K. I. Williams

We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over...

2005 The MIT Press eBooks 10408 citations

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Marc Peter Deisenroth , Carl Edward Rasmussen

In this paper, we introduce pilco, a practical, data-efficient model-based policy search method. Pilco reduces model bias, one of the key problems of model-based reinforcement l...

2011 Scientific Repository (Petra Christia... 1076 citations

Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting

Agathe Girard , Carl Edward Rasmussen , Joaquin Quiñonero Candela +1 more

We consider the problem of multi-step ahead prediction in time series analysis using the non-parametric Gaussian process model. k-step ahead forecasting of a discrete-time non-l...

2002 370 citations

Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks

Emre Neftci , Hesham Mostafa , Friedemann Zenke

Spiking neural networks (SNNs) are nature's versatile solution to fault-tolerant, energy-efficient signal processing. To translate these benefits into hardware, a growing number...

2019 IEEE Signal Processing Magazine 1181 citations

kernlab- AnS4Package for Kernel Methods inR

Alexandros Karatzoglou , Alex Smola , Kurt Hornik +1 more

kernlab is an extensible package for kernel-based machine learning methods in R. It takes advantage of R's new S4 ob ject model and provides a framework for creating and using k...

2004 Journal of Statistical Software 1777 citations

Publication Info

Year: 2015
Type: preprint
Citations: 4015
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Dropout as a Bayesian Approximation: Representing Model Uncertainty in\n Deep Learning

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

4015

OpenAlex

Cite This

APA Style

                            
                                    Yarin Gal, 
                                
                                    Zoubin Ghahramani
                                
                            (2015). 
                            Dropout as a Bayesian Approximation: Representing Model Uncertainty in\n Deep Learning. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.1506.02142

Identifiers

DOI: 10.48550/arxiv.1506.02142