Random search for hyper-parameter optimization

Abstract

Grid search and manual search are the most widely used strategies for hyper-parameter optimization. This paper shows empirically and theoretically that randomly chosen trials are more efficient for hyper-parameter optimization than trials on a grid. Empirical evidence comes from a comparison with a large previous study that used grid search and manual search to configure neural networks and deep belief networks. Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as good or better within a small fraction of the computation time. Granting random search the same computational budget, random search finds better models by effectively searching a larger, less promising configuration space. Compared with deep belief networks configured by a thoughtful combination of manual search and grid search, purely random search over the same 32-dimensional configuration space found statistically equal performance on four of seven data sets, and superior performance on one of seven. A Gaussian process analysis of the function from hyper-parameters to validation set performance reveals that for most data sets only a few of the hyper-parameters really matter, but that different hyper-parameters are important on different data sets. This phenomenon makes

Keywords

Hyperparameter optimizationRandom searchComputer scienceGridSet (abstract data type)Artificial neural networkFraction (chemistry)Search algorithmData miningArtificial intelligenceMachine learningAlgorithmMathematics

Affiliated Institutions

Université de Montréal CA

Related Publications

Particle swarm optimization

James Kennedy , R.C. Eberhart

The base isolation design usually used the historical well-known earthquake records as an input ground motion. Through the adjustment on each variables of the structure system, ...

2002 45598 citations

A survey on Image Data Augmentation for Deep Learning

Connor Shorten , Taghi M. Khoshgoftaar

Abstract Deep convolutional neural networks have performed remarkably well on many Computer Vision tasks. However, these networks are heavily reliant on big data to avoid overfi...

2019 Journal Of Big Data 11041 citations

A Comprehensive Survey on Graph Neural Networks

Zong-Han Wu , Shirui Pan , Fengwen Chen +9 more

Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language u...

2020 IEEE Transactions on Neural Networks ... 7809 citations

Optuna

Takuya Akiba , Shotaro Sano , Toshihiko Yanase +2 more

The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API tha...

2019 Proceedings of the 25th ACM SIGKDD In... 5681 citations

Optimum diversity combining and equalization in digital data transmission with applications to cellular mobile radio. I. Theoretical considerations

П. М. Балабан , J. Salz

A comprehensive theory for Nth-order space diversity reception combined with various equalization techniques in digital data transmission over frequency-selective fading channel...

1992 IEEE Transactions on Communications 362 citations

Publication Info

Year: 2012
Type: article
Volume: 13
Issue: 1
Pages: 281-305
Citations: 7916
Access: Closed

External Links

Citation Metrics

7916

OpenAlex

Cite This

APA Style

                            
                                    James Bergstra, 
                                
                                    Yoshua Bengio
                                
                            (2012). 
                            Random search for hyper-parameter optimization. 
                            
                            , 13
                            (1)
                            , 281-305.