Abstract

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.

Keywords

Reinforcement learningComputer scienceArtificial intelligenceConvolutional neural networkDeep learningFunction (biology)Bellman equationValue (mathematics)ArchitectureControl (management)Machine learningPixelQ-learningReinforcementMathematicsEngineeringMathematical optimization

Related Publications

Publication Info

Year
2013
Type
preprint
Citations
5109
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

5109
OpenAlex

Cite This

Volodymyr Mnih, Koray Kavukcuoglu, David Silver et al. (2013). Playing Atari with Deep Reinforcement Learning. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1312.5602

Identifiers

DOI
10.48550/arxiv.1312.5602