TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

Gerald Tesauro

doi:10.1162/neco.1994.6.2.215

Abstract

TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results, based on the TD(λ) reinforcement learning algorithm (Sutton 1988). Despite starting from random initial weights (and hence random initial strategy), TD-Gammon achieves a surprisingly strong level of play. With zero knowledge built in at the start of learning (i.e., given only a “raw” description of the board state), the network learns to play at a strong intermediate level. Furthermore, when a set of hand-crafted features is added to the network's input representation, the result is a truly staggering level of performance: the latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players.

Keywords

Reinforcement learningSet (abstract data type)Representation (politics)Computer scienceArtificial intelligenceArtificial neural network

Affiliated Institutions

IBM Research - Thomas J. Watson Research Center US

Related Publications

Sigmoid-weighted linear units for neural network function approximation in reinforcement learning

Stefan Elfwing , Eiji Uchibe , Kenji Doya

In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro's TD-Gammon achieved near top-level hu...

2018 Neural Networks 1643 citations

Jonathan Baxter , Andrew Tridgell , Lex Weaver

In this paper we present TDLEAF(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our ...

2000 Machine Learning 132 citations

Temporal difference learning and TD-Gammon

Ever since the days of Shannon's proposal for a chess-playing algorithm [12] and Samuel's checkers-learning program [10] the domain of complex board games such as Go, chess, che...

1995 Communications of the ACM 1457 citations

Going deeper with convolutions

Christian Szegedy , Wei Liu , Yangqing Jia +6 more

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Sca...

2015 45596 citations

Learning hierarchical representations for face verification with convolutional deep belief networks

Guoyang Huang , Honglak Lee , Erik Learned-Miller

Most modern face recognition systems rely on a feature representation given by a hand-crafted image descriptor, such as Local Binary Patterns (LBP), and achieve improved perform...

2012 412 citations

Publication Info

Year: 1994
Type: article
Volume: 6
Issue: 2
Pages: 215-219
Citations: 783
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

783

OpenAlex

Cite This

APA Style

                            
                                    Gerald Tesauro
                                
                            (1994). 
                            TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play. 
                            Neural Computation
                            , 6
                            (2)
                            , 215-219.
                            https://doi.org/10.1162/neco.1994.6.2.215

Identifiers

DOI: 10.1162/neco.1994.6.2.215