Temporal difference learning and TD-Gammon

1995 Communications of the ACM 1,457 citations

Abstract

Ever since the days of Shannon's proposal for a chess-playing algorithm [12] and Samuel's checkers-learning program [10] the domain of complex board games such as Go, chess, checkers, Othello, and backgammon has been widely regarded as an ideal testing ground for exploring a variety of concepts and approaches in artificial intelligence and machine learning. Such board games offer the challenge of tremendous complexity and sophistication required to play at expert level. At the same time, the problem inputs and performance measures are clear-cut and well defined, and the game environment is readily automated in that it is easy to simulate the board, the rules of legal play, and the rules regarding when the game is over and determining the outcome.

Keywords

SophisticationComputer scienceVariety (cybernetics)Artificial intelligenceIdeal (ethics)Domain (mathematical analysis)Outcome (game theory)Machine learningMathematicsMathematical economics

Related Publications

In this paper we present TDLEAF(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our ...

2000 Machine Learning 132 citations

Publication Info

Year
1995
Type
article
Volume
38
Issue
3
Pages
58-68
Citations
1457
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1457
OpenAlex

Cite This

(1995). Temporal difference learning and TD-Gammon. Communications of the ACM , 38 (3) , 58-68. https://doi.org/10.1145/203330.203343

Identifiers

DOI
10.1145/203330.203343