Abstract
A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associative reinforcement learning tasks. An algorithm is presented, called the associative reward-penalty, or A <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R-P</sub> algorithm for which a form of optimal performance is proved. This algorithm simultaneously generalizes a class of stochastic learning automata and a class of supervised learning pattern-classification methods related to the Robbins-Monro stochastic approximation procedure. The relevance of this hybrid algorithm is discussed with respect to the collective behaviour of learning automata and the behaviour of networks of pattern-classifying adaptive elements. Simulation results are presented that illustrate the associative reinforcement learning task and the performance of the A <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R-P</sub> algorithm as compared with that of several existing algorithms.
Keywords
Affiliated Institutions
Related Publications
On the use of backpropagation in associative reinforcement learning
A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks. One way is to train a second ...
The Best Two Independent Measurements Are Not the Two Best
Consider an item that belongs to one of two classes, θ = 0 or θ = 1, with equal probability. Suppose also that there are two measurement experiments E <sub xmlns:mml="http://www...
Maximum distanceq-nary codes
A <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</tex> -nary error-correcting code with <tex xmlns:mml="http://www.w3.org/1998/...
On two or more dimensional optimum quantizers
It is hard to compute the performance of an N-level K-dimensional optimum quantizer <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink...
Fast evaluation of logarithms in fields of characteristic two
A method for determining logarithms in GF <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(2^{n})</tex> is presented. Its asymptot...
Publication Info
- Year
- 1985
- Type
- article
- Volume
- SMC-15
- Issue
- 3
- Pages
- 360-375
- Citations
- 309
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/tsmc.1985.6313371