Abstract

A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associative reinforcement learning tasks. An algorithm is presented, called the associative reward-penalty, or A <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R-P</sub> algorithm for which a form of optimal performance is proved. This algorithm simultaneously generalizes a class of stochastic learning automata and a class of supervised learning pattern-classification methods related to the Robbins-Monro stochastic approximation procedure. The relevance of this hybrid algorithm is discussed with respect to the collective behaviour of learning automata and the behaviour of networks of pattern-classifying adaptive elements. Simulation results are presented that illustrate the associative reinforcement learning task and the performance of the A <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R-P</sub> algorithm as compared with that of several existing algorithms.

Keywords

Associative propertyComputer scienceReinforcement learningLearning automataArtificial intelligenceClass (philosophy)Associative learningAutomatonMachine learningRelevance (law)Task (project management)AlgorithmTheoretical computer scienceMathematics

Affiliated Institutions

Related Publications

Maximum distanceq-nary codes

A <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</tex> -nary error-correcting code with <tex xmlns:mml="http://www.w3.org/1998/...

1964 IEEE Transactions on Information Theory 508 citations

Publication Info

Year
1985
Type
article
Volume
SMC-15
Issue
3
Pages
360-375
Citations
309
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

309
OpenAlex

Cite This

Andrew G. Barto, P. Anandan (1985). Pattern-recognizing stochastic learning automata. IEEE Transactions on Systems Man and Cybernetics , SMC-15 (3) , 360-375. https://doi.org/10.1109/tsmc.1985.6313371

Identifiers

DOI
10.1109/tsmc.1985.6313371