Pattern-recognizing stochastic learning automata

Andrew G. Barto; P. Anandan

doi:10.1109/tsmc.1985.6313371

Abstract

A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associative reinforcement learning tasks. An algorithm is presented, called the associative reward-penalty, or A <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R-P</sub> algorithm for which a form of optimal performance is proved. This algorithm simultaneously generalizes a class of stochastic learning automata and a class of supervised learning pattern-classification methods related to the Robbins-Monro stochastic approximation procedure. The relevance of this hybrid algorithm is discussed with respect to the collective behaviour of learning automata and the behaviour of networks of pattern-classifying adaptive elements. Simulation results are presented that illustrate the associative reinforcement learning task and the performance of the A <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R-P</sub> algorithm as compared with that of several existing algorithms.

Keywords

Associative propertyComputer scienceReinforcement learningLearning automataArtificial intelligenceClass (philosophy)Associative learningAutomatonMachine learningRelevance (law)Task (project management)AlgorithmTheoretical computer scienceMathematics

Affiliated Institutions

University of Massachusetts Amherst US

Related Publications

On the use of backpropagation in associative reinforcement learning

Williams

A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks. One way is to train a second ...

1988 IEEE International Conference on Neur... 74 citations

The Best Two Independent Measurements Are Not the Two Best

Thomas M. Cover

Consider an item that belongs to one of two classes, θ = 0 or θ = 1, with equal probability. Suppose also that there are two measurement experiments E <sub xmlns:mml="http://www...

1974 IEEE Transactions on Systems Man and ... 250 citations

Maximum distanceq-nary codes

R. C. Singleton

A <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</tex> -nary error-correcting code with <tex xmlns:mml="http://www.w3.org/1998/...

1964 IEEE Transactions on Information Theory 508 citations

On two or more dimensional optimum quantizers

D. Chen

It is hard to compute the performance of an N-level K-dimensional optimum quantizer <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink...

2005 17 citations

Fast evaluation of logarithms in fields of characteristic two

Don Coppersmith

A method for determining logarithms in GF <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(2^{n})</tex> is presented. Its asymptot...

1984 IEEE Transactions on Information Theory 299 citations

Publication Info

Year: 1985
Type: article
Volume: SMC-15
Issue: 3
Pages: 360-375
Citations: 309
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Pattern-recognizing stochastic learning automata

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

309

OpenAlex

Cite This

APA Style

                            
                                    Andrew G. Barto, 
                                
                                    P. Anandan
                                
                            (1985). 
                            Pattern-recognizing stochastic learning automata. 
                            IEEE Transactions on Systems Man and Cybernetics
                            , SMC-15
                            (3)
                            , 360-375.
                            https://doi.org/10.1109/tsmc.1985.6313371

Identifiers

DOI: 10.1109/tsmc.1985.6313371