Abstract

A new class of reinforcement schemes for learning automata that makes use of estimates of the random characteristics of the environment is introduced. Both a single automaton and a hierarchy of learning automata are considered. It is shown that under small values for the parameters, these algorithms converge in probability to the optimal choice of actions. By simulation it is observed that, for both cases, these algorithms converge quite rapidly. Finally, the generality of this method of designing learning schemes is pointed out, and it is shown that a very minor modification will enable the algorithm to learn in a multiteacher environment as well.

Keywords

GeneralityLearning automataReinforcement learningComputer scienceAutomatonHierarchyClass (philosophy)Theoretical computer scienceAutomata theoryArtificial intelligenceAlgorithm

Affiliated Institutions

Related Publications

Sequencing Aspects of Multiprogramming

article Free AccessSequencing Aspects of Multiprogramming Author: J. Heller Institute of Mathematical Sciences, New York University, New York, N. Y. Institute of Mathematical Sc...

1961 Journal of the ACM 28 citations

Publication Info

Year
1985
Type
article
Volume
SMC-15
Issue
1
Pages
168-175
Citations
173
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

173
OpenAlex

Cite This

M.A.L. Thathachar, P. S. Sastry (1985). A new approach to the design of reinforcement schemes for learning automata. IEEE Transactions on Systems Man and Cybernetics , SMC-15 (1) , 168-175. https://doi.org/10.1109/tsmc.1985.6313407

Identifiers

DOI
10.1109/tsmc.1985.6313407