A new approach to the design of reinforcement schemes for learning automata

Abstract

A new class of reinforcement schemes for learning automata that makes use of estimates of the random characteristics of the environment is introduced. Both a single automaton and a hierarchy of learning automata are considered. It is shown that under small values for the parameters, these algorithms converge in probability to the optimal choice of actions. By simulation it is observed that, for both cases, these algorithms converge quite rapidly. Finally, the generality of this method of designing learning schemes is pointed out, and it is shown that a very minor modification will enable the algorithm to learn in a multiteacher environment as well.

Keywords

GeneralityLearning automataReinforcement learningComputer scienceAutomatonHierarchyClass (philosophy)Theoretical computer scienceAutomata theoryArtificial intelligenceAlgorithm

Affiliated Institutions

Indian Institute of Science Bangalore IN

Related Publications

Pattern-recognizing stochastic learning automata

Andrew G. Barto , P. Anandan

A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associativ...

1985 IEEE Transactions on Systems Man and ... 309 citations

Reinforcement Learning: A Survey

Leslie Pack Kaelbling , Michael L. Littman , Andrew Moore

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both th...

1996 Journal of Artificial Intelligence Re... 8505 citations

Sequencing Aspects of Multiprogramming

Jack Heller

article Free AccessSequencing Aspects of Multiprogramming Author: J. Heller Institute of Mathematical Sciences, New York University, New York, N. Y. Institute of Mathematical Sc...

1961 Journal of the ACM 28 citations

Scheduling: Theory, Algorithms, and Systems

Michael Pinedo

Efficient scheduling of resources is critical to the proper functioning of businesses in today's competitive environment. Scheduling focuses on theoretical as well as applied as...

1996 IIE Transactions 6280 citations

Convergence of the Nelder--Mead Simplex Method to a Nonstationary Point

K. I. M. McKinnon

This paper analyzes the behavior of the Nelder--Mead simplex method for a family of examples which cause the method to converge to a nonstationary point. All the examples use co...

1998 SIAM Journal on Optimization 469 citations

Publication Info

Year: 1985
Type: article
Volume: SMC-15
Issue: 1
Pages: 168-175
Citations: 173
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

A new approach to the design of reinforcement schemes for learning automata

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

173

OpenAlex

Cite This

APA Style

                            
                                    M.A.L. Thathachar, 
                                
                                    P. S. Sastry
                                
                            (1985). 
                            A new approach to the design of reinforcement schemes for learning automata. 
                            IEEE Transactions on Systems Man and Cybernetics
                            , SMC-15
                            (1)
                            , 168-175.
                            https://doi.org/10.1109/tsmc.1985.6313407

Identifiers

DOI: 10.1109/tsmc.1985.6313407