Abstract
A new class of reinforcement schemes for learning automata that makes use of estimates of the random characteristics of the environment is introduced. Both a single automaton and a hierarchy of learning automata are considered. It is shown that under small values for the parameters, these algorithms converge in probability to the optimal choice of actions. By simulation it is observed that, for both cases, these algorithms converge quite rapidly. Finally, the generality of this method of designing learning schemes is pointed out, and it is shown that a very minor modification will enable the algorithm to learn in a multiteacher environment as well.
Keywords
Affiliated Institutions
Related Publications
Pattern-recognizing stochastic learning automata
A class of learning tasks is described that combines aspects of learning automation tasks and supervised learning pattern-classification tasks. These tasks are called associativ...
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both th...
Sequencing Aspects of Multiprogramming
article Free AccessSequencing Aspects of Multiprogramming Author: J. Heller Institute of Mathematical Sciences, New York University, New York, N. Y. Institute of Mathematical Sc...
Scheduling: Theory, Algorithms, and Systems
Efficient scheduling of resources is critical to the proper functioning of businesses in today's competitive environment. Scheduling focuses on theoretical as well as applied as...
Convergence of the Nelder--Mead Simplex Method to a Nonstationary Point
This paper analyzes the behavior of the Nelder--Mead simplex method for a family of examples which cause the method to converge to a nonstationary point. All the examples use co...
Publication Info
- Year
- 1985
- Type
- article
- Volume
- SMC-15
- Issue
- 1
- Pages
- 168-175
- Citations
- 173
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/tsmc.1985.6313407