Abstract
The principal contribution of this paper is a new result on the decentralized control of finite Markov chains with unknown transition probabilities and rewords. One decentralized decision maker is associated with each state in which two or more actions (decisions) are available. Each decision maker uses a simple learning scheme, requiring minimal information, to update its action choice. It is shown that, if updating is done in sufficiently small steps, the group will converge to the policy that maximizes the long-term expected reward per step. The analysis is based on learning in sequential stochastic games and on certain properties, derived in this paper, of ergodic Markov chains. A new result on convergence in identical payoff games with a unique equilibrium point is also presented.
Keywords
Affiliated Institutions
Related Publications
An N-player sequential stochastic game with identical payoffs
A sequential stochastic game among an arbitrary number of players in which all players' payoffs are identical is analyzed. The players are unaware that they are in a game and he...
Learning Automata: An Introduction
This self-contained introductory text on the behavior of learning automata focuses on how a sequential decision-maker with a finite number of choices responds in a random enviro...
The Statistical Mechanics of Strategic Interaction
I study strategic interaction among players who live on a lattice. Each player interacts directly with only a finite set of neighbors, but any two players indirectly interact th...
Stochastic Petri net representation of discrete event simulations
In the context of discrete event simulation, the marking of a stochastic Petri net (SPN) corresponds to the state of the underlying stochastic process of the simulation and the ...
Partially Observable<scp>MDPs</scp>(<scp>POMDPS</scp>): Introduction and Examples
Abstract A partially observable Markov decision process (POMDP) is a generalization of a Markov decision process where the states of the model are not completely observable by t...
Publication Info
- Year
- 1986
- Type
- article
- Volume
- 31
- Issue
- 6
- Pages
- 519-526
- Citations
- 100
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/tac.1986.1104342