Markov: A methodology for the solution of infinite time horizon markov decision processes

1988 Applied Stochastic Models and Data Analysis 11 citations

Abstract

Abstract Algorithms are described for determining optimal policies for finite state, finite action, infinite discrete time horizon Markov decision processes. Both value‐improvement and policy‐improvement techniques are used in the algorithms. Computing procedures are also described. The algorithms are appropriate for processes that are either finite or infinite, deterministic or stochastic, discounted or undiscounted, in any meaningful combination of these features. Computing procedures are described in terms of initial data processing, bound improvements, process reduction, and testing and solution. Application of the methodology is illustrated with an example involving natural resource management. Management implications of certain hypothesized relationships between mallard survival and harvest rates are addressed by applying the optimality procedures to mallard population models.

Keywords

Markov decision processMathematical optimizationPartially observable Markov decision processMarkov processMarkov chainTime horizonPopulationComputer scienceReduction (mathematics)Action (physics)Markov modelDecision problemMathematicsDiscrete time and continuous timeAlgorithmMachine learningStatistics

Affiliated Institutions

Related Publications

Publication Info

Year
1988
Type
article
Volume
4
Issue
4
Pages
253-271
Citations
11
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

11
OpenAlex

Cite This

Byron K. Williams (1988). Markov: A methodology for the solution of infinite time horizon markov decision processes. Applied Stochastic Models and Data Analysis , 4 (4) , 253-271. https://doi.org/10.1002/asm.3150040405

Identifiers

DOI
10.1002/asm.3150040405