KIT | KIT-Bibliothek | Impressum | Datenschutz

Markov Decision Processes

Bäuerle, N. ORCID iD icon; Rieder, U.

Abstract:

The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950's. During the decades of the last century this theory has grown dramatically. It has found applications in various areas like e.g. computer science, engineering, operations research, biology and economics. In this article we give a short introduction to parts of this theory. We treat Markov Decision Processes with finite and infinite time horizon where we will restrict the presentation to the so-called (generalized) negative case. Solution algorithms like Howard's policy improvement and linear programming are also explained. Various examples show the application of the theory. We treat stochastic linear-quadratic control problems, bandit problems and dividend pay-out problems.


Volltext §
DOI: 10.5445/IR/1000032907
Originalveröffentlichung
DOI: 10.1365/s13291-010-0007-2
Dimensions
Zitationen: 10
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Stochastik (STOCH)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2010
Sprache Englisch
Identifikator ISSN: 0012-0456
urn:nbn:de:swb:90-329075
KITopen-ID: 1000032907
Erschienen in Jahresbericht der deutschen Mathematiker-Vereinigung (DMV)
Verlag Springer
Band 112
Heft 4
Seiten 217-243
Schlagwörter Markov Decision Process, Markov Chain, Bellman Equation, Policy Improvement, Linear Programming
Nachgewiesen in Scopus
Dimensions
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page