Fahiem Bacchus
Department of Computer Science
University of Waterloo
Waterloo, ON, CANADA N2L 3G1
email: fbacchus@logos.uwaterloo.ca
Craig Boutilier
Department of Computer Science
University of British Columbia
Vancouver, BC, CANADA, V6T 1Z4
email: cebly@cs.ubc.ca
Adam Grove
NEC Research Institute
4 Independence Way
Princeton NJ 08540, USA
email: grove@research.nj.nec.com
Abstract
Markov Decision Processes (MDPs), currently a popular method for
modeling and solving decision theoretic planning problems, are
limited by the Markovian assumption: rewards and dynamics depend on
the current state only, and not on previous history.
Non-Markovian decision processes (NMDPs) can also be defined,
but then the more tractable solution techniques developed for MDP's
cannot be directly applied.
In this paper, we show how an NMDP,
in which temporal logic is used to specify history dependence,
can be
automatically converted into an equivalent MDP by adding appropriate
temporal variables. The resulting MDP can be represented in a
structured fashion and solved using structured policy construction
methods. In many cases, this offers significant computational
advantages over previous proposals for solving NMDPs.
To appear, AAAI-97, Providence, RI
Return to List of Papers