Craig Boutilier and David Poole
Department of Computer Science
University of British Columbia
Vancouver, BC, CANADA, V6T 1Z4
email: cebly@cs.ubc.ca, poole@cs.ubc.ca
Abstract
Partially-observable Markov decision processes provide a
very general model for decision-theoretic planning problems,
allowing the trade-offs between various courses of actions to
be determined under conditions of uncertainty, and incorporating
partial observations made by an agent.
Dynamic programming algorithms based on the information or
belief state of an agent can be used to construct optimal
policies without explicit consideration of past history, but at
high computational cost. In this paper, we
discuss how structured representations of the system dynamics
can be incorporated in classic POMDP solution algorithms.
We use Bayesian networks with structured conditional probability
matrices to represent POMDPs, and use this representation to
structure the belief space for POMDP algorithms. This allows
irrelevant distinctions to be ignored. Apart from speeding up
optimal policy construction, we suggest that such representations
can be exploited to great extent in the development of useful
approximation methods.
To appear, AAAI-96
Return to List of Papers