Pascal Poupart
Department of Computer Science
University of Toronto
Toronto, ON M5S 3H5
email: ppoupart@cs.toronto.edu
Luis E. Ortiz
Computer Science Department, Box 1910
Brown University
Providence, RI 02912-1210, U.S.A.
email: leo@cs.brown.edu
Craig Boutilier
Department of Computer Science
University of Toronto
Toronto, ON M5S 3H5
email: cebly@cs.toronto.edu
Abstract
We consider the problem of approximate belief-state monitoring
using particle filtering for the purposes of implementing a policy for
a partially observable Markov decision process (POMDP). While
particle filtering has become a widely used tool in AI for monitoring
dynamical systems, rather scant attention has been paid to their
use in the context of decision making. Assuming the existence of a
value function, we derive error bounds on decision quality associated
with filtering using importance sampling. We also describe an adaptive
procedure that can be used to dynamically determine the number of
samples required to meet specific error bounds. Empirical evidence
is offered supporting this technique as a profitable means of directing
sampling effort where it is needed to distinguish policies.
To appear, UAI-01
Return to List of Papers