Nicolas Meuleau, Milos Hauskrecht, Kee-Eung Kim, Leonid Peshkin,
Leslie Pack Kaelbling, Thomas Dean
Computer Science Department, Box 1910
Brown University
Providence, RI 02912-1210, U.S.A.
email: nm,milos,lpk,tld@cs.brown.edu
Craig Boutilier
Department of Computer Science
University of British Columbia
Vancouver, BC, CANADA, V6T 1Z4
email: cebly@cs.ubc.ca
Abstract
We present a technique for computing approximately optimal solutions
to stochastic resource allocation problems modeled as Markov decision
processes (MDPs). We exploit two key properties to avoid
explicitly enumerating the very large state and action spaces
associated with these problems. First, the problems are composed of
multiple tasks whose utilities are independent. Second, the actions
taken with respect to (or resources allocated to) a task do not
influence the status of any other task. We can therefore view each
task as an MDP. However, these MDPs are weakly coupled by
resource constraints: actions selected for one MDP restrict the
actions available to others. We describe heuristic techniques for
dealing with several classes of constraints that use the solutions for
individual MDPs to construct an approximate global solution. We
demonstrate this technique on problems involving hundreds of state
variables and thousands of tasks, approximating the solution to
problems that are far beyond the reach of standard methods.
Appeared, AAAI 1998
Return to List of Papers