Craig Boutilier
Department of Computer Science
University of British Columbia
Vancouver, BC, CANADA, V6T 1Z4
email: cebly@cs.ubc.ca
Abstract
Fully cooperative multiagent systems---those in which
agents share a joint utility model---is of special interest in AI.
A key problem is that of ensuring that the actions of individual
agents are coordinated, especially in settings where the agents
are autonomous decision makers. We investigate approaches to
learning coordinated strategies in stochastic domains where an
agent's actions are not directly observable by others. Much recent
work in game theory has adopted a Bayesian learning perspective to
the more general problem of equilibrium selection, but tends to
assume that actions can be observed. We discuss the special
problems that arise when actions are not observable, including
effects on rates of convergence, and the effect of action
failure probabilities and asymmetries. We also use likelihood
estimates as a means
of generalizing fictitious play learning models in our setting.
Finally, we propose the use of maximum likelihood as a means of
removing strategies from consideration, with the aim of convergence
to a conventional equilibrium, at which point learning
and deliberation can cease.
To appear, UAI-96, Portland, OR
Return to List of Papers