Caroline Claus and Craig Boutilier
Department of Computer Science
University of British Columbia
Vancouver, BC, CANADA, V6T 1Z4
email: cclaus,cebly@cs.ubc.ca
Abstract
Reinforcement learning can provide a robust and natural means
for agents to learn how to coordinate their action choices in
multiagent systems. We examine some of the factors that can
influence the dynamics of the learning process in such a setting.
We first distinguish reinforcement learners that are unaware of (or
ignore) the presence of other agents from those that explicitly
attempt to learn the value of joint actions and the strategies
of their counterparts. We study (a simple form of)
Q-learning in cooperative multiagent systems under these two
perspectives, focusing on the influence of that game structure
and exploration strategies on convergence to (optimal and suboptimal)
Nash equilibria. We then propose alternative optimistic
exploration strategies that increase the likelihood of convergence to
an optimal equilibrium.
Appeared AAAI, 1998
Return to List of Papers