3D People Tracking

Research Overview

The inference of human shape and motion in 3D has become a topic of great interest in the vision community. The problem is difficult because people move in complex ways, having with many degrees of freedom. Their appearance is similarly hard to model due to variations in lighting, to deformations of clothing, and to occlusions. To constrain people tracking, most existing methods assume one or more constraints, such as knowledge of a static background, the existence of multiple views of the person, or that color can be used to find skin regions. Hedvig Sidenbladh, Michael Black and I proposed a Bayesian approach to tracking people in 3D from 2D video. With a motion-based likelihood function based on a robust form of intensity conservation, a particle filter to deal with nonlinear dynamics and observations, a learned parameterized model of human walking motion and manual initialization of the model, we were able to infer the time-varying 3D structure of a single person in unknown cluttered backgrounds in monocular, greyscale video.

More recently, in an attempt to provide more efficient stochastic sampling so that we could handle weaker models of human dynamics, Kiam Choo and I began to consider the use of particle filters with MCMC updates. Applied to the inference of 3D joint configuration from 2D motion capture point data, we found that a particle filter with hybrid Monte Carlo updates produced an estimator more than 2,000 times faster than a conventional particle filter, with similar estimator variance. In combination with richer likelihood functions, combining motion and edge information, I hope this will lead to more effective tracking in high-dimensional spaces with complex dynamics and observation equations.

Related Publications

Return to David Fleet's home page.