Physics-Based Human Motion Modelling for People Tracking

Tutorial at ICCV 2009, Kyoto, Japan, September 2009


General Information

Intructors: Marcus A. Brubaker (, University of Toronto
Leonid Sigal (, University of Toronto / Disney Research
David J. Fleet (, University of Toronto

Time: September 28th, Morning
Duration: Half-day (4 hours)
Location: Faculty of Engineering Bldg.#3, 2F, Room W2


Course Description

Physics-based models have proved to be effective in modeling how people move in, and interact with, their environment. Such dynamical models are prevalent in computer graphics and robotics, where they allow physically plausible animation and/or simulation of humanoid motion. Similar models have also proved useful in biomechanics, allowing clinically meaningful analysis of human motion in terms of muscle and ground reaction forces.

In computer vision the use of such models (e.g., as priors for video-based human pose tracking) has been limited. Rather, most prior models in vision, to date, take the form of kinematic priors that can effectively be learned from motion capture data, but are inherently unable to explicitly account for physical plausibility of recovered motions (e.g., consistency with gravity, ground interactions, inertia, etc.). As a result many current methods suffer from visually unpleasant artifacts, (e.g., out of plane rotations, foot skate, etc.), especially when one is limited to monocular observations.

Recently, physics-based prior models have been successfully illustrated to address some of these issues (see references below). We posit that physics-based prior models are among the next important steps in developing more robust methods to track human motion over time. That said, the models involved are conceptually challenging and carry a high overhead for those unfamiliar with Newtonian mechanics; furthermore good references that address practical issues of importance (particularly as they apply to vision problems) are scarce.

This tutorial will cover the motivation for the use of physics-based models for tracking of articulated objects (e.g., people), as well as the formalism required for someone unfamiliar with these models to easily get started. We will provide the slides, notes, and Matlab code that will allow a capable novice to proceed along this innovative research path.



  • Introduction and Motivation
    • Human tracking and challenges
    • Kinematics-based human pose tracking (Bayesian filtering, latent variable models, etc.)
    • Physics-based human pose tracking and importance of context
    • Robotics and Biomechanics
    • Physics-based animation (control, motion re-targeting, dynamic interactions)
    • Physics-based rigid tracking and scene dynamics
  • Classical and Constrained Mechanics
    • Pose of the rigid body (poisition, quaternion orientation, etc.)
    • Rigid-body mechanics (velocity, mass, momentum, etc.)
    • Newton's laws of motion
    • Newton-Euler equations of motion for a rigid body
    • Constrained dynamics (explicit constraints)
    • Generalized coordinates and the principle of virtual work
    • Equations of motion for complex, articulated systems
  • Biomechanical Models of Human Locomotion
    • Biomechanical properties (body segment parameters, joint kinematics, dynamics, etc.)
    • Passive dynamics (monopode, anthropometric walker, etc.)
    • Impulsive contact models
    • Learning controllers without data
    • Kneed walker (equations of motion, control, etc.)
    • Case study: Bayesian people tracking with kneed walker
  • 3D Control and Optimization
    • P/PD control
    • State-space controllers (for complex cyclic and non-cyclic motions)
    • Balance feedback (ZMP, SIMBICON, etc.)
    • Estimating parameters of controllers
    • Compbining controllers
    • Trajectory control with motion capture data (PD-based, constraint-based)
    • Case sudy: Bayesian people tracking with motion capture constraint-based controller
    • Space-time optimization
  • Interactions and Contact Dynamics
    • Inferring contact (formulation and optimization)
    • Case sudy: Inferring contact parameters from motion-capture and video
  • Discussion: What's next?

Course Materials


Motivating Papers


Instructor Biographies


Marcus A. Brubaker is a Ph.D. student in the Department of Computer Science at the University of Toronto where he received his Hon.B.Sc (2004) and M.Sc (2006). Supervised by David Fleet, he works on human motion estimation using physically realistic models of motion. He has also worked on problems in protein structure estimation, Bayesian inference and Markov Chain Monte Carlo. Research interests beyond the areas noted include physics-based animation, humanoid robotics and machine learning.


Leonid Sigal is a postdoctoral research associate at Disney Research Pittsburgh, in conjunction with Carnegie Mellon University; before that he was a postdoctoral fellow in the Department of Computer Science at University of Toronto. He completed his Ph.D. under the supervision of Michael J. Black at Brown University; he received his B.Sc. degrees in Computer Science and Mathematics from Boston University (1999), his M.A. from Boston University (1999), and his M.S. from Brown University (2003). From 1999 to 2001, he worked as a senior vision engineer at Cognex Corporation, where he developed industrial vision applications for pattern analysis and verification. In 2002, he spent a semester as a research intern at Siemens Corporate Research (SCR) working with Dorin Comaniciu on autonomous obstacle detection and avoidance for vehicle navigation. During the summers of 2005 and 2006, he worked as a research intern at Intel Applications Research Lab (ARL) on human pose estimation and tracking. His work received the Best Paper Award at the Articulate Motion and Deformable Objects Conference in 2006 (with Michael J. Black).


David J. Fleet is professor of computer science at the University of Toronto. He received the PhD in Computer Science from the University of Toronto in 1991. He held a faculty position at Queen's University in Kingston (1991-1998), and managed the Digital Video Analysis Group and the Perceptual Document Analysis Group at Xerox PARC (1999-2003). He then returned to the University of Toronto in 2004. His research interests include computer vision, machine learning, image processing, visual perception, and visual neuroscience. He has published research articles and one book on a wide variety of topics, including the estimation of optical flow and stereoscopic disparity, probabilistic methods in motion analysis, visual tracking, 3D human pose tracking, and appearance modeling in image sequences. In 1996 he was awarded an Alfred P. Sloan Research Fellowship. His paper awards include Honorable Mention for the Marr Prize at the ICCV 1999 (with M. Black), runner-up for best paper at CVPR 2001 (with A. Jepson and T. El-Maraghi), and best paper at ACM UIST '03 (with E. Saund, J. Mahoney and D. Larner). He was Associate Editor of IEEE Trans PAMI (2000-2004), Program Co-Chair for CVPR 2003, and Associate Editor-In-Chief for IEEE Trans PAMI (2005-2008). He is Fellow of the Canadian Institute of Advanced Research.