Estimating 3D Articulated Human Motion in Monocular Video

Tracking using Discriminative Models, Generative Models based on Manifold Learning, Variational Mixture Smoothing, Kinematic Jumps and Covariance Scaled Sampling

The Discriminative Human Tracking Page

 

Learning to Reconstruct 3D Human Motion from Bayesian Mixtures of Experts. A Probabilistic Discriminative Approach (C. Sminchisescu, A. Kanaujia, Z. Li, D. Metaxas), Technical Report CSRG-502, University of Toronto, October 2004.

Density Propagation for Continuous Temporal Chains. Generative and Discriminative Models (C. Sminchisescu, A. Jepson), Technical Report CSRG-501, University of Toronto, October 2004.

Generative Modeling for Continuous Non-Linearly Embedded Visual Inference  (C. Sminchisescu, A. Jepson), poster, talk, International Conference on Machine Learning, ICML 2004.

Generalized Darting Monte-Carlo (C.Sminchisescu, M. Welling, G. Hinton), submitted to Machine Learning Journal, 2004.

Variational Mixture Smoothing for Non-Linear Dynamical Systems (C.Sminchisescu, A.Jepson),  poster, talk, IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2004.

Kinematic Jump Processes for Monocular 3D Human Tracking (C.Sminchisescu, B.Triggs),  IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2003. (See the VIDEOS and talks I've given at the Royal Swedish Academy of Sciences, Mittag-Leffler Institute, and the one at the CVPR 2003 conference).
 Estimating Articulated Human Motion with Covariance Scaled Sampling (C.Sminchisescu, B.Triggs), International Journal of Robotics Research, vol 22, No.6, pages 371-393, 2003.
Covariance Scaled Sampling for Monocular 3D Body Tracking (C.Sminchisescu, B.Triggs), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2001, Hawaii. 
A Robust Multiple Hypothesis Approach to Monocular Human Motion Tracking (C.Sminchisescu, B.Triggs), INRIA Research Report No. 4208, June 2001.

 

Human Pose Estimation and Optimization Papers

Mapping Minima and Transitions in Visual Models (C.Sminchisescu, B.Triggs), International Journal of Computer Vision, vol.61, No.1,2005.

Building Roadmaps of Local Minima of Visual Models (C.Sminchisescu, B.Triggs), poster, European Conference on Computer Vision, ECCV 2002, Copenhagen.
Hyperdynamics Importance Sampling (C.Sminchisescu, B.Triggs), European Conference on Computer Vision, ECCV 2002, Copenhagen.
Consistency and Coupling in Human Model Likelihoods (C.Sminchisescu), IEEE International Conference on Automatic Face and Gesture Recognition,  FGR 2002, Washington.
Human Pose Estimation from Silhouettes. A Consistent Approach Using Distance Level Sets (C.Sminchisescu, A.Telea), International Conference on Computer Graphics, Visualization and Computer Vision WSCG 2002, Czech Republic.

My PhD. thesis

ESTIMATION ALGORITHMS FOR AMBIGUOUS VISUAL MODELS - Three Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences (C.Sminchisescu), Doctoral Thesis, Institute National Politechnique de Grenoble (INRIA), July 2002
 


Covariance Scaled Sampling Algorithm

Covariance Scaled Sampling is a novel high-dimensional search strategy that we introduce in order to deal with the depth ambiguities and modeling constraints associated with monocular human motion estimation. The method involves both local/continous and global/discrete search. It is based on robust error distributions and it is fully constraint consistent (joint angle limits, body parts non-penetration constraints), it employs parameter stabilization and anthropometric model priors. The likelihood term involves robustly extracted edge and intensity descriptors. The sampling is prior/covariance based, so sensitive to the local shape of the cost surface. This is in  contrast with CONDENSATION based methods which use a high dynamical noise process as an empirical search focusing parameter.

Abstract: We present a method for recovering 3D human body motion from monocular video sequences, based on a robust image matching metric, incorporation of joint limits and non-self-intersection constraints, and a new sample-and-refine search strategy guided by rescaled cost-function covariances.  Monocular 3D body tracking is challenging: besides the difficulty of matching an imperfect, highly flexible, self-occluding model to cluttered image features, realistic body models have at least 30 joint parameters subject to highly nonlinear physical constraints, and about a third of these degrees of freedom are essentially unobservable in any given monocular image.  For image matching we use a carefully designed robust cost metric combining robust optical flow, edge energy, and motion boundaries. The nonlinearities and matching ambiguities make the parameter-space cost surface multi-modal, ill-conditioned and highly nonlinear, so searching it is difficult. Addressing the limitations of CONDENSATION like samplers, we introduce a novel hybrid search algorithm that combines inflated-covariance-scaled sampling and robust continuous optimization subject to physical constraints and model priors. Our experiments on challenging monocular sequences show that robust cost modeling, joint and self-intersection constraints, and informed sampling are all essential for reliable monocular 3D motion estimation.

Keywords: 3D human body tracking, particle filtering, high-dimensional search, constrained optimization, robust matching.