Copyright Notice: The papers are made available for personal use only, subject to author's and publisher's copyright.
ESTIMATION ALGORITHMS FOR AMBIGUOUS VISUAL MODELS - Three Dimensional Human Modeling and Motion Reconstruction in Monocular Video Sequences (C.Sminchisescu), Doctoral Thesis, Institute National Politechnique de Grenoble (INRIA), July 2002 (Jury: A. Blake, M. Black, R. Hartley, R. Mohr, L. Quan, W.Triggs; passed with highest rank, 'Tres honorable avec Felicitations du Jury'). My doctoral thesis including an overview of the monocular human tracking difficulties, state of the art, 3D human modeling, the image features and their likelihood models and three novel multiple hypothesis search algorithms in order to attack human tracking or other non-convex optimization problems in vision.
Books and Chapters
Optimization Algorithms for Non-linear Time Series Models (C. Sminchisescu), Matrix Publishing Ltd., ISBN 978-973-755-211-2, June 2007.
3D human Motion Reconstruction in Monocular Video. Techniques and Challenges (C. Sminchisescu), chapter in Human Motion Capture: Modeling, Analysis, Animation, Metaxas, Rosenhahn and Kleete Eds., Springer, 2007.
Learning Decompositional Models from Examples (A. Levinshtein, C. Sminchisescu, S. Dickinson), chapter in Energy Minimization Methods in Computer Vision and Pattern Recognition, Rangarajan et al Eds, Springer-Verlag, 2005.
BM3E: Discriminative Density Propagation for Visual Tracking (C. Sminchisescu, A. Kanaujia, D. Metaxas), IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.
Conditional Models for Contextual Human Motion Recognition (C. Sminchisescu, A. Kanujia, D. Metaxas), Computer Vision and Image Understanding, Vol. 104, Issues 2-3, Nov-Dec, 2006.
Fast Mixing Hyperdynamic Sampling (C. Sminchisescu, B. Triggs), Image and Vision Computing, Special Issue on Selected Papers from ECCV02 Conference, 2006.
Mapping Minima and Transitions in Visual Models (C.Sminchisescu, B.Triggs), International Journal of Computer Vision, vol. 61, No. 1, 2005.
Estimating Articulated Human Motion with Covariance Scaled Sampling (C.Sminchisescu, B.Triggs), International Journal of Robotics Research, vol 22, No.6, pages 371-393, 2003.
Estimating Deformable Shape and Motion with Geometric Constraints (C.Sminchisescu, D.Metaxas, S.Dickinson), IEEE Transactions on Pattern Analysis and Machine Intelligence,2005.
People Tracking with the Laplacian Eigenmaps Latent Variable Model (Z. Lu, M. C. Perpinan, C. Sminchisescu), Neural Information Processing Systems, NIPS, 2007, Vancouver.
Support Kernel Machines for Object Recognition (A. Kumar, C. Sminchisescu), IEEE International Conference on Computer Vision, ICCV, 2007, Rio de Janeiro.
Spectral Latent Variable Models for Perceptual Inference (A. Kanaujia, C. Sminchisescu, D. Metaxas), IEEE International Conference on Computer Vision, ICCV, 2007, Rio de Janeiro.
Semi-supervised Hierarchical Models for 3D Human Pose Reconstruction (A. Kanaujia, C. Sminchisescu, D. Metaxas), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2007, Minneapolis.
Generalized Darting Monte-Carlo (C. Sminchisescu, M. Welling), 9th International Conference on Artificial Intelligence and Statistics, AISTATS, 2007, Puerto Rico (see also longer Technical Report CRSG-543, November 2006).
Learning Joint Top-Down and Bottom-Up Processes for 3D Visual Inference (C. Sminchisescu, A. Kanaujia, D. Metaxas), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2006, New York.
Bidirectional Model Learning for Visual Inference (C. Sminchisescu, A. Kanaujia, D. Metaxas), Snowbird Machine Learning Wokrshop, April 2006.
Training Deformable Models for Localization (D. Ramanan, C. Sminchisescu), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2006, New York
Canonical Skeletons for Shape Matching (M. van Eede, D. Macrini, A. Telea, C. Sminchisescu, and S. Dickinson), IEEE International Conference on Pattern Recognition, ICPR, 2006, Hong Kong.
Conditional Visual Tracking in Kernel Space (C. Sminchisescu, A. Kanaujia, Z. Li, D. Metaxas), Neural Information Processing Systems, NIPS, 2005, Vancouver.
Conditional Models for Human Motion Recognition (C. Sminchisescu, A. Kanaujia, Z. Li, D. Metaxas), IEEE International Conference on Computer Vision, ICCV, 2005, Beijing.
Learning Decompositional Models from Examples (A. Levinshtein, C. Sminchisescu, S. Dickinson), Energy Minimization Methods in Computer Vision and Pattern Recognition, EMMCVPR, 2005, Florida.
Discriminative Density Propagation for 3D Human Motion Estimation (C. Sminchisescu, A. Kanaujia, Z. Li, D. Metaxas), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2005, San Diego. See the Discriminative Human Tracking Page for details. A probabilistic treatment of discriminative temporal chain models allowing fast robust inference and automatic model initialization. This framework has three key ingredients: (1) the graphical model structure that is no longer generative (therefore it avoids modeling the observation, instead conditions on it). This avoids independence assumptions often required with generative models; (2) Modeling local (per node, per state) conditional distributions -- we use conditional Bayesian mixture of experts in order to compactly model complex multimodal distributions -- arising e.g. in cases where similar image observations can be produced by very different state configurations; (3) Temporal state inference using a parametric belief propagation algorithm.
Learning to Reconstruct 3D Human Motion from Bayesian Mixtures of Experts. A Probabilistic Discriminative Approach (C. Sminchisescu, A. Kanaujia, Z. Li, D. Metaxas) , Technical Report CSRG-502, University of Toronto, October 2004. A mixture density propagation algorithm to estimate 3D human motion in monocular video sequences, based on observations encoding the appearance of image silhouettes. An extended version of the CVPR05 paper above.
Density Propagation for Continuous Temporal Chains. Generative and Discriminative Models (C. Sminchisescu, A. Jepson), Technical Report CSRG-501, University of Toronto, October 2004. A short technical note on various independence assumptions and propagation rules for filtering and smoothing in both generative and discriminative chain models, including extensions to windows of observations with arbitrary size.
Variational Mixture Smoothing for Non-Linear Dynamical Systems (C. Sminchisescu, A. Jepson), poster, talk, IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2004. A compact, computationally efficient mixture smoother, for non-linear non-Gaussian dynamical systems. Combines dynamic programming, sparse robust non-linear optimization and variational refinement, to construct a bounded Bayesian approximation (in a KL sense) to the true joint state posterior, given an entire observation sequence. The contributions are twofold. For non-linear dynamical systems iterated Kalman smoothing is not applicable and direct MCMC or particle smoothers tend not to be efficient. An accurate, compact multi-modal representation is also desirable in many applications where a sample-based mean state / trajectory output is uninformative, even if this were correct. Practically applications include high-dimensional systems with temporal state distributions that have persistent, as opposed to transient multi-modality, e.g. monocular 3D human tracking. We give a quantitative study for this problem, and demonstrate how the algorithm reconstructs multiple plausible high-quality 3D human motion trajectories from difficult monocular video.
Generative Modeling for Continuous Non-Linearly Embedded Visual Inference (C. Sminchisescu, A. Jepson), poster, talk, International Conference on Machine Learning, ICML 2004. Many difficult visual perception problems like 3D human motion estimation can be formulated as inference using complex generative models defined over high-dimensional state spaces. Optimizing such models is difficult because prior knowledge cannot be flexibly integrated in order to reshape an initially designed representation space. We present a learning and inference algorithm that restricts visual tracking to automatically learned, non-linearly embedded, low-dimensional spaces. This formulation produces a layered generative model with reduced state representation. Inference can conveniently be based on efficient continuous optimization methods. We introduce a prior flattening method to allow a simple analytic treatment of low-dimensional intrinsic curvature constraintsand the computation of consistent geodesics. We analyze reduced manifolds for human interaction activities and demonstrate that the algorithm learns models that are useful for tracking and for the reconstruction of 3D human motion in monocular video.
Optimal Inference for Hierarchical Skeleton Abstraction (A. Telea, C. Sminchisescu, S. Dickinson), IEEE International Conference on Pattern Recognition, ICPR 2004. Generating an abstracted skeleton hierarchy based on both boundary and internal structure simplifications, so that objects that are perceptually similar will have more similar shock graphs (and therefore will be easier to match). Formulated as 2d parameter inference over the boundary and internal branch simplification dimensions, of a skeleton. The criteria is a Bayesian-inspired energy function that trades-off silhouette-from-skeleton reconstruction accuracy and skeleton parsimony (small number of internal branches, encoded in the eccentricity) as inspired by a MDL principle. An earlier version of the above appeared as Hierarchical Skeleton Abstraction (A. Telea, C. Sminchisescu, S. Dickinson), Technical Report CSRG-480, University of Toronto, January 2004.
A Mode-Hopping MCMC Sampler (C. Sminchisescu, M.Welling, G. Hinton), Technical Report CSRG-478, University of Toronto, September 2003. Accelerating fair sampling from an equilibrium distribution using advanced knowledge of its dominant maxima. It combines local moves with inter-maxima jumps in a way that obeys detailed balance, in an MCMC algorithm. Applied to learning random fields, using contrastive-divergence, and to sampling human poses in monocular images, using local gradient-based Hybrid MC moves and deterministic long range jumps, based on maxima position and covariance. These can be pre-computed using a minima enumeration method. For general search methods that apply to any differentiable probability density, see Building Roadmaps of Local Minima of Visual Models (C.Sminchisescu, B.Triggs), ECCV 2002. For enumeration methods that further exploit the symmetric structural ambiguities of the monocular 3D articulated pose, see Kinematic Jump Processes for Monocular 3D Human Tracking (C.Sminchisescu, B.Triggs), CVPR 2003.
Kinematic Jump Processes for Monocular 3D Human Tracking (C.Sminchisescu, B.Triggs), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2003. (See the VIDEOS and talks I've given at the Royal Swedish Academy of Sciences, Mittag-Leffler Institute, and the one at the CVPR conference). Using interpretation trees and inverse kinematic reasoning in a covariance-scaled diffusion framework to generate highly accurate hypotheses for difficult multi-modal search problems with long-range structural ambiguities like monocular 3D articulated tracking. See Variational Mixture Smoothing for Non-Linear Dynamical Systems (C. Sminchisescu, A. Jepson), CVPR'04, for a multiple hypothesis smoother that can be effectively used in tandem with this kinematic jump tracker.
Building Roadmaps of Local Minima of Visual Models (C.Sminchisescu, B.Triggs), poster, European Conference on Computer Vision, ECCV 2002, Copenhagen. Two novel local optimization based methods for finding nearby saddle points and hence nearby local minima of a convoluted high-dimensional cost surface. Eigenvector Tracking is a modified (min-max) form of Newton minimization based on eigenvector following. Hypersurface Sweeping propagates a moving hypersurface through space, tracking minima within it. Applied to 3D human tracking from monocular video.
Hyperdynamics Importance Sampling (C.Sminchisescu, B.Triggs), European Conference on Computer Vision, ECCV 2002, Copenhagen. MCMC sampling in a modified potential function that focuses samples on nearby saddle points based on the local gradient and curvature of the input distribution. Used to find nearby local minima of a convoluted high-dimensional cost surface and applied to 3D human tracking from monocular video.
Consistency and Coupling in Human Model Likelihoods (C.Sminchisescu), IEEE International Conference on Automatic Face and Gesture Recognition, FGR 2002, Washington. Building more coherent likelihood models based on silhouette and contour features for human localization. Avoids the spurious local minima and singularities resulting from inconsistent model/image explanation or independent model/image matching.
Human Pose Estimation from Silhouettes. A Consistent Approach Using Distance Level Sets (C.Sminchisescu, A.Telea), International Conference on Computer Graphics, Visualization and Computer Vision WSCG 2002, Czech Republic. Estimating monocular human pose using continuous optimization and a silhouette likelihood model built with distance level sets.
Covariance Scaled Sampling for Monocular 3D Body Tracking (C.Sminchisescu, B.Triggs), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2001, Hawaii. Using local optimization, covariance estimates and oversized cost-sensitive sampling to build more reliable probabilistic trackers for ill-conditioned high-dimensional problems such as 3D human tracking.
Improving the Scope of Deformable Model Shape and Motion Estimation (C.Sminchisescu, D.Metaxas, S.Dickinson), IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2001, Hawaii. Using trilinear geometric constraints to increase the shape coverage of a deformable model with new line features, incrementally, during tracking. The newly detected features are fused as soft constraints in order to robustify the motion estimation that is also robust and unbiased.
A Framework for Generic State Estimation in Computer Vision Applications (C.Sminchisescu, A.Telea), International Conference on Computer Vision-International Conference on Computer Vision Systems, ICVS 2001, Vancouver. A modeling framework for structuring computer vision optimal state estimation applications and an architecture based on object-orientation and dataflow principles to implement this. Includes various examples ranging from articulated human tracking to reduced modeling and tracking in fluid dynamics datasets.
A Robust Multiple Hypothesis Approach to Monocular Human Motion Tracking (C.Sminchisescu, B.Triggs), INRIA Research Report No. 4208, June 2001. A summary of our initial work on estimating 3D human articular motion from monocular video, including robust feature extraction and a combined optimization + sampling framework.
Incremental Model-Based Estimation Using Geometric Consistency Constraints (C.Sminchisescu, D.Metaxas, S.Dickinson), INRIA Research Report No.4209, June 2001. A summary of our initial work on building flexible deformable models that increase their shape representation on the fly.
Learning in Medical Image DataBases (C.Sminchisescu), Rutgers University Report, 1998. Applying Bayesian and nearest neighbour techniques for pathology classification in a database of dental x-rays.
Indexing Methods in High Dimensional Spaces (C.Sminchisescu), Rutgers University Report, 1998. A study of randomized divide and conquer algorithms for high-dimensional indexing in spaces of moderate dimension (10-14).
Physics-Based Deformable Models for Graphics and Vision (C.Sminchisescu), presentation slides, 1999. Formulation of deformable models for vision and graphics, including shape modeling, kinematics and dynamics, extracting `forces' from images and multi-camera visual tracking applications.
On the Role of the Hippocampus and Neocortex in Learning and Memory. An Overview on Present Modeling Accounts (C.Sminchisescu), Technical Report, Department of Electrical Engineering and Computer Science, "Politehnica" University of Bucharest, July 1997. An overview of the interactions between hippocampus and neocortex in the memory formation, under an information processing perspective: hippocampus as a `teacher' versus `representation provider'.
Languages and Complier Design
An Object-Oriented Approach to C++ Compiler Technology (C.Sminchisescu, A.Telea), European Conference on Object Oriented Programming-PHOOS Workshop 1999, Lisabon. An approach for building compiler front ends for languages like C++ that involve complex syntax (arbitrary lookahead and backtracking during parsing) and semantics (virtual object model).
A Component-Based Dataflow Framework for Simulation and Visualization (A.Telea, C.Sminchisescu), European Conference on Object Oriented Programming-PHOOS Workshop 1999, Lisabon. Description of a system that allows seamless integration of object-orientation and data-flow for building complex scientific simulation (like FEM) or vizualization applications.
On the Separation of Interfaces and Implementations. The Mutual Interaction between Programming Languages and Object-Oriented Design Patterns (C.Sminchisescu), PCReport, volume 8, august 1997, Agora Press. pp.25-28. A note on the importance of separating interfaces and implementations and on flexibly checking interface conformance at run-time in object-oriented languages.
An Object Oriented Approach to Semantic Analysis for C++ (C.Sminchisescu), International Conference on Computer Science and Control, May 1997, pp 120-127 vol.2.
Code and Model Translations for C++. A Stage Inside the Semantic Transformer (I.Athanasiu, C.Sminchisescu), International Conference on Computer Science and Control, May 1997, pp 102-109, vol.2.
Software Reuse. From Design Patterns to Program Generalization (C.Sminchisescu), Technical Report, Department of Electrical Engineering and Computer Science, "Politehnica" University of Bucharest, 1997.
Developing an Object-Oriented Semantic Analyzer for C++ (C.Sminchisescu), Engineering Diploma Thesis, Politehnic Institute Bucharest, Romania, Department of Computer Science, and University of Sunderland, UK, Department of Computing and Information Systems, September 1996.