![]() |
||
Fall 2006 Talk DescriptionsA Dynamic Prior for Human Pose TrackingMarcus BrubakerAbstract: Good motion models for human pose tracking have been ellusive. Previous efforts have focused on learning priors from motion capture data. Such solutions generalize well close to the original data but tend to generalize poorly to new dynamical situations such as shorter or longer stride lengths or walking on inclines. In this talk I will present a physics based prior which uses abstract physical dynamics as its basis. These abstract models express many salient features of human locomotion but remain simple and manageable. Optimization of fMRI Processing PipelinesStephen StrotherAbstract: I will introduce the components of an fMRI processing pipeline ( i.e., experimental design, data acquisition, preprocessing and data analysis modeling), as an example of a scientific workflow, and briefly describe the basic steps of fMRI imaging. Using several fMRI studies I will demonstrate that components of this pipeline strongly interact (particularly with different data-analysis model choices) and that fMRI researchers are far from optimizing, or perhaps even understanding what it means to optimise such pipelines. I will then describe our experience with resampling techniques measuring prediction and reproducibility metrics within a framework (dubbed NPAIRS) that utilises split-half resampling to obtain prediction and global-pattern reproducibility metrics. In addition to providing prediction and reproducibility metrics this framework allows us to convert arbitrary spatial distributions of modeled image parameters to standard Gaussian parametric maps, and identify heterogeneous data observations. Using NPAIRS I will illustrate non-standard uses of prediction (e.g., minimisation) as a function of global pattern reproducibility to measure (1) the dependence of activation patterns on processing pipeline choices in humans (3.125 mm^3) at 1.5 Tesla for a task involving application of manual, parametric static force, and (2) the spatial scale of visual orientation columns in a cat with ultra-high resolution fMRI (.15 x .15 x 1 mm^3) at 9 Tesla. I will end by briefly discussing the possible clinical benefits of optimising, fMRI processing pipelines. The role of Manifold learning in Human Motion AnalysisAhmed ElgammalAbstract: Human body is an articulated object with high degrees of
freedom. Despite the high dimensionality of the configuration space, many human motion activities lie intrinsically on low dimensional
manifolds. Although the intrinsic body configuration manifolds might be very low in dimensionality, the resulting appearance manifolds are
challenging to model given various aspects that affects the appearance such as the shape and appearance of the person performing the motion,
or variation in the view point, or illumination. Our objective is to learn representations for the shape and the appearance of moving
(dynamic) objects that support tasks such as synthesis, pose recovery, reconstruction, and tracking. We studied various approaches for
representing global deformation manifolds that preserve their geometric structure. Given such representations, we can learn
generative models for dynamic shape and appearance. We also address the fundamental question of separating style and content on nonlinear
manifolds representing dynamic objects.We learn factorized generative models that explicitly decompose the intrinsic body configuration
(content) as a function of time from the appearance/shape (style factors) of the person performing the action as time-invariant
parameters. We show results on pose recovery, body tracking, gait recognition, as well as facial expression tracking and recognition.
Is face recognition 'special'? An examination of psychological and neural mechanisms supporting face recognitionMarlene BehrmannAbstract: Face recognition is often considered to be a special instance of visual recognition, demanding specialized, perhaps even dedicated psychological and neural mechanisms. To address this issue, behavioral data will be presented from three different populations of individuals all of whom are impaired at face recognition. Thereafter, functional and structural imaging data will be presented to explore the neural correlate of face processing in normal and impaired individuals. Taken together, the findings will support the view that face recognition is not 'special' and, instead, engages general visual processes which represent other classes of objects as well. Additionally, face recognition is supported by an underlying distributed network of cortical regions rather than being mediated by a particular, specialized cortical area. Investigating Blur in the Framework of Reverse ProjectionScott McCloskeyAbstract: I present a reverse projection model for image formation, which is particularly useful for explaining blur. The model is used to develop methods for seeing "around" occluding objects and also recovering depth from defocus. The appearance of severely defocused occluding objects is modeled, giving rise to an image processing method to recover the radiance of the background in the partially-occluded region. The model also shows that, when out of focus, nearby pixels record light emitted from overlapping regions of the scene. This gives rise to a measurable increase in the correlation between such pixels, with the increase being proportional to scene depth. This principle is used to motivate a method of recovering depth from defocus. Computer Vision for Panoramic Viewing and Augmented RealityMark FialaAbstract: The Computational Video Group (CVG) at the National Research Council of
Canada is involved in several areas of image processing and computer vision research and applications. Two such areas are the application of
panoramic cameras for robotics and "pano-presence", and fiducial marker systems used with non-panoramic cameras for "augmented reality"
visualization of 3D content.
Abstract: While vision systems that actively explore their environment is an already
established research field for many years, the interaction of vision systems with humans is still much less understood. In recent years,
there are two different trends that push forward computer vision techniques for human-machine interaction.
On the one side, hand-held interactive devices are becoming smaller and smaller or completely disappear into an ambient
intelligence environment. On the other side, artificial communication partners are becoming embodied in a shared virtual or physical
environment. In both cases, the interaction space is extended from the display that is controlled by the computer system to the external
environment that has to be perceived through sensors. Intuitive and seamless communication needs to be established in a human-human like
fashion using speech, gestures, and interpreting the actions of the user in his/her own natural environment.
Visual Recognition and Tracking for Perceptive InterfacesTrevor DarrellAbstract: Devices should be perceptive, and respond directly to their human user and/or environment. In this talk I'll present new computer vision algorithms for fast recognition, indexing, and tracking that make this possible, enabling multimodal interfaces which respond to users' conversational gesture and body language, robots which recognize common object categories, and mobile devices which can search using visual cues of specific objects of interest. As time permits, I'll describe recent advances in real-time human pose tracking for multimodal interfaces, including new methods which exploit fast computation of approximate likelihood with a pose-sensitive image embedding. I'll also present our linear-time approximate correspondence kernel, the Pyramid Match, and its use for image indexing and object recognition, and discovery of object categories. Throughout the talk, I'll show interface examples including grounded multimodal conversation as well as mobile image-based information retrieval applications based on these techniques. Photometric Invariants from Color SubspacesTodd ZicklerAbstract: Complex reflectance phenomena such as specular reflections confound many vision problems because they produce image `features' that do not correspond directly to intrinsic surface properties such as shape and spectral reflectance. One approach to mitigating these effects is to explore functions of an image that are invariant to these complex photometric events. In this talk, I describe a family of such invariants that result from exploiting color information in images of dichromatic surfaces. These invariants are derived from subspaces of RGB color space, and they enable the application of Lambertian-based vision techniques (for stereo, shape from shading, motion estimation, photometric stereo, etc.) to a broad class of specular, non-Lambertian scenes. Towards Robots that Learn by CommunicatingGerhard SagererAbstract: One common goal of an increasing number of projects is the development
of autonomous personal robots that are able to acquire knowledge from the real world by interacting with humans and the environment. Starting
from paradigms of situated communication, cooperative construction and optimization of behaviors we are focusing on learning by communication
based on the idea of robots as companions. In this approach, a robot is viewed as a communicating and learning agent in the real world.
Send questions or comments about this page to |
||