|
Winter 2004 Talk Descriptions
Automated Gesture Recognition
within a Linguistics-Based Framework
Abstract: Automated gesture recognition has the potential to create powerful interfaces between man and his artifacts. Computer vision offers potential means for acquiring and interpreting gesture information while being minimally obtrusive to the human participant. Toward this end, an approach is being developed to recognizing human hand gestures from a monocular temporal sequence of images. Of particular concern is the representation and recognition of hand movements that are used in single handed America Sign Language (ASL). The approach exploits previous linguistics analysis of manual languages that decompose dynamic gestures into their static and dynamic components. The first level of decomposition is in terms of three sets of primitives, hand shape, location and movement. Further levels of decomposition involve lexical and sentence level analysis. In the current work, we propose and demonstrate that given a monocular gesture sequence, kinematic features can be recovered from the apparent motion that provide distinctive signatures for 14 primitive movements of ASL. The approach has been implemented in software and evaluated on a database of over 700 gesture sequences. The results suggest that the approach is applicable to the analysis and interpretation of complex gesture sequences.
This research was funded by IRIS and was performed in collaboration with R. Wildes and J. Tsotsos.
The Deep Structure of Images;
Critical Points, Paths and Hierarchy
Abstract: In this presentation I will give an introduction to the deep structure of images. I will explain the ideas behind scale space and discuss important structures in the deep structure of images. The main focus will be on critical points and critical paths in this deep structure and how we can use these to describe images. Finally I will suggest some methods to capture the hierarchy implied by the deep structure in a graph or rooted tree structure. These graphs or trees must describe the image in such a way that we can use it for image matching, segmentation or recognition.
Content Based Image Retrieval
using Multi Scale Top Points
Abstract: A feasibility study for a new method for content based image retrieval is presented. First, an image representation using multiscale top points is introduced. This representation is validated using a minimal variance reconstruction algorithm. The image retrieval problem can now be translated into comparing distances between point sets. For this purpose, the proportional transportation distance (PTD) is used. A method is proposed using multiscale top points and their reconstruction coefficients in the PTD to define these distances between images. We present some experiments with promising results on a database with face images.
Decomposing biological motion:
Analysis and synthesis of human gait.
Abstract: Biological motion contains information about the identity of an agent, about his or her actions, intentions and expressions. The human visual system is highly sensitive to animate motion patterns and capable to extract socially relevant information from it. Here, I outline a framework to artificially obtain gender and other characteristics of the agent from human motion patterns and subsequently use this information to synthesize motion with particular, well-defined biological and psychological attributes. The proposed model is based on the statistics of a data base of motion capture data. Based on linearization of the motion data, a motion space is defined which is spanned by the first few principal components obtained from a data base of input walkers. Using biological and psychological traits attributed to the input walkers, linear discriminant functions are computed which define vectors in the motion space that generalize the respective trait. The linear discriminant functions are in turn used to generate walking patterns with the respective properties.
The application of computer vision in assisting
older adults with dementia
Abstract: Older adults constitute the fastest growing population group in Canada. As such, finding ways of supporting older adults who wish to continue living independently in their own homes, as opposed to moving to a long-term care facility, is a growing social problem. However, the goal of "aging-in-place" is becoming increasingly difficult as more older adults are living alone in their homes (often in rural areas), and as the proportion of this population with a cognitive disability such as dementia increases. It has been hypothesized, however, that through the careful placement of technological support, such as computer vision aided artificially intelligent systems, these difficulties can be reduced, and older adults will be able to continue living safely in their own homes longer.
This presentation, with a focus on the application of computer vision, will provide an overview of some of the technologies currently under development in the Intelligent Assistive Technology and Systems Lab. Specifically, two systems will be discussed-an intelligent environment that provides prompting to older adults with dementia during the completion of various activities; and an intelligent emergency response system that recognizes if a person has fallen and then calls for assistance.
Retinal Thickness Measurements and Optic Nervehead Tracking
in Optical Coherence Tomography
Abstract: This talk will open with a brief overview of some of the research activities underway at the Signal Analysis and Machine Perception Laboratory (SAMPL) at The Ohio State University. SAMPL was founded in 1986 by Prof. Boyer to conduct fundamental and applied research in computer vision and related topics. Current research activities include face and pattern recognition, aerial and satellite image understanding, perceptual organization, object recognition, target detection, and medical image analysis. Following the introductory material, the talk will focus on our recent research in the analysis of optical coherence tomograpy to map the human retina for the diagnosis and monitoring of glaucoma, macular edema, and other eye diseases. In particular, we will discuss our use of an autoregressive Markov model to trace retinal boundaries through extremely heavy speckle noise and a dual eigenspace approach to nervehead tracking in the associated video. This system is likely to become the "standard of care" in the clinical assessment of retina.
Real-Time Implementation of Phase-Correlation
Stereo Algorithm using FPGAs
Abstract: This talk describes a real-time (30 fps) implementation of a phase-correlation based, dense-stereo algorithm using Field Programmable Gate Arrays (FPGAs). Reconfigurable hardware, including FPGAs, is an attractive platform for implementing vision algorithms due to its ability to exploit parallelism often found in these algorithms, and due to the speed with which applications can be developed as compared to hardware. Further, a single hardware substrate can be re-used in the implementation of different modules. The system described in this talk outputs 8-bit, subpixel disparity estimates for 256x360 pixel images at 30 frames/s, using a phase correlation algorithm for stereo disparity. The phase information is derived from the output of a set of multi-scale, steerable, complex filters applied to the input images. Despite the complexity of performing correlations on multi-scale, multi-orientation phase data, the system runs 30-300 times faster in hardware than its software implementation. This talk describes the hardware platform used, the algorithm, the design methodology and the issues encountered during its hardware implementation. Of particular interest is the implementation of the filters, which are widely used in computer vision algorithms. Several trade-offs were required to both fit the design into the available hardware and to achieve video-rate processing. Finally, results from the system are given both for synthetic data sets as well as several standard stereo-pair test images.
Inference of Visual Motion Boundaries
Abstract: A long-standing problem in motion analysis has been the detection and estimation of motion in the neighborhoods of surface boundaries, where motion in the image is discontinuous and occlusions cause image structure to appear or disappear from one image to the next. While a problem for flow estimation, motion boundaries provide a rich source of information about the location of surface boundaries and the surface depth ordering.
This talk outlines a probabilistic framework for representing and estimating image motion in terms of multiple motion models, including both smooth motion and local motion discontinuities. We compute the posterior probability distribution over models and model parameters, given the image data, using a particle filter for propagating beliefs over space and time.
More seriously, this talk summarizes some work done over 2 or 3 years with Michael Black and Oscar Nestares. The talk was recycled for a small meeting in Vancouver a month ago... and now again here.
Extending approximated maximum likelihood estimation
for computer vision
Abstract: Several estimation problems in computer vision can be modelled using a unified statistical framework. Applying a maximum likelihood argument, and adopting necessary approximations, estimation algorithms can be developed. In this talk, a thorough analysis is made of such approximated-ML estimation schemes. This affords three benefits. Firstly, existing estimation algorithms can be easily understood and compared by being placed in a cohesive context. Secondly a new algorithm which is straight forward to understand (and implement) can be derived from the formulation. And thirdly, estimators can be extended using the framework. For example, extra geometric constraints on parameters can be included directly in the estimation algorithms.

Send questions or comments about this page to 
Page last modified on Saturday, November 20, 2004
|
|