Winter 2005 Talk Descriptions

Attending to Visual Motion

John Tsotsos

Abstract: Visual motion analysis has focused on decomposing image sequences into their component features. There has been little success at re-combining those features into moving objects. Here, a novel model of attentive visual motion processing is presented that addresses both decomposition of the signal into constituent features as well as the re-combination, or binding, of those features into wholes. A new feed-forward motion-processing pyramid is presented motivated by the neurobiology of primate motion processes. On this structure the Selective Tuning (ST) model for visual attention is demonstrated. There are three main contributions: 1) a new feed-forward motion processing hierarchy, the first to include a multi-level decomposition with local spatial derivatives of velocity; 2) examples of how ST operates on this hierarchy to attend to motion and to localize and label motion patterns; and, 3) a new solution to the feature binding problem sufficient for grouping motion features into coherent object motion. Binding is accomplished using a top-down attentive selection mechanism that does not depend on a single location-based saliency representation.  Several human experimental studies are summarized in support of the overall model.

An Overview of Recent Computer Vision Research
in the Georgia Tech Biotracking Project

Frank Dellaert

Abstract: The Georgia Tech Biotracking project is an effort funded by the US National Science Foundation to investigate the interplay between visual tracking and behavioral modeling. The project is currently focused on the behavior of social animals such as honeybees and ants, of which we have several colonies in our lab for study. The underlying hypothesis is that better tracking will lead to better behavioral models, and better models will in turn lead to better tracking. While we yet have to fully close the loop, to date we have made significant progress in both computer vision techniques for tracking single and multiple targets, as well as behavioral modeling of individual and interacting agents. In particular, we have investigated novel multi-target tracking algorithms based on MCMC sampling, novel appearance models for bee-tracking and Rao-Blackwellized particle filters to implement them efficiently, as well as non-parameteric learning methods to model complex behavior. In addition to tracking work, we have also investigated various methods to interpret the behavior of single animals, in particular the stylized dancing behavior exhibited by bees, and have looked at both HMM approaches as well as switching linear dynamical systems. In my talk I will give an overview of the computer vision research in the project, including the latest exciting but as yet unpublished results.

Some of our work was prominently featured on CNN, and an overview can be found at http://www.cc.gatech.edu/~borg/biotracking/

Credits: the Georgia Tech Biotracking project leads are Tucker Balch and Frank Dellaert. Current graduate students include Zia Khan, Grant Schindler, Adam Feldman, and Sang Min Oh.

Graph Algorithms for Computer Vision

Olga Veksler

Abstract: Many tasks in computer vision can be formulated as an instance of a labeling problem. In a labeling problem we wish to assign to every pixel a label from a certain label class. A label class encodes scene properties such as color, stereo depth, motion vector, etc. A natural constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, for example, at object boundaries. For such problems, graph algorithms have recently gained popularity as they provide powerful combinatorial optimization tools.

Traditional approaches to a labeling problem can be roughly divided into two groups: local and global. In this talk, I will show how to use graph algorithms to improve both the local and global approach. In the local approach, to determine the label of a pixel, only the pixels in the surrounding patch (or window) are considered. A central problem for the local method is selecting an appropriate window shape. I will show how to use the minimum ratio cycle algorithm to optimize the search for window shapes.

In the global approach, the desirable constraints on a labeling are encoded in a global objective function. Thus the central problem in the global approach is the optimization of this objective function. I will show how to use graph cut algorithm to efficiently compute provably good approximations to a certain class of objective functions which are of particular interest for vision.

Finally, I will present a very fast global approach based on dynamic programming. Even though this approach is very fast, it does not perform a true 2D optimization, and therefore is not as accurate as the global approach based on graph cuts. However, it does give a nice tradeoff in terms of speed vs. accuracy.

3D vision sensors and their application
to wheelchair collision avoidance

Jesse Hoey

Abstract: This will be an informal talk primarily about two state-of-the-art 3D vision sensors, the Bumblebee stereo camera from Point Grey Research, and the time-of-flight infrared depth sensor from Canesta, Inc. I will first discuss a application project in the Intelligent Assistive Technology and Systems Laboratory (IATSL) for wheelchair collision avoidance using 3D sensors. I will give a brief overview of the project and our proposed approach. Then I will talk about how both of the sensors operate, and discuss the pros and cons of each. I will then demonstrate both sensors, and will expect participation and critical discussion from the audience. I will show some preliminary results using the Bumblebee for collision avoidance on the wheelchair. Time permitting, I will discuss some related autonomous robotics projects at UBC.

For more info see: http://www.cs.toronto.edu/~jhoey/wheel/index.html

Recent Results in Graph Cuts: Illumination-Invariant Tracking and k-Pixel Interactions

Daniel Freedman

Abstract: Over the last few years, techniques based on combinatorial optimization have gained a following in computer vision. In particular, a variety of vision algorithms have been formulated which rely on min-cut / max-flow methods. In this talk, we introduce two new results in this vein. First, we show an application of graph-cut methods to the problem of tracking under large variations in illumination. This is a challenging problem which is relevant for a number of surveillance and military applications. Second, we present new results on sufficient conditions for min-cut optimization of energy functions with k-wise interactions of pixels. These conditions should prove useful in a variety of MRF-style problems.

Graph Cuts in Vision: Theory and Applications

Yuri Boykov

Abstract: Min-Cut algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. We review a number of applications and discuss some theoretical properties that motivate the use of graph cuts. Intuitively, graph cuts can be presented as a "generalization" of Dynamic Programming and shortest path algorithms to N-D optimization problems. Geometric interpretation of a cut as a hypersurface is one of the key insights. We discuss our recent results connecting min-cuts with Riemannian minimal surfaces. We also discuss limitations of graph cuts and review some new directions of research.

Confocal Stereo

Sam Hasinoff

Abstract: At close scales, hair and many other scenes exhibit sufficient texture to suggest that classic depth-from-focus techniques could prove useful for 3D reconstruction. Our goal is to recover highly detailed geometry from such scenes (e.g., to the level of individual strands of hair), using images from a very high-resolution 12MP digital SLR camera. Compared to previous research on depth-from-focus, our approach differs in two significant ways. First, by manipulating both aperture and focus setting, we can use richer models of appearance variation to estimate depth, and potentially reconstruct occluded layers as well. Second, because of our high demands on accuracy and the high resolution, zoomed-in nature of the scenes, we require a more extensive calibration model than others. Here I discuss some of our current ideas (and explain why we call this technique <93>confocal stereo<94>), present our new calibration technique, and demonstrate some preliminary results.

Send questions or comments about this page to
Page last modified on Saturday, April 30, 2005