![]() |
||
Winter 2009 Talk DescriptionsAbstract: We present a method for learning hierarchical class-specific features for recognition. Recently a greedy layerwise procedure was proposed to initialize weights of deep belief networks, by viewing each layer as a separate Restricted Boltzmann Machine (RBM). We develop the Convolutional RBM (C-RBM), a variant of the RBM model in which weights are shared to respect the spatial structure of images. This framework learns a set of features that can generate the images of a specific object class, and serves as building block of our hierarchical model. This hierarchical feature detector is a four layer structure of alternating filtering and maximum subsampling. We learn feature parameters of the first and third layers viewing them as separate C-RBMs. The output of our feature extraction hierarchy is fed as input to a discriminative classifier. It is experimentally demonstrated that the extracted features are effective for object detection, using them to obtain performance comparable to the state-of-the-art on handwritten digit recognition and pedestrian detection. Abstract: Most face recognition algorithms use "distance-based" methods: feature vectors are extracted from each face and distances in feature space are compared to determine matches. In this talk I will argue for a fundamentally different approach. We consider each image as having been generated from an underlying cause (a latent identity variable, or LIV). In recognition we evaluate the probability that two faces have the same underlying cause. Since image generation is noisy, we can never be exactly certain what this cause was, so we integrate (marginalize) over all possible causes. We present examples of identification and verification and show that our approach outperforms equivalent distance-based algorithms. We will also show how to build models that can cope with wide pose and illumination variation and how to leverage these models for creating realistic images of faces. Abstract:
Light sources and cameras are optical duals: sources emit light rays while
the cameras capture them. This talk will argue that light sources can
serve as better cameras in many applications, advancing the state of the
art in computer vision. By moving a light source instead of a camera, we
show how to reconstruct highly intricate shapes like wreaths, corals and
tree branches from hundreds of 'views'. Second, we leverage the
'illumination dithering' in the micromirror array of DLP projectors to
speedup virtually any active vision algorithm, resulting in high-speed 3D
reconstruction, photometric stereo, appearance capture and high frequency
preserving motion blur. We finally show what can be known about the
camera, by observing the main outdoor illuminants (sky and sun) in a
time-lapse image sequence, and how it can be used for appearance transfer
across different scenes.
Abstract:
We report on a continuing assessment of the imaging performance of an operator independent breast imaging device based
on the principles of acoustic tomography. The data were collected with a clinical prototype at the Karmanos Cancer
Institute in Detroit MI from patients recruited at our breast center. Tomographic sets of images were constructed from
the data and used to form 3-D image stacks corresponding to the volume of the breast. Our techniques generated whole
breast reflection images as well as images of the acoustic parameters of sound speed and attenuation. The combination
of these images reveals major breast anatomy, including fat, parenchyma, fibrous stroma and masses. The three types of
images are intrinsically co-registered because the reconstructions are performed using a common data set acquired by
the prototype. Fusion imaging, utilizing thresholding, is shown to visualize mass characterization and facilitates
separation of cancer from benign masses. These initial results indicate that operator-independent whole-breast imaging
and the detection and characterization of cancerous breast masses are feasible using acoustic tomography techniques.
Future improvements in image processing, including denoising and segmentation, will be discussed as a next step in
improving the clinical performance of the prototype.
Send questions or comments about this page to |
||