Samuel W. Hasinoff and Kiriakos N. Kutulakos, A Layer-Based Restoration Framework for Variable-Aperture Photography. Proc. 11th IEEE International Conference on Computer Vision, ICCV 2007, 8 pp. (DVD proceedings). [pdf] [poster]
To test our approach on real data, we captured sequences using a Canon EOS 1Ds Mark II, secured on a tripod, with an 85mm f1.2L lens set to manual focus. In all our experiments we use the three-image "aperture bracketing" mode set to +-2 stops, and select shutter speed so that the images are captured at f8, f4, and f2 (yielding relative exposure levels of roughly 1, 4, and 16, respectively). We captured RAW images for increased dynamic range, and demonstrate our results for downsampled 500x333 pixel images.
All the results videos (MPEG-2) below include a side panel with three sliders, to
help visualize the camera settings used to synthesize new images. The red zones
on the sliders indicate extrapolation:
|
![]() |
Outdoor sequence, composed of three layers—a rusty dumpster, a pebbled wall, and a building. The foreground dumpster is darker and nearly in-focus.
Indoor sequence, backlit with available light from the window. The nearly-focused subject is dark compared to the background buildings, and a very dark defocused chair sits in the foreground. Because the chair under-exposed even in the widest-aperture image, we see artifacts at its boundary, due to posterization and over-smoothing
Outdoor sequence, composed of two differently exposed structures—a dark wall is occluded by several bright stone pillars. Note how the method assigns slightly different depths to the two segments containing the gradually sloping background wall. Although not as noticeable in the synthesized results, the initial segmentation misassigns the lower-rightmost portion of the foreground ledge to the background layer.
Macro sequence (using a 10mm extension tube), composed of a miniature glass bottle whose inner surface is painted, and a dried bundle of green tea leaves. This is a challenging dataset for several reasons: the level of defocus is severe outside the very narrow depth-of-field, the scene consists of both smooth and intricate geometry (bottle and tea leaves, respectively), and reflections on the glass surface actually lead to focusing at "virtual" depths. The initial segmentation leads to a very coarse decomposition into layers, which is not improved by the optimization. The worst resynthesis artifacts occur at layer boundaries—the bright "cracks" visible when refocusing are due to the incorrect segmentation combined with our diffusion-based inpainting algorithm.
This synthetic dataset consists of an HDR version of the 512x512 pixel Lena image, where we simulate HDR by dividing the image into three vertical bands and artificially exposing each band. We decompose the image into layers by assigning different depths to each of three horizontal bands, and generate the input images by applying the forward image formation model. Finally, we add Gaussian noise to the input with a standard deviation of 1% of the intensity range.