2019-06-03-demosaicing
Reports
Interpolation-based demosaicing
- 2002_demosaicking_methods_for_bayer_color_arrays
- original method that
- bilinear interpolate G channel
- constant-hue assumption: interpolate R,B to maintain constant chrominance
- intro
- green is representative of luminance (response curve peaks ~550nm)
- linear interpolation fails at edges
- performance metrics
- MSE in RBB color space
- delta E error in CIELAB color space
- imaging model
- locally, red/blue perfectly correlated with green over a small neighborhood and differ from green by only an offset
- constant hue assumption
- hue: the he degree to which a stimulus can be described as similar to or different from stimuli that are described as red, green, blue, and yellow
- chrominance: red, blue
- lumimnance: green
- hue
R/G, B/G
- allowed to change gradually (reduce color fringe)
- if constant hue
R_ij / R_kl = G_ij / G_kl
- ratio
log(A/B)=logA-logB
, since exposure is logarithmic
- known R_ij, G_ij as measured values, G_lk is interpolated
- the missing chrominance
R_kl = G_kl (R_ij / G_ij)
- idea
- interpolate G using bilinear interpolation
- compute hue in R,B (
R<-R/G, B<-B/G
)
- interpolate hue in R,B with bilinear interpolation
- determine chrominance (B,R) from hue (
R = hue[0]*G
)
- 2004_high_quality_linear_interpolation_for_demosaicing_of_bayer_patterned_color_images
- 2006_color_demosaicing_using_variance_of_color_differences
- abstract
- missing green first estimated based on variances of color dfferences alone edge directions, missing red/blue estimated based on interpolated green plane
- goodness
- preserves textures details, reduce color artifacts
- previously
- (ACPI) adaptive color plane interpolation
- a base method
- extensions
- (PCSD) same interpolation direction for each color component of a pixel reduces color artifacts
- (AHDDA) local homogeneity as indicator to pick direction for interpolation
- ACPI
- steps
- interpolate green along direction of smallest gradient
- 2014_rethinking_color_cameras
- traditionally
- camera sub-sample measurements at alternating pixel locations and then demosaick measurements to create full color image by up-sampling
- but blocks majority of incident light
- each color filter blocks roughly 2/3 of incident light, color camera 3x slower than grayscale counterpart
- prone to artifact during reconstruction
- idea
- new computational approach for co-designed sampling pattern and reconstruction leading to reduction in noise and aliasing artifact
- pattern panchromatic
- no filter for majority of pixels -> measures luminance without subsampling
- avoid loss of light, and aliasing and high spatial-frequency variations
- color sampled sparsely
- 2x2 bayer block sparsely placed (reconsturction prevents loss of color-fidelity )
- then propagated via guidance from un-aliased luminance channel
- reconstruction
- infer missing luminance samples by hole-filling
- chromaticity propagated by colorization with a spatio-spectral image model
- faster and sharper color camera
- why it works
- redundancies in spatio-spectral content of natural images
- boundaries are sparse, exists contiguous regions where variations in colors primarily due to shading
- luminance and chrominance vary
- chromaticities, relative ratios between different channels, stays constant
- at boundaries, both luminance and chromaticities change abruptly
- steps
- Recovering luminance (at bayers blocks)
- wavelet transform, minimization problem
- estimating chromaticities at bayer blocks
- chromaticity (for R,G,B)
- ratio between a color channel value and the luminance value at the same pixel
c = m / l
c
chromaticity
m
measured value of r,g,b color channel
l
luminance
- assumption
- bayer block does not span material boundary
- so chromaticity assumed to be same for all 4 pixels in a block
- colro and luminance related
- luminance is a weighted sum of color r,g,b channels +formulation
- least squared minimization chromaticity of r,g,b separately
- minimizing
lc - m
, and
- regularized by a noise variance,
- biasing chromaticity at dark pixels toward gray
- remove outliers, using median filtering, on estimates on blocks that occur at boundaries or in dark regions
- propagating chromaticities
- partition
- KxK patches s.t. 4 corners of each patch includes a bayer block.
- chromaticity inin each patch as a linear combination of chromaticity at the 4 corners, where the weights based on luminance
- material affinity
alpha
- encode affinity between a bayer block and pixels around it
- for each bayer block
j
and sites n
around it, compute affinity
alpha_j[n]
within overlapping (2K+1) x (2K+1)
regions around the block
- optimization
- minimize a
- constrained with
alpha=1
at center and alpha=0
at other 8 bayer blocks
- combination weights
weight_j[n] \propsto alpha^2 * l
j \in {1,2,3,4}
, i.e. the 4 corners
- gives initial estimates
- the apply non-local kernel to refine chromaticity estimates
- experiments
- full-color image, then added gaussian noise
- color-sampling frequency K, trade-off between light-efficiency and sharpness.
- performance
- the good
- better reconstruction even in not-noisy scenarios
- a lot better in images with lots of noise
- the bad
- not able to reconstruct hue with very fine image structures
- changes in chromaticity happen at a rate faster than color sampling frequency K
- but still able to recover texture info
- questions
- recovering luminance
- confused with in-painting with wavelet-based approaches
- use wavelet decomposition and l1 minimization to infer luminance values
- not entirely sure of the inner workings
- propagating chromaticities
- confused about formulation of c as well,
- what does e or c represent.
- why so many convolutions…
- does not really relate to the project?
- more similar to spatially-varying exposure (SVE) images.
- no color captured in sparse locations?
- so what is the take-away idea from this paper?
Data-driven learning based
- 2003_enhancing_resolution_along_multiple_imaging_dimensions_using_assorted_pixels
- goal
- enhance resolution using local structure models (polynomial funtion of intensities) learnt offline from images
- multisampling
- a framework for using pixels to sample simultaneously multiple dimension of an image (space, time, spectrum - color, brightness - dynamic range, and polarization)
- learn structured model
- https://en.wikipedia.org/wiki/Nyquist_frequency
- interpolation methods enhances spatial resolution but introduce errors in blurring and aliasing
- different dimension of imaging highly correlated with reach other (due to reflectance and illumination of scene)
- local structured model
- learn a local mapping function from data,
- inputs: low-res multisampled images
- labels: correct high-resolution images
- model is a polynomial function of brightness measured within a local neighborhood
- learn local mapping
f
H(i, j) = f(M(X, Y))
where
H
is high quality value at pixel i, j
M
is measured low-res value at pixel x,y
in a neighbhorhood X,Y
around i,j
- use a single structured model for each type of local sampling pattern
- previous methods
- Markov model, bayesian, kernel
- novel
- resolution enhancement over multiple dimensions
- SVC and models
- SVC (spatially varying color)
- bayes color mosaic
- mosaic of R, G, B in digital color sensors
- have 4 different sampling pattern for a neighborhood of 3x3
- goal
- compute value of (R,G,B) at each pixel
- model
M_p(\lambda) = A_p C_p(\lambda)
p
represent pattern, \lambda
represent color
A_P
are measurements of neighborhood having pattern P
- where each row has all relevant powers of measured data
M_P
within a neighborhood
C_P(\lambda)
coefficients of polynomial mapping function
- optimize with least squared
- performance
- visual and luminance error
- stability of learnt model
- good on training data, or test data similar to training data
- if trainnig data is large, then random test data has larger error but is stable
- SVEC (spatially varying exposure and color)
- simutaneous sampling of space, color, and exposure
- given 8 bit single color -> construct 12 bit for each 3 color values at each pixel
- pattern
- still the same model,
- questions
- SVC model
- why is the polynomial model for SVC accounts for correlation between different color channels, seems the coefficients for the model is computed independently
- to get high quality value from low-res value in the neighborhood, why the model multiplies measurement for each pixel and every other pixel
- to model pairwise correlation between pixels?
- compute number of coefficients, why + P not * P
- is the project currently using bicubic interpolation?
- 2013_joint_demosacing_and_denoising_via_learned_nonparametric_random_fields
- 2016_klatzer_learning_joint_demosaicing_and_denoising_based_on_sequential_energy_minimization
- abstract
- demosacing+denoising as image restoration problem
- method learngs efficient regularization by a variational energy minimization
- introduction
- challenges
- interpolating edges/corners
- error propagation with interpolating channels separately/sequentially
- noise maybe non-Gaussian, maybe a complex distribution
- dataset
- images already processed,
- [Khashbi] introduces the Microsoft Demosaicing dataset
- related work
- demosaicing
- interpolation
- heuristic-based,
- some jointly demosaic+denoise but only in Gaussian settings
- learning based
- [Khashbi] regression tree fields
- inverse problem
- priors for regularization: TV, color difference, hue smoothness, denoiser (BM3D)
- hand-crafted demosaicing not able to capture image statistics
- this method
- learning+reconstrction based
- demosaicing in linRBG space
- adaptable to different CFA pattern and camera types
- jointly denoise and demosaic under non-Gaussian noise
- regularization does not rely on handcrafted correlation, but learns
- method
- upper level
- L2 loss between ground truth
g
and a sequence of reconstructed images u^s
- lower level quadratic energy
Q
- gradient descent given
grad f
- backpropagation
- energy function
f
f(u) = R(u) + D(u)
R
fields of experts prior
- nonlinearity with radial basis function (to learn image priors from data)
D
data fidelity
- nonlinearity with radial basis function (to learn non-Gaussian noise)
- can compute
grad(f)
- noise model
- mixture of poisson+gaussian (Microsoft dataset)
- optimization
- experiments & results
- mean of psnr for 200 images
- slightly better (1dB) over FlexISP
- 2016_deep_joint_demosaicking_and_denoising
- slides (https://groups.csail.mit.edu/graphics/demosaicnet/data/demosaicnet_slides.pdf)
- abstract
- data-driven (instead of hand-crafted priors) approach to demosaicing and denoising using deep neural nets
- goal
- reduce computation speed
- reduce artifacts
- dataset
- millions of sRGB images, mosaicked, and added noise
- metrics to identify difficult patches
- inject noise
- introduction
- compare to flexISP (2014)
- non-local natural image priors is still handcrafted, and incoporation of optimization and non-local priors leads to increase in computation cost
- contribution
demosaicnet
uses data-driven local filtering approach for efficiency
- capable of handling wide range of noise
- method of building training set rich in challenging artifacts (moire, etc)
- SOTA results
- runs faster
- related work
- demosaicing
- filters, smooth hue priors
- replace hand-crafted filters with deep learning
- self-similarity
- some methods uses neural nets
- but trained on small datasets
- this work builds a larger dataset
- joint denoising and demosaicking
- Heide [2014] use a global primal-dual optimization with self-similarity prior but is too slow
- Khashabi [2014] (joint via learned nonparametric random fields) learning approach
- Klatzer [2016] (learning joint demosaicing and denoising baesd on sequential energy minimization) sequential energy minimization
- this work expose noise level as a parameter
- network
- convert Bayer input to quarter-resolution multi-channel image
- each 2x2 patch -> 4 channel feature
- makes spatial pattern invariant with a period of 1 pixel
- D conv layer, with ReLU nonlinearity
- first D-1
- last filter
- 12W weights
- last output has 12 channels
- upsample 12 channel -> 3 channel full res image, then concatenate with masked input, totals to 6 channels
- 1 last convolution at full resolution
- hyperparameter
- joint denoising with multiple noise levels
- motivation
- want 1 network for different noise levels
- idea
- train a single network on a range of noise levels and explicitly add noise level as an input parameter to the network
- training
- for each input image, sample a noise level in a range
sigma \in [a,b]
,
- corrupt image with a centered additive Gaussian noise with
sigma^2
before feeding into the network
- also feed noise level
sigma
as the 5th channel
- training
- D = 15
- W = 64
- 3x3 filters
- optimize normalized L2 loss
- 64 batch size
- learning rate 10^-4
- weight decay 10^-8
- ~600,000 trainable parameters
- training on Titan X, taking 2-3 weeks…
- training data
- problem
- hard cases are rare and diluted by vastly more common easy areas.
- L2 norm fail to notice demosaicking artifacts
- solution
- detect challenging cases, and tune network to learn the hard examples
- just use the network trained on standard network and look for classes of images that the network prone to produce artifacts,
- luminance around thin structures
- detected via HDR-VDP2 … PSNR does not capture this artifact properly
- color moire
- transform input/output to Lab space and compute 2D fourier transform
- quantify gain in frequencies
- then fine-tune or train network from scratch to improve performance on difficult cases
- i.e. training done on problematic patches (with high HDR-VDP2), not the entire image
- results
- metric
- PSNR, error averaged over pixels and color channels before taking the average
- test set
- the HDR-VDP mined dataset and the color moire dataset
- demosaicking noise-free images
- better results … +joint denoising and demosaicking on noisy images
- better results …
- also non-Bayer mosaicks
- novelty
- model tailered to zippering and moire artifact
- metrics for finding hard dataset and fine-tune network on those
- questions
- https://github.com/mgharbi/demosaicnet
- code available
- took a look, seems pretty straightforward
- although training would take days and weeks, and input dataset is not as abundant.
- so might not be the most feasible ….
- relating to the project
- 2018_deep_image_demosaicking_using_a_cascade_of_convolutional_residual_denoising_networks
- basically 2016 paper but extended with majority maximization …
- 2018_deep_residual_network_for_joint_demosaicing_and_superresolution
Inverse problem
- 2014_flexISP_a_flexible_camera_image_processing_framework
- abstract
- end-to-end image processing that
- enforce image priors as proximal operators
- ADMM/primaldual optimization
- idea: end-to-end reduces error introduecd in each step of image processing
- each stage is ill-posed
- individual stage not independent
- applied to demosaicking, denoising, deconvolution, etc.
- related work
- denoising
- priors: self-similarity and sparsity
- BM3D and TV (total variations)
- joint optimization
- addressing subproblems does not yield best-quality reconstructions
- demosaiking+denoising
- demosaiking+denoising
- …
- ADMM applied to image processing literature
- optimization
- inverse problem
z = Ax + \eta
min_x 1/2 ||z-Ax||^2 + R(x)
- converted to standard constrained optimization
- solver (primal dual faster than ADMM)
- regularization
- image gradient sparsity (TV)
- denoising (NLM,BM3D)
- cross channel gradient correlation for edge consistency
- combination of regularizers
- design choices
- optimization
- priors as proximal operators
- prior choices
- external priors
- tested TV, curvelet, EPLL
- TV is most cost-effective
- internal priors (Denoiser)
- tested BM3D, NLM, DCT
- BM3D best
- applications & results
- demosaiking
- 2.2 dB over best previous method
- deblurring
- interlased HDR
- color array camera
- burst denoising and demosaicking
- process JPEG compressed image
- discussion
- priors
- choice of prior importants
- failure cases
- inputs with no self-similarity (i.e. random noie)
- misconfiguring solver parameter
- observation
- regularizer weights influence reconstruction
- did not prove convexity / guaranteed convergence
- 2017_joint_demosaicing_and_denoising_of_nioisy_bayer_images_with_ADMM
- codebase: https://github.com/TomHeaven/Joint-Demosaic-and-Denoising-with-ADMM
- abstract
- unified objective function with hidden priors and optimize with ADMM for recovering color image from noisy Bayer input
- perform better in PSNR and human vision
- more robust to variations of noise levels
- intro
- recent research in jointly demosaic and denoise (good place to see what is SOTA)
- problem formulation
- image formation
b = Ax + \eta
b
bayer iamge
A
downsampling operator
x \in \R^{3n}
\eta
noise vector
- image recovery
- inverse problem
\min || Ax-b || + T(x)
- where
T
are prior functions
- priors
- flat areas: smoothness prior
- edge: edge-preserving denoising prior
- bayer mosaic pattern: structural information
- total variation
- (CBM3D) attenuation of additive white Gaussian noise from
- cross-channel prior (basically 2004 paper that MATLAB
demosaic
uses)
- handwavy about parts of objective function as hidden function, i.e. did not establish convexity of BM3D/demosaocing prior (ability to do follow up work)
- optimization with ADMM
- experiments
- Kodak + McM
- inputs
- downsample -> Gaussian white noise
- noise level is known
- methods compared
- results
- better result for noisy inputs
- slow (900s vs 4s for deepjoint)
- follow up
- faster algo
- convergence of ADMM not determined
Reviews