Inverse problem and optimization

Similar documents
Bayesian Paradigm. Maximum A Posteriori Estimation

Preconditioning. Noisy, Ill-Conditioned Linear Systems

Satellite image deconvolution using complex wavelet packets

Preconditioning. Noisy, Ill-Conditioned Linear Systems

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Image restoration: numerical optimisation

Probabilistic & Unsupervised Learning

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

Proximal tools for image reconstruction in dynamic Positron Emission Tomography

Support Vector Machines

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

Image processing and Computer Vision

A Majorize-Minimize subspace approach for l 2 -l 0 regularization with applications to image processing

Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing

Machine Learning

Statistically-Based Regularization Parameter Estimation for Large Scale Problems

Simultaneous Multi-frame MAP Super-Resolution Video Enhancement using Spatio-temporal Priors

Image restoration. An example in astronomy

Optimisation in imaging

Linear Inverse Problems

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

Probabilistic Machine Learning. Industrial AI Lab.

Inverse problems in statistics

Computer Vision & Digital Image Processing

Numerical Linear Algebra and. Image Restoration

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Wavelet-based Image Deconvolution and Reconstruction

Regularization Parameter Estimation for Least Squares: A Newton method using the χ 2 -distribution

NONLINEAR DIFFUSION PDES

Laplace-distributed increments, the Laplace prior, and edge-preserving regularization

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

MCMC Sampling for Bayesian Inference using L1-type Priors

LPA-ICI Applications in Image Processing

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Inverse problems in statistics

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems

Erkut Erdem. Hacettepe University February 24 th, Linear Diffusion 1. 2 Appendix - The Calculus of Variations 5.

Covariance Matrix Simplification For Efficient Uncertainty Management

Machine Learning

ENERGY METHODS IN IMAGE PROCESSING WITH EDGE ENHANCEMENT

No. of dimensions 1. No. of centers

Wavelet-based Image Deconvolution and Reconstruction

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

Marginal density. If the unknown is of the form x = (x 1, x 2 ) in which the target of investigation is x 1, a marginal posterior density

Sparse & Redundant Signal Representation, and its Role in Image Processing

Physics-based Prior modeling in Inverse Problems

Bayesian Learning (II)

Super-Resolution. Shai Avidan Tel-Aviv University

Linear Models for Regression. Sargur Srihari

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Noise, Image Reconstruction with Noise!

Chapter 2 Mathematical Background

Advanced Numerical Linear Algebra: Inverse Problems

Bayesian Approach for Inverse Problem in Microwave Tomography

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

CS-E3210 Machine Learning: Basic Principles

Linear Diffusion and Image Processing. Outline

Iterative Methods for Ill-Posed Problems

10. Multi-objective least squares

SIGNAL AND IMAGE RESTORATION: SOLVING

Bayesian Linear Regression. Sargur Srihari

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Inverse Theory. COST WaVaCS Winterschool Venice, February Stefan Buehler Luleå University of Technology Kiruna

2 Tikhonov Regularization and ERM

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Compressive Sensing and Beyond

Introduction to the Mathematics of Medical Imaging

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Machine Learning: Logistic Regression. Lecture 04

MAT Inverse Problems, Part 2: Statistical Inversion

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

Pattern Recognition and Machine Learning

Statistical regularization theory for Inverse Problems with Poisson data

Linear Regression (9/11/13)

Variational Methods in Bayesian Deconvolution

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

COMS 4771 Regression. Nakul Verma

M. Holschneider 1 A. Eicker 2 R. Schachtschneider 1 T. Mayer-Guerr 2 K. Ilk 2

Discretization-invariant Bayesian inversion and Besov space priors. Samuli Siltanen RICAM Tampere University of Technology

2 Nonlinear least squares algorithms

Variable Metric Forward-Backward Algorithm

An IDL Based Image Deconvolution Software Package

Spectral Regularization

The Bayesian approach to inverse problems

COMPUTATIONAL ISSUES RELATING TO INVERSION OF PRACTICAL DATA: WHERE IS THE UNCERTAINTY? CAN WE SOLVE Ax = b?

Bregman Divergences for Data Mining Meta-Algorithms

ECE521 week 3: 23/26 January 2017

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Gauge optimization and duality

Generative v. Discriminative classifiers Intuition

A quantative comparison of two restoration methods as applied to confocal microscopy

Density Estimation. Seungjin Choi

Sparse & Redundant Representations by Iterated-Shrinkage Algorithms

Wavelet Based Image Restoration Using Cross-Band Operators

y(x) = x w + ε(x), (1)

Graphical Models for Collaborative Filtering

Transcription:

Inverse problem and optimization Laurent Condat, Nelly Pustelnik CNRS, Gipsa-lab CNRS, Laboratoire de Physique de l ENS de Lyon Decembre, 15th 2016

Inverse problem and optimization 2/36 Plan 1. Examples of inverse problems 2. Direct model: linear operator, noise 3. Ill-posed problem 4. Inversion 5. Regularization 6. Gradient descent

Inverse problem and optimization 3/36 Example: denoising Observations Denoised image

Inverse problem and optimization 3/36 Example: denoising

Inverse problem and optimization 3/36 Example: denoising

Inverse problem and optimization 4/36 Example: motion blur Observations Restored image

Inverse problem and optimization 4/36 Example: motion blur

Inverse problem and optimization 4/36 Example: motion blur

Inverse problem and optimization 5/36 Example: structured illumination microscopy

Inverse problem and optimization 6/36 Example: positron emission tomography

Inverse problem and optimization 6/36 Example: positron emission tomography? Measurements Reconstructed images

Inverse problem and optimization 7/36 Example: cartoon-texture decomposition? Observations Texture Cartoon

Inverse problem and optimization 8/36 Notations Image x R N 1 N 2 x = ( x n1,n 2 ) 1 n 1 N 1,1 n 2 N 2 Vector consisting of the values of the image of size N = N 1 N 2 arranged column-wise x R N (with N = N 1 N 2 ) x = ( x n )1 n N

Inverse problem and optimization 9/36 Direct model z = D α (Hx) x = (x n ) 1 n N R N : vector consisting of the (unknown) values of the original image of size N = N 1 N 2. z = (z j ) 1 j M R M : vector containing the observed values of size M = M 1 M 2. H R M N : matrix associated to a linear degradation operator. D α : R M R M : models other degradations such as nonlinear ones or the effect of the noise, parameterized by α (e.g. additive noise with variance α, Poisson noise with scaling parameter α).

Inverse problem and optimization 9/36 Direct model z = D α (Hx) x = (x n ) 1 n N R N : vector consisting of the (unknown) values of the original image of size N = N 1 N 2. z = (z j ) 1 j M R M : vector containing the observed values of size M = M 1 M 2. H R M N : matrix associated to a linear degradation operator. D α : R M R M : models other degradations such as nonlinear ones or the effect of the noise, parameterized by α (e.g. additive noise with variance α, Poisson noise with scaling parameter α).

Inverse problem and optimization 10/36 Direct model: convolution z = D α (Hx) z = D α (h x) where {h x}: convolution product with the Point Spread Function (PSF) h of size Q 1 Q 2. Link between h and H: Under zero-end conditions, unknown image x is zero outside its domain [0,N 1 1] [0,N 2 1], kernel h is zero outside its domain [0,Q 1 1] [0,Q 2 1].

Inverse problem and optimization 11/36 Direct model: convolution Link between h and H: Zero padding of x and h: extended image x e and kernel h e of size M 1 M 2 as { x e xi1,i i 1,i 2 = 2 if 0 i 1 N 1 1 and 0 i 2 N 2 1 0 if N 1 i 1 M 1 1 and N 2 i 2 M 2 1, { hi e hi1,i 1,i 2 = 2 if 0 i 1 Q 1 1 and 0 i 2 Q 2 1 0 if Q 1 i 1 M 1 1 and Q 2 i 2 M 2 1, This yields to (Hx) j1,j 2 = M 1 1 i 1 =0 M 2 1 i 2 =0 h e j 1 i 1,j 2 i 2 x e i 1,i 2 where j 1 {0,...,M 1 1} and j 2 {0,...,M 2 1}.

Inverse problem and optimization 12/36 Direct model: convolution Link between h and H: H = H 0 HM1 1 H M1 2... H1 H 1 H0 HM1 1... H2 H 2 H1 H0... H3......... H0 H M1 1 H M1 2 H M1 3 where H j1 denotes a circulant matrix with M 2 columns such that hj e 1,0 hj e 1,M 2 1 hj e 1,M 2 2... h e j 1,1 hj e H j1 = 1,1 hj e 1,0 hj e 1,M 2 1... h e j 1,2..... RM 2 M 2. hj e 1,M 2 1 hj e 1,M 2 2 hj e 1,M 2 3... hj e 1,0

where X denotes the Fourier transform of x. Inverse problem and optimization 13/36 Direct model: convolution If H is a block-circulant matrix with circulant blocks, then where D: diagonal matrix, H = U DU U: unitary matrix (i.e, U = U 1 ) representing the discrete Fourier transform, denotes here the transpose conjugate. Efficient computation of Hx: Hx = U DU(U U)x = U DX

Inverse problem and optimization 14/36 Direct model: convolution Gaussian filter h Original image x F(h) F(x) {h x}

Inverse problem and optimization 14/36 Direct model: convolution Gaussian filter h Original image x F(h) F(x) {h x}

Inverse problem and optimization 14/36 Direct model: convolution Uniform filter h Original image x F(h) F(x) {h x}

Inverse problem and optimization 14/36 Direct model: convolution Uniform filter h Original image x F(h) F(x) {h x}

Inverse problem and optimization 15/36 Direct model: tomography (without noise) Tube of response j z = Hx Pixel n x = (x n ) 1 n N R N : vector consisting of the (unknown) values of the original image of size N = N 1 N 2. H = (H ji ) 1 j M,1 n N : probability to detect an event in the tube/line of response. z = (z j ) 1 j M R M : vector containing the observed values (sinogram).

Inverse problem and optimization 16/36 Direct model: compressed sensing z = Hx x = (x n ) 1 n N R N : vector consisting of the (unknown) values of the original image of size N = N 1 N 2. z = (z j ) 1 j M R M : vector containing the observed values (size M N). H = (H ji ) 1 j M,1 j N : random measurement matrix (size M N).

Inverse problem and optimization 17/36 Direct model: super-resolution ( b {1,...,B}) z b = D b TWx +ε b z: B multicomponent images at low-resolution (size M), x: (high-resolution) image to be recovered (size N), D b : downsampling matrix (size M N such that M < N), T: matrix associated to the blur (size N N), W : warp matrix (size N N), ε b N(0,σ 2 Id K ): noise often assumed to be a zero-mean white Gaussian additive noise.

Inverse problem and optimization 18/36 Direct model z = D α (Hx) x = (x n ) 1 n N R N : vector consisting of the (unknown) values of the original image of size N = N 1 N 2. z = (z j ) 1 j M R M : vector containing the observed values of size M = M 1 M 2. H R M N : matrix associated to a linear degradation operator D α : R M R M : models other degradations such as nonlinear ones or the effect of the noise, parameterized by α (e.g. additive noise with variance α, Poisson noise with scaling parameter α)

Inverse problem and optimization 19/36 Direct model: Gaussian noise z = D α (Hx) z = Hx +b where b: white additive Gaussian noise with variance α = σ 2. Original image Degraded image with σ = 10

Inverse problem and optimization 19/36 Direct model: Gaussian noise z = D α (Hx) z = Hx +b where b: white additive Gaussian noise with variance α = σ 2. Original image Degraded image σ = 30

Inverse problem and optimization 19/36 Direct model: Gaussian noise z = D α (Hx) z = Hx +b where b: white additive Gaussian noise with variance α = σ 2. Original image Degraded image σ = 50

Inverse problem and optimization 20/36 Direct model: Poisson noise z = D α (Hx) where D α : Poisson noise with scaling parameter α noise variance varies with image intensity. Original image Poisson noise α = 1

Inverse problem and optimization 20/36 Direct model: Poisson noise z = D α (Hx) where D α : Poisson noise with scaling parameter α noise variance varies with image intensity. Original image Poisson noise α = 0.1

Inverse problem and optimization 20/36 Direct model: Poisson noise z = D α (Hx) where D α : Poisson noise with scaling parameter α noise variance varies with image intensity. Original image Poisson noise α = 0.01

Inverse problem and optimization 21/36 Inverse problem Inverse problem : Find x the closest from x from observations. z = D α (Hx)? Observations z R M Restored image x R N

Inverse problem and optimization 22/36 Hadamard conditions The problem z = Hx is said to be well-posed if it fulfills the Hadamard conditions (1902) 1. existence of a solution, i.e. the range ranh of H is equal to R M, 2. uniqueness of the solution, i.e. the nullspace ker H of H is equal to {0}, 3. stability of the solution x relatively to the observation, i.e. ( (z,z ) ( R M) 2) z z 0 x(z) x(z ) 0.

Inverse problem and optimization 23/36 Hadamard conditions The problem z = Hx is said to be well-posed if it fulfills the Hadamard conditions 1. existence of a solution, i.e. every vector z in R M is the image of a vector x in R N, 2. uniqueness of the solution, i.e. if x(z) and x (z) are two solutions, then they are necessarily equal since x(z) x (z) belongs to kera, 3. stability of the solution x relatively to the observation, i.e. ensure that a small perturbation of the observed image leads to a slight variation of the recovered image.

Inverse problem and optimization 24/36 Inversion Inverse filtering (if M = N et H est inversible) x = H 1 z = H 1 (Hx +b) if additive noise b R M = x +H 1 b Remark : Closed form expression but noise amplification if H ill-conditioned (ill-posed problem).

Inverse problem and optimization 24/36 Inversion Inverse filtering (if M N and rank of H is N) x = (H H) 1 H z = (H H) 1 H (Hx +b) if additive noise b R M = x +(H H) 1 H b Remark : Closed form expression but noise amplification if H ill-conditioned (ill-posed problem).

Inverse problem and optimization 25/36 Regularization Variational approach where x Argmin x R N z Hx 2 2 : data-term, z Hx 2 2 +λω(x) Ω(x): regularization term (e.g. Ω(x) = x 2 2 ), λ 0: regularization parameter. Remarks If λ = 0: inverse filtering,

Inverse problem and optimization 26/36 Maximum A Posteriori (MAP) Let x and z be random vector realizations X and Z. Maximum A Posteriori (MAP) ˆx Argmax x R N µ X Z=z (x) find x that maximizes the posterior µ X Z=z (x)

Inverse problem and optimization 26/36 Maximum A Posteriori (MAP) Let x and z be random vector realizations X and Z. Maximum A Posteriori (MAP) ˆx Argmax x R N µ X Z=z (x) find x that maximizes the posterior µ X Z=z (x) Bayes rule max x R N µ X Z=z(x) max x R N µ Z X=x(z) µ X (x) min x R N { log(µz X=x (z)) log(µ X (x)) }

Inverse problem and optimization 26/36 Maximum A Posteriori (MAP) Let x and z be random vector realizations X and Z. Maximum A Posteriori (MAP) ˆx Argmax x R N µ X Z=z (x) find x that maximizes the posterior µ X Z=z (x) Bayes rule max µ X Z=z (x) max µ Z X=x (z) µ X (x) x R N x R N min x R N { log(µz X=x (z)) }{{} Data-term min x R N f 1 (x)+f 2 (x) } log(µ X (x)) }{{} A priori

Inverse problem and optimization 27/36 Data-term: Gaussian noise ( x R N ) f 1 (x) = log(µ Z X=x (z)) Let z = Hx +b with b N(0,α) Gaussian likelihood: µ Z X=x (z) = ( ) M 1 ((Hx) (i) z (i) ) 2 exp 2πα 2α i=1 Data-term: f 1 (x) = M i=1 1 2α ((Hx) i z i ) 2

Inverse problem and optimization 28/36 Data-term: Poisson noise ( x R N ) f 1 (x) = log(µ Z X=x (z)) Let z = D α (Hx) where D α Poisson noise with parameter α. Poisson likelihood: M exp ( α(hx) (i)) ( µ Z X=x (z) = α(hx) (i) ) z (i) z (i)! i=1 Data-term: f 1 (x) = M i=1 Ψ ( i (Hx) (i) ) αυ z (i) ln(αυ) if z (i) > 0 and υ > 0, ( υ R) Ψ i (υ) = αυ si z (i) = 0 and υ 0, + otherwise.

Inverse problem and optimization 29/36 Prior: Tikhonov /TV ( x R N ) f 2 (x) = log(µ X (x)) Tikhonov [Tikhonov, 1963] f 2 (x) = Lx 2 N 1 N 2 = {l x} 2 i,j avec l = i=1 j=1 0 1 0 1 4 1 0 1 0

Inverse problem and optimization 30/36 Minimisation problem Variational approach where x Argmin x R N L(x) = f(z,hx): data-term, Ω(x): regularization term, λ 0: regularization parameter. L(x)+λΩ(x)

Inverse problem and optimization 31/36 Convex optimization? Let f : R N ],+ ]. An optimization problem consists in solving: x Argmin x R N f(x) x is a global solution if for every x R N, f( x) f(x), Course objectif: Build a sequence (x n ) n Z that converges to x. f x 0 x n x n+1 x x

Inverse problem and optimization 32/36 Gradient descent Illustration: Remarks: f is displayed with its level lines, Fermat rule: x Argmin x R N f(x) f( x) = 0 Solve a problem with N equations and N unknowns. Closed form expression for very few f. If no closed form expression possible iterative method.

Inverse problem and optimization 33/36 Solving mean square problem Find x Argmin x R N Hx z 2 2 with { H R M N z R M Optimality conditions f( x) = 0 H (H x z) = 0 x = (H H) 1 H z Difficulty: invert H H.

Inverse problem and optimization 34/36 Solving logistic regression Find x Argminlog(1+exp( yx) with x R y R Optimality conditions f( x) = 0 y exp( y x 1+exp( y x) = 0 Difficulty: no closed form expression.

Inverse problem and optimization 35/36 Gradient descent Illustration: Iterations: ( k N) x [k+1] = x [k] +γ [k] d [k] where { d [k] R N : descent direction, γ [k] > 0: step-size.