Image restoration: numerical optimisation

Image restoration: numerical optimisation Short and partial presentation Jean-François Giovannelli Groupe Signal Image Laboratoire de l Intégration du Matériau au Système Univ. Bordeaux CNRS BINP / 6

Context and topic Image restoration, deconvolution Missing information: ill-posedness and regularisation Previous / next lectures, exercices and practical works Three types of regularised inversion Quadratic penalties Convex non-quadratic penalties Constraints: positivity and support Bayesian strategy: an incursion for hyperparameter tuning A basic component: Quadratic criterion and Gaussian model Circulant approximation and computations based on FFT Other approaches: Numerical linear system solvers Numerical quadratic optimizers / 6

Direct / Inverse, Convolution / Deconvolution,... y = Hx + ε = h x + ε ε x H + y x = X (y) Reconstruction, Restoration, Deconvolution, Denoising General problem: ill-posed inverse problems, i.e., lack of information Methodology: regularisation, i.e., information compensation / 6

Various quadratic criteria Quadratic criterion and linear solution J (x) = y Hx + µ (x p x q ) = y Hx + µ Dx Huber penalty, extended criterion and half-quadratic approaches J (x) = y Hx + µ ϕ(x p x q ) J (x, b) = y Hx + µ [(x p x q ) b pq ] + ζ(b pq ) Constraints: augmented Lagrangian and ADMM J (x) = y Hx + µ Dx { x p = for p s.t. S x p for p M L(x, s, l) = y Hx + µ Dx + ρ x s + l t (x s) / 6

Various solutions (no circulant approximation)... 8 6 True Observation Quadratic penalty Huber penalty True Observation Quadratic penalty Constrained / 6

A reminder of the calculi The criterion...... its gradient... J (x) = y Hx + µ Dx J x = Ht (y Hx) + µ D t Dx = (H t H + µ D t D)x H t y... and its Hessian... J x = (H t H + µ D t D) The multivariate linear system of equations...... and the minimiser... (H t H + µd t D) x = H t y x = (H t H + µd t D) H t y 6 / 6

Reformulation, notations... Rewrite the criterion... J (x) = y Hx + µ Dx = xt Qx + q t x + q Gradient: g(x) = Qx + q Hessian: Q = (H t H + µ D t D) q = g() = H t y the gradient at x =... the system of equations... (H t H + µd t D) x = H t y Q x = q... and the minimiser... x = (H t H + µd t D) H t y = Q q 7 / 6

Quadratic criterion: illustration α (x x) + γ α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Variable x Variable x Variable x Comment: convex, concave, saddle point, proper direction 8 / 6

Quadratic criterion: illustration α (x x) + γ α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Variable x Variable x Variable x Comment: convex, concave, saddle point, proper direction 9 / 6

Computational and algorithmical aspects Various options and many relationships... Direct calculus, compact (closed) form, matrix inversion Circulant approximation and diagonalisation by FFT Special algorithms for D case Recursive least squares Kalman smoother or filter (and fast versions,... ) Algorithms for linear system solving Splitting idea Gauss, Gauss-Jordan Substitution, triangularisation,... Et Levinson,... Algorithms for criterion optimisation Component wise... Gradient descent...... and various modifications / 6

A family of linear system solvers Matrix splitting algorithms / 6

Linear solver: illustration Criterion J (x) = α (x x ) + α (x x ) ρ α α (x x )(x x ) Nullification of the gradient J / x = α (x x ) ρ α α (x x ) = J / x = α (x x ) ρ α α (x x ) = / 6

An example of linear solver: matrix splitting A simple and fruitful idea Solve Q x + q = that is Q x = q Matrix splitting Q = A B Take advantage... Q x = q (A B) x = q A x = B x q x = A [B x q] / 6

An example of linear solver: matrix splitting A simple and fruitful idea Solve Q x + q = that is Q x = q Matrix splitting Q = A B Take advantage... Q x = q (A B) x = q A x = B x q x = A [B x q] Iterative solver / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m 6 / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m = M (Mx [K] m) m = M x [K] (M + I) m 7 / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m = M (Mx [K] m) m = M x [K] (M + I) m = M (Mx [K ] m) (M + I) m = M x [K ] (M + M + I) m 8 / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m = M (Mx [K] m) m = M x [K] (M + I) m = M (Mx [K ] m) (M + I) m = M x [K ] (M + M + I) m = M K x [] (M [K ] + + M + M + I) m ( K ) = M K x [] M k m k= 9 / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M Increasing powers... M = B B M = B B B B = B B M K = B K B O / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M Increasing powers... M = B B M = B B B B = B B M K = B K B O... and a well known serie K k= M k = = B K k= ( B B ) k ( K k= k ) B / 6

Matrix splitting: sketch of proof () Iterates x [K] = M K x [] ( K k= M k ) m Convergence if: ρ(m) <, spectral radius of M smaller than Limit for K tends to + x = M x [] M k m k= = O (I M) m = (I A B) A q = (A B) q = Q q / 6

Matrix splitting: recap and examples Matrix splitting recap Solve / compute Matrix splitting Iterate Q x = q x = Q q Q = A B [ ] x [k] = A B x [k ] q [ ] A x [k] = B x [k ] q Computability: efficient inversion of A efficient solving of Au = v Convergence requires ρ(a B) < / 6

Matrix splitting: recap and examples Matrix splitting examples Diagonal: Q = D R ] x [k] = D [R x [k ] q Lower and Upper: Q = L + U [ ] x [k] = L + U x [k ] q [ ] L + x [k] = U x [k ] q Circulant: Q = C R ] x [k] = C [R x [k ] q 6 / 6

Matrix splitting: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration 7 / 6

Matrix splitting: numerical results () Iterates Iterates (top: x and bottom x ) Variable x 6 8 Variable x 6 8 Iteration 8 / 6

Matrix splitting: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration 9 / 6

numerical results (all of them) Coef. ρ =.7 Coef. ρ =.9 Coef. ρ =. / 6

Two families of quadratic optimizers Component wise Gradient direction / 6

Quadratic criterion: illustration α (x x) + γ α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Variable x Variable x Variable x Comment: convex, concave, saddle point, proper direction / 6

Minimisation, a first approach: component-wise Criterion J (x) = xt Qx + q t x + q Iterative component-wise update For k =,,... For p =,,... P Update x p by minimization of J (x) w.r.t. x p given the other variables End p End k Properties Fixed point algorithm Convergence is proved / 6

Component-wise minimisation () Criterion J (x) = xt Qx + q t x + q Two ingredients Select pixel p: vector p = [,...,,,,... ] t Nullify pixel p: matrix I p t p Rewrite current image...... and rewrite criterion x p = t p x x /p = (I p t p) x x = (I p t p) x + x p p = x /p + x p p J p (x p ) = J [ x /p + x p p ] / 6

Component-wise minimisation () Consider the criterion as a function of pixel p J p (x p ) = J [ x /p + x p p ] = (x /p + x p p ) t Q(x /p + x p p ) + q t (x /p + x p p ) + q. = t p Q p x p + (Q x /p + q) t p x p +...... and minimize: update pixel p x opt p = (Q x /p + q) t p t p Q p Computation load Practicability of computing t p Q p (think about side-effects) Possible efficient update of φ p = (Q x /p + q) t p itself / 6

Component-wise: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion Iteration 6 / 6

Component-wise: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion Iteration 7 / 6

Component-wise: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x 6 7 8 9 6 7 8 9 Iteration Criterion 6 7 8 9 Iteration 8 / 6

Component-wise: numerical results (all of them) Coef. ρ =.7 Coef. ρ =.9 Coef. ρ =. 9 / 6

Extensions: multipixel and separability / 6

Two families of quadratic optimizers Component wise Gradient direction / 6

Quadratic criterion: generalities α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Optimality over R N (no constraint) Null gradient Positive Hessian = J = g( x) x x Q = J x > / 6

Iterative algorithms, fixed point,... Iterative update Initialisation x [] Iteration k =,,... x = arg min J (x) x x [k+] = x [k] + τ [k] δ [k] Direction δ R N Step length τ R + x [k] k= x Stopping rule, e.g. g(x [k] ) < ε / 6

Iterative algorithms, fixed point, descent,... Direction δ Optimal : opposite of the gradient Newton, inverse Hessien Preconditioned Corrected directions: bisector, Vignes, Polak-Ribière,... conjugate direction,...... Step length τ Optimal Over-relaxed / under-relaxed Armijo, Goldstein, Wolfe... and also fixed step... optimal, yes,... and no... / 6

Gradient with optimal step length Strategy: optimal direction optimal step length Iteration x [k+] = x [k] + τ [k] δ [k] Direction δ R N δ [k] = g(x [k] ) Step length τ R + J δ (τ) = J (x [k] + τδ [k] ) A second order polynom J δ (τ) =... τ +... τ +... Optimal step length τ [k] = g(x[k] ) t g(x [k] ) g(x [k] ) t Q g(x [k] ) / 6

Illustration Two readings: optimisation along the given direction, line-search constrained optimisation Remark: orthogonality of successive directions (see end of exercice) 6 / 6

Sketch of convergence proof () A reminder for notations J (x) = xt Qx + q t x + q g(x) = Qx + q Some preliminary results x = arg min J (x) = Q q x J ( x) = qt Q q + q J (x) J ( x) = g(x)t Q g(x) J (x + h) J (x ) = g(x ) t h + ht Q h 7 / 6

Sketch of convergence proof () Notational convenience: x [k] = x, x [k+] = x, τ [k] = τ, g(x [k] ) = g Criterion increase / decrease J (x ) J (x) = J (x + τ δ) J (x) since δ = g and τ = gt g g t Q g = g t [τδ] + [τδ] t Q [τδ] =... = [ g t g ] g t Q g Comment: it decreases, it reduces... 8 / 6

Sketch of convergence proof () Distance to minimiser J (x ) J ( x) = J (x) J ( x) = J (x) J ( x) [ = [J (x) J ( x)] =... [J (x) J ( x)] [ g t g] g t Q g [ g t g] g t Q g J (x) J ( x) g t Q g/ ] [g t g] [g t Q g] [g t Q g] ( ) M m M + m... so it converges M and m: maximal and minimal eigenvalue of Q... and comment Kantorovich inequality (see next slide) 9 / 6

Kantorovich inequality Result [ u t Q u ] [ u t Q u ] ( ) m M M + u m Q symmetric and positive-definite M and m: maximal and minimal eigenvalue of Q Short sketch of proof Quite long and complex Case u = Diagonalise Q: Q = P t ΛP et Q = P t Λ P Convex combination of the eigenvalues and their inverse Convexiy of t /t / 6

numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion Iteration / 6

numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion 6 Iteration / 6

numerical results () Iterates Iterates (top: x and bottom x ) Variable x 6 8 Variable x 6 8 Iteration Criterion 6 8 Iteration / 6

numerical results (all of them) Coef. ρ =.7 Coef. ρ =.9 Coef. ρ =. / 6

Preconditioned gradient: short digest... Strategy: modified direction optimal step length Iteration x [k+] = x [k] + τ [k] δ [k] Direction δ R N δ [k] = P g(x [k] ) Step length τ R + J δ (τ) = J (x [k] + τδ [k] ) A second order polynom Optimal step length J δ (τ) =... τ +... τ +... τ [k] = g(x[k] ) t P g(x [k] ) g(x [k] ) t P Q P g(x [k] ) / 6

Preconditioned gradient: two special cases... Non-preconditioned: P = I standard gradient algorithm Direction δ R N δ [k] = g(x [k] ) Step length τ R + τ [k] = g(x[k] ) t g(x [k] ) g(x [k] ) t Q g(x [k] ) Perfect-preconditioner: P = Q one step optimal Direction δ R N δ [k] = Q g(x [k] ) Step length τ R + First iterate τ [k] = g(x[k] ) t P g(x [k] ) g(x [k] ) t P Q P g(x [k] ) = x [] = x []. Q g(x [] ) = x []. Q (Qx [] + q) = Q q! 6 / 6

Preconditionned gradient: numerical results: Better and better approximation of the inverse Hessian 7 / 6

Synthesis Image restoration, deconvolution and other inverse problems Three types of regularised inversion... Quadratic penalties Convex non-quadratic penalties Constraints: positivity and support... based on unconstrained quadratic optimsation A basic component: Quadratic criterion and Gaussian model Circulant approximation and computations based on FFT Numerical linear system solvers Matrix splitting Numerical quadratic optimizers Component-wise Gradient methods 8 / 6

Decreasing criterion... x 6. 9 / 6

Various solutions... 8 6 True Observation Quadratic penalty Huber penalty True Observation Quadratic penalty Constrained 6 / 6