Image restoration: numerical optimisation

Similar documents
Image restoration: edge preserving

Image restoration. An example in astronomy

Line Search Methods for Unconstrained Optimisation

Nonlinear Optimization: What s important?

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Numerical optimization

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

minimize x subject to (x 2)(x 4) u,

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Written Examination

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Numerical Optimization: Basic Concepts and Algorithms

1 Computing with constraints

Optimisation in Higher Dimensions

You should be able to...

Iterative Methods for Smooth Objective Functions

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

8 Numerical methods for unconstrained problems

Algorithms for constrained local optimization

Numerical Optimization

FALL 2018 MATH 4211/6211 Optimization Homework 4

Chapter 4. Unconstrained optimization

Non-convex optimization. Issam Laradji

MATH 4211/6211 Optimization Basics of Optimization Problems

Relaxed linearized algorithms for faster X-ray CT image reconstruction

Numerical Optimization of Partial Differential Equations

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Inverse problem and optimization

Algorithms for Constrained Optimization

Continuous Optimisation, Chpt 6: Solution methods for Constrained Optimisation

Constrained Optimization

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Unconstrained optimization

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Computational Finance

MS&E 318 (CME 338) Large-Scale Numerical Optimization

5 Handling Constraints

1 Numerical optimization

Introduction to Unconstrained Optimization: Part 2

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

Iterative Methods for Solving A x = b

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Applications of Linear Programming

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Trajectory-based optimization

Second Order Optimization Algorithms I

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Mechanical Systems II. Method of Lagrange Multipliers

The Steepest Descent Algorithm for Unconstrained Optimization

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

Matrix Derivatives and Descent Optimization Methods

Mathematical optimization

Optimization and Root Finding. Kurt Hornik

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Examination paper for TMA4180 Optimization I

IE 5531: Engineering Optimization I

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey

Nonlinear Optimization for Optimal Control

Higher-Order Methods

Chapter 7 Iterative Techniques in Matrix Algebra

An Inexact Newton Method for Nonlinear Constrained Optimization

ICS-E4030 Kernel Methods in Machine Learning

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Multidisciplinary System Design Optimization (MSDO)

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

Mathematical Economics. Lecture Notes (in extracts)

On the convergence properties of the modified Polak Ribiére Polyak method with the standard Armijo line search

Motion Estimation (I) Ce Liu Microsoft Research New England

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

1 Introduction

Gradient Descent. Dr. Xiaowei Huang

arxiv: v1 [math.oc] 23 May 2017

Introduction to Alternating Direction Method of Multipliers

Course on Model Predictive Control Part II Linear MPC design

Generalization to inequality constrained problem. Maximize

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

1 Numerical optimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

4.6 Iterative Solvers for Linear Systems


Scientific Computing: Optimization

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

SF2822 Applied Nonlinear Optimization. Preparatory question. Lecture 9: Sequential quadratic programming. Anders Forsgren

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

Sub-Sampled Newton Methods

The Conjugate Gradient Method

Computational Linear Algebra

CS-E4830 Kernel Methods in Machine Learning

CPSC 540: Machine Learning

OR MSc Maths Revision Course

Convex Optimization. Prof. Nati Srebro. Lecture 12: Infeasible-Start Newton s Method Interior Point Methods

Introduction to gradient descent

Transcription:

Image restoration: numerical optimisation Short and partial presentation Jean-François Giovannelli Groupe Signal Image Laboratoire de l Intégration du Matériau au Système Univ. Bordeaux CNRS BINP / 6

Context and topic Image restoration, deconvolution Missing information: ill-posedness and regularisation Previous / next lectures, exercices and practical works Three types of regularised inversion Quadratic penalties Convex non-quadratic penalties Constraints: positivity and support Bayesian strategy: an incursion for hyperparameter tuning A basic component: Quadratic criterion and Gaussian model Circulant approximation and computations based on FFT Other approaches: Numerical linear system solvers Numerical quadratic optimizers / 6

Direct / Inverse, Convolution / Deconvolution,... y = Hx + ε = h x + ε ε x H + y x = X (y) Reconstruction, Restoration, Deconvolution, Denoising General problem: ill-posed inverse problems, i.e., lack of information Methodology: regularisation, i.e., information compensation / 6

Various quadratic criteria Quadratic criterion and linear solution J (x) = y Hx + µ (x p x q ) = y Hx + µ Dx Huber penalty, extended criterion and half-quadratic approaches J (x) = y Hx + µ ϕ(x p x q ) J (x, b) = y Hx + µ [(x p x q ) b pq ] + ζ(b pq ) Constraints: augmented Lagrangian and ADMM J (x) = y Hx + µ Dx { x p = for p s.t. S x p for p M L(x, s, l) = y Hx + µ Dx + ρ x s + l t (x s) / 6

Various solutions (no circulant approximation)... 8 6 True Observation Quadratic penalty Huber penalty True Observation Quadratic penalty Constrained / 6

A reminder of the calculi The criterion...... its gradient... J (x) = y Hx + µ Dx J x = Ht (y Hx) + µ D t Dx = (H t H + µ D t D)x H t y... and its Hessian... J x = (H t H + µ D t D) The multivariate linear system of equations...... and the minimiser... (H t H + µd t D) x = H t y x = (H t H + µd t D) H t y 6 / 6

Reformulation, notations... Rewrite the criterion... J (x) = y Hx + µ Dx = xt Qx + q t x + q Gradient: g(x) = Qx + q Hessian: Q = (H t H + µ D t D) q = g() = H t y the gradient at x =... the system of equations... (H t H + µd t D) x = H t y Q x = q... and the minimiser... x = (H t H + µd t D) H t y = Q q 7 / 6

Quadratic criterion: illustration α (x x) + γ α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Variable x Variable x Variable x Comment: convex, concave, saddle point, proper direction 8 / 6

Quadratic criterion: illustration α (x x) + γ α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Variable x Variable x Variable x Comment: convex, concave, saddle point, proper direction 9 / 6

Computational and algorithmical aspects Various options and many relationships... Direct calculus, compact (closed) form, matrix inversion Circulant approximation and diagonalisation by FFT Special algorithms for D case Recursive least squares Kalman smoother or filter (and fast versions,... ) Algorithms for linear system solving Splitting idea Gauss, Gauss-Jordan Substitution, triangularisation,... Et Levinson,... Algorithms for criterion optimisation Component wise... Gradient descent...... and various modifications / 6

A family of linear system solvers Matrix splitting algorithms / 6

Linear solver: illustration Criterion J (x) = α (x x ) + α (x x ) ρ α α (x x )(x x ) Nullification of the gradient J / x = α (x x ) ρ α α (x x ) = J / x = α (x x ) ρ α α (x x ) = / 6

An example of linear solver: matrix splitting A simple and fruitful idea Solve Q x + q = that is Q x = q Matrix splitting Q = A B Take advantage... Q x = q (A B) x = q A x = B x q x = A [B x q] / 6

An example of linear solver: matrix splitting A simple and fruitful idea Solve Q x + q = that is Q x = q Matrix splitting Q = A B Take advantage... Q x = q (A B) x = q A x = B x q x = A [B x q] Iterative solver / 6

An example of linear solver: matrix splitting A simple and fruitful idea Solve Q x + q = that is Q x = q Matrix splitting Q = A B Take advantage... Q x = q (A B) x = q A x = B x q x = A [B x q] Iterative solver [ ] x [k] = A B x [k ] q If it exists, a fixed point satisfies: x = A [B x q] so, Q x = q / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m 6 / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m = M (Mx [K] m) m = M x [K] (M + I) m 7 / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m = M (Mx [K] m) m = M x [K] (M + I) m = M (Mx [K ] m) (M + I) m = M x [K ] (M + M + I) m 8 / 6

Matrix splitting: sketch of proof () Notations Iterates [ ] x [k] = A B x [k ] q x [K] = M x [K ] m = A B x [k ] A q = M x [k ] m = M (Mx [K] m) m = M x [K] (M + I) m = M (Mx [K ] m) (M + I) m = M x [K ] (M + M + I) m = M K x [] (M [K ] + + M + M + I) m ( K ) = M K x [] M k m k= 9 / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M Increasing powers... M = B B M = B B B B = B B M K = B K B O / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M Increasing powers... M = B B M = B B B B = B B M K = B K B O... and a well known serie K k= M k = = B K k= ( B B ) k ( K k= k ) B / 6

Matrix splitting: sketch of proof () Convergence... in relation with eigenvalues of M Increasing powers... M = B B M = B B B B = B B M K = B K B O... and a well known serie K k= M k = = B K k= ( B B ) k ( K k= k ) B B (I ) B = (I M) Convergence if: ρ(m) <, spectral radius of M smaller than / 6

Matrix splitting: sketch of proof () Iterates x [K] = M K x [] ( K k= M k ) m Convergence if: ρ(m) <, spectral radius of M smaller than Limit for K tends to + x = M x [] M k m k= = O (I M) m = (I A B) A q = (A B) q = Q q / 6

Matrix splitting: recap and examples Matrix splitting recap Solve / compute Matrix splitting Iterate Q x = q x = Q q Q = A B [ ] x [k] = A B x [k ] q [ ] A x [k] = B x [k ] q Computability: efficient inversion of A efficient solving of Au = v Convergence requires ρ(a B) < / 6

Matrix splitting: recap and examples Matrix splitting examples Diagonal: Q = D R ] x [k] = D [R x [k ] q Lower and Upper: Q = L + U [ ] x [k] = L + U x [k ] q [ ] L + x [k] = U x [k ] q Circulant: Q = C R ] x [k] = C [R x [k ] q 6 / 6

Matrix splitting: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration 7 / 6

Matrix splitting: numerical results () Iterates Iterates (top: x and bottom x ) Variable x 6 8 Variable x 6 8 Iteration 8 / 6

Matrix splitting: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration 9 / 6

numerical results (all of them) Coef. ρ =.7 Coef. ρ =.9 Coef. ρ =. / 6

Two families of quadratic optimizers Component wise Gradient direction / 6

Quadratic criterion: illustration α (x x) + γ α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Variable x Variable x Variable x Comment: convex, concave, saddle point, proper direction / 6

Minimisation, a first approach: component-wise Criterion J (x) = xt Qx + q t x + q Iterative component-wise update For k =,,... For p =,,... P Update x p by minimization of J (x) w.r.t. x p given the other variables End p End k Properties Fixed point algorithm Convergence is proved / 6

Component-wise minimisation () Criterion J (x) = xt Qx + q t x + q Two ingredients Select pixel p: vector p = [,...,,,,... ] t Nullify pixel p: matrix I p t p Rewrite current image...... and rewrite criterion x p = t p x x /p = (I p t p) x x = (I p t p) x + x p p = x /p + x p p J p (x p ) = J [ x /p + x p p ] / 6

Component-wise minimisation () Consider the criterion as a function of pixel p J p (x p ) = J [ x /p + x p p ] = (x /p + x p p ) t Q(x /p + x p p ) + q t (x /p + x p p ) + q. = t p Q p x p + (Q x /p + q) t p x p +...... and minimize: update pixel p x opt p = (Q x /p + q) t p t p Q p Computation load Practicability of computing t p Q p (think about side-effects) Possible efficient update of φ p = (Q x /p + q) t p itself / 6

Component-wise: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion Iteration 6 / 6

Component-wise: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion Iteration 7 / 6

Component-wise: numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x 6 7 8 9 6 7 8 9 Iteration Criterion 6 7 8 9 Iteration 8 / 6

Component-wise: numerical results (all of them) Coef. ρ =.7 Coef. ρ =.9 Coef. ρ =. 9 / 6

Extensions: multipixel and separability / 6

Two families of quadratic optimizers Component wise Gradient direction / 6

Quadratic criterion: generalities α (x x ) + α (x x ) ρ α α (x x )(x x ) + γ Optimality over R N (no constraint) Null gradient Positive Hessian = J = g( x) x x Q = J x > / 6

Iterative algorithms, fixed point,... Iterative update Initialisation x [] Iteration k =,,... x = arg min J (x) x x [k+] = x [k] + τ [k] δ [k] Direction δ R N Step length τ R + x [k] k= x Stopping rule, e.g. g(x [k] ) < ε / 6

Iterative algorithms, fixed point, descent,... Direction δ Optimal : opposite of the gradient Newton, inverse Hessien Preconditioned Corrected directions: bisector, Vignes, Polak-Ribière,... conjugate direction,...... Step length τ Optimal Over-relaxed / under-relaxed Armijo, Goldstein, Wolfe... and also fixed step... optimal, yes,... and no... / 6

Gradient with optimal step length Strategy: optimal direction optimal step length Iteration x [k+] = x [k] + τ [k] δ [k] Direction δ R N δ [k] = g(x [k] ) Step length τ R + J δ (τ) = J (x [k] + τδ [k] ) A second order polynom J δ (τ) =... τ +... τ +... Optimal step length τ [k] = g(x[k] ) t g(x [k] ) g(x [k] ) t Q g(x [k] ) / 6

Illustration Two readings: optimisation along the given direction, line-search constrained optimisation Remark: orthogonality of successive directions (see end of exercice) 6 / 6

Sketch of convergence proof () A reminder for notations J (x) = xt Qx + q t x + q g(x) = Qx + q Some preliminary results x = arg min J (x) = Q q x J ( x) = qt Q q + q J (x) J ( x) = g(x)t Q g(x) J (x + h) J (x ) = g(x ) t h + ht Q h 7 / 6

Sketch of convergence proof () Notational convenience: x [k] = x, x [k+] = x, τ [k] = τ, g(x [k] ) = g Criterion increase / decrease J (x ) J (x) = J (x + τ δ) J (x) since δ = g and τ = gt g g t Q g = g t [τδ] + [τδ] t Q [τδ] =... = [ g t g ] g t Q g Comment: it decreases, it reduces... 8 / 6

Sketch of convergence proof () Distance to minimiser J (x ) J ( x) = J (x) J ( x) = J (x) J ( x) [ = [J (x) J ( x)] =... [J (x) J ( x)] [ g t g] g t Q g [ g t g] g t Q g J (x) J ( x) g t Q g/ ] [g t g] [g t Q g] [g t Q g] ( ) M m M + m... so it converges M and m: maximal and minimal eigenvalue of Q... and comment Kantorovich inequality (see next slide) 9 / 6

Kantorovich inequality Result [ u t Q u ] [ u t Q u ] ( ) m M M + u m Q symmetric and positive-definite M and m: maximal and minimal eigenvalue of Q Short sketch of proof Quite long and complex Case u = Diagonalise Q: Q = P t ΛP et Q = P t Λ P Convex combination of the eigenvalues and their inverse Convexiy of t /t / 6

numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion Iteration / 6

numerical results () Iterates Iterates (top: x and bottom x ) Variable x Variable x Iteration Criterion 6 Iteration / 6

numerical results () Iterates Iterates (top: x and bottom x ) Variable x 6 8 Variable x 6 8 Iteration Criterion 6 8 Iteration / 6

numerical results (all of them) Coef. ρ =.7 Coef. ρ =.9 Coef. ρ =. / 6

Preconditioned gradient: short digest... Strategy: modified direction optimal step length Iteration x [k+] = x [k] + τ [k] δ [k] Direction δ R N δ [k] = P g(x [k] ) Step length τ R + J δ (τ) = J (x [k] + τδ [k] ) A second order polynom Optimal step length J δ (τ) =... τ +... τ +... τ [k] = g(x[k] ) t P g(x [k] ) g(x [k] ) t P Q P g(x [k] ) / 6

Preconditioned gradient: two special cases... Non-preconditioned: P = I standard gradient algorithm Direction δ R N δ [k] = g(x [k] ) Step length τ R + τ [k] = g(x[k] ) t g(x [k] ) g(x [k] ) t Q g(x [k] ) Perfect-preconditioner: P = Q one step optimal Direction δ R N δ [k] = Q g(x [k] ) Step length τ R + First iterate τ [k] = g(x[k] ) t P g(x [k] ) g(x [k] ) t P Q P g(x [k] ) = x [] = x []. Q g(x [] ) = x []. Q (Qx [] + q) = Q q! 6 / 6

Preconditionned gradient: numerical results: Better and better approximation of the inverse Hessian 7 / 6

Synthesis Image restoration, deconvolution and other inverse problems Three types of regularised inversion... Quadratic penalties Convex non-quadratic penalties Constraints: positivity and support... based on unconstrained quadratic optimsation A basic component: Quadratic criterion and Gaussian model Circulant approximation and computations based on FFT Numerical linear system solvers Matrix splitting Numerical quadratic optimizers Component-wise Gradient methods 8 / 6

Decreasing criterion... x 6. 9 / 6

Various solutions... 8 6 True Observation Quadratic penalty Huber penalty True Observation Quadratic penalty Constrained 6 / 6