SIGNAL AND IMAGE RESTORATION: SOLVING

Similar documents
COMPUTATIONAL ISSUES RELATING TO INVERSION OF PRACTICAL DATA: WHERE IS THE UNCERTAINTY? CAN WE SOLVE Ax = b?

Regularization Parameter Estimation for Least Squares: A Newton method using the χ 2 -distribution

Statistically-Based Regularization Parameter Estimation for Large Scale Problems

Newton s Method for Estimating the Regularization Parameter for Least Squares: Using the Chi-curve

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization Parameter Estimation

The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization Parameter Estimation

A Hybrid LSQR Regularization Parameter Estimation Algorithm for Large Scale Problems

Newton s method for obtaining the Regularization Parameter for Regularized Least Squares

Preconditioning. Noisy, Ill-Conditioned Linear Systems

Preconditioning. Noisy, Ill-Conditioned Linear Systems

Regularization parameter estimation for underdetermined problems by the χ 2 principle with application to 2D focusing gravity inversion

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

Non-negatively constrained least squares and parameter choice by the residual periodogram for the inversion of electrochemical impedance spectroscopy

Towards Improved Sensitivity in Feature Extraction from Signals: one and two dimensional

Inverse Ill Posed Problems in Image Processing

DS-GA 1002 Lecture notes 10 November 23, Linear models

THE χ 2 -CURVE METHOD OF PARAMETER ESTIMATION FOR GENERALIZED TIKHONOV REGULARIZATION

NON-NEGATIVELY CONSTRAINED LEAST

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

1. Introduction. We are concerned with the approximate solution of linear least-squares problems

Choosing the Regularization Parameter

ITERATIVE REGULARIZATION WITH MINIMUM-RESIDUAL METHODS

Lecture 1: Numerical Issues from Inverse Problems (Parameter Estimation, Regularization Theory, and Parallel Algorithms)

Regularization Parameter Estimation for Underdetermined problems by the χ 2 principle with application to 2D focusing gravity inversion

Singular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0

A Newton Root-Finding Algorithm For Estimating the Regularization Parameter For Solving Ill- Conditioned Least Squares Problems

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems

Stability and error analysis of the polarization estimation inverse problem for solid oxide fuel cells.

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

unbiased predictive risk estimator for regularization

Iterative Methods for Ill-Posed Problems

Linear Inverse Problems

Laboratorio di Problemi Inversi Esercitazione 2: filtraggio spettrale

Big Data Analytics: Optimization and Randomization

Numerical Linear Algebra and. Image Restoration

The Singular Value Decomposition

MATH 350: Introduction to Computational Mathematics

8 The SVD Applied to Signal and Image Deblurring

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

c 1999 Society for Industrial and Applied Mathematics

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17

6 The SVD Applied to Signal and Image Deblurring

Spectral Regularization

1 Linearity and Linear Systems

8 The SVD Applied to Signal and Image Deblurring

Golub-Kahan iterative bidiagonalization and determining the noise level in the data

Regularization via Spectral Filtering

Discrete ill posed problems

Least Squares Approximation

Advanced Numerical Linear Algebra: Inverse Problems

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Homework 1. Yuan Yao. September 18, 2011

Unfolding Methods in Particle Physics

Numerical Methods I Singular Value Decomposition

Linear & nonlinear classifiers

What is Image Deblurring?

UPRE Method for Total Variation Parameter Selection

Numerical Methods I Non-Square and Sparse Linear Systems

Properties of Matrices and Operations on Matrices

Constrained optimization

15 Singular Value Decomposition

2 Tikhonov Regularization and ERM

Application of the χ 2 principle and unbiased predictive risk estimator for determining the regularization parameter in 3D focusing gravity inversion

REGULARIZATION PARAMETER SELECTION IN DISCRETE ILL POSED PROBLEMS THE USE OF THE U CURVE

estimator for determining the regularization parameter in

Deconvolution. Parameter Estimation in Linear Inverse Problems

A fast nonstationary preconditioning strategy for ill-posed problems, with application to image deblurring

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

The Alternating Direction Method of Multipliers

Solving discrete ill posed problems with Tikhonov regularization and generalized cross validation

Principal Component Analysis and Linear Discriminant Analysis

arxiv: v2 [math.na] 8 Feb 2018

The Singular Value Decomposition and Least Squares Problems

Linear inversion methods and generalized cross-validation

On the regularization properties of some spectral gradient methods

SIMPLIFIED GSVD COMPUTATIONS FOR THE SOLUTION OF LINEAR DISCRETE ILL-POSED PROBLEMS (1.1) x R Rm n, x R n, b R m, m n,

DUAL REGULARIZED TOTAL LEAST SQUARES SOLUTION FROM TWO-PARAMETER TRUST-REGION ALGORITHM. Geunseop Lee

The Singular Value Decomposition

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Problem 1: Toolbox (25 pts) For all of the parts of this problem, you are limited to the following sets of tools:

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

Statistical and Adaptive Signal Processing

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

SPARSE SIGNAL RESTORATION. 1. Introduction

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

COMP 558 lecture 18 Nov. 15, 2010

Parameter estimation: A new approach to weighting a priori information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

Preliminary Examination, Numerical Analysis, August 2016

Linear Least Squares. Using SVD Decomposition.

Compressive Imaging by Generalized Total Variation Minimization

ETNA Kent State University

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

Transcription:

1 / 55 SIGNAL AND IMAGE RESTORATION: SOLVING ILL-POSED INVERSE PROBLEMS - ESTIMATING PARAMETERS Rosemary Renaut http://math.asu.edu/ rosie CORNELL MAY 10, 2013

2 / 55 Outline Background Parameter Estimation Simple 1D Example Signal Restoration Feature Extraction - Gradients/Edges Tutorial: LS Solution Standard Analysis by the SVD Importance of the Basis and Noise Picard Condition for Ill-Posed Problems Generalized regularization GSVD for examining the solution Revealing the Noise in the GSVD Basis Applying to TV and the SB Algorithm Parameter Estimation for the TV Conclusions and Future

3 / 55 Ill-conditioned Least Squares: Tikhonov Regularization Solve ill-conditioned Ax b Standard Tikhonov, L approximates a derivative operator x(λ) = arg min{ 1 x 2 Ax b 2 2 + λ2 2 Lx 2 2} x(λ) solves normal equations provided null(l) null(a) = {0} (A T A + λ 2 L T L)x(λ) = A T b This is not good for preserving edges in solutions. Not good for extracting features in images. But multiple approaches exist for estimating the parameter λ

An Example: Problem Phillips 10% Noise 4 / 55

5 / 55 Problem Phillips 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Generalized Cross Validation

6 / 55 Problem Phillips 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: L-curve

7 / 55 Problem Phillips 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Unbiased Predictive Risk

8 / 55 Problem Phillips 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Optimal χ 2 - no parameter estimation!

An Example: Problem Shaw10% Noise 9 / 55

10 / 55 Problem Shaw10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Generalized Cross Validation

11 / 55 Problem Shaw10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: L-curve

12 / 55 Problem Shaw10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Unbiased Predictive Risk

13 / 55 Problem Shaw10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Optimal χ 2 - no parameter estimation!

14 / 55 Simple Example: Blurred Signal Restoration b(t) = π 1 π h(t, s)x(s)ds h(s) = σ s2 exp( ), σ > 0. 2π 2σ 2

Example: Sample Blurred Signal 10% Noise 15 / 55

16 / 55 Sampled Signal 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Generalized Cross Validation

17 / 55 Sampled Signal 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: L-curve

18 / 55 Sampled Signal 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Unbiased Predictive Risk

19 / 55 Sampled Signal 10% Noise: Tikhonov Regularized Solutions x(λ) and derivative Lx for optimal λ Figure: Optimal χ 2 - no parameter estimation!

20 / 55 Observations Solutions clearly parameter dependent For smooth signals we may be able to recover reasonable solutions For non smooth signals we have a problem Parameter estimation is more difficult for non smooth case What is best? What adapts for other situations?

21 / 55 Alternative Regularization: Total Variation Consider general regularization R(x) suited to properties of x: x(λ) = arg min{ 1 x 2 Ax b 2 2 + λ2 2 R(x)} Suppose R is total variation of x (general options are possible) For example p norm regularization p < 2 x(λ) = arg min x { Ax b 2 2 + λ 2 Lx p } p = 1 approximates the total variation in the solution. How to solve the optimization problem for large scale? One approach - iteratively reweighted norms (Rodriguez and Wohlberg) - suitable for 0 < p < 2 - depends on two parameters.

22 / 55 Augmented Lagrangian - Split Bregman (Goldstein and Osher, 2009) Introduce d Lx and let R(x) = λ2 2 d Lx 2 2 + µ d 1 (x, d)(λ, µ) = arg min x,d {1 2 Ax b 2 2 + λ2 2 d Lx 2 2 + µ d 1 } Alternating minimization separates steps for d from x Various versions of the iteration can be defined. Fundamentally: S1 : x (k+1) = arg min{ 1 x 2 Ax b 2 2 + λ2 2 Lx (d(k+1) g (k) ) 2 2} S2 : d (k+1) = arg min{ λ2 d 2 d (Lx(k+1) + g (k) ) 2 2 + µ d 1 } S3 : g (k+1) = g (k) + Lx (k+1) d (k+1). Notice dimension increase of the problem

23 / 55 Advantages of the formulation Update for g: updates the Lagrange multiplier g S3 : g (k+1) = g (k) + Lx (k+1) d (k+1). This is just - a vector update Update for d: S2 : d = arg min{µ d 1 + λ2 d 2 d c 2 2}, c = Lx + g = arg min d { d 1 + γ 2 d c 2 2}, γ = λ2 µ. This is achieved using soft thresholding.

24 / 55 Focus: Tikhonov Step of the Algorithm S1 : x (k+1) = arg min{ 1 x 2 Ax b 2 2 + λ2 2 Lx (d(k) g (k) ) 2 2} Update for x: Introduce Then h (k) = d (k) g (k). x (k+1) = arg min{ 1 x 2 Ax b 2 2 + λ2 2 Lx h(k) 2 2}. Standard least squares update using a Tikhonov regularizer. Depends on changing right hand side Depends on parameter λ.

25 / 55 The Tikhonov Update Disadvantages of the formulation update for x: A Tikhonov LS update each step Changing right hand side. Regularization parameter λ - dependent on k? Threshold parameter µ - dependent on k? Advantages of the formulation Extensive literature on Tikhonov LS problems To determine stability analyze Tikhonov LS step of algorithm Understand the impact of the basis on the solution Picard condition Use Generalized Singular Value Decomposition for analysis

26 / 55 What are the issues? 1. Inverse problem we need regularization 2. For feature extraction we need more than Tikhonov Regularization - e.g. TV 3. The TV iterates over many Tikhonov solutions 4. Both techniques are parameter dependent 5. Moreover the parameters are needed 6. We need to fully understand the Tikhonov and ill-posed problems 7. Can we do blackbox solvers? 8. Be careful

Tutorial: LS Solutions Regularization of solution x SVD for the LS solution Basis for the LS solution The Picard condition Modified Regularization GSVD for the regularized solution Changes the basis Generalized Picard condition 27 / 55

28 / 55 Spectral Decomposition of the Solution: The SVD Consider general overdetermined discrete problem Ax = b, A R m n, b R m, x R n, m n. Singular value decomposition (SVD) of A (full column rank) A = UΣV T = n u i σ i v T i, Σ = diag(σ 1,..., σ n ). i=1 gives expansion for the solution x = n i=1 u T i b v i σ i u i, v i are left and right singular vectors for A Solution is a weighted linear combination of the basis vectors v i

29 / 55 The Solutions with truncated SVD- problem shaw Figure: Truncated SVD Solutions: data enters through coefficients u T i b. On the left no noise (inverse crime) in b and on the right with tiny noise 10 4 How is this impacted by the basis?

30 / 55 Some basis vectors for shaw Figure: The first few left singular vectors v i - are they accurate?

31 / 55 Left Singular Vectors and Basis Depend on kernel (matrix A) Figure: The first few left singular vectors u i and basis vectors v i. Are the basis vectors accurate?

32 / 55 Use Matlab High Precision to examine the SVD Matlab digits allows high precision. Standard is 32. Symbolic toolbox allows operations on high precision variables with vpa. SVD for vpa variables calculates the singular values symbolically, but not the singular vectors. Higher accuracy for the SVs generates higher accuracy singular vectors. Solutions with high precision can take advantage of Matlab s symbolic toolbox.

33 / 55 Left Singular Vectors and Basis Calculated in High Precision Figure: The first few left singular vectors u i and basis vectors v i. Higher precision preserves the frequency content of the basis. How many can we use in the solution for x? How does inaccurate basis impact regularized solutions?

34 / 55 The Truncated Solutions (Noise free data b) - inverse crime Figure: Truncated SVD Solutions: Standard precision u T i b. Error in the basis contaminates the solution

35 / 55 The Truncated Solutions (Noise free data b) - inverse crime Figure: Truncated SVD Solutions: VPA calculation u T i b. Larger number of accurate terms

Technique to detect true basis: the Power Spectrum for detecting white noise : a time series analysis technique Suppose for a given vector y that it is a time series indexed by position, i.e. index i. Diagnostic 1 Does the histogram of entries of y generate histogram consistent with y N(0, 1)? (i.e. independent normally distributed with mean 0 and variance 1) Not practical to automatically look at a histogram and make an assessment Diagnostic 2 Test the expectation that y i are selected from a white noise time series. Take the Fourier transform of y and form cumulative periodogram z from power spectrum c j c j = (dft(y) j 2 i=1, z j = c j q i=1 c, j = 1,..., q, i Automatic: Test is the line (z j, j/q) close to a straight line with slope 1 and length 5/2? 36 / 55

37 / 55 Cumulative Periodogram for the left singular vectors Figure: Standard precision on the left and high precision on the right: On left more vectors are close to white than on the right - vectors are white if the CP is close to the diagonal on the plot: Low frequency vectors lie above the diagonal and high frequency below the diagonal. White noise follows the diagonal

38 / 55 Cumulative Periodogram for the basis vectors Figure: Standard precision on the left and high precision on the right: On left more vectors are close to white than on the right - if you count there are 9 vectors with true frequency content on the left

39 / 55 Measure Deviation from Straight Line: Basis Vectors Figure: Testing for white noise for the standard precision vectors: Calculate the cumulative periodogram and measure the deviation from the white noise line or assess proportion of the vector outside the Kolmogorov Smirnov test at a 5% confidence level for white noise lines. In this case it suggests that about 9 vectors are noise free. Cannot expect to use more than 9 vectors in the expansion for x. Additional terms are contaminated by noise - independent of noise in b

40 / 55 Standard Analysis Discrete Picard condition: assesses impact of noise in the data Recall x = n i=1 u T i b v i σ i Here u T i b /σ i = O(1) Ratios are not large but are the values correct? Considering only the discrete Picard condition does not tell us whether the expansion for the solution is correct.

41 / 55 Observations Even when committing the inverse crime we will not achieve the solution if we cannot approximate the basis correctly. We need all basis vectors which contain the high frequency terms in order to approximate a solution with high frequency components - e.g. edges. Reminder - this is independent of the data. But is an indication of an ill-posed problem. In this case the data that is modified exhibits in the matrix A decomposition. To do any uncertainty estimation one must understand noise throughout model and data. Why is this relevant to TV regularization?

Tikhonov Regularization is Spectral Filtering x Tik = n i=1 φ i ( ut I b )v i σ i Tikhonov Regularization φ i = σ2 i, i = 1... n, λ is the σi 2+λ2 regularization parameter, and solution is x Tik (λ) = arg min x { b Ax 2 + λ 2 x 2 } More general formulaton x Tik (λ) = arg min{ b Ax 2 x W + λ 2 Lx 2 } = arg min(j) x χ 2 distribution: find λ such that E(J(x(λ))) = m + p n Yields an unregularized estimate for x It is exactly Tikhonov regularization that is used in the SB algorithm. But for a change of basis determined by L. 42 / 55

43 / 55 The Generalized Singular Value Decomposition Introduce generalization of the SVD to obtain expansion for x(λ) = arg min x { Ax b 2 + λ 2 L(x x 0 ) 2 } Lemma (GSVD) Assume invertibility and m n p. There exist unitary matrices U R m m, V R p p, and a nonsingular matrix Z R n n such that [ ] Υ A = U Z T, L = V [M, 0 0 p (n p) ]Z T, (m n) n with Υ = diag(υ 1,..., υ p, 1,..., 1) R n n, M = diag(µ 1,..., µ p) R p p, 0 υ 1 υ p 1, 1 µ 1 µ p > 0, υ 2 i + µ 2 i = 1, i = 1,... p. Use Υ and M to denote the rectangular matrices containing Υ and M, and note generalized singular values: ρ i = ν i µ i

44 / 55 Solution of the Generalized Problem using the GSVD h = 0 x(λ) = p ν i νi 2 + λ2 µ 2 (u T i b) z i + i i=1 n (u T i b) z i i=p+1 z i is the i th column of (Z T ) 1. Equivalently with filter factor φ i x(λ) = Lx(λ) = p i=1 p i=1 φ i u T i b ν i z i + n (u T i b) z i, i=p+1 φ i u T i b ρ i v i, φ i = ρ2 i ρ 2 i + λ2,

45 / 55 Does noise enter the GSVD basis? Form the truncated matrix A k using only k basis vectors v i Figure: Contrast GSVD basis u (left) z (right). Trade off U and Z

46 / 55 The TIK update of the TV algorithm Update for x: using the GSVD : basis Z x (k+1) = ( p ν i u T i b νi 2 + λ2 µ 2 i i=1 ) + λ2 µ i vi T h(k) νi 2 + λ2 µ 2 z i + i n (u T i b)z i i=p+1 A weighted combination of basis vectors z i : weights ν 2 i (i) νi 2 + λ2 µ 2 i u T i b ν i λ 2 µ 2 i (ii) νi 2 + λ2 µ 2 i v T i h(k) µ i Notice (i) is fixed by b, but (ii) depends on the updates h (k) i.e. (i) is iteration independent If (i) dominates (ii) solution will converge slowly or not at all λ impacts the solution and must not over damp (ii)

47 / 55 Update of the mapped data x (k+1) = ( p u T i φ b i ν i i=1 ) + (1 φ i ) vt i h(k) µ i z i + n (u T i b)z i i=p+1 Lx (k+1) = p i=1 ( u T i φ b ) i + (1 φ i )vi T h (k) v i. ρ i Now the stability depends on coefficients φ i ρ i 1 φ i φ i ν i 1 φ i µ i with respect to the inner products u T i b, vt i h(k)

48 / 55 Examining the weights with λ increasing Coefficients u T i b and vt i h, λ =.001,.1 and 10, k = 1, 10 and Final

49 / 55 Heuristic Observations It is important to analyze the components of the solution Leads to a generalized Picard condition analysis - more than I can give here The basis is important Regularization parameters should be adjusted dynamically. Theoretical results apply concerning parameter estimation We can use χ 2, UPRE etc Important to have information on the noise levels

50 / 55 Theoretical results: using Unbiased Predictive Risk for the SB Tik Lemma Suppose noise in h (k) is stochastic and both data fit Ax b and derivative data fit Lx h are weighted by their inverse covariance matrices for normally distributed noise in b and h; then optimal choice for λ at all steps is λ = 1. Remark Can we expect h (k) is stochastic? Lemma Suppose h (k+1) is regarded as deterministic, then UPRE applied to find the the optimal choice for λ at each step leads to a different optimum at each step, namely it depends on h. Remark Because h changes each step the optimal choice for λ using UPRE will change with each iteration, λ varies over all steps.

Example Solution: 2D - various examples (SNR in title) 51 / 55

Example Solution: 2D - various examples (SNR in title) 52 / 55

Example Solution: 2D - various examples (SNR in title) 53 / 55

Example Solution: 2D - various examples (SNR in title) 54 / 55

55 / 55 Further Observations and Future Work Results demonstrate basic analysis of problem is worthwhile Extend standard parameter estimation from LS Overhead of optimal λ for the first step - reasonable Stochastic interpretation - use fixed λ = 1 after the first iteration is found. Practically use LSQR algorithms rather than GSVD - changes the basis to the Krylov basis. Or use randomized SVD to generate randomized GSVD Noise in the basis is relevant for LSQR / CGLS iterative algorithms Convergence testing is based on h. Initial work - much to do