Preconditioning Noisy, Ill-Conditioned Linear Systems James G. Nagy Emory University Atlanta, GA Outline 1. The Basic Problem 2. Regularization / Iterative Methods 3. Preconditioning 4. Example: Image Restoration 5. Summary
Basic Problem Linear system of equations b = Ax where A, b are known A is large, structured Goal: Compute an approximation of x
Basic Problem Linear system of equations b = Ax + e where A, b are known A is large, structured, ill-conditioned Goal: Compute an approximation of x
Basic Problem Linear system of equations b = Ax + e where A, b are known A is large, structured, ill-conditioned Goal: Compute an approximation of x Applications: Ill-posed inverse problems. Geomagnetic Prospecting Tomography Image Restoration
Basic Problem Linear system of equations b = Ax + e where A, b are known A is large, structured, ill-conditioned Goal: Compute an approximation of x Applications: Ill-posed inverse problems. Geomagnetic Prospecting Tomography Image Restoration b = observed image A = blurring matrix (structured) e = noise x = true image
Basic Problem Linear system of equations b = Ax + e where A, b are known A is large, structured, ill-conditioned Goal: Compute an approximation of x Applications: Ill-posed inverse problems. Geomagnetic Prospecting Tomography Image Restoration b = observed image A = blurring matrix (structured) e = noise x = true image Example
Basic Problem Properties Computational difficulties revealed through SVD: Let A = UΣV T where Σ =diag(σ 1, σ 2,..., σ N ), σ 1 σ 2 σ N U T U = I, V T V = I
Basic Problem Properties Computational difficulties revealed through SVD: Let A = UΣV T where Σ =diag(σ 1, σ 2,..., σ N ), σ 1 σ 2 σ N U T U = I, V T V = I For ill-posed inverse problems, σ 1 1, small singular values cluster at small singular values oscillating singular vectors
Basic Problem Properties Inverse solution for noisy, ill-posed problems: If A = UΣV T, then x = A 1 b = V Σ 1 U T b = n i=1 u T i b σ i v i
Basic Problem Properties Inverse solution for noisy, ill-posed problems: If A = UΣV T, then ˆx = A 1 (b + e) = V Σ 1 U T (b + e) = n i=1 u T i (b + e) σ i v i
Basic Problem Properties Inverse solution for noisy, ill-posed problems: If A = UΣV T, then ˆx = A 1 (b + e) = V Σ 1 U T (b + e) = n i=1 u T i (b + e) σ i v i = n i=1 u T i b σ i v i + n i=1 u T i e σ i v i = x + error
Basic Problem Properties Inverse solution for noisy, ill-posed problems: If A = UΣV T, then ˆx = A 1 (b + e) = V Σ 1 U T (b + e) = n i=1 u T i (b + e) σ i v i = n i=1 u T i b σ i v i + n i=1 u T i e σ i v i = x + error Example
Regularization Basic Idea: Filter out effects of small singular values. x reg = n i=1 φ i u T i b σ i v i where the filter factors satisfy φ i { 1 if σi is large if σ i is small
Regularization Some regularization methods: 1. Truncated SVD x tsvd = k i=1 u T i b σ i v i 2. Tikhonov x tik = n i=1 σi 2 u T σi 2 i b + v i α2 σ i 3. Wiener x wien = n i=1 δ i σ 2 i δ i σ 2 i + γ2 i u T i b σ i v i
Iterative Regularization Basic idea: Use an iterative method (e.g., conjugate gradients) Terminate iteration before theoretical convergence: Early iterations reconstruct solution. Later iterations reconstruct noise.
Iterative Regularization Basic idea: Use an iterative method (e.g., conjugate gradients) Terminate iteration before theoretical convergence: Early iterations reconstruct solution. Later iterations reconstruct noise. Some important methods: CGLS, LSQR, GMRES MR2 (Hanke, 95) MRNSD (Kaufman, 93; N., Strakos, )
Iterative Regularization Efficient for large problems, provided 1. Multiplication with A is not expensive. 2. Convergence is rapid enough.
Iterative Regularization Efficient for large problems, provided 1. Multiplication with A is not expensive. Image restoration Use FFTs 2. Convergence is rapid enough. CGLS, LSQR, GMRES, MR2 often fast, especially for severely ill-posed, noisy problems. MRNSD based on steepest descent, typically converges very slowly.
Iterative Regularization Efficient for large problems, provided 1. Multiplication with A is not expensive. Image restoration Use FFTs 2. Convergence is rapid enough. CGLS, LSQR, GMRES, MR2 often fast, especially for severely ill-posed, noisy problems. MRNSD based on steepest descent, typically converges very slowly. Example
Preconditioning Purposes of preconditioning: 1. Accelerate convergence. Apply iterative method to P 1 Ax = P 1 b.
Preconditioning Purposes of preconditioning: 1. Accelerate convergence. Apply iterative method to P 1 Ax = P 1 b. In this case we minimize x 2. 2. Enforce regularization constraint on solution. (Hanke, 92; Hansen, 98) Apply iterative method to AL 1 Lx = b. In this case, we minimize Lx 2.
Preconditioning for Regularization Basic idea: Find a matrix L to enforce smoothness constraint min Lx 2 Typically L approximates a derivative operator.
Preconditioning for Speed Typical approach for Ax = b Find matrix P so that P 1 A I. Ideal choice: P = A In this case, converge in one iteration to x = A 1 b
Preconditioning for Speed Typical approach for Ax = b Find matrix P so that P 1 A I. Ideal choice: P = A In this case, converge in one iteration to x = A 1 b For ill-conditioned, noisy problems Inverse solution is corrupted with noise
Preconditioning for Speed Typical approach for Ax = b Find matrix P so that P 1 A I. Ideal choice: P = A In this case, converge in one iteration to x = A 1 b For ill-conditioned, noisy problems Inverse solution is corrupted with noise Ideal regularized preconditioner: If A = UΣV T (Hanke, N., Plemmons, 93) P = UΣ k V T = Udiag(σ 1,..., σ k,1,..., 1)V T
Preconditioning for Speed Notice that the preconditioned system is: P 1 A = (UΣ k V T ) 1 (UΣV T ) = V Σ 1 k ΣV T = V V T where =diag(1,..., 1, σ k+1,..., σ n ) That is, Large (good) singular values clustered at 1. Small (bad) singular values not clustered.
Preconditioning for Speed Remaining questions: 1. How to choose truncation index, k? Use regularization parameter choice methods, e.g., GCV, L-curve, Picard condition 2. We can t compute SVD, so now what? Use SVD approximation.
Preconditioning for Speed An SVD Approximation: Decompose A as: (Van Loan and Pitsianis, 93) A = C 1 D 1 + C 2 D 2 + + C k D k where C 1 D 1 = argmin A C D F.
Preconditioning for Speed An SVD Approximation: Decompose A as: (Van Loan and Pitsianis, 93) A = C 1 D 1 + C 2 D 2 + + C k D k where C 1 D 1 = argmin A C D F. Choose a structured (or sparse) U and V.
Preconditioning for Speed An SVD Approximation: Decompose A as: (Van Loan and Pitsianis, 93) A = C 1 D 1 + C 2 D 2 + + C k D k where C 1 D 1 = argmin A C D F. Choose a structured (or sparse) U and V. Let Σ = argmin A UΣV T F.
Preconditioning for Speed An SVD Approximation: Decompose A as: (Van Loan and Pitsianis, 93) A = C 1 D 1 + C 2 D 2 + + C k D k where C 1 D 1 = argmin A C D F. Choose a structured (or sparse) U and V. Let Σ = argmin A UΣV T F. That is, Σ = diag ( U T AV )
Preconditioning for Speed An SVD Approximation: Decompose A as: (Van Loan and Pitsianis, 93) A = C 1 D 1 + C 2 D 2 + + C k D k where C 1 D 1 = argmin A C D F. Choose a structured (or sparse) U and V. Let Σ = argmin A UΣV T F. That is, Σ = diag ( U T AV ) = diag U T k i=1 C i D i V
Preconditioning for Speed Choices for U and V depend on problem (application). Since A = C 1 D 1 + C 2 D 2 + + C k D k and C 1 D 1 = argmin A C D F we might use singular vectors of C 1 D 1.
Preconditioning for Speed Choices for U and V depend on problem (application). Since A = C 1 D 1 + C 2 D 2 + + C k D k and C 1 D 1 = argmin A C D F we might use singular vectors of C 1 D 1. For image restoration, we also use Fourier Transforms (FFTs) Discrete Cosine Transforms (DCTs)
Example: Image Restoration 1. Matrix Structure 2. Efficiently computing SVD approximation
Matrix Structure in Image Restoration First, how do we get the matrix, A?
Matrix Structure in Image Restoration First, how do we get the matrix, A? Using linear algebra notation, the i-th column of A can be written as: Ae i = [ a 1 a i a n ]. 1. = a i
Matrix Structure in Image Restoration First, how do we get the matrix, A? Using linear algebra notation, the i-th column of A can be written as: Ae i = [ a 1 a i a n ] In an imaging system, e i = point source. 1. = a i Ae i = point spread function (PSF)
Matrix Structure in Image Restoration point source PSF 1.1.8.8.6.6.4.4.2.2 8 8 6 4 2 1 2 3 4 5 6 7 6 4 2 1 2 3 4 5 6 7
Matrix Structure in Image Restoration Spatially invariant PSF implies: e i e j e k Ae i Ae j Ae k
Matrix Structure in Image Restoration That is, spatially invariant implies Each column of A is identical, modulo shift. One point PSF is enough to fully describe A. A has Toeplitz structure.
Matrix Structure in Image Restoration e 5 = 1 1 blur p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 Ae 5 = p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 A = p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33
Matrix Structure in Image Restoration e 5 = 1 1 blur p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 Ae 5 = p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 A = p 12 p 11 p 13 p 12 p 11 p 13 p 12 p 22 p 21 p 23 p 22 p 21 p 23 p 22 p 32 p 31 p 33 p 32 p 31 p 33 p 32
Matrix Structure in Image Restoration e 5 = 1 1 blur p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 Ae 5 = p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 A = p 22 p 21 p 12 p 11 p 23 p 22 p 21 p 13 p 12 p 11 p 23 p 22 p 13 p 12 p 32 p 31 p 22 p 21 p 12 p 11 p 33 p 32 p 31 p 23 p 22 p 21 p 13 p 12 p 11 p 33 p 32 p 23 p 22 p 13 p 12 p 32 p 31 p 22 p 21 p 33 p 32 p 31 p 23 p 22 p 21 p 33 p 32 p 23 p 22
Matrix Structure in Image Restoration Matrix Summary Boundary Condition Matrix Structure zero BTTB periodic BCCB reflexive BTTB+BTHB+BHTB+BHHB B = block T = Toeplitz C = circulant H = Hankel
Matrix Structure in Image Restoration Matrix Summary Boundary Condition Matrix Structure zero BTTB periodic reflexive (strongly symmetric) BCCB (use FFT) BTTB+BTHB+BHTB+BHHB (use DCT) B = block T = Toeplitz C = circulant H = Hankel
Matrix Structure in Image Restoration For a separable PSF, we get: p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 = cd T = c 1d 1 c 1 d 2 c 1 d 3 c 2 d 1 c 2 d 2 c 2 d 3 c 3 d 1 c 3 d 2 c 3 d 3 Ae 5 = c 1 c 2 c 3 d 1 d 2 d 3 d 1 d 2 d 3 d 1 d 2 d 3
Matrix Structure in Image Restoration For a separable PSF, we get: p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 = cd T = c 1d 1 c 1 d 2 c 1 d 3 c 2 d 1 c 2 d 2 c 2 d 3 c 3 d 1 c 3 d 2 c 3 d 3 Ae 5 = c 1 c 2 c 3 d 1 d 2 d 3 d 1 d 2 d 3 d 1 d 2 d 3 c 1 c 2 c 3 d 1 d 2 d 3 d 1 d 2 d 3 d 1 d 2 d 3
Matrix Structure in Image Restoration For a separable PSF, we get: p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 = cd T = c 1d 1 c 1 d 2 c 1 d 3 c 2 d 1 c 2 d 2 c 2 d 3 c 3 d 1 c 3 d 2 c 3 d 3 Ae 5 = c 1 c 2 c 3 d 1 d 2 d 3 d 1 d 2 d 3 d 1 d 2 d 3 c 1 c 2 c 3 d 2 d 1 d 3 d 2 d 1 d 3 d 2 d 2 d 1 d 3 d 2 d 1 d 3 d 2 d 2 d 1 d 3 d 2 d 1 d 3 d 2
Matrix Structure in Image Restoration For a separable PSF, we get: p 11 p 12 p 13 p 21 p 22 p 23 p 31 p 32 p 33 = cd T = c 1d 1 c 1 d 2 c 1 d 3 c 2 d 1 c 2 d 2 c 2 d 3 c 3 d 1 c 3 d 2 c 3 d 3 Ae 5 = c 1 c 2 c 3 d 1 d 2 d 3 d 1 d 2 d 3 d 1 d 2 d 3 c 2 c 3 d 2 d 1 d 3 d 2 d 1 d 3 d 2 d 2 d 1 d 3 d 2 d 1 d 3 d 2 c 1 d 2 d 1 d 3 d 2 d 1 d 3 d 2 c 2 d 2 d 1 d 3 d 2 d 1 d 3 d 2 c 3 d 2 d 1 d 3 d 2 d 1 d 3 d 2 c 1 d 2 d 1 d 3 d 2 d 1 d 3 d 2 c 2 d 2 d 1 d 3 d 2 d 1 d 3 d 2 = C D
Matrix Structure in Image Restoration If the PSF is not separable, we can still compute: P = r i=1 c i d T i (sum of rank-1 matrices) and therefore, get A = r i=1 C i D i (sum of Kron. products) In fact, we can get optimal decompositions. (Kamm, N, ; N., Ng, Perrone, 3)
SVD Approximation for Image Restoration Use A UΣV T, where If F ( C i D i ) F is best, U = V = F, Σ = diag ( F ( Ci D i ) F ) If C ( C i D i ) C T is best, U = V = C T, Σ = diag ( C ( Ci D i ) C T ) If (U c U d ) T ( C i D i ) (V c V d ) is best, U = U c U d, V = V c V d, Σ = diag ( (U c U d ) T ( Ci D i ) (Vc V d ) ) Example
The End Preconditioning ill-posed problems is difficult, possible. but Can build approximate SVD from Kronecker product approximations. Can implement efficiently for image restoration. Matlab software: RestoreTools (Lee, N., Perrone, 2) Object oriented approach for image restoration. http://www.mathcs.emory.edu/ nagy/restoretools/ Related software for ill-posed problems (Hansen, Jacobsen) http://www.imm.dtu.dk/ pch/regutools/