Statistically-Based Regularization Parameter Estimation for Large Scale Problems
|
|
- Tobias Wilkins
- 5 years ago
- Views:
Transcription
1 Statistically-Based Regularization Parameter Estimation for Large Scale Problems Rosemary Renaut Joint work with Jodi Mead and Iveta Hnetynkova March 1, 2010 National Science Foundation: Division of Computational Mathematics 1 / 1
2 Outline National Science Foundation: Division of Computational Mathematics 2 / 1
3 Signal/Image Restoration: Integral Model of Signal Degradation b(t) = K(t, s)x(s)ds K(t, s) describes blur of the signal. Convolutional model: invariant K(t, s) = K(t s) is Point Spread Function (PSF). Typically sampling includes noise e(t), model is b(t) = K(t s)x(s)ds + e(t) Discrete model: given discrete samples b, find samples x of x Let A discretize K, assume known, model is given by Naïvely invert the system to find x! b = Ax + e. National Science Foundation: Division of Computational Mathematics 3 / 1
4 Signal/Image Restoration: Integral Model of Signal Degradation b(t) = K(t, s)x(s)ds K(t, s) describes blur of the signal. Convolutional model: invariant K(t, s) = K(t s) is Point Spread Function (PSF). Typically sampling includes noise e(t), model is b(t) = K(t s)x(s)ds + e(t) Discrete model: given discrete samples b, find samples x of x Let A discretize K, assume known, model is given by Naïvely invert the system to find x! b = Ax + e. National Science Foundation: Division of Computational Mathematics 3 / 1
5 Example 1-D Original and Blurred Noisy Signal Original signal x. Blurred and noisy signal b, Gaussian PSF. National Science Foundation: Division of Computational Mathematics 4 / 1
6 The Solution: Regularization is needed Naïve Solution A Regularized Solution National Science Foundation: Division of Computational Mathematics 5 / 1
7 Two dimensional Image Deblurring How do we obtain the deblurred image National Science Foundation: Division of Computational Mathematics 6 / 1
8 Two dimensional Image Deblurring How do we obtain the deblurred image National Science Foundation: Division of Computational Mathematics 6 / 1
9 Least Squares for Ax = b: A Quick Review Consider discrete systems: A R m n, b R m, x R n Ax = b + e, Classical Approach Linear Least Squares (A full rank) x LS = arg min x Ax b 2 2 Difficulty x LS sensitive to changes in right hand side b when A is ill-conditioned. For convolutional models system is ill-posed. National Science Foundation: Division of Computational Mathematics 7 / 1
10 Least Squares for Ax = b: A Quick Review Consider discrete systems: A R m n, b R m, x R n Ax = b + e, Classical Approach Linear Least Squares (A full rank) x LS = arg min x Ax b 2 2 Difficulty x LS sensitive to changes in right hand side b when A is ill-conditioned. For convolutional models system is ill-posed. National Science Foundation: Division of Computational Mathematics 7 / 1
11 Least Squares for Ax = b: A Quick Review Consider discrete systems: A R m n, b R m, x R n Ax = b + e, Classical Approach Linear Least Squares (A full rank) x LS = arg min x Ax b 2 2 Difficulty x LS sensitive to changes in right hand side b when A is ill-conditioned. For convolutional models system is ill-posed. National Science Foundation: Division of Computational Mathematics 7 / 1
12 Least Squares for Ax = b: A Quick Review Consider discrete systems: A R m n, b R m, x R n Ax = b + e, Classical Approach Linear Least Squares (A full rank) x LS = arg min x Ax b 2 2 Difficulty x LS sensitive to changes in right hand side b when A is ill-conditioned. For convolutional models system is ill-posed. National Science Foundation: Division of Computational Mathematics 7 / 1
13 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
14 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
15 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
16 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
17 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
18 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
19 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
20 Introduce Regularization to Pick a Solution Weighted Fidelity with Regularization Regularize x RLS (λ) = arg min{ b Ax 2 x W b + λ 2 R(x)}, Weighting matrix W b R(x) is a regularization term λ is a regularization parameter which is unknown. Solution x RLS (λ) depends on λ. depends on regularization operator R depends on the weighting matrix W b National Science Foundation: Division of Computational Mathematics 8 / 1
21 The Weighting Matrix: Some Assumptions for Multiple Data Measurements Given multiple measurements of data b: Usually error in b, e is an m vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E(ee T ). For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I. Weighting by W b = C b 1 in data fit term, theoretically, ẽ = W b 1/2 e are uncorrelated. Difficulty if W b increases ill-conditioning of A! National Science Foundation: Division of Computational Mathematics 9 / 1
22 The Weighting Matrix: Some Assumptions for Multiple Data Measurements Given multiple measurements of data b: Usually error in b, e is an m vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E(ee T ). For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I. Weighting by W b = C b 1 in data fit term, theoretically, ẽ = W b 1/2 e are uncorrelated. Difficulty if W b increases ill-conditioning of A! National Science Foundation: Division of Computational Mathematics 9 / 1
23 The Weighting Matrix: Some Assumptions for Multiple Data Measurements Given multiple measurements of data b: Usually error in b, e is an m vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E(ee T ). For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I. Weighting by W b = C b 1 in data fit term, theoretically, ẽ = W b 1/2 e are uncorrelated. Difficulty if W b increases ill-conditioning of A! National Science Foundation: Division of Computational Mathematics 9 / 1
24 The Weighting Matrix: Some Assumptions for Multiple Data Measurements Given multiple measurements of data b: Usually error in b, e is an m vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E(ee T ). For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I. Weighting by W b = C b 1 in data fit term, theoretically, ẽ = W b 1/2 e are uncorrelated. Difficulty if W b increases ill-conditioning of A! National Science Foundation: Division of Computational Mathematics 9 / 1
25 The Weighting Matrix: Some Assumptions for Multiple Data Measurements Given multiple measurements of data b: Usually error in b, e is an m vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E(ee T ). For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I. Weighting by W b = C b 1 in data fit term, theoretically, ẽ = W b 1/2 e are uncorrelated. Difficulty if W b increases ill-conditioning of A! National Science Foundation: Division of Computational Mathematics 9 / 1
26 The Weighting Matrix: Some Assumptions for Multiple Data Measurements Given multiple measurements of data b: Usually error in b, e is an m vector of random measurement errors with mean 0 and positive definite covariance matrix C b = E(ee T ). For uncorrelated measurements C b is diagonal matrix of standard deviations of the errors. (Colored noise) For white noise C b = σ 2 I. Weighting by W b = C b 1 in data fit term, theoretically, ẽ = W b 1/2 e are uncorrelated. Difficulty if W b increases ill-conditioning of A! National Science Foundation: Division of Computational Mathematics 9 / 1
27 Formulation: Generalized Tikhonov Regularization With Weighting Use R(x) = D(x x 0 ) 2 ˆx = argmin J(x) = argmin{ Ax b 2 W b + λ 2 D(x x 0 ) 2 }. (1) D is a suitable operator, often derivative approximation. Assume N (A) N (D) = {0} x 0 is a reference solution, often x 0 = 0. Question Given D, W b how do we find λ? National Science Foundation: Division of Computational Mathematics 10 / 1
28 Formulation: Generalized Tikhonov Regularization With Weighting Use R(x) = D(x x 0 ) 2 ˆx = argmin J(x) = argmin{ Ax b 2 W b + λ 2 D(x x 0 ) 2 }. (1) D is a suitable operator, often derivative approximation. Assume N (A) N (D) = {0} x 0 is a reference solution, often x 0 = 0. Question Given D, W b how do we find λ? National Science Foundation: Division of Computational Mathematics 10 / 1
29 Formulation: Generalized Tikhonov Regularization With Weighting Use R(x) = D(x x 0 ) 2 ˆx = argmin J(x) = argmin{ Ax b 2 W b + λ 2 D(x x 0 ) 2 }. (1) D is a suitable operator, often derivative approximation. Assume N (A) N (D) = {0} x 0 is a reference solution, often x 0 = 0. Question Given D, W b how do we find λ? National Science Foundation: Division of Computational Mathematics 10 / 1
30 Formulation: Generalized Tikhonov Regularization With Weighting Use R(x) = D(x x 0 ) 2 ˆx = argmin J(x) = argmin{ Ax b 2 W b + λ 2 D(x x 0 ) 2 }. (1) D is a suitable operator, often derivative approximation. Assume N (A) N (D) = {0} x 0 is a reference solution, often x 0 = 0. Question Given D, W b how do we find λ? National Science Foundation: Division of Computational Mathematics 10 / 1
31 Formulation: Generalized Tikhonov Regularization With Weighting Use R(x) = D(x x 0 ) 2 ˆx = argmin J(x) = argmin{ Ax b 2 W b + λ 2 D(x x 0 ) 2 }. (1) D is a suitable operator, often derivative approximation. Assume N (A) N (D) = {0} x 0 is a reference solution, often x 0 = 0. Question Given D, W b how do we find λ? National Science Foundation: Division of Computational Mathematics 10 / 1
32 Formulation: Generalized Tikhonov Regularization With Weighting Use R(x) = D(x x 0 ) 2 ˆx = argmin J(x) = argmin{ Ax b 2 W b + λ 2 D(x x 0 ) 2 }. (1) D is a suitable operator, often derivative approximation. Assume N (A) N (D) = {0} x 0 is a reference solution, often x 0 = 0. Question Given D, W b how do we find λ? National Science Foundation: Division of Computational Mathematics 10 / 1
33 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
34 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
35 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
36 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
37 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
38 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
39 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
40 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
41 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
42 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
43 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
44 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
45 Example: solution for Increasing λ, D = I. National Science Foundation: Division of Computational Mathematics 11 / 1
46 Choice of λ crucial Different algorithms yield different solutions. Discrepancy Principle L-Curve Generalized Cross Validation (GCV) Unbiased Predictive Risk (UPRE) National Science Foundation: Division of Computational Mathematics 12 / 1
47 Choice of λ crucial Different algorithms yield different solutions. Discrepancy Principle L-Curve Generalized Cross Validation (GCV) Unbiased Predictive Risk (UPRE) National Science Foundation: Division of Computational Mathematics 12 / 1
48 Choice of λ crucial Different algorithms yield different solutions. Discrepancy Principle L-Curve Generalized Cross Validation (GCV) Unbiased Predictive Risk (UPRE) National Science Foundation: Division of Computational Mathematics 12 / 1
49 Choice of λ crucial Different algorithms yield different solutions. Discrepancy Principle L-Curve Generalized Cross Validation (GCV) Unbiased Predictive Risk (UPRE) National Science Foundation: Division of Computational Mathematics 12 / 1
50 Choice of λ crucial Different algorithms yield different solutions. Discrepancy Principle L-Curve Generalized Cross Validation (GCV) Unbiased Predictive Risk (UPRE) National Science Foundation: Division of Computational Mathematics 12 / 1
51 The Discrepancy Principle Suppose noise is white: C b = σ 2 b I. Find λ such that the regularized residual satisfies σ 2 b = 1 m b Ax(λ) 2 2. (2) Can be implemented by a Newton root finding algorithm. But discrepancy principle typically oversmooths. National Science Foundation: Division of Computational Mathematics 13 / 1
52 The Discrepancy Principle Suppose noise is white: C b = σ 2 b I. Find λ such that the regularized residual satisfies σ 2 b = 1 m b Ax(λ) 2 2. (2) Can be implemented by a Newton root finding algorithm. But discrepancy principle typically oversmooths. National Science Foundation: Division of Computational Mathematics 13 / 1
53 The Discrepancy Principle Suppose noise is white: C b = σ 2 b I. Find λ such that the regularized residual satisfies σ 2 b = 1 m b Ax(λ) 2 2. (2) Can be implemented by a Newton root finding algorithm. But discrepancy principle typically oversmooths. National Science Foundation: Division of Computational Mathematics 13 / 1
54 The Discrepancy Principle Suppose noise is white: C b = σ 2 b I. Find λ such that the regularized residual satisfies σ 2 b = 1 m b Ax(λ) 2 2. (2) Can be implemented by a Newton root finding algorithm. But discrepancy principle typically oversmooths. National Science Foundation: Division of Computational Mathematics 13 / 1
55 Some standard approaches I: L-curve - Find the corner Let r(λ) = (A(λ) A)b: Influence Matrix A(λ) = A(A T W b A+λ 2 D T D) 1 A T Plot log( Dx ), log( r(λ) ) Trade off contributions. Expensive - requires range of λ. GSVD makes calculations efficient. Not statistically based Find corner No corner National Science Foundation: Division of Computational Mathematics 14 / 1
56 Generalized Cross-Validation (GCV) Let A(λ) = A(A T W b A+λ 2 D T D) 1 A T Can pick W b = I. Minimize GCV function b Ax(λ) 2 W b [trace(i m A(λ))] 2, which estimates predictive risk. Expensive - requires range of λ. GSVD makes calculations efficient. Requires minimum Multiple minima Sometimes flat National Science Foundation: Division of Computational Mathematics 15 / 1
57 Unbiased Predictive Risk Estimation (UPRE) Minimize expected value of predictive risk: Minimize UPRE function b Ax(λ) 2 W b +2 trace(a(λ)) m Expensive - requires range of λ. GSVD makes calculations efficient. Need estimate of trace Minimum needed National Science Foundation: Division of Computational Mathematics 16 / 1
58 An Alternative: Iterative Methods with Stopping Criteria Iterate to find approximate solution of Ax b. Use LSQR. Introduce stopping criteria based on determining noise in the solution. e.g. Residual Periodogram (O Leary and Rust). Stop when noise dominates Hnetynkova et al. Hybrid Methods combine iteration and regularization parameter. Cost of regularization of reduced system is minimal Advantage any regularization may be used for subproblem. (Nagy etc) How to be sure when to stop the LSQR iteration? How is statistics included in sub problem? Include statistics directly National Science Foundation: Division of Computational Mathematics 17 / 1
59 An Alternative: Iterative Methods with Stopping Criteria Iterate to find approximate solution of Ax b. Use LSQR. Introduce stopping criteria based on determining noise in the solution. e.g. Residual Periodogram (O Leary and Rust). Stop when noise dominates Hnetynkova et al. Hybrid Methods combine iteration and regularization parameter. Cost of regularization of reduced system is minimal Advantage any regularization may be used for subproblem. (Nagy etc) How to be sure when to stop the LSQR iteration? How is statistics included in sub problem? Include statistics directly National Science Foundation: Division of Computational Mathematics 17 / 1
60 Background: Statistics of the Least Squares Problem Theorem (Rao73: First Fundamental Theorem) Let r be the rank of A and for b N(Ax, σb 2 I), (errors in measurements are normally distributed with mean 0 and covariance σb 2 I), then J = min Ax b 2 σ 2 x bχ 2 (m r). J follows a χ 2 distribution with m r degrees of freedom: Basically the Discrepancy Principle Corollary (Weighted Least Squares) For b N(Ax, C b ), W b = C b 1 then J = min Ax b 2 x W b χ 2 (m r). National Science Foundation: Division of Computational Mathematics 18 / 1
61 Background: Statistics of the Least Squares Problem Theorem (Rao73: First Fundamental Theorem) Let r be the rank of A and for b N(Ax, σb 2 I), (errors in measurements are normally distributed with mean 0 and covariance σb 2 I), then J = min Ax b 2 σ 2 x bχ 2 (m r). J follows a χ 2 distribution with m r degrees of freedom: Basically the Discrepancy Principle Corollary (Weighted Least Squares) For b N(Ax, C b ), W b = C b 1 then J = min Ax b 2 x W b χ 2 (m r). National Science Foundation: Division of Computational Mathematics 18 / 1
62 Extension: Statistics of the Regularized Least Squares Problem Thm: χ 2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: Weighting Matrix on Regularization term.) ˆx = argmin J D (x) = argmin{ Ax b 2 W b + (x x 0 ) 2 W D }, W D = D T W x D. (3) Assume Then W b and W x are symmetric positive definite. Problem is uniquely solvable N (A) N (D) {0}. Moore-Penrose generalized inverse of W D is C D Statistics: Errors in the right hand side e N(0, C b ), and x 0 is known so that (x x 0 ) = f N(0, C D ), x 0 is the mean vector of the model parameters. J D (ˆx(W D )) χ 2 (m + p n) National Science Foundation: Division of Computational Mathematics 19 / 1
63 Extension: Statistics of the Regularized Least Squares Problem Thm: χ 2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: Weighting Matrix on Regularization term.) ˆx = argmin J D (x) = argmin{ Ax b 2 W b + (x x 0 ) 2 W D }, W D = D T W x D. (3) Assume Then W b and W x are symmetric positive definite. Problem is uniquely solvable N (A) N (D) {0}. Moore-Penrose generalized inverse of W D is C D Statistics: Errors in the right hand side e N(0, C b ), and x 0 is known so that (x x 0 ) = f N(0, C D ), x 0 is the mean vector of the model parameters. J D (ˆx(W D )) χ 2 (m + p n) National Science Foundation: Division of Computational Mathematics 19 / 1
64 Extension: Statistics of the Regularized Least Squares Problem Thm: χ 2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: Weighting Matrix on Regularization term.) ˆx = argmin J D (x) = argmin{ Ax b 2 W b + (x x 0 ) 2 W D }, W D = D T W x D. (3) Assume Then W b and W x are symmetric positive definite. Problem is uniquely solvable N (A) N (D) {0}. Moore-Penrose generalized inverse of W D is C D Statistics: Errors in the right hand side e N(0, C b ), and x 0 is known so that (x x 0 ) = f N(0, C D ), x 0 is the mean vector of the model parameters. J D (ˆx(W D )) χ 2 (m + p n) National Science Foundation: Division of Computational Mathematics 19 / 1
65 Extension: Statistics of the Regularized Least Squares Problem Thm: χ 2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: Weighting Matrix on Regularization term.) ˆx = argmin J D (x) = argmin{ Ax b 2 W b + (x x 0 ) 2 W D }, W D = D T W x D. (3) Assume Then W b and W x are symmetric positive definite. Problem is uniquely solvable N (A) N (D) {0}. Moore-Penrose generalized inverse of W D is C D Statistics: Errors in the right hand side e N(0, C b ), and x 0 is known so that (x x 0 ) = f N(0, C D ), x 0 is the mean vector of the model parameters. J D (ˆx(W D )) χ 2 (m + p n) National Science Foundation: Division of Computational Mathematics 19 / 1
66 Extension: Statistics of the Regularized Least Squares Problem Thm: χ 2 distribution of the regularized functional (Renaut/Mead 2008) (NOTE: Weighting Matrix on Regularization term.) ˆx = argmin J D (x) = argmin{ Ax b 2 W b + (x x 0 ) 2 W D }, W D = D T W x D. (3) Assume Then W b and W x are symmetric positive definite. Problem is uniquely solvable N (A) N (D) {0}. Moore-Penrose generalized inverse of W D is C D Statistics: Errors in the right hand side e N(0, C b ), and x 0 is known so that (x x 0 ) = f N(0, C D ), x 0 is the mean vector of the model parameters. J D (ˆx(W D )) χ 2 (m + p n) National Science Foundation: Division of Computational Mathematics 19 / 1
67 Significance of the χ 2 result For sufficiently large m = m + p n J D χ 2 (m + p n) E(J(x(W D ))) = m + p n E(JJ T ) = 2(m + p n) Moreover m 2 mz α/2 < J(ˆx(W D )) < m + 2 mz α/2. (4) z α/2 is the relevant z-value for a χ 2 -distribution with m = m + p n degrees National Science Foundation: Division of Computational Mathematics 20 / 1
68 Key Aspects of the Proof I: The Functional J Algebraic Simplifications: Rewrite functional as quadratic form Regularized solution given in terms of resolution matrix R(W D ) ˆx = x 0 + (A T W b A + D T W x D) 1 A T W b r, (5) = x 0 + R(W D )W 1/2 b r, r = b Ax 0 = x 0 + y(w D ). (6) R(W D ) = (A T W b A + D T W x D) 1 A T 1/2 W b (7) Functional is given in terms of influence matrix A(W D ) A(W D ) = W b 1/2 AR(W D ) (8) J D (ˆx) = r T W b 1/2 (I m A(W D ))W b 1/2 r, let r = W b 1/2 r (9) = r T (I m A(W D )) r. A Quadratic Form (10) National Science Foundation: Division of Computational Mathematics 21 / 1
69 Key Aspects of the Proof II : Properties of a Quadratic Form χ 2 distribution of Quadratic Forms x T Px for normal variables (Fisher- Cochran Theorem) Components x i are independent normal variables x i N(0, 1), i = 1 : n. A necessary and sufficient condition that x T Px has a central χ 2 distribution is that P is idempotent, P 2 = P. In which case the degrees of freedom of χ 2 is rank(p) =trace(p) = n.. When the means of x i are µ i 0, x T Px has a non-central χ 2 distribution, with non-centrality parameter c = µ T Pµ A χ 2 random variable with n degrees of freedom and centrality parameter c has mean n + c and variance 2(n + 2c). National Science Foundation: Division of Computational Mathematics 22 / 1
70 Key Aspects of the Proof III: Requires the GSVD Lemma Assume invertibility and m n p. There exist unitary matrices U R m m, V R p p, and a nonsingular matrix X R n n such that [ ] Υ A = U X T D = V[M, 0 p (n p) ]X T, (11) 0 (m n) n Υ = diag(υ 1,..., υ p, 1,..., 1) R n n, M = diag(µ 1,..., µ p ) R p p, 0 υ 1 υ p 1, 1 µ 1 µ p > 0, υi 2 + µ 2 i = 1, i = 1,... p. (12) The Functional with the GSVD Let Q = diag(µ 1,..., µ p, 0 n p, I m n ) then J = r T (I m A(W D )) r = QU T r 2 2 = k 2 2 National Science Foundation: Division of Computational Mathematics 23 / 1
71 Proof IV: Statistical Distribution of the Weighted Residual Covariance Structure Errors in b are e N(0, C b ). Now b depends on x, b = Ax hence we can show b N(Ax 0, C b + AC D A T ) (x 0 is mean of x) Residual r = b Ax N(0, C b + AC D A T ). r = W b 1/2 r N(0, I + ÃC D Ã T ), Ã = W b 1/2 A. Use the GSVD I + ÃC D Ã T = UQ 2 U T, Q = diag(µ 1,..., µ p, I n p, I m n ) Now k = QU T r then k N(0, QU T (UQ 2 U T )UQ) N(0, I m ) But J = QU T r 2 = k 2, where k is the vector k excluding components p + 1 : n. Thus J D χ 2 (m + p n). National Science Foundation: Division of Computational Mathematics 24 / 1
72 When mean of the parameters is not known, or x 0 = 0 is not the mean Corollary: non-central χ 2 distribution of the regularized functional Recall ˆx = argmin J D (x) = argmin{ Ax b 2 W b + (x x 0 ) 2 W D }, W D = D T W x D. Assume all assumptions as before, but x x 0 is the mean vector of the model parameters. Let c = c 2 2 = QU T W b 1/2 A(x x 0 ) 2 2 Then J D χ 2 (m + p n, c) The functional at optimum follows a non central χ 2 distribution National Science Foundation: Division of Computational Mathematics 25 / 1
73 A further result when A is not of full column rank The Rank Deficient Solution Suppose A is not full column rank. Then the filtered solution can be written in terms of the GSVD x FILT (λ) = p i=p+1 r γ 2 i υ i (γ 2 i + λ 2 ) s i x i + n i=p+1 s i x i = p i=1 f i υ i s i x i + n i=p+1 s i x i. Here f i = 0, i = 1 : p r, f i = γ 2 i /(γ2 i + λ 2 ), i = p r + 1 : p. This yields J(x FILT (λ)) χ 2 (m n + r, c) notice degrees of freedom are reduced. National Science Foundation: Division of Computational Mathematics 26 / 1
74 General Result The Cost Functional follows a χ 2 Statistical Distribution Suppose degrees of freedom m and centrality parameter c then E(J D ) = m + c E(J D J T D) = 2( m) + 4c Suggests: Try to find W D so that E(J) = m + c Mead expensive nonlinear algorithm, c = 0 general W D. First find λ only. Find W x = λ 2 I National Science Foundation: Division of Computational Mathematics 27 / 1
75 General Result The Cost Functional follows a χ 2 Statistical Distribution Suppose degrees of freedom m and centrality parameter c then E(J D ) = m + c E(J D J T D) = 2( m) + 4c Suggests: Try to find W D so that E(J) = m + c Mead expensive nonlinear algorithm, c = 0 general W D. First find λ only. Find W x = λ 2 I National Science Foundation: Division of Computational Mathematics 27 / 1
76 General Result The Cost Functional follows a χ 2 Statistical Distribution Suppose degrees of freedom m and centrality parameter c then E(J D ) = m + c E(J D J T D) = 2( m) + 4c Suggests: Try to find W D so that E(J) = m + c Mead expensive nonlinear algorithm, c = 0 general W D. First find λ only. Find W x = λ 2 I National Science Foundation: Division of Computational Mathematics 27 / 1
77 General Result The Cost Functional follows a χ 2 Statistical Distribution Suppose degrees of freedom m and centrality parameter c then E(J D ) = m + c E(J D J T D) = 2( m) + 4c Suggests: Try to find W D so that E(J) = m + c Mead expensive nonlinear algorithm, c = 0 general W D. First find λ only. Find W x = λ 2 I National Science Foundation: Division of Computational Mathematics 27 / 1
78 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
79 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
80 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
81 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
82 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
83 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
84 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
85 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
86 What do we need to apply the Theory? Requirements Covariance C b on data parameters b (or on model parameters x!) A priori information x 0, mean x. But x (and hence x 0 ) are not known. If not known use repeated data measurements calculate C b and mean b. Hence estimate the centrality parameter E(b) = AE(x) implies b = Ax. Hence c = c 2 2 = QU T W b 1/2 (b Ax 0 ) 2 2 E(J D ) = E( QU T W b 1/2 (b Ax 0 ) 2 2 ) = m + p n + c 2 2 Given the GSVD estimate the degrees of freedom m. Then we can use E(J) to find λ National Science Foundation: Division of Computational Mathematics 28 / 1
87 Assume x 0 is the mean (experimentalists know something about model parameters) DESIGNING THE ALGORITHM: I Recall: if C b and C x are good estimates of covariance should be small. J D (ˆx) (m + p n) Thus, let m = m + p n then we want m 2 mz α/2 < J(x(W D )) < m + 2 mz α/2. z α/2 is the relevant z-value for a χ 2 -distribution with m degrees GOAL Find W x to make (??) tight: Single Variable case find λ J D (ˆx(λ)) m National Science Foundation: Division of Computational Mathematics 29 / 1
88 Assume x 0 is the mean (experimentalists know something about model parameters) DESIGNING THE ALGORITHM: I Recall: if C b and C x are good estimates of covariance should be small. J D (ˆx) (m + p n) Thus, let m = m + p n then we want m 2 mz α/2 < J(x(W D )) < m + 2 mz α/2. z α/2 is the relevant z-value for a χ 2 -distribution with m degrees GOAL Find W x to make (??) tight: Single Variable case find λ J D (ˆx(λ)) m National Science Foundation: Division of Computational Mathematics 29 / 1
89 A Newton-line search Algorithm to find λ = 1/σ. (Basic algebra) Newton to Solve F(σ) = J D (σ) m = 0 We use σ = 1/λ, and y(σ (k) ) is the current solution for which x(σ (k) ) = y(σ (k) ) + x 0 Then σ J(σ) = 2 σ 3 Dy(σ) 2 < 0 Hence we have a basic Newton Iteration σ (k+1) = σ (k) ( ( σ(k) Dy )2 (J D (σ (k) ) m)). National Science Foundation: Division of Computational Mathematics 30 / 1
90 A Newton-line search Algorithm to find λ = 1/σ. (Basic algebra) Newton to Solve F(σ) = J D (σ) m = 0 We use σ = 1/λ, and y(σ (k) ) is the current solution for which x(σ (k) ) = y(σ (k) ) + x 0 Then σ J(σ) = 2 σ 3 Dy(σ) 2 < 0 Hence we have a basic Newton Iteration σ (k+1) = σ (k) ( ( σ(k) Dy )2 (J D (σ (k) ) m)). National Science Foundation: Division of Computational Mathematics 30 / 1
91 A Newton-line search Algorithm to find λ = 1/σ. (Basic algebra) Newton to Solve F(σ) = J D (σ) m = 0 We use σ = 1/λ, and y(σ (k) ) is the current solution for which x(σ (k) ) = y(σ (k) ) + x 0 Then σ J(σ) = 2 σ 3 Dy(σ) 2 < 0 Hence we have a basic Newton Iteration σ (k+1) = σ (k) ( ( σ(k) Dy )2 (J D (σ (k) ) m)). National Science Foundation: Division of Computational Mathematics 30 / 1
92 Algorithm Using the GSVD GSVD Use GSVD of [W 1/2 b A, D] For γ i the generalized singular values, and s = U T W 1/2 b r m = m n + p s i = s i /(γi 2σ2 x + 1), i = 1,..., p, t i = s i γ i. Find root of F(σ x ) = p 1 ( γi 2σ2 x + 1 )s2 i + i=1 m s 2 i m = 0 i=n+1 Equivalently: solve F = 0, where F(σ x ) = s T s m and F (σ x ) = 2σ x t 2 2. National Science Foundation: Division of Computational Mathematics 31 / 1
93 Discussion on Convergence F is monotonic decreasing (F (σ x ) = 2σ x Dy 2 2 ) Solution either exists and is unique for positive σ Or no solution exists F(0) < 0. implies incorrect statistics of the model Theoretically, lim σ F > 0 possible. Equivalent to λ = 0. No regularization needed. National Science Foundation: Division of Computational Mathematics 32 / 1
94 Practical Details of Algorithm Find the parameter Step 1: Bracket the root by logarithmic search on σ to handle the asymptotes: yields sigmamax and sigmamin Step 2: Calculate step, with steepness controlled by told. Let t = Dy/σ (k), where y is the current update, then step = 1 2 ( 1 max { t, told} )2 (J D (σ (k) ) m) Step 3: Introduce line search α (k) in Newton sigmanew = σ (k) (1 + α (k) step) α (k) chosen such that sigmanew within bracket. National Science Foundation: Division of Computational Mathematics 33 / 1
95 Practical Details of Algorithm Find the parameter Step 1: Bracket the root by logarithmic search on σ to handle the asymptotes: yields sigmamax and sigmamin Step 2: Calculate step, with steepness controlled by told. Let t = Dy/σ (k), where y is the current update, then step = 1 2 ( 1 max { t, told} )2 (J D (σ (k) ) m) Step 3: Introduce line search α (k) in Newton sigmanew = σ (k) (1 + α (k) step) α (k) chosen such that sigmanew within bracket. National Science Foundation: Division of Computational Mathematics 33 / 1
96 Practical Details of Algorithm Find the parameter Step 1: Bracket the root by logarithmic search on σ to handle the asymptotes: yields sigmamax and sigmamin Step 2: Calculate step, with steepness controlled by told. Let t = Dy/σ (k), where y is the current update, then step = 1 2 ( 1 max { t, told} )2 (J D (σ (k) ) m) Step 3: Introduce line search α (k) in Newton sigmanew = σ (k) (1 + α (k) step) α (k) chosen such that sigmanew within bracket. National Science Foundation: Division of Computational Mathematics 33 / 1
97 Practical Details of Algorithm Find the parameter Step 1: Bracket the root by logarithmic search on σ to handle the asymptotes: yields sigmamax and sigmamin Step 2: Calculate step, with steepness controlled by told. Let t = Dy/σ (k), where y is the current update, then step = 1 2 ( 1 max { t, told} )2 (J D (σ (k) ) m) Step 3: Introduce line search α (k) in Newton sigmanew = σ (k) (1 + α (k) step) α (k) chosen such that sigmanew within bracket. National Science Foundation: Division of Computational Mathematics 33 / 1
98 Practical Details of Algorithm: Large Scale problems Algorithm Initialization Convert generalized Tikhonov problem to standard form.( if L is not invertible you just need to know how to find Ax and A T x, and the null space of L) Use LSQR algorithm to find the bidiagonal matrix for the projected problem. Obtain a solution of the bidiagonal problem for given initial σ. Subsequent Steps Increase dimension of space if needed with reuse of existing bidiagonalization. May also use smaller size system if appropriate. Each σ calculation of algorithm reuses saved information from the Lancos bidiagonalization. National Science Foundation: Division of Computational Mathematics 34 / 1
99 Comparison with Standard LSQR hybrid Algorithm Algorithm can concurrently regularize and find λ Standard hybrid LSQR solves projected system then adds regularization. This is also possible: results of both approaches. Advantages Costs Needs only cost of standard LSQR algorithm with some updates for solution solves for iterated σ. The regularization introduced by LSQR projection may be useful for preventing problems with GSVD expansion. Makes algorithm viable for large scale problems. National Science Foundation: Division of Computational Mathematics 35 / 1
100 Illustrating the Results for Problem Size 512: Two Standard Test Problems omparison for noise level 10%. On left D = I and on right D is first derivative Notice L-curve and χ 2 -LSQR perform well. UPRE does not perform well. National Science Foundation: Division of Computational Mathematics 36 / 1
101 Real Data: Seismic Signal Restoration The Data Set and Goal Real data set of 48 signals of length The point spread function is derived from the signals. Calculate the signal variance pointwise over all 48 signals. Goal: restore the signal x from Ax = b, where A is PSF matrix and b is given blurred signal. Method of Comparison- no exact solution known: use convergence with respect to downsampling. National Science Foundation: Division of Computational Mathematics 37 / 1
102 Comparison High Resolution White noise Greater contrast with χ 2. UPRE is insufficiently regularized. L-curve severely undersmooths (not shown). Parameters not consistent across resolutions National Science Foundation: Division of Computational Mathematics 38 / 1
103 THE UPRE SOLUTION: x 0 = 0 White Noise Regularization Parameters are consistent: σ = all resolutions National Science Foundation: Division of Computational Mathematics 39 / 1
104 THE LSQR Hybrid SOLUTION: White Noise Regularization quite consistent resolution 2 to 100 σ = , , , , National Science Foundation: Division of Computational Mathematics 40 / 1
105 Illustrating the Deblurring Result: Problem Size Example taken from RESTORE TOOLS Nagy et al : 15% Noise Computational Cost is Minimal: Projected Problem Size is 15, λ =.58 National Science Foundation: Division of Computational Mathematics 41 / 1
106 Problem Grain noise 15% added for increasing subproblem size Signal to noise ratio 10 log 10 (1/e) relative error e Regularization parameter σ in each case. National Science Foundation: Division of Computational Mathematics 42 / 1
107 Illustrating the progress of the Newton algorithm post LSQR National Science Foundation: Division of Computational Mathematics 43 / 1
108 Illustrating the progress of the Newton algorithm with LSQR National Science Foundation: Division of Computational Mathematics 44 / 1
109 Problem Grain noise 15% added for increasing subproblem size Signal to noise ratio 10 log 10 (1/e) relative error e National Science Foundation: Division of Computational Mathematics 45 / 1
110 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
111 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
112 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
113 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
114 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
115 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
116 Conclusions Observations A new statistical method for estimating regularization parameter Compares favorably with UPRE with respect to performance and compared to L-curve. (GCV is not competitive). Method can be used for large scale problems. PROBLEM SIZE of Hybrid LSQR USED IS VERY SMALL Method is very efficient, Newton method is robust and fast. a priori information is needed. Can be obtained directly from the data. e.g. Use local statistical information of image National Science Foundation: Division of Computational Mathematics 46 / 1
117 Future Work Other Results and Future Work Software Package! Diagonal Weighting Schemes Edge preserving regularization - Total Variation Better handling of Colored Noise. Bound Constraints (with Mead accepted). National Science Foundation: Division of Computational Mathematics 47 / 1
118 THANK YOU! National Science Foundation: Division of Computational Mathematics 48 / 1
The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization Parameter Estimation
The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization Parameter Estimation Rosemary Renaut Collaborators: Jodi Mead and Iveta Hnetynkova DEPARTMENT OF MATHEMATICS
More informationThe Chi-squared Distribution of the Regularized Least Squares Functional for Regularization Parameter Estimation
The Chi-squared Distribution of the Regularized Least Squares Functional for Regularization Parameter Estimation Rosemary Renaut DEPARTMENT OF MATHEMATICS AND STATISTICS Prague 2008 MATHEMATICS AND STATISTICS
More informationA Hybrid LSQR Regularization Parameter Estimation Algorithm for Large Scale Problems
A Hybrid LSQR Regularization Parameter Estimation Algorithm for Large Scale Problems Rosemary Renaut Joint work with Jodi Mead and Iveta Hnetynkova SIAM Annual Meeting July 10, 2009 National Science Foundation:
More informationNewton s method for obtaining the Regularization Parameter for Regularized Least Squares
Newton s method for obtaining the Regularization Parameter for Regularized Least Squares Rosemary Renaut, Jodi Mead Arizona State and Boise State International Linear Algebra Society, Cancun Renaut and
More informationNewton s Method for Estimating the Regularization Parameter for Least Squares: Using the Chi-curve
Newton s Method for Estimating the Regularization Parameter for Least Squares: Using the Chi-curve Rosemary Renaut, Jodi Mead Arizona State and Boise State Copper Mountain Conference on Iterative Methods
More informationRegularization Parameter Estimation for Least Squares: A Newton method using the χ 2 -distribution
Regularization Parameter Estimation for Least Squares: A Newton method using the χ 2 -distribution Rosemary Renaut, Jodi Mead Arizona State and Boise State September 2007 Renaut and Mead (ASU/Boise) Scalar
More informationSIGNAL AND IMAGE RESTORATION: SOLVING
1 / 55 SIGNAL AND IMAGE RESTORATION: SOLVING ILL-POSED INVERSE PROBLEMS - ESTIMATING PARAMETERS Rosemary Renaut http://math.asu.edu/ rosie CORNELL MAY 10, 2013 2 / 55 Outline Background Parameter Estimation
More informationA Newton Root-Finding Algorithm For Estimating the Regularization Parameter For Solving Ill- Conditioned Least Squares Problems
Boise State University ScholarWorks Mathematics Faculty Publications and Presentations Department of Mathematics 1-1-2009 A Newton Root-Finding Algorithm For Estimating the Regularization Parameter For
More informationTowards Improved Sensitivity in Feature Extraction from Signals: one and two dimensional
Towards Improved Sensitivity in Feature Extraction from Signals: one and two dimensional Rosemary Renaut, Hongbin Guo, Jodi Mead and Wolfgang Stefan Supported by Arizona Alzheimer s Research Center and
More informationTHE χ 2 -CURVE METHOD OF PARAMETER ESTIMATION FOR GENERALIZED TIKHONOV REGULARIZATION
THE χ -CURVE METHOD OF PARAMETER ESTIMATION FOR GENERALIZED TIKHONOV REGULARIZATION JODI MEAD AND ROSEMARY A RENAUT Abstract. We discuss algorithms for estimating least squares regularization parameters
More informationRegularization parameter estimation for underdetermined problems by the χ 2 principle with application to 2D focusing gravity inversion
Regularization parameter estimation for underdetermined problems by the χ principle with application to D focusing gravity inversion Saeed Vatankhah, Rosemary A Renaut and Vahid E Ardestani, Institute
More informationRegularization Parameter Estimation for Underdetermined problems by the χ 2 principle with application to 2D focusing gravity inversion
Regularization Parameter Estimation for Underdetermined problems by the χ principle with application to D focusing gravity inversion Saeed Vatankhah, Rosemary A Renaut and Vahid E Ardestani, Institute
More informationCOMPUTATIONAL ISSUES RELATING TO INVERSION OF PRACTICAL DATA: WHERE IS THE UNCERTAINTY? CAN WE SOLVE Ax = b?
COMPUTATIONAL ISSUES RELATING TO INVERSION OF PRACTICAL DATA: WHERE IS THE UNCERTAINTY? CAN WE SOLVE Ax = b? Rosemary Renaut http://math.asu.edu/ rosie BRIDGING THE GAP? OCT 2, 2012 Discussion Yuen: Solve
More informationInverse Ill Posed Problems in Image Processing
Inverse Ill Posed Problems in Image Processing Image Deblurring I. Hnětynková 1,M.Plešinger 2,Z.Strakoš 3 hnetynko@karlin.mff.cuni.cz, martin.plesinger@tul.cz, strakos@cs.cas.cz 1,3 Faculty of Mathematics
More informationPreconditioning. Noisy, Ill-Conditioned Linear Systems
Preconditioning Noisy, Ill-Conditioned Linear Systems James G. Nagy Emory University Atlanta, GA Outline 1. The Basic Problem 2. Regularization / Iterative Methods 3. Preconditioning 4. Example: Image
More informationGolub-Kahan iterative bidiagonalization and determining the noise level in the data
Golub-Kahan iterative bidiagonalization and determining the noise level in the data Iveta Hnětynková,, Martin Plešinger,, Zdeněk Strakoš, * Charles University, Prague ** Academy of Sciences of the Czech
More informationPreconditioning. Noisy, Ill-Conditioned Linear Systems
Preconditioning Noisy, Ill-Conditioned Linear Systems James G. Nagy Emory University Atlanta, GA Outline 1. The Basic Problem 2. Regularization / Iterative Methods 3. Preconditioning 4. Example: Image
More informationRegularization methods for large-scale, ill-posed, linear, discrete, inverse problems
Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems Silvia Gazzola Dipartimento di Matematica - Università di Padova January 10, 2012 Seminario ex-studenti 2 Silvia Gazzola
More informationInverse problem and optimization
Inverse problem and optimization Laurent Condat, Nelly Pustelnik CNRS, Gipsa-lab CNRS, Laboratoire de Physique de l ENS de Lyon Decembre, 15th 2016 Inverse problem and optimization 2/36 Plan 1. Examples
More informationA fast nonstationary preconditioning strategy for ill-posed problems, with application to image deblurring
A fast nonstationary preconditioning strategy for ill-posed problems, with application to image deblurring Marco Donatelli Dept. of Science and High Tecnology U. Insubria (Italy) Joint work with M. Hanke
More informationTikhonov Regularization of Large Symmetric Problems
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1 11 [Version: 2000/03/22 v1.0] Tihonov Regularization of Large Symmetric Problems D. Calvetti 1, L. Reichel 2 and A. Shuibi
More informationNumerical Methods for Separable Nonlinear Inverse Problems with Constraint and Low Rank
Numerical Methods for Separable Nonlinear Inverse Problems with Constraint and Low Rank Taewon Cho Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial
More informationECE295, Data Assimila0on and Inverse Problems, Spring 2015
ECE295, Data Assimila0on and Inverse Problems, Spring 2015 1 April, Intro; Linear discrete Inverse problems (Aster Ch 1 and 2) Slides 8 April, SVD (Aster ch 2 and 3) Slides 15 April, RegularizaFon (ch
More informationOn nonstationary preconditioned iterative regularization methods for image deblurring
On nonstationary preconditioned iterative regularization methods for image deblurring Alessandro Buccini joint work with Prof. Marco Donatelli University of Insubria Department of Science and High Technology
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationLaboratorio di Problemi Inversi Esercitazione 2: filtraggio spettrale
Laboratorio di Problemi Inversi Esercitazione 2: filtraggio spettrale Luca Calatroni Dipartimento di Matematica, Universitá degli studi di Genova Aprile 2016. Luca Calatroni (DIMA, Unige) Esercitazione
More informationStochastic Analogues to Deterministic Optimizers
Stochastic Analogues to Deterministic Optimizers ISMP 2018 Bordeaux, France Vivak Patel Presented by: Mihai Anitescu July 6, 2018 1 Apology I apologize for not being here to give this talk myself. I injured
More informationComputational Methods. Least Squares Approximation/Optimization
Computational Methods Least Squares Approximation/Optimization Manfred Huber 2011 1 Least Squares Least squares methods are aimed at finding approximate solutions when no precise solution exists Find the
More informationKrylov subspace iterative methods for nonsymmetric discrete ill-posed problems in image restoration
Krylov subspace iterative methods for nonsymmetric discrete ill-posed problems in image restoration D. Calvetti a, B. Lewis b and L. Reichel c a Department of Mathematics, Case Western Reserve University,
More informationLecture 6. Regularized least-squares and minimum-norm methods 6 1
Regularized least-squares and minimum-norm methods 6 1 Lecture 6 Regularized least-squares and minimum-norm methods EE263 Autumn 2004 multi-objective least-squares regularized least-squares nonlinear least-squares
More informationFunctional SVD for Big Data
Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem
More informationITERATIVE REGULARIZATION WITH MINIMUM-RESIDUAL METHODS
BIT Numerical Mathematics 6-3835/3/431-1 $16. 23, Vol. 43, No. 1, pp. 1 18 c Kluwer Academic Publishers ITERATIVE REGULARIZATION WITH MINIMUM-RESIDUAL METHODS T. K. JENSEN and P. C. HANSEN Informatics
More informationLarge Sample Properties of Estimators in the Classical Linear Regression Model
Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in
More informationInverse Theory. COST WaVaCS Winterschool Venice, February Stefan Buehler Luleå University of Technology Kiruna
Inverse Theory COST WaVaCS Winterschool Venice, February 2011 Stefan Buehler Luleå University of Technology Kiruna Overview Inversion 1 The Inverse Problem 2 Simple Minded Approach (Matrix Inversion) 3
More informationMathematical Beer Goggles or The Mathematics of Image Processing
How Mathematical Beer Goggles or The Mathematics of Image Processing Department of Mathematical Sciences University of Bath Postgraduate Seminar Series University of Bath 12th February 2008 1 How 2 How
More informationTHE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR
THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},
More informationPrinciples of the Global Positioning System Lecture 11
12.540 Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring http://geoweb.mit.edu/~tah/12.540 Statistical approach to estimation Summary Look at estimation from statistical point
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationDiscrete ill posed problems
Discrete ill posed problems Gérard MEURANT October, 2008 1 Introduction to ill posed problems 2 Tikhonov regularization 3 The L curve criterion 4 Generalized cross validation 5 Comparisons of methods Introduction
More informationSCMA292 Mathematical Modeling : Machine Learning. Krikamol Muandet. Department of Mathematics Faculty of Science, Mahidol University.
SCMA292 Mathematical Modeling : Machine Learning Krikamol Muandet Department of Mathematics Faculty of Science, Mahidol University February 9, 2016 Outline Quick Recap of Least Square Ridge Regression
More informationUncertainty Quantification for Inverse Problems. November 7, 2011
Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance
More information1 Cricket chirps: an example
Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number
More informationA MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS
A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS SILVIA NOSCHESE AND LOTHAR REICHEL Abstract. Truncated singular value decomposition (TSVD) is a popular method for solving linear discrete ill-posed
More informationParameter estimation: A new approach to weighting a priori information
c de Gruyter 27 J. Inv. Ill-Posed Problems 5 (27), 2 DOI.55 / JIP.27.xxx Parameter estimation: A new approach to weighting a priori information J.L. Mead Communicated by Abstract. We propose a new approach
More informationIterative Methods for Ill-Posed Problems
Iterative Methods for Ill-Posed Problems Based on joint work with: Serena Morigi Fiorella Sgallari Andriy Shyshkov Salt Lake City, May, 2007 Outline: Inverse and ill-posed problems Tikhonov regularization
More informationEAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science
EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Taylor s Theorem Can often approximate a function by a polynomial The error in the approximation
More informationUPRE Method for Total Variation Parameter Selection
UPRE Method for Total Variation Parameter Selection Youzuo Lin School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ 85287 USA. Brendt Wohlberg 1, T-5, Los Alamos National
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationETNA Kent State University
Electronic Transactions on Numerical Analysis. Volume 28, pp. 49-67, 28. Copyright 28,. ISSN 68-963. A WEIGHTED-GCV METHOD FOR LANCZOS-HYBRID REGULARIZATION JULIANNE CHUNG, JAMES G. NAGY, AND DIANNE P.
More informationLPA-ICI Applications in Image Processing
LPA-ICI Applications in Image Processing Denoising Deblurring Derivative estimation Edge detection Inverse halftoning Denoising Consider z (x) =y (x)+η (x), wherey is noise-free image and η is noise. assume
More informationLanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact
Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Zdeněk Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/ strakos Hong Kong, February
More informationSignal Denoising with Wavelets
Signal Denoising with Wavelets Selin Aviyente Department of Electrical and Computer Engineering Michigan State University March 30, 2010 Introduction Assume an additive noise model: x[n] = f [n] + w[n]
More informationBinary choice 3.3 Maximum likelihood estimation
Binary choice 3.3 Maximum likelihood estimation Michel Bierlaire Output of the estimation We explain here the various outputs from the maximum likelihood estimation procedure. Solution of the maximum likelihood
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationNumerical Linear Algebra and. Image Restoration
Numerical Linear Algebra and Image Restoration Maui High Performance Computing Center Wednesday, October 8, 2003 James G. Nagy Emory University Atlanta, GA Thanks to: AFOSR, Dave Tyler, Stuart Jefferies,
More informationSingular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0
Singular value decomposition If only the first p singular values are nonzero we write G =[U p U o ] " Sp 0 0 0 # [V p V o ] T U p represents the first p columns of U U o represents the last N-p columns
More informationσ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2
HE SINGULAR VALUE DECOMPOSIION he SVD existence - properties. Pseudo-inverses and the SVD Use of SVD for least-squares problems Applications of the SVD he Singular Value Decomposition (SVD) heorem For
More informationExtension of GKB-FP algorithm to large-scale general-form Tikhonov regularization
Extension of GKB-FP algorithm to large-scale general-form Tikhonov regularization Fermín S. Viloche Bazán, Maria C. C. Cunha and Leonardo S. Borges Department of Mathematics, Federal University of Santa
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationTikhonov Regularization for Weighted Total Least Squares Problems
Tikhonov Regularization for Weighted Total Least Squares Problems Yimin Wei Naimin Zhang Michael K. Ng Wei Xu Abstract In this paper, we study and analyze the regularized weighted total least squares (RWTLS)
More informationRegularizing inverse problems. Damping and smoothing and choosing...
Regularizing inverse problems Damping and smoothing and choosing... 141 Regularization The idea behind SVD is to limit the degree of freedom in the model and fit the data to an acceptable level. Retain
More informationStructural Damage Detection Using Time Windowing Technique from Measured Acceleration during Earthquake
Structural Damage Detection Using Time Windowing Technique from Measured Acceleration during Earthquake Seung Keun Park and Hae Sung Lee ABSTRACT This paper presents a system identification (SI) scheme
More informationComputer Vision & Digital Image Processing
Computer Vision & Digital Image Processing Image Restoration and Reconstruction I Dr. D. J. Jackson Lecture 11-1 Image restoration Restoration is an objective process that attempts to recover an image
More informationOne Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017
One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 1 One Picture and a Thousand Words Using Matrix Approximations Dianne P. O Leary
More informationDownloaded 06/11/15 to Redistribution subject to SIAM license or copyright; see
SIAM J. SCI. COMPUT. Vol. 37, No. 2, pp. B332 B359 c 2015 Society for Industrial and Applied Mathematics A FRAMEWORK FOR REGULARIZATION VIA OPERATOR APPROXIMATION JULIANNE M. CHUNG, MISHA E. KILMER, AND
More informationGeneralized Local Regularization for Ill-Posed Problems
Generalized Local Regularization for Ill-Posed Problems Patricia K. Lamm Department of Mathematics Michigan State University AIP29 July 22, 29 Based on joint work with Cara Brooks, Zhewei Dai, and Xiaoyue
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More information10. Multi-objective least squares
L Vandenberghe ECE133A (Winter 2018) 10 Multi-objective least squares multi-objective least squares regularized data fitting control estimation and inversion 10-1 Multi-objective least squares we have
More informationLinear models. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. October 5, 2016
Linear models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark October 5, 2016 1 / 16 Outline for today linear models least squares estimation orthogonal projections estimation
More informationc 1999 Society for Industrial and Applied Mathematics
SIAM J. MATRIX ANAL. APPL. Vol. 21, No. 1, pp. 185 194 c 1999 Society for Industrial and Applied Mathematics TIKHONOV REGULARIZATION AND TOTAL LEAST SQUARES GENE H. GOLUB, PER CHRISTIAN HANSEN, AND DIANNE
More informationGaussian and Linear Discriminant Analysis; Multiclass Classification
Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More information8 The SVD Applied to Signal and Image Deblurring
8 The SVD Applied to Signal and Image Deblurring We will discuss the restoration of one-dimensional signals and two-dimensional gray-scale images that have been contaminated by blur and noise. After an
More informationLanczos Bidiagonalization with Subspace Augmentation for Discrete Inverse Problems
Lanczos Bidiagonalization with Subspace Augmentation for Discrete Inverse Problems Per Christian Hansen Technical University of Denmark Ongoing work with Kuniyoshi Abe, Gifu Dedicated to Dianne P. O Leary
More information1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0
Numerical Analysis 1 1. Nonlinear Equations This lecture note excerpted parts from Michael Heath and Max Gunzburger. Given function f, we seek value x for which where f : D R n R n is nonlinear. f(x) =
More informationEmpirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods
Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,
More informationAdaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL.
Adaptive Filtering Fundamentals of Least Mean Squares with MATLABR Alexander D. Poularikas University of Alabama, Huntsville, AL CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is
More informationDUAL REGULARIZED TOTAL LEAST SQUARES SOLUTION FROM TWO-PARAMETER TRUST-REGION ALGORITHM. Geunseop Lee
J. Korean Math. Soc. 0 (0), No. 0, pp. 1 0 https://doi.org/10.4134/jkms.j160152 pissn: 0304-9914 / eissn: 2234-3008 DUAL REGULARIZED TOTAL LEAST SQUARES SOLUTION FROM TWO-PARAMETER TRUST-REGION ALGORITHM
More informationLikelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is
Likelihood Methods 1 Likelihood Functions The multivariate normal distribution likelihood function is The log of the likelihood, say L 1 is Ly = π.5n V.5 exp.5y Xb V 1 y Xb. L 1 = 0.5[N lnπ + ln V +y Xb
More informationLeast-Squares Performance of Analog Product Codes
Copyright 004 IEEE Published in the Proceedings of the Asilomar Conference on Signals, Systems and Computers, 7-0 ovember 004, Pacific Grove, California, USA Least-Squares Performance of Analog Product
More informationLecture 1: Numerical Issues from Inverse Problems (Parameter Estimation, Regularization Theory, and Parallel Algorithms)
Lecture 1: Numerical Issues from Inverse Problems (Parameter Estimation, Regularization Theory, and Parallel Algorithms) Youzuo Lin 1 Joint work with: Rosemary A. Renaut 2 Brendt Wohlberg 1 Hongbin Guo
More informationJACOBI S ITERATION METHOD
ITERATION METHODS These are methods which compute a sequence of progressively accurate iterates to approximate the solution of Ax = b. We need such methods for solving many large linear systems. Sometimes
More information10. Linear Models and Maximum Likelihood Estimation
10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that
More informationOptimal State Estimators for Linear Systems with Unknown Inputs
Optimal tate Estimators for Linear ystems with Unknown Inputs hreyas undaram and Christoforos N Hadjicostis Abstract We present a method for constructing linear minimum-variance unbiased state estimators
More informationIntroduction. Chapter One
Chapter One Introduction The aim of this book is to describe and explain the beautiful mathematical relationships between matrices, moments, orthogonal polynomials, quadrature rules and the Lanczos and
More informationDeconvolution. Parameter Estimation in Linear Inverse Problems
Image Parameter Estimation in Linear Inverse Problems Chair for Computer Aided Medical Procedures & Augmented Reality Department of Computer Science, TUM November 10, 2006 Contents A naive approach......with
More informationLinear Inverse Problems
Linear Inverse Problems Ajinkya Kadu Utrecht University, The Netherlands February 26, 2018 Outline Introduction Least-squares Reconstruction Methods Examples Summary Introduction 2 What are inverse problems?
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationMODULE 8 Topics: Null space, range, column space, row space and rank of a matrix
MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix Definition: Let L : V 1 V 2 be a linear operator. The null space N (L) of L is the subspace of V 1 defined by N (L) = {x
More informationAn l 2 -l q regularization method for large discrete ill-posed problems
Noname manuscript No. (will be inserted by the editor) An l -l q regularization method for large discrete ill-posed problems Alessandro Buccini Lothar Reichel the date of receipt and acceptance should
More informationLINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING
LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING JIAN-FENG CAI, STANLEY OSHER, AND ZUOWEI SHEN Abstract. Real images usually have sparse approximations under some tight frame systems derived
More informationThe Kalman filter is arguably one of the most notable algorithms
LECTURE E NOTES «Kalman Filtering with Newton s Method JEFFREY HUMPHERYS and JEREMY WEST The Kalman filter is arguably one of the most notable algorithms of the 0th century [1]. In this article, we derive
More informationInverse scattering problem from an impedance obstacle
Inverse Inverse scattering problem from an impedance obstacle Department of Mathematics, NCKU 5 th Workshop on Boundary Element Methods, Integral Equations and Related Topics in Taiwan NSYSU, October 4,
More informationOn the regularization properties of some spectral gradient methods
On the regularization properties of some spectral gradient methods Daniela di Serafino Department of Mathematics and Physics, Second University of Naples daniela.diserafino@unina2.it contributions from
More informationChapter 3 Numerical Methods
Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2
More informationSystem Identification
System Identification Lecture : Statistical properties of parameter estimators, Instrumental variable methods Roy Smith 8--8. 8--8. Statistical basis for estimation methods Parametrised models: G Gp, zq,
More information6 The SVD Applied to Signal and Image Deblurring
6 The SVD Applied to Signal and Image Deblurring We will discuss the restoration of one-dimensional signals and two-dimensional gray-scale images that have been contaminated by blur and noise. After an
More informationSPARSE SIGNAL RESTORATION. 1. Introduction
SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful
More informationPhysics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester
Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation
More information