Singular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0

Similar documents
ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Regularizing inverse problems. Damping and smoothing and choosing...

Linearized inverse Problems

= (G T G) 1 G T d. m L2

DS-GA 1002 Lecture notes 10 November 23, Linear models

5 Linear Algebra and Inverse Problem

Numerical Methods I Singular Value Decomposition

σ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Linear Inverse Problems

Basic Linear Inverse Method Theory - DRAFT NOTES

The importance of the Vp/Vs ratio in determining the error propagation and the resolution in linear AVA inversion

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Computing tomographic resolution matrices using Arnoldi s iterative inversion algorithm

Geophysical Data Analysis: Discrete Inverse Theory

Probabilistic Latent Semantic Analysis

1 Linearity and Linear Systems

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Singular Value Decomposition

SIGNAL AND IMAGE RESTORATION: SOLVING

Singular Value Decompsition

The Singular Value Decomposition

Linear Regression and Its Applications

SVD, PCA & Preprocessing

Linear Algebra - Part II

Discrete Ill Posed and Rank Deficient Problems. Alistair Boyle, Feb 2009, SYS5906: Directed Studies Inverse Problems 1

Applied Numerical Linear Algebra. Lecture 8

Inverse problems in statistics

Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:

MLCC 2015 Dimensionality Reduction and PCA

Statistical Geometry Processing Winter Semester 2011/2012

Inversion of Phase Data for a Phase Velocity Map 101. Summary for CIDER12 Non-Seismologists

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

Maths for Signals and Systems Linear Algebra in Engineering

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

LECTURE 16: PCA AND SVD

Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1)

Lecture 6 Sept Data Visualization STAT 442 / 890, CM 462

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1)

Why the QR Factorization can be more Accurate than the SVD

Inverse problems in statistics

Singular Value Decomposition

Tutorial on Principal Component Analysis

Math Review: parameter estimation. Emma

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Linear Algebra for Machine Learning. Sargur N. Srihari

Least squares: the big idea

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

nonlinear simultaneous equations of type (1)

Large Scale Data Analysis Using Deep Learning

2 Tikhonov Regularization and ERM

IV. Matrix Approximation using Least-Squares

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

CS6964: Notes On Linear Systems

Statistical and Computational Inverse Problems with Applications Part 2: Introduction to inverse problems and example applications

Computational Methods. Eigenvalues and Singular Values

Spectral Regularization

The Singular Value Decomposition

Mathematical foundations - linear algebra

Fourier PCA. Navin Goyal (MSR India), Santosh Vempala (Georgia Tech) and Ying Xiao (Georgia Tech)

HST.582J/6.555J/16.456J

Regularization via Spectral Filtering

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

Foundations of Matrix Analysis

Principal Component Analysis

Optimal design for inverse problems

A few applications of the SVD

Main matrix factorizations

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

ADAPTIVE ANTENNAS. SPATIAL BF

Fundamentals of Matrices

3 Compact Operators, Generalized Inverse, Best- Approximate Solution

EECS 275 Matrix Computation

Regularization and Inverse Problems

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

The Singular Value Decomposition and Least Squares Problems

Vector and Matrix Norms. Vector and Matrix Norms

PCA, Kernel PCA, ICA

Lecture 7. Econ August 18

Proposition 42. Let M be an m n matrix. Then (32) N (M M)=N (M) (33) R(MM )=R(M)

Singular value decompositions

Regularization Parameter Estimation for Least Squares: A Newton method using the χ 2 -distribution

1 Cricket chirps: an example

FSAN/ELEG815: Statistical Learning

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

The Singular Value Decomposition

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Properties of Matrices and Operations on Matrices

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Application of Principal Component Analysis to TES data

Lecture 5 Singular value decomposition

arxiv:gr-qc/ v1 16 Feb 2007

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Linear Algebra Review. Vectors

Transcription:

Singular value decomposition If only the first p singular values are nonzero we write G =[U p U o ] " Sp 0 0 0 # [V p V o ] T U p represents the first p columns of U U o represents the last N-p columns of U V p represents the first p columns of V V o represents the last M-p columns of V A data null space is created A model null space is created Properties U T p U o =0 U T o U p =0 V T p V o =0 U T p U p = I U T o U o = I V T o V o = I V T o V p =0 V T p V p = I Since the columns of V o and U o multiply by zeros we get the compact form for G G = U p S p V T p 120

Model null space Consider a vector made up of a linear combination of the columns of V o m v = MX i=p+1 λ i v i The model m lies in the space spanned by columns of V o Gm v = MX i=p+1 λ i U p S p V T p v i = 0 So any model of this type has no affect on the data. It lies in the model null space! Where have we seen this before? Consequence: If any solution exists to the inverse problem then an infinite number will Assume the model m ls fits the data Gm ls = d obs G(m ls + m v )=Gm ls + Gm v = d obs + 0 Uniqueness question of Backus and Gilbert The data can not constrain models in the model null space 121

Data null space Consider a data vector with at least one component in U o d obs = d o + λ i u i (i>p) For any model space vector m we have d pre = Gm = U p S p V T p m = U p a For the model to fit the data we must have d o + λ i u i = px j=1 a j u j d obs = d pre Where have we seen this before? So data of this type can not be fit by any model. The data has a component in the data null space! Consequence: No model exists that can fit the data Existence question of Backus and Gilbert All this depends on the structure of the kernel matrix G! 122

Moore Penrose Generalized inverse G = V p Sp 1 Up T The generalized inverse combines the features of the least squares and minimum length solutions. Purely over-determined problem it is equivalent to the least squares solution m = G d =(G T G) 1 G T d In a purely under-determined problem it is equivalent to the minimum length solution m = G d = G T (GG T ) 1 d In general problems it minimizes the data prediction error while also producing a minimizing the length solution. L(m )=m T m φ(m )=(d Gm ) T (d Gm ) 123

Covariance and Resolution of the pseudo inverse How does data noise propagate into the model? What is the model covariance matrix for the generalized inverse? For the case C d = σ 2 I C M = G C d (G ) T G = V p Sp 1 Up T C M = σ 2 G (G ) T = σ 2 V p Sp 2 Vp T Prove this Recall that S p is a diagonal matrix of singular ordered values S p = diag[s 1,s 2,...,s p ] C M = σ 2 p X i=1 v i v T i s 2 i Prove this As the number of singular values, p, increases the variance of What is the effect of singular values on the model covariance? the model parameters increases! 124

Covariance and Resolution of the pseudo inverse How is the estimated model related to the true model? Model resolution matrix R = G G = V p Sp 1 Up T U p S p Vp T = V p V T p As p increases the model null space decreases m = Rm true G = V p Sp 1 Up T p M : V T p V 1 p, R I As the number of singular values, p, increases the resolution of What is the effect of singular values on the resolution matrix? the model parameters increases! We see the trade-off between variance and resolution 125

Worked example: tomography Using rays 1-4 G = 1 0 1 0 0 1 0 1 0 2 2 0 2 0 0 2 δd = Gδm G T G = 3 0 1 2 0 3 2 1 1 2 3 0 2 1 0 3 This has eigenvalues 0, 2, 4, 6. V p = 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 V o = 0.5 0.5 0.5 0.5 Gv o = 0 s 1 2 = 6 s 2 2 = 4 s 3 2 = 2 s 4 2 = 0 126

Worked example: Eigenvectors S 1 2 =6 S 2 2 =4 V p = 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 S 3 2 =2 V o = 0.5 0.5 0.5 0.5 127

Worked example: tomography Using all non zero eigenvalues s 1, s 2 and s 3 the resolution matrix becomes δm = Rδm true = V p V T p δm true R = 0.75 0.25 0.25 0.25 0.25 0.75 0.25 0.25 0.25 0.25 0.75 0.25 0.25 0.25 0.25 0.75 V p = 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 Input model Recovered model 128

Worked example: tomography Using eigenvalues s 1, s 2 and s 3 the model covariance becomes C M = σ 2 p X i=1 v i v T i s 2 i C M = σ2 4 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 + 1 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 s 1 2 = 2 s 2 2 = 4 s 3 2 = 6 + 1 6 C M = σ2 48 11 7 5 1 7 11 1 5 5 1 11 7 1 5 7 11 129

Worked example: tomography Repeat using only one singular value s 3 =6 Model resolution matrix V p = 0.5 0.5 0.5 0.5 R = V p V T p = 1 4 Model covariance matrix Input Output C M = σ 2 p X i=1 v i v T i s 2 i = σ2 24 130

Recap: Singular value decomposition There may exist a model null space -> models that can not be constrained by the data. There may exist a data null space -> data that can not be fit by any model. The general linear discrete inverse problem may be simultaneously under and over determined (mix-determined). Singular value decomposition is a framework for dealing with ill-posed problems. The Pseudo inverse is constructed using SVD and provides a unique model with desirable properties. Fits the data in a least squares sense Gives a minimum length model (no component in the null space) Model Resolution and Covariance can be traded off by choosing the number of eigenvalues to use in reconstruction. 131

Ill-posedness = sensitivity to noise Look what happens when the eigenvalues are small and positive Truncated SVD m = V p S 1 p U T p d = px i=1 Ã ui d Noise in the data is amplified in the model if s i << 1. The eigenvalue spectrum needs to be truncated by reducing p. s i! v i Discrete Picard condition Stability question of Backus and Gilbert TSVD: Choose the smallest p such that data fit is acceptable Gm d 2 δ As N or M increase the computational cost increases significantly! (See example 4.3 of Aster et al., 2005) 132

SVD Example: The Shaw problem m(θ) = intensity of light incident on a slit at angle θ π 2 m(θ) π 2 d(s) = measurements of diffracted light intensity at angle s π 2 s π 2 Shaw Problem Given d(s) find m(s)? d(s) = Ã! 2 sin(π(sin(s)+sin(θ))) π/2 (cos(s)+cos(θ))2 m(θ)dθ π(sin(s)+sin(θ)) Z π/2 Is this a continuous or discrete inverse problem? Is this a linear or nonlinear inverse problem? 133

SVD Example: The Shaw problem Let s discretize the inverse problem Data d(s) and model m(θ) at N equal angles s i = θ i = (i 0.5)π n π 2, (i =1, 2,...,n) d i = d(s i ) m j = m(θ j ) (i =1,...,n) (j =1,...,n) This gives a system of N N linear equations where d = Gm Ã! G i,j = s(cos(s i )+cos(θ j )) 2 sin(π(sin(si )+sin(θ j ))) 2 π(sin(s i )+sin(θ j )) s = π n See MATLAB routine `shaw 134

Example: Ill-posedness Ill-posedness means solution sensitivity to noise m = V p S 1 p U T p d = px i=1 Ã ui d s i! v i d = Gm s i 20 data, 20 unknowns N = M =20 i Eigenvalue spectrum for Shaw problem Condition number is the ratio of largest to smallest singular value = 10 14 Large condition number means severe ill-posedness 135

Example: Ill-posedness Eigenvectors for different singular values: Shaw problem Ã! m = V p Sp 1 Up T px ui d d = v i i=1 s i Amplitude v 18 v 1 Model units Eigenvector for smallest non-zero singular value Model units Eigenvector for largest singular value 136

Test inversion without noise d = Gm m = V p S 1 p U T p d = px i=1 Ã ui d s i! v i Input spike model Data from input spike Amplitude Model units Data units Recovered model 137

d = Gm Test inversion with noise m = V p S 1 p U T p d = px i=1 Ã ui d s i! v i Input spike model Data from spike model Amplitude Model units Data units Recovered model Add Gaussian noise to data σ =10 6 Presence of small eigenvalues means sensitivity of solution to noise 138

d = Gm Shaw problem with p=10 m = V p S 1 p U T p d = px i=1 Ã ui d s i! v i Input spike model Amplitude Model units use first 10 eigenvalues only No noise solution Noise solution Truncating Truncating eigenvalues eigenvalues reduces reduces sensitivity sensitivity to to noise noise but but also also resolving resolving power power of of the the data data 139

Shaw problem Picard plot A guide to choosing the SVD truncation level p (=number of eigenvalues) Ã! m = V p Sp 1 Up T px ui d d = v i i=1 s i The eigenvalue from the truncation level in SVD 140