Linearized inverse Problems (Weakly nonlinear problems) Using a Taylor expansion then away we go...
Linearized inverse problems Nonlinear inverse problem d obs,i = g i (m) Choose a reference model m o and perform a Taylor expansion of g(m) m = m o + δm g i (m o + δm) =g i (m o )+ g i δm +... g i = " gi, g # T i,... m 1 m 2 Linearized inverse problem δd = d obs g(m o ) δd = Gδm G i,j = g i m j 79
Linearized inverse problems Data prediction error Linearized problem Least squares solution φ(m) =(d g(m)) T Cd 1 (d g(m)) δd = Gδm φ 0 (δm) =(δd Gδm) T Cd 1 (δd Gδm) It can be shown that φ 0 (m) is a quadratic approximation to φ(m) about the reference model m o. Linearized problems need to be solved iteratively δm =(G T C 1 d δm n+1 =(G T nc 1 d G) 1 G T C 1 δd d G n) 1 G T n C 1 δd n d 80
Linearized inverse problems Linearization can succeed...... and linearization can fail. The starting point for an iterative procedure can be all important. 81
Example: Earthquake location δm n+1 =(G T nc 1 d G n) 1 G T nc 1 δd n d m =[x, y, z, t o ] T d =[t arr,1,t arr,2,...,t arr,n ] T t arr,i = t o + Z R i 1 v(x) dl G i,j = g i m j Derivative of the i th arrival time with respect to the j th hypocentral co-ordinate 82
Example: Earthquake location t r = Z R 1 v(x) dl m =[x, y, z, t o ] T d =[t 1,t 2,...,t N ] T What is the data model parameter relationship? Assume homogeneous 3-D Earth model t r = D(m) v t i = t o + D i(x, y, z) v What are the Frechet derivatives? G i,j = d i m j? δm n+1 =(G T nc 1 d G n) 1 G T nc 1 δd n d 83
Example: Linearized inversion δm n+1 =(G T n C 1 d G n) 1 G T n C 1δd n d 84
Example: Earthquake location C M =(G T Cd 1 G) 1 C d = σ 2 I Where do significant the trade offs occur? 85
Discrete non-unique inverse problems Non-uniqueness: When there is no one answer to the question... 86
Example: Travel time tomography Seismic travel times are observed at the surface, and we want to learn about the Earth s structure at depth. Travel times are related to the wave speeds of rocks through the expression t = Z R Z 1 v(x) dl = R s(x)dl The raypath, R also depends on the velocity structure, v(x). R can be found using ray tracing methods. Is this a continuous or discrete inverse problem? 6 Is it linear or nonlinear? 87
Travel time tomography example We can linearize the problem about a reference model s o (x) or v o (x). We get either... δt = Z δs(x)dl or δt = R o Z R o 1 v 2 o δv(x)dl δm(x) = MX j=1 δm j φ j (x) φ j (x) = ( 1 If x in block j 0 otherwise δt i = MX j=1 δm j ZR o,i φ j (x)dl = MX j=1 δm j G i,j How do elements of the matrix G relate to the rays? i,j 88
Travel time tomography example The element of the matrix G i,j is the integral of the j-th basis function along the i-th ray. Hence for our chosen basis functions it is the length of the i-th ray in the j-th block. δt i = G i,j δm j δd = Gδm G = l 1,1 l 1,2,l 1,M l 2,1 l 2,2,l 2,M...... l N,2 l N,2,l N,M δd j = t o i tc i (s o) δm j = s j s o,j l i,j = Length of i-th ray in j-th cell Travel time residual for i-th path Slowness perturbation in j-th cell 89
Travel time tomography example One ray and two blocks δt i = G i,j δm j Non-uniqueness δt 1 = l 1,1 δs 1 + l 1,2 δs 2 90
Travel time tomography example Many rays and two blocks δt i = G i,j δm j Uniqueness? NO! δt i = l i,1 δs 1 + l i,2 δs 2 (i =1,N) 91
Travel time tomography example Can we resolve both slowness perturbations? δt 1 = l 1,1 δs 1 + l 1,2 δs 2 δt 2 = l 2,1 δs 1 + l 2,2 δs 2 δd = Gδm l 1,1 l 1,2 = l 2,1 l 2,2 G =0 G has a zero determinant and hence problem is underdetermined Zero eigenvalues => Linear dependence between equations => no unique solution. An infinite number of solutions exist! Same argument applies to all rays that enter and exit through the same pair of sides. 92
Travel time tomography example Two rays and two blocks δt i = G i,j δm j Uniqueness? YES δt i = l i,1 δs 1 + l i,2 δs 2 (i =1, 2) 93
Travel time tomography example Two rays and two blocks δt i = G i,j δm j C M =(G T Cd 1 G) 1 Model variance is low but cell size is large Over-determined Linear Least squares problem δt i = l i,1 δs 1 + l i,2 δs 2 (i =1,N) 94
Travel time tomography example Many rays and many blocks δt i = G i,j δm j Model variance is higher but cell size is smaller Model variance and resolution trade off Simultaneously over and under-determined Linear Least squares problem Mix-determined problem 95
Recap: In a linear problem, if the number of data is less than the number of unknowns then the problem will be under-determined. If the number of data is more than the number of unknowns the system may not be over-determined. The number of linearly independent data is what matters. This is the true number of pieces of information. Linear discrete problems can be simultaneously over and under-determined. This is a mix-determined problem. There is a trade-off between the variance (of the solution) and the resolution (of the parametrization). 96
Discrete ill-posed problems What does the data misfit function look like in a non-unique problem? ψ(m) = 1 2 (d Gm)T Cd 1 (d Gm) Gm 1 =0 d = G(m o + m 1 )=Gm o 97
Discrete non-unique problems What happens if the normal equations have no solution? m LS =(G T C 1 d G) 1 G T Cd 1 d = G g d Recall that the inverse of a matrix is proportional to the reciprocal of the determinant # G = " a b c d G = ad cb G 1 = 1 G " d b c a # The determinant is the product of the eigenvalues. Hence the inverse does not exist if any of the eigenvalues of G T Cd 1 G are zero We have seen examples of this in the tomography problem This is an ill-posed or under-determined problem with no unique solution 98
The Minimum Length solution If the problem is completely under-determined we can minimize the length of the solution subject to it fitting the data. Min L(m) =m T m : d = Gm Lagrange multipliers says minimize φ(m, λ) φ(m, λ) =m T m + λ T (d Gm)...and we get m ML = G T (GG T ) 1 d G = h l 1 l 2 i Example We get the same solution from here T = l 1 s 1 + l 2 s 2 φ = s 2 1 + s2 2 + λ(t l 1s 1 l 2 s 2 ) s 1 s 2 = l 1 l 2 s 1 = l 1T (l2 2 + l2 1 ) s 2 = l 2T (l2 2 + l2 1 ) 99
Minimum Length and least squares solutions m LS =(G T G) 1 G T d m ML = G T (GG T ) 1 d m est = G g d Model resolution matrix m est = Rm true R = G g G Least squares R =(G T G) 1 G T G = I Minimum length R = G T (GG T ) 1 G 100
Example: Minimum Length resolution matrix Model resolution matrix m ML = G T (GG T ) 1 d m est = G g d = G g Gm true R = Ã l1 l 2 If l 1 = l 2! R = m est = Rm true R = G g G R = G T (GG T ) 1 G = " ³ l1 l 2 Ã 1 (l 2 1 + l2 2 ) R = 1 2 l 1 l 2 Ã l 2 1 l 1 l 2 Ã 1 1 1 1 l 2 l 1 l 2 2!!# 1 ³ l1 l 2! Unlike the least squares case the model resolution matrix is not the identity 101
Minimum Length and least squares solutions m LS =(G T G) 1 G T d m ML = G T (GG T ) 1 d Data resolution matrix Least squares Minimum length m est = G g d d pre = Dd obs D = GG g D = G(G T G) 1 G T D = GG T (GG T ) 1 = I There is symmetry between the least squares and minimum length solutions. Least squares complete solves the over-determined problem and has perfect model resolution, while the minimum length solves the completely under-determined problem and has perfect data resolution. For mix-determined problems all solutions will be between these two extremes. 102