Linearized inverse Problems

Similar documents
Singular value decomposition. If only the first p singular values are nonzero we write. U T o U p =0

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

Regularizing inverse problems. Damping and smoothing and choosing...

The Normal Equations. For A R m n with m > n, A T A is singular if and only if A is rank-deficient. 1 Proof:

Non-polynomial Least-squares fitting

The Application of Discrete Tikhonov Regularization Inverse Problem in Seismic Tomography

Sufficient Conditions for Finite-variable Constrained Minimization

Approximate- vs. full-hessian in FWI: 1D analytical and numerical experiments

Course on Inverse Problems

Non-linear least squares

Inverse problems in a nutshell

Course on Inverse Problems Albert Tarantola

Receiver Function Inversion

Lecture 6. Regularized least-squares and minimum-norm methods 6 1

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations

Structural Cause of Missed Eruption in the Lunayyir Basaltic

Descent methods. min x. f(x)

SOEE3250/5675/5115 Inverse Theory Lecture 2; notes by G. Houseman

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Stacking-velocity tomography in tilted orthorhombic media

Solving Quadratic Equations

Full Waveform Inversion via Matched Source Extension

3D VTI traveltime tomography for near-surface imaging Lina Zhang*, Jie Zhang, Wei Zhang, University of Science and Technology of China (USTC)

M. Koch and T.H. Münch. Department of Geohydraulics and Engineering Hydrology University of Kassel Kurt-Wolters-Strasse 3 D Kassel

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

Traveltime sensitivity kernels: Banana-doughnuts or just plain bananas? a

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Lagrange multipliers. Portfolio optimization. The Lagrange multipliers method for finding constrained extrema of multivariable functions.

L5 Support Vector Classification

Some aspects of seismic tomography

Recovery of anisotropic metrics from travel times

EIGENVALUES AND EIGENVECTORS 3

39.1 Absolute maxima/minima

Parallelizing large scale time domain electromagnetic inverse problem

Prevailing-frequency approximation of the coupling ray theory for S waves

An introduction to PDE-constrained optimization

GEOPHYSICAL INVERSE THEORY AND REGULARIZATION PROBLEMS

Seismogram Interpretation. Seismogram Interpretation

Applied Mathematics 205. Unit I: Data Fitting. Lecturer: Dr. David Knezevic

Geophysical Data Analysis: Discrete Inverse Theory

Arithmetic Progressions Over Quadratic Fields

264 CHAPTER 4. FRACTIONS cm in cm cm ft pounds

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim

FIXED POINT ITERATION

) in the box next to your answer. (1) (b) Explain why it is difficult to predict when an earthquake will happen. (2)

ECON 5350 Class Notes Nonlinear Regression Models

2.5 Operations With Complex Numbers in Rectangular Form

Seismic tomography with co-located soft data

Optimal Experimental Design (Survey)

Spatial Regression. 15. Spatial Panels (3) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Inversion of Phase Data for a Phase Velocity Map 101. Summary for CIDER12 Non-Seismologists

Reverse engineering using computational algebra

Nonuniqueness in Anisotropic Traveltime Tomography under the Radon Transform Approximation. Bill Menke, December 2017 and January 2018

Lecture 1: Systems of linear equations and their solutions

INTERPRETATION OF SEISMOGRAMS

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL.

Numerical Methods I Solving Nonlinear Equations

Lecture XI. Approximating the Invariant Distribution

Notes on Some Methods for Solving Linear Systems

A vector from the origin to H, V could be expressed using:

The Improvement of 3D Traveltime Tomographic Inversion Method

More chapter 3...linear dependence and independence... vectors

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

STA 4273H: Statistical Machine Learning

Video 6.1 Vijay Kumar and Ani Hsieh

Frequency-domain ray series for viscoelastic waves with a non-symmetric stiffness matrix

5 Handling Constraints

MATHEMATICS FOR ECONOMISTS. An Introductory Textbook. Third Edition. Malcolm Pemberton and Nicholas Rau. UNIVERSITY OF TORONTO PRESS Toronto Buffalo

Second Order Optimality Conditions for Constrained Nonlinear Programming

FIG. 16: A Mach Zehnder interferometer consists of two symmetric beam splitters BS1 and BS2

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

APPLIED PARTIM DIFFERENTIAL EQUATIONS with Fourier Series and Boundary Value Problems

Seismic Noise Correlations. - RL Weaver, U Illinois, Physics

LAB. Balboa Heights, Panama. Boulder, Colorado. Mexico City, Mexico. Data Table. Difference Between P-wave and S-wave. S-wave Arrival Time

Adaptive Beamforming Algorithms

ARITHMETIC PROGRESSIONS OF THREE SQUARES

Vollständige Inversion seismischer Wellenfelder - Erderkundung im oberflächennahen Bereich

PELL S EQUATION, II KEITH CONRAD

1. Determine the Zero-Force Members in the plane truss.

7 Planar systems of linear ODE

Comparison between least-squares reverse time migration and full-waveform inversion

COMP 558 lecture 18 Nov. 15, 2010

Lagrange Multipliers

Math 2030 Assignment 5 Solutions

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

On the Limitation of Receiver Functions Method: Beyond Conventional Assumptions & Advanced Inversion Techniques

1 The Stokes System. ρ + (ρv) = ρ g(x), and the conservation of momentum has the form. ρ v (λ 1 + µ 1 ) ( v) µ 1 v + p = ρ f(x) in Ω.

Factoring Algorithms Pollard s p 1 Method. This method discovers a prime factor p of an integer n whenever p 1 has only small prime factors.

Model estimation through matrix equations in financial econometrics

Fitting. PHY 688: Numerical Methods for (Astro)Physics

Linearization of Differential Equation Models

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

General Physical Chemistry I

Registration-guided least-squares waveform inversion

Matthew W. Milligan. Kinematics. What do you remember?

Conjugate Directions for Stochastic Gradient Descent

Constrained Optimization

Tracing rays through the Earth

Transcription:

Linearized inverse Problems (Weakly nonlinear problems) Using a Taylor expansion then away we go...

Linearized inverse problems Nonlinear inverse problem d obs,i = g i (m) Choose a reference model m o and perform a Taylor expansion of g(m) m = m o + δm g i (m o + δm) =g i (m o )+ g i δm +... g i = " gi, g # T i,... m 1 m 2 Linearized inverse problem δd = d obs g(m o ) δd = Gδm G i,j = g i m j 79

Linearized inverse problems Data prediction error Linearized problem Least squares solution φ(m) =(d g(m)) T Cd 1 (d g(m)) δd = Gδm φ 0 (δm) =(δd Gδm) T Cd 1 (δd Gδm) It can be shown that φ 0 (m) is a quadratic approximation to φ(m) about the reference model m o. Linearized problems need to be solved iteratively δm =(G T C 1 d δm n+1 =(G T nc 1 d G) 1 G T C 1 δd d G n) 1 G T n C 1 δd n d 80

Linearized inverse problems Linearization can succeed...... and linearization can fail. The starting point for an iterative procedure can be all important. 81

Example: Earthquake location δm n+1 =(G T nc 1 d G n) 1 G T nc 1 δd n d m =[x, y, z, t o ] T d =[t arr,1,t arr,2,...,t arr,n ] T t arr,i = t o + Z R i 1 v(x) dl G i,j = g i m j Derivative of the i th arrival time with respect to the j th hypocentral co-ordinate 82

Example: Earthquake location t r = Z R 1 v(x) dl m =[x, y, z, t o ] T d =[t 1,t 2,...,t N ] T What is the data model parameter relationship? Assume homogeneous 3-D Earth model t r = D(m) v t i = t o + D i(x, y, z) v What are the Frechet derivatives? G i,j = d i m j? δm n+1 =(G T nc 1 d G n) 1 G T nc 1 δd n d 83

Example: Linearized inversion δm n+1 =(G T n C 1 d G n) 1 G T n C 1δd n d 84

Example: Earthquake location C M =(G T Cd 1 G) 1 C d = σ 2 I Where do significant the trade offs occur? 85

Discrete non-unique inverse problems Non-uniqueness: When there is no one answer to the question... 86

Example: Travel time tomography Seismic travel times are observed at the surface, and we want to learn about the Earth s structure at depth. Travel times are related to the wave speeds of rocks through the expression t = Z R Z 1 v(x) dl = R s(x)dl The raypath, R also depends on the velocity structure, v(x). R can be found using ray tracing methods. Is this a continuous or discrete inverse problem? 6 Is it linear or nonlinear? 87

Travel time tomography example We can linearize the problem about a reference model s o (x) or v o (x). We get either... δt = Z δs(x)dl or δt = R o Z R o 1 v 2 o δv(x)dl δm(x) = MX j=1 δm j φ j (x) φ j (x) = ( 1 If x in block j 0 otherwise δt i = MX j=1 δm j ZR o,i φ j (x)dl = MX j=1 δm j G i,j How do elements of the matrix G relate to the rays? i,j 88

Travel time tomography example The element of the matrix G i,j is the integral of the j-th basis function along the i-th ray. Hence for our chosen basis functions it is the length of the i-th ray in the j-th block. δt i = G i,j δm j δd = Gδm G = l 1,1 l 1,2,l 1,M l 2,1 l 2,2,l 2,M...... l N,2 l N,2,l N,M δd j = t o i tc i (s o) δm j = s j s o,j l i,j = Length of i-th ray in j-th cell Travel time residual for i-th path Slowness perturbation in j-th cell 89

Travel time tomography example One ray and two blocks δt i = G i,j δm j Non-uniqueness δt 1 = l 1,1 δs 1 + l 1,2 δs 2 90

Travel time tomography example Many rays and two blocks δt i = G i,j δm j Uniqueness? NO! δt i = l i,1 δs 1 + l i,2 δs 2 (i =1,N) 91

Travel time tomography example Can we resolve both slowness perturbations? δt 1 = l 1,1 δs 1 + l 1,2 δs 2 δt 2 = l 2,1 δs 1 + l 2,2 δs 2 δd = Gδm l 1,1 l 1,2 = l 2,1 l 2,2 G =0 G has a zero determinant and hence problem is underdetermined Zero eigenvalues => Linear dependence between equations => no unique solution. An infinite number of solutions exist! Same argument applies to all rays that enter and exit through the same pair of sides. 92

Travel time tomography example Two rays and two blocks δt i = G i,j δm j Uniqueness? YES δt i = l i,1 δs 1 + l i,2 δs 2 (i =1, 2) 93

Travel time tomography example Two rays and two blocks δt i = G i,j δm j C M =(G T Cd 1 G) 1 Model variance is low but cell size is large Over-determined Linear Least squares problem δt i = l i,1 δs 1 + l i,2 δs 2 (i =1,N) 94

Travel time tomography example Many rays and many blocks δt i = G i,j δm j Model variance is higher but cell size is smaller Model variance and resolution trade off Simultaneously over and under-determined Linear Least squares problem Mix-determined problem 95

Recap: In a linear problem, if the number of data is less than the number of unknowns then the problem will be under-determined. If the number of data is more than the number of unknowns the system may not be over-determined. The number of linearly independent data is what matters. This is the true number of pieces of information. Linear discrete problems can be simultaneously over and under-determined. This is a mix-determined problem. There is a trade-off between the variance (of the solution) and the resolution (of the parametrization). 96

Discrete ill-posed problems What does the data misfit function look like in a non-unique problem? ψ(m) = 1 2 (d Gm)T Cd 1 (d Gm) Gm 1 =0 d = G(m o + m 1 )=Gm o 97

Discrete non-unique problems What happens if the normal equations have no solution? m LS =(G T C 1 d G) 1 G T Cd 1 d = G g d Recall that the inverse of a matrix is proportional to the reciprocal of the determinant # G = " a b c d G = ad cb G 1 = 1 G " d b c a # The determinant is the product of the eigenvalues. Hence the inverse does not exist if any of the eigenvalues of G T Cd 1 G are zero We have seen examples of this in the tomography problem This is an ill-posed or under-determined problem with no unique solution 98

The Minimum Length solution If the problem is completely under-determined we can minimize the length of the solution subject to it fitting the data. Min L(m) =m T m : d = Gm Lagrange multipliers says minimize φ(m, λ) φ(m, λ) =m T m + λ T (d Gm)...and we get m ML = G T (GG T ) 1 d G = h l 1 l 2 i Example We get the same solution from here T = l 1 s 1 + l 2 s 2 φ = s 2 1 + s2 2 + λ(t l 1s 1 l 2 s 2 ) s 1 s 2 = l 1 l 2 s 1 = l 1T (l2 2 + l2 1 ) s 2 = l 2T (l2 2 + l2 1 ) 99

Minimum Length and least squares solutions m LS =(G T G) 1 G T d m ML = G T (GG T ) 1 d m est = G g d Model resolution matrix m est = Rm true R = G g G Least squares R =(G T G) 1 G T G = I Minimum length R = G T (GG T ) 1 G 100

Example: Minimum Length resolution matrix Model resolution matrix m ML = G T (GG T ) 1 d m est = G g d = G g Gm true R = Ã l1 l 2 If l 1 = l 2! R = m est = Rm true R = G g G R = G T (GG T ) 1 G = " ³ l1 l 2 Ã 1 (l 2 1 + l2 2 ) R = 1 2 l 1 l 2 Ã l 2 1 l 1 l 2 Ã 1 1 1 1 l 2 l 1 l 2 2!!# 1 ³ l1 l 2! Unlike the least squares case the model resolution matrix is not the identity 101

Minimum Length and least squares solutions m LS =(G T G) 1 G T d m ML = G T (GG T ) 1 d Data resolution matrix Least squares Minimum length m est = G g d d pre = Dd obs D = GG g D = G(G T G) 1 G T D = GG T (GG T ) 1 = I There is symmetry between the least squares and minimum length solutions. Least squares complete solves the over-determined problem and has perfect model resolution, while the minimum length solves the completely under-determined problem and has perfect data resolution. For mix-determined problems all solutions will be between these two extremes. 102