Inverse Theory Course: LTU Kiruna. Day 1

Inverse Theory Course: LTU Kiruna. Day Hugh Pumphrey March 6, 0 Preamble These are the notes for the course Inverse Theory to be taught at LuleåTekniska Universitet, Kiruna in February 00. They are not exhaustive, rather, they are a collection of the formulæ that you will nee to be familiar with, printe up so that you on t have to struggle to copy them off the boar or the projector screen. Inverse Theory: what is it? Inverse theory is a term use for the tools use to attack a class of problems common in various branches of Earth an space science, but which occur in other fiels as well. The main thing that links these problems is that you can not make irect measurements of the thing x you want to measure, but you can measure another thing, y, which is relate to x in a way that you unerstan. The quantities x an y usually consist of more than one number they are vectors. It is common for x to be referre to as the state vector or the moel vector, an for y to be referre to as the measurement vector or the ata vector. 3 Setting up a linear problem We suppose that we can measure y, we want x an we know that they are relate by some function F : y = F (x) F is calle the forwar moel an in general it coul be any kin of function. For a lot of problems we can approximate F by a Taylor series about x = x L : y F (x L ) + K(x x L ) where the matrix K is given by K = y x x=xl These names an symbols are not universal: see appenix A for etails.

Just as a tiying job, we choose new variables so that x x x L an y y F (y L ). The problem therefore becomes y = Kx which is a set of simultaneous equations to be solve for x. 4 Solving the linear problem Let y have length m an x have length n. Solving our set of simultaneous equations is straightforwar in principle if m = n as we can immeiately write x = K y Such a problem is calle equi-etermine or well-etermine. But real-worl inverse theory problems often have n > m or n < m. If n > m then we have more unknowns than equations. There will not be a single solution. There will be many solutions an the problem is sai to be uner-etermine. If n < m there will be no solution an the equations are sai to be over-etermine. Note that a set of equations may have n > m but have some equations that contraict each other so that there is no solution instea of infinitely many solutions such a problem is calle mixe-etermine. An a problem with n < m may actually be uner-etermine or mixe-etermine if sufficient of the equations are effectively uplicates of each other. We nee to know how to go about solving all these sorts of systems of equations. 4. The over-etermine problem The over-etermine problem has no exact solution, so we have to look for the next best thing: a value of x which is less ba in some sense than any other value of x. We can efine the error in our solution as e = y Kx, but e is a vector so we nee a single number that is a measure of its length. We choose the sum of the squares of its elements E = e T e = (y Kx) T (y Kx) an look for the solution that makes E as small as possible. This iea occurs over an over again in inverse theory. The thing that we want to minimise is sometimes referre to as a cost function or a penalty function. We can fin the x that minimises E by ifferentiating E with respect to x an setting the result to 0. Differentiating with respect to a vector is a bit tricky the en result in this case is K T Kx = K T y () This is a (usually) equi-etermine set of equation for x which can be solve in the usual manner. They are known as the normal equations. For the case where x has two elements we can raw a contour plot of E (x) this can be quite helpful in unerstaning the nature of the solution that we have obtaine.

x 00 0 0 0 x 00 0 0 0 0 0 x x Figure : Error surfaces (contours of E = (y Kx) T (y Kx) )for two sets of simultaneous equations. The heavy black lines represent the simultaneous equations an the black ot is the solution. The left-han figure is over-etermine an the solution is a least-squares solution. The right-han figure is equietermine an the solution is exact, so E is zero at the solution. 4. The uner-etermine problem We ll look at this in great etail later in the course. For now, we note that a truly uner-etermine problem has an infinite number of exact solutions. It is sometimes useful to have a formula which will give one of these solutions i.e. any vector x for which y = Kx. To o this, we nee an n m matrix D for which DK = I. Multiply on the right by K T to give DKK T = K T. Now, KK T is a m m matrix which we can probably take the inverse of. We can therefore write D = D(KK T )(KK T ) = K T (KK T ). This solution is sometimes calle the minimum-norm solution. Figure shows a simple example. Dealing with measurement errors. Definition of the covariance matrix Typically, x an y are lists of numbers an can therefore be hanle using the techniques of matrix algebra. Because they are measure quantities (or relate to measure quantities) they have ranom errors in them, so we nee some of the tools for hanling ranom variables. For our purposes a ranom variable is a thing for which you get a ifferent result every time you measure it. Suppose we have a scalar ranom variable v an we measure it N times, 3

x 0 0.0 0.0 0. 0 0 00 00 0. 0 0. 0 0. 0 0. 0. 0 x Figure : Error surfaces (contours of E = (y Kx) T (y Kx) )for a set of simultaneous equations with one equation an two unknowns. The black ot is the minimum-norm solution x = K T (KK T ) y calling the jth sample v j. We efine the mean value of v as v = N The sprea of the measurements about the mean is often summarise by the stanar eviation σ σ = N (v j v) N The square of the stanar eviation is calle the variance. If we have two ranom variables: v an u, then we can calculate an aitional quantity: the covariance: cov(v, u) = N (v j v)(u j ū) N N For a vector ranom variable v we efine the mean in a similar way to the scalar case. v = N v j N To express how the iniviual samples v j vary about the mean v we calculate a quantity calle the covariance matrix, efine as: S = N v j N (v j v)(v j v) T 4

The iagonal elements of the covariance matrix are the variances of the iniviual elements of v. Each off-iagonal element is the covariances of two ifferent elements of v.. Weighte least-squares Suppose that some of our measurements y are more noisy than others, i.e. they have larger errors. We can escribe these errors by a covariance matrix S, with the iagonal terms being the square error on each element of y. If the errors are correlate, then the off-iagonal terms escribe those correlations. The least-squares approach is now not quite appropriate as it gives the same importance to all elements of y. Instea of minimising E = (y Kx) T (y Kx), we minimise E = (y Kx) T S (y Kx). This weights the elements of y by the inverse of their square errors, so the elements with the largest error get the smallest weight. Any correlations are also correctly accounte for. The normal equations now become K T S Kx = KT S y so that the least-squares solution now becomes ˆx = (K T S K) K T S y Note that the matrix (K T S K) can be shown to be the covariance matrix of ˆx; by explicitly stating the errors on y we get an estimate of how goo our solution is. A Notation Your ata Your moel parameters Matrix in linear forwar moel Covariance matrix of a ranom vector a Matrix relating true moel params to estimate ones Menke Gubbins Rogers Data vector of Data vector of Measurement vector length N length D y of length m Moel vector m of Moel vector m of State vector x of length M length P length n Data kernel G Data kernel A Influence function matrix K [cov a] C(a) or C a S a Moel resolution matrix R Resolution matrix R Averaging matrix A kernel Table : Different names an notations use for the same things in various inverse theory texts.

Inverse theory has been evelope in a variety of contexts. You will therefore fin textbooks that escribe essentially the same mathematics but using ifferent notation, ifferent names for things an using ifferent examples. Table shows the ifferent names an symbols that three ifferent textbooks use for the same things. I attempt here to stick to the notation of Rogers. 6