Inverse Theory Course: LTU Kiruna. Day 1

Similar documents
Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Linear First-Order Equations

Parameter estimation: A new approach to weighting a priori information

Vectors in two dimensions

6 General properties of an autonomous system of two first order ODE

Euler equations for multiple integrals

The Exact Form and General Integrating Factors

4. Important theorems in quantum mechanics

Schrödinger s equation.

Calculus of Variations

Integration Review. May 11, 2013

Free rotation of a rigid body 1 D. E. Soper 2 University of Oregon Physics 611, Theoretical Mechanics 5 November 2012

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Chapter 6: Energy-Momentum Tensors

Introduction to Mechanics Work and Energy

UNDERSTANDING INTEGRATION

Multi-View Clustering via Canonical Correlation Analysis

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

Designing Information Devices and Systems II Fall 2017 Note Theorem: Existence and Uniqueness of Solutions to Differential Equations

Implicit Differentiation. Lecture 16.

Experiment 2, Physics 2BL

The Non-abelian Hodge Correspondence for Non-Compact Curves

Lecture 10 Notes, Electromagnetic Theory II Dr. Christopher S. Baird, faculty.uml.edu/cbaird University of Massachusetts Lowell

ARCH 614 Note Set 5 S2012abn. Moments & Supports

Conductors & Capacitance

Optimization of Geometries by Energy Minimization

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Quantum Mechanics in Three Dimensions

ELEC3114 Control Systems 1

Lagrangian and Hamiltonian Mechanics

Math 1B, lecture 8: Integration by parts

The Press-Schechter mass function

New Statistical Test for Quality Control in High Dimension Data Set

Implicit Differentiation

Computing Derivatives J. Douglas Child, Ph.D. Rollins College Winter Park, FL

Lecture Notes: March C.D. Lin Attosecond X-ray pulses issues:

Lecture 6: Calculus. In Song Kim. September 7, 2011

Stable and compact finite difference schemes

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Multi-View Clustering via Canonical Correlation Analysis

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

The Principle of Least Action

Capacity Analysis of MIMO Systems with Unknown Channel State Information

State-Space Model for a Multi-Machine System

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Integration by Parts

Differentiation ( , 9.5)

5.4 Fundamental Theorem of Calculus Calculus. Do you remember the Fundamental Theorem of Algebra? Just thought I'd ask

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

Physics 121 for Majors

A Review of Multiple Try MCMC algorithms for Signal Processing

Multi-View Clustering via Canonical Correlation Analysis

A Modification of the Jarque-Bera Test. for Normality

Similarity Measures for Categorical Data A Comparative Study. Technical Report

Chapter 6: Integration: partial fractions and improper integrals

A Course in Machine Learning

Antiderivatives Introduction

Math 1272 Solutions for Spring 2005 Final Exam. asked to find the limit of the sequence. This is equivalent to evaluating lim. lim.

A Second Time Dimension, Hidden in Plain Sight

under the null hypothesis, the sign test (with continuity correction) rejects H 0 when α n + n 2 2.

Solving the Schrödinger Equation for the 1 Electron Atom (Hydrogen-Like)

Lecture 6: Generalized multivariate analysis of variance

PHYS 414 Problem Set 2: Turtles all the way down

Bohr Model of the Hydrogen Atom

Further Differentiation and Applications

Lecture 6 : Dimensionality Reduction

Physics 2112 Unit 5: Electric Potential Energy

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

Introduction to Markov Processes

SYNCHRONOUS SEQUENTIAL CIRCUITS

Lecture 10: Logistic growth models #2

Pure Further Mathematics 1. Revision Notes

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

3.2 Shot peening - modeling 3 PROCEEDINGS

Introduction to variational calculus: Lecture notes 1

CONTROL CHARTS FOR VARIABLES

Analytische Qualitätssicherung Baden-Württemberg

Some vector algebra and the generalized chain rule Ross Bannister Data Assimilation Research Centre, University of Reading, UK Last updated 10/06/10

Calculus Class Notes for the Combined Calculus and Physics Course Semester I

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

Jointly continuous distributions and the multivariate Normal

Math 210 Midterm #1 Review

DEGREE DISTRIBUTION OF SHORTEST PATH TREES AND BIAS OF NETWORK SAMPLING ALGORITHMS

VI. Linking and Equating: Getting from A to B Unleashing the full power of Rasch models means identifying, perhaps conceiving an important aspect,

Semiclassical analysis of long-wavelength multiphoton processes: The Rydberg atom

State observers and recursive filters in classical feedback control theory

arxiv: v1 [hep-lat] 19 Nov 2013

Multi-View Clustering via Canonical Correlation Analysis

Transmission Line Matrix (TLM) network analogues of reversible trapping processes Part B: scaling and consistency

It's often useful to find all the points in a diagram that have the same voltage. E.g., consider a capacitor again.

Derivatives and the Product Rule

THE EFFICIENCIES OF THE SPATIAL MEDIAN AND SPATIAL SIGN COVARIANCE MATRIX FOR ELLIPTICALLY SYMMETRIC DISTRIBUTIONS

Logarithmic spurious regressions

A Random Graph Model for Massive Graphs

Topic 2.3: The Geometry of Derivatives of Vector Functions

0.1 Differentiation Rules

Differentiability, Computing Derivatives, Trig Review. Goals:

Transcription:

Inverse Theory Course: LTU Kiruna. Day Hugh Pumphrey March 6, 0 Preamble These are the notes for the course Inverse Theory to be taught at LuleåTekniska Universitet, Kiruna in February 00. They are not exhaustive, rather, they are a collection of the formulæ that you will nee to be familiar with, printe up so that you on t have to struggle to copy them off the boar or the projector screen. Inverse Theory: what is it? Inverse theory is a term use for the tools use to attack a class of problems common in various branches of Earth an space science, but which occur in other fiels as well. The main thing that links these problems is that you can not make irect measurements of the thing x you want to measure, but you can measure another thing, y, which is relate to x in a way that you unerstan. The quantities x an y usually consist of more than one number they are vectors. It is common for x to be referre to as the state vector or the moel vector, an for y to be referre to as the measurement vector or the ata vector. 3 Setting up a linear problem We suppose that we can measure y, we want x an we know that they are relate by some function F : y = F (x) F is calle the forwar moel an in general it coul be any kin of function. For a lot of problems we can approximate F by a Taylor series about x = x L : y F (x L ) + K(x x L ) where the matrix K is given by K = y x x=xl These names an symbols are not universal: see appenix A for etails.

Just as a tiying job, we choose new variables so that x x x L an y y F (y L ). The problem therefore becomes y = Kx which is a set of simultaneous equations to be solve for x. 4 Solving the linear problem Let y have length m an x have length n. Solving our set of simultaneous equations is straightforwar in principle if m = n as we can immeiately write x = K y Such a problem is calle equi-etermine or well-etermine. But real-worl inverse theory problems often have n > m or n < m. If n > m then we have more unknowns than equations. There will not be a single solution. There will be many solutions an the problem is sai to be uner-etermine. If n < m there will be no solution an the equations are sai to be over-etermine. Note that a set of equations may have n > m but have some equations that contraict each other so that there is no solution instea of infinitely many solutions such a problem is calle mixe-etermine. An a problem with n < m may actually be uner-etermine or mixe-etermine if sufficient of the equations are effectively uplicates of each other. We nee to know how to go about solving all these sorts of systems of equations. 4. The over-etermine problem The over-etermine problem has no exact solution, so we have to look for the next best thing: a value of x which is less ba in some sense than any other value of x. We can efine the error in our solution as e = y Kx, but e is a vector so we nee a single number that is a measure of its length. We choose the sum of the squares of its elements E = e T e = (y Kx) T (y Kx) an look for the solution that makes E as small as possible. This iea occurs over an over again in inverse theory. The thing that we want to minimise is sometimes referre to as a cost function or a penalty function. We can fin the x that minimises E by ifferentiating E with respect to x an setting the result to 0. Differentiating with respect to a vector is a bit tricky the en result in this case is K T Kx = K T y () This is a (usually) equi-etermine set of equation for x which can be solve in the usual manner. They are known as the normal equations. For the case where x has two elements we can raw a contour plot of E (x) this can be quite helpful in unerstaning the nature of the solution that we have obtaine.

x 00 0 0 0 x 00 0 0 0 0 0 x x Figure : Error surfaces (contours of E = (y Kx) T (y Kx) )for two sets of simultaneous equations. The heavy black lines represent the simultaneous equations an the black ot is the solution. The left-han figure is over-etermine an the solution is a least-squares solution. The right-han figure is equietermine an the solution is exact, so E is zero at the solution. 4. The uner-etermine problem We ll look at this in great etail later in the course. For now, we note that a truly uner-etermine problem has an infinite number of exact solutions. It is sometimes useful to have a formula which will give one of these solutions i.e. any vector x for which y = Kx. To o this, we nee an n m matrix D for which DK = I. Multiply on the right by K T to give DKK T = K T. Now, KK T is a m m matrix which we can probably take the inverse of. We can therefore write D = D(KK T )(KK T ) = K T (KK T ). This solution is sometimes calle the minimum-norm solution. Figure shows a simple example. Dealing with measurement errors. Definition of the covariance matrix Typically, x an y are lists of numbers an can therefore be hanle using the techniques of matrix algebra. Because they are measure quantities (or relate to measure quantities) they have ranom errors in them, so we nee some of the tools for hanling ranom variables. For our purposes a ranom variable is a thing for which you get a ifferent result every time you measure it. Suppose we have a scalar ranom variable v an we measure it N times, 3

x 0 0.0 0.0 0. 0 0 00 00 0. 0 0. 0 0. 0 0. 0. 0 x Figure : Error surfaces (contours of E = (y Kx) T (y Kx) )for a set of simultaneous equations with one equation an two unknowns. The black ot is the minimum-norm solution x = K T (KK T ) y calling the jth sample v j. We efine the mean value of v as v = N The sprea of the measurements about the mean is often summarise by the stanar eviation σ σ = N (v j v) N The square of the stanar eviation is calle the variance. If we have two ranom variables: v an u, then we can calculate an aitional quantity: the covariance: cov(v, u) = N (v j v)(u j ū) N N For a vector ranom variable v we efine the mean in a similar way to the scalar case. v = N v j N To express how the iniviual samples v j vary about the mean v we calculate a quantity calle the covariance matrix, efine as: S = N v j N (v j v)(v j v) T 4

The iagonal elements of the covariance matrix are the variances of the iniviual elements of v. Each off-iagonal element is the covariances of two ifferent elements of v.. Weighte least-squares Suppose that some of our measurements y are more noisy than others, i.e. they have larger errors. We can escribe these errors by a covariance matrix S, with the iagonal terms being the square error on each element of y. If the errors are correlate, then the off-iagonal terms escribe those correlations. The least-squares approach is now not quite appropriate as it gives the same importance to all elements of y. Instea of minimising E = (y Kx) T (y Kx), we minimise E = (y Kx) T S (y Kx). This weights the elements of y by the inverse of their square errors, so the elements with the largest error get the smallest weight. Any correlations are also correctly accounte for. The normal equations now become K T S Kx = KT S y so that the least-squares solution now becomes ˆx = (K T S K) K T S y Note that the matrix (K T S K) can be shown to be the covariance matrix of ˆx; by explicitly stating the errors on y we get an estimate of how goo our solution is. A Notation Your ata Your moel parameters Matrix in linear forwar moel Covariance matrix of a ranom vector a Matrix relating true moel params to estimate ones Menke Gubbins Rogers Data vector of Data vector of Measurement vector length N length D y of length m Moel vector m of Moel vector m of State vector x of length M length P length n Data kernel G Data kernel A Influence function matrix K [cov a] C(a) or C a S a Moel resolution matrix R Resolution matrix R Averaging matrix A kernel Table : Different names an notations use for the same things in various inverse theory texts.

Inverse theory has been evelope in a variety of contexts. You will therefore fin textbooks that escribe essentially the same mathematics but using ifferent notation, ifferent names for things an using ifferent examples. Table shows the ifferent names an symbols that three ifferent textbooks use for the same things. I attempt here to stick to the notation of Rogers. 6