An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

Similar documents
Fundamentals of Data Assimilation

Introduction to Data Assimilation. Reima Eresmaa Finnish Meteorological Institute

4. DATA ASSIMILATION FUNDAMENTALS

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

Assimilation des Observations et Traitement des Incertitudes en Météorologie

Fundamentals of Data Assimila1on

Variational data assimilation

M.Sc. in Meteorology. Numerical Weather Prediction

Variational Data Assimilation Current Status

Ensemble prediction and strategies for initialization: Tangent Linear and Adjoint Models, Singular Vectors, Lyapunov vectors

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Gaussian Filtering Strategies for Nonlinear Systems

A Spectral Approach to Linear Bayesian Updating

The Inversion Problem: solving parameters inversion and assimilation problems

Smoothers: Types and Benchmarks

EnKF-based particle filters

Kalman Filter and Ensemble Kalman Filter

(Extended) Kalman Filter

Stability of Ensemble Kalman Filters

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm

Data assimilation : Basics and meteorology

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

Inverse problems and uncertainty quantification in remote sensing

Lecture notes on assimilation algorithms

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

Organization. I MCMC discussion. I project talks. I Lecture.

Sensitivity analysis in variational data assimilation and applications

Forecasting and data assimilation

Data assimilation concepts and methods March 1999

Data assimilation; comparison of 4D-Var and LETKF smoothers

Application of the Ensemble Kalman Filter to History Matching

Mathematical Concepts of Data Assimilation

Fundamentals of Data Assimila1on

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Relative Merits of 4D-Var and Ensemble Kalman Filter

Localization in the ensemble Kalman Filter

Introduction to ensemble forecasting. Eric J. Kostelich

Maximum Likelihood Ensemble Filter Applied to Multisensor Systems

Optimal control problems with PDE constraints

Lagrangian Data Assimilation and Its Application to Geophysical Fluid Flows

Diagnosis of observation, background and analysis-error statistics in observation space

Short tutorial on data assimilation

Introduction to Data Assimilation

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.

4 Derivations of the Discrete-Time Kalman Filter

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Probabilistic & Unsupervised Learning

Observation Impact Assessment for Dynamic. Data-Driven Coupled Chaotic System

Multivariate Regression

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Inverse problem and optimization

State and Parameter Estimation in Stochastic Dynamical Models

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Phase space, Tangent-Linear and Adjoint Models, Singular Vectors, Lyapunov Vectors and Normal Modes

The Ensemble Kalman Filter:

Parameter Estimation

How does 4D-Var handle Nonlinearity and non-gaussianity?

arxiv: v1 [physics.ao-ph] 23 Jan 2009

Data assimilation with and without a model

Data assimilation for large state vectors

Generalized Gradient Descent Algorithms

Data assimilation as an optimal control problem and applications to UQ

Bayesian Statistics and Data Assimilation. Jonathan Stroud. Department of Statistics The George Washington University

minimize x subject to (x 2)(x 4) u,

Local Ensemble Transform Kalman Filter

Ergodicity in data assimilation methods

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

6 Sequential Data Assimilation for Nonlinear Dynamics: The Ensemble Kalman Filter

Introduction to Data Assimilation. Saroja Polavarapu Meteorological Service of Canada University of Toronto

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Statistics and Data Analysis

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Brian J. Etherton University of North Carolina

Lagrangian data assimilation for point vortex systems

Loïk Berre Météo-France (CNRM/GAME) Thanks to Gérald Desroziers

Observability, a Problem in Data Assimilation

Adaptive Data Assimilation and Multi-Model Fusion

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Ensemble Kalman Filter

Gaussian processes and bayesian optimization Stanisław Jastrzębski. kudkudak.github.io kudkudak

Four-Dimensional Ensemble Kalman Filtering

Gaussian Process Approximations of Stochastic Differential Equations

A Note on the Particle Filter with Posterior Gaussian Resampling

Convergence of the Ensemble Kalman Filter in Hilbert Space

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

Numerical Weather prediction at the European Centre for Medium-Range Weather Forecasts

Inverse methods and data assimilation

Statistics 910, #15 1. Kalman Filter

STAT 100C: Linear models

4DEnVar. Four-Dimensional Ensemble-Variational Data Assimilation. Colloque National sur l'assimilation de données

Ensemble Data Assimilation and Uncertainty Quantification

Image restoration: numerical optimisation

Optimization Problems

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Machine Learning Lecture 5

Transcription:

An introduction to data assimilation Eric Blayo University of Grenoble and INRIA

Data assimilation, the science of compromises Context characterizing a (complex) system and/or forecasting its evolution, given several heterogeneous and uncertain sources of information Widely used for geophysical fluids (meteorology, oceanography, atmospheric chemistry... ), but also in other numerous domains (e.g. nuclear energy, medicine, agriculture planning... ) Closely linked to inverse methods, control theory, estimation theory, filtering... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 2/58

A daily example: numerical weather forecast E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 3/58

Data assimilation, the science of compromises Numerous possible aims: Forecast: estimation of the present state (initial condition) Model tuning: parameter estimation Inverse modeling: estimation of parameter fields Data analysis: re-analysis (model = interpolation operator) OSSE: optimization of observing systems... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 4/58

Objectives for this lecture introduce data assimilation from several points of view give an overview of the main families of methods point out the main difficulties and current corresponding answers E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 5/58

Objectives for this lecture introduce data assimilation from several points of view give an overview of the main families of methods point out the main difficulties and current corresponding answers Outline 1. Data assimilation for dummies: a simple model problem 2. Generalization: linear estimation theory, variational and sequential approaches 3. Main current challenges E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 5/58

Some references 1. Blayo E. and M. Nodet, 2012: Introduction à l assimilation de données variationnelle. Lecture notes for UJF Master course on data assimilation. https://team.inria.fr/moise/files/2012/03/methodes-inverses-var-m2-math-2009.pdf 2. Bocquet M., 2014 : Introduction aux principes et méthodes de l assimilation de données en géophysique. Lecture notes for ENSTA - ENPC data assimilation course. http://cerea.enpc.fr/homepages/bocquet/doc/assim-mb.pdf 3. Bocquet M., E. Cosme and E. Blayo (Eds.), 2014: Advanced Data Assimilation for Geosciences. Oxford University Press. 4. Bouttier F. and P. Courtier, 1999: Data assimilation, concepts and methods. Meteorological training course lecture series ECMWF, European Center for Medium range Weather Forecast, Reading, UK. http://www.ecmwf.int/newsevents/training/rcourse notes/data ASSIMILATION/ ASSIM CONCEPTS/Assim concepts21.html 5. Cohn S., 1997: An introduction to estimation theory. Journal of the Meteorological Society of Japan, 75, 257-288. 6. Daley R., 1993: Atmospheric data analysis. Cambridge University Press. 7. Evensen G., 2009: Data assimilation, the ensemble Kalman filter. Springer. 8. Kalnay E., 2003: Atmospheric modeling, data assimilation and predictability. Cambridge University Press. 9. Lahoz W., B. Khattatov and R. Menard (Eds.), 2010: Data assimilation. Springer. 10. Rodgers C., 2000: Inverse methods for atmospheric sounding. World Scientific, Series on Atmospheric Oceanic and Planetary Physics. 11. Tarantola A., 2005: Inverse problem theory and methods for model parameter estimation. SIAM. http://www.ipgp.fr/~tarantola/files/professional/books/inverseproblemtheory.pdf E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 6/58

A simple but fundamental example E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 7/58

Model problem: least squares approach Two different available measurements of a single quantity. Which estimation for its true value? least squares approach E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 8/58

Model problem: least squares approach Two different available measurements of a single quantity. Which estimation for its true value? least squares approach Example 2 obs y 1 = 19 C and y 2 = 21 C of the (unknown) present temperature x. Let J(x) = 1 2 [ (x y1 ) 2 + (x y 2 ) 2] Min x J(x) ˆx = y 1 + y 2 2 = 20 C E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 8/58

Model problem: least squares approach Observation operator If units: y 1 = 66.2 F and y 2 = 69.8 F Let H(x) = 9 5 x + 32 Let J(x) = 1 [ (H(x) y1 ) 2 + (H(x) y 2 ) 2] 2 Min x J(x) ˆx = 20 C E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 9/58

Model problem: least squares approach Observation operator If units: y 1 = 66.2 F and y 2 = 69.8 F Let H(x) = 9 5 x + 32 Let J(x) = 1 [ (H(x) y1 ) 2 + (H(x) y 2 ) 2] 2 Min x J(x) ˆx = 20 C Drawback # 1: if observation units are inhomogeneous y 1 = 66.2 F and y 2 = 21 C J(x) = 1 [ (H(x) y1 ) 2 + (x y 2 ) 2] ˆx = 19.47 C!! 2 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 9/58

Model problem: least squares approach Observation operator If units: y 1 = 66.2 F and y 2 = 69.8 F Let H(x) = 9 5 x + 32 Let J(x) = 1 [ (H(x) y1 ) 2 + (H(x) y 2 ) 2] 2 Min x J(x) ˆx = 20 C Drawback # 1: if observation units are inhomogeneous y 1 = 66.2 F and y 2 = 21 C J(x) = 1 [ (H(x) y1 ) 2 + (x y 2 ) 2] ˆx = 19.47 C!! 2 Drawback # 2: if observation accuracies are inhomogeneous If y 1 is twice more accurate than y 2, one should obtain ˆx = 2y 1 + y 2 [ (x ) 3 2 ( ) ] 2 y1 x y2 + = 19.67 C J should be J(x) = 1 2 1/2 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 9/58

Model problem: statistical approach Reformulation in a probabilistic framework: the goal is to estimate a scalar value x y i is a realization of a random variable Y i One is looking for an estimator (i.e. a r.v.) ˆX that is linear: ˆX = α1 Y 1 + α 2 Y 2 (in order to be simple) unbiased: E( ˆX) = x (it seems reasonable) of minimal variance: Var( ˆX) minimum (optimal accuracy) BLUE (Best Linear Unbiased Estimator) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 10/58

Model problem: statistical approach Let Y i = x + ε i with Hypotheses E(ε i ) = 0 (i = 1, 2) unbiased measurement devices Var(ε i ) = σi 2 (i = 1, 2) known accuracies Cov(ε 1, ε 2 ) = 0 independent measurement errors E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 11/58

Model problem: statistical approach Let Y i = x + ε i with Hypotheses E(ε i ) = 0 (i = 1, 2) unbiased measurement devices Var(ε i ) = σi 2 (i = 1, 2) known accuracies Cov(ε 1, ε 2 ) = 0 independent measurement errors Then, since ˆX = α 1 Y 1 + α 2 Y 2 = (α 1 + α 2 )x + α 1 ε 1 + α 2 ε 2 : E( ˆX) = (α 1 + α 2 )x + α 1 E(ε 1 ) + α 2 E(ε 2 ) = α 1 + α 2 = 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 11/58

Model problem: statistical approach Let Y i = x + ε i with Hypotheses E(ε i ) = 0 (i = 1, 2) unbiased measurement devices Var(ε i ) = σi 2 (i = 1, 2) known accuracies Cov(ε 1, ε 2 ) = 0 independent measurement errors Then, since ˆX = α 1 Y 1 + α 2 Y 2 = (α 1 + α 2 )x + α 1 ε 1 + α 2 ε 2 : E( ˆX) = (α 1 + α 2 )x + α 1 E(ε 1 ) + α 2 E(ε 2 ) = α 1 + α 2 = 1 Var( ˆX) = E [( ˆX ] x) 2 = E [ (α 1 ε 1 + α 2 ε 2 ) 2] = α1 2σ2 1 +(1 α 1) 2 σ2 2 α 1 = 0 = α 1 = σ2 2 σ 2 1 + σ2 2 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 11/58

Model problem: statistical approach In summary: BLUE ˆX = 1 σ 2 1 Y 1 + 1 σ 2 2 1 Y 2 Its accuracy: σ 2 1 [ ] 1 1 Var( ˆX) = σ1 2 + 1 σ2 2 + 1 σ 2 2 accuracies are added E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 12/58

Model problem: statistical approach In summary: BLUE ˆX = 1 σ 2 1 Y 1 + 1 σ 2 2 1 Y 2 Its accuracy: σ 2 1 [ ] 1 1 Var( ˆX) = σ1 2 + 1 σ2 2 + 1 σ 2 2 accuracies are added Remarks: The hypothesis Cov(ε 1, ε 2 ) = 0 is not compulsory at all. Cov(ε 1, ε 2 ) = c α i = σ2 j c σ 2 1 +σ2 2 2c Statistical hypotheses on the two first moments of ε 1, ε 2 lead to statistical results on the two first moments of ˆX. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 12/58

Model problem: statistical approach Variational equivalence This is equivalent to the problem: Minimize J(x) = 1 [ (x y1 ) 2 2 σ 2 1 + (x y 2) 2 ] σ2 2 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 13/58

Model problem: statistical approach Variational equivalence This is equivalent to the problem: Minimize J(x) = 1 [ (x y1 ) 2 2 σ 2 1 + (x y 2) 2 ] σ2 2 Remarks: This answers the previous problems of sensitivity to inhomogeneous units and insensitivity to inhomogeneous accuracies This gives a rationale for choosing the norm for defining J J (ˆx) = 1 }{{} σ 2 + 1 1 σ2 2 convexity = [Var(ˆx)] 1 }{{} accuracy E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 13/58

Model problem Alternative formulation: background + observation If one considers that y 1 is a prior (or background) estimate x b for x, and y 2 = y is an independent observation, then: J(x) = 1 (x x b ) 2 2 σb 2 + 1 (x y) 2 2 σo }{{}}{{ 2 } J b J o and ˆx = 1 σ 2 b 1 σ 2 b x b + 1 σo 2 y + 1 σ 2 o = x b + σ2 b σb 2 + (y x b ) σ2 o }{{}}{{} innovation gain E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 14/58

Model problem Interpretation If the background error and the observation error are uncorrelated: E(e o e b ) = 0, then one can show that the estimation error and the innovation are uncorrelated: E(e a (Y X b )) = 0 orthogonal projection for the scalar product < Z 1, Z 2 >= E(Z 1 Z 2 ) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 15/58

Model problem: Bayesian approach One can also consider x as a realization of a r.v. X, and be interested in the pdf p(x Y ). Several optimality criteria minimum variance: ˆXMV such that the spread around it is minimal ˆX MV = E(X Y ) maximum a posteriori: most probable value of X given Y ˆX MAP such that p(x Y ) X = 0 maximum likelihood: ˆXML that maximizes p(y X) Based on the Bayes rule: P(Y = y X = x) P(X = x) P(X = x Y = y) = P(Y = y) requires additional hypotheses on prior pdf for X and for Y X In the Gaussian case, these estimations coincide with the BLUE E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 16/58

Model problem: Bayesian approach Back to our example: observations y 1 and y 2 of an unknown value x. The simplest approach: maximum likelihood (no prior on X) Hypotheses: Y i N (X, σi 2 ) unbiased, known accuracies + known pdf Cov(Y 1, Y 2 ) = 0 independent measurement errors Likelihood function: L(x) = dp(y 1 = y 1 and Y 2 = y 2 X = x) One is looking for ˆx ML = Argmax L(x) maximum likelihood estimation E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 17/58

Model problem: Bayesian approach L(x) = 2 dp(y i = y i X = x) = i=1 2 1 e (y i x) 2σ i 2 2π σi i=1 Argmax L(x) = Argmin ( ln L(x)) = Argmin 1 [ (x y1 ) 2 2 σ 2 1 + (x y 2) 2 ] σ2 2 2 Hence ˆx ML = 1 σ 2 1 1 σ 2 1 y 1 + 1 σ 2 2 + 1 σ 2 2 y 2 BLUE again (because of Gaussian hypothesis) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 18/58

Model problem: synthesis Data assimilation methods are often split into two families: variational methods and statistical methods. Variational methods: minimization of a cost function (least squares approach) Statistical methods: algebraic computation of the BLUE (with hypotheses on the first two moments), or approximation of pdfs (with hypotheses on the pdfs) and computation of the MAP estimator There are strong links between those approaches, depending on the case (linear? Gaussian?) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 19/58

Model problem: synthesis Data assimilation methods are often split into two families: variational methods and statistical methods. Variational methods: minimization of a cost function (least squares approach) Statistical methods: algebraic computation of the BLUE (with hypotheses on the first two moments), or approximation of pdfs (with hypotheses on the pdfs) and computation of the MAP estimator There are strong links between those approaches, depending on the case (linear? Gaussian?) Theorem If you have understood this previous stuff, you have (almost) understood everything on data assimilation. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 19/58

Generalization: variational approach E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 20/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 21/58

Generalization: arbitrary number of unknowns and observations A simple example of observation operator x 1 ( ) If x = x 2 x 3 and y = an observation of x 1+x 2 2 an observation of x 4 x 4 1 1 then H(x) = Hx with H = 0 0 2 2 0 0 0 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 22/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p Cost function: J(x) = 1 H(x) y 2 with 2. to be chosen. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 23/58

Reminder: norms and scalar products u = u 1. u n IR n Euclidian norm: u 2 = u T u = n ui 2 i=1 Associated scalar product: (u, v) = u T v = n u i v i i=1 Generalized norm: let M a symmetric positive definite matrix n n M-norm: u 2 M = u T M u = m ij u i u j i=1 j=1 Associated scalar product: (u, v) M = u T M v = n n m ij u i v j i=1 j=1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 24/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p Cost function: J(x) = 1 H(x) y 2 with 2. to be chosen. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 25/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p Cost function: J(x) = 1 H(x) y 2 with 2. to be chosen. Remark (Intuitive) necessary (but not sufficient) condition for the existence of a unique minimum: p n E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 25/58

Formalism background value + new observations z = ( xb y ) background new observations The cost function becomes: J(x) = 1 2 x x b 2 b }{{} J b + 1 2 H(x) y 2 o }{{} J o E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 26/58

Formalism background value + new observations z = ( xb y ) background new observations The cost function becomes: J(x) = 1 2 x x b 2 b }{{} J b + 1 2 H(x) y 2 o }{{} J o The necessary condition for the existence of a unique minimum (p n) is automatically fulfilled. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 26/58

If the problem is time dependent Observations are distributed in time: y = y(t) The observation cost function becomes: N J o (x) = 1 2 i=0 H i (x(t i )) y(t i ) 2 o E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 27/58

If the problem is time dependent Observations are distributed in time: y = y(t) The observation cost function becomes: N J o (x) = 1 2 i=0 H i (x(t i )) y(t i ) 2 o dx There is a model describing the evolution of x: = M(x) with dt x(t = 0) = x 0. Then J is often no longer minimized w.r.t. x, but w.r.t. x 0 only, or to some other parameters. J o (x 0 ) = 1 2 N H i (x(t i )) y(t i ) 2 o = 1 2 i=0 N H i (M 0 ti (x 0 )) y(t i ) 2 o i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 27/58

If the problem is time dependent J(x 0 ) = 1 2 x 0 x0 b 2 b + 1 N H i (x(t i )) y(t i ) 2 o }{{} 2 i=0 }{{} background term J b observation term J o E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 28/58

Uniqueness of the minimum? N J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 i=0 H i (M 0 ti (x 0 )) y(t i ) 2 o If H and M are linear then J o is quadratic. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 29/58

Uniqueness of the minimum? N J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 i=0 H i (M 0 ti (x 0 )) y(t i ) 2 o If H and M are linear then J o is quadratic. However J o generally does not have a unique minimum, since the number of observations is generally less than the size of x 0 (the problem is underdetermined: p < n). Example: let (x1 t, x 2 t ) = (1, 1) and y = 1.1 an observation of 1 2 (x 1 + x 2 ). J o(x 1, x 2 ) = 1 2 ( x1 + x 2 2 ) 2 1.1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 29/58

Uniqueness of the minimum? N J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 i=0 H i (M 0 ti (x 0 )) y(t i ) 2 o If H and M are linear then J o is quadratic. However it generally does not have a unique minimum, since the number of observations is generally less than the size of x 0 (the problem is underdetermined). Adding J b makes the problem of minimizing J = J o + J b well posed. Example: let (x1 t, x 2 t ) = (1, 1) and y = 1.1 an observation of 1 2 (x 1 + x 2 ). Let (x1 b, x 2 b ) = (0.9, 1.05) J(x 1, x 2 ) = 1 ( ) x1 + x 2 2 1.1 2 2 }{{} J o + 1 2 (x1, x 2 ) = (0.94166..., 1.09166...) [ (x1 0.9) 2 + (x 2 1.05) 2] } {{ } J b E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 30/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 31/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) dx = α(y x) dt dy = βx y xz dt dz = γz + xy dt i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 31/58

http://www.chaos-math.org E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 32/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 N H i (M 0 ti (x 0 )) y(t i ) 2 o 2 i=0 If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) dx = α(y x) dt dy = βx y xz dt dz = γz + xy dt J o (y 0 ) = 1 2 N (x(t i ) x obs (t i )) 2 dt i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 33/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 34/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. i=0 Adding J b makes it more quadratic (J b is a regularization term), but J = J o + J b may however have several local minima. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 34/58

A fundamental remark before going into minimization aspects Once J is defined (i.e. once all the ingredients are chosen: control variables, norms, observations... ), the problem is entirely defined. Hence its solution. The physical (i.e. the most important) part of data assimilation lies in the definition of J. The rest of the job, i.e. minimizing J, is only technical work. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 35/58

Minimizing J J(x) = J b (x) + J o (x) = 1 2 (x x b) T B 1 (x x b ) + 1 2 (Hx y)t R 1 (Hx y) Optimal estimation in the linear case ˆx = x b + (B 1 + H T R 1 H) 1 H T R 1 }{{} (y Hx b ) }{{} gain matrix innovation vector E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 36/58

Minimizing J J(x) = J b (x) + J o (x) = 1 2 (x x b) T B 1 (x x b ) + 1 2 (Hx y)t R 1 (Hx y) Optimal estimation in the linear case ˆx = x b + (B 1 + H T R 1 H) 1 H T R 1 }{{} (y Hx b ) }{{} gain matrix innovation vector Given the size of n and p, it is generally impossible to handle explicitly H, B and R. So the direct computation of the gain matrix is impossible. even in the linear case (for which we have an explicit expression for ˆx), the computation of ˆx is performed using an optimization algorithm. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 36/58

Minimizing J: descent methods Descent methods for minimizing the cost function require the knowledge of (an estimate of) its gradient. x k+1 = x k + α k d k with d k = J(x k ) gradient method [Hess(J)(x k )] 1 J(x k ) Newton method B k J(x k ) quasi-newton methods (BFGS,... ) J(x k ) + J(x k ) 2 J(x k 1 ) d 2 k 1 conjugate gradient...... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 37/58

Getting the gradient is not obvious It is often difficult (or even impossible) to obtain the gradient through the computation of growth rates. Example: { dx(t)) = M(x(t)) t [0, T ] dt x(t = 0) = u J(u) = 1 2 T 0 J(u) = with u = u 1. u N x(t) x obs (t) 2 requires one model run J u 1 (u). J u N (u) [J(u + α e 1 ) J(u)] /α. [J(u + α e N ) J(u)] /α N + 1 model runs E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 38/58

Getting the gradient is not obvious In actual applications like meteorology / oceanography, N = [u] = O(10 6 10 9 ) this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute J. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 39/58

Example: an adjoint for the Burgers equation u t + u u x ν 2 u = f x ]0, L[, t [0, T ] x 2 u(0, t) = ψ 1 (t) u(l, t) = ψ 2 (t) t [0, T ] u(x, 0) = u 0 (x) x [0, L] u obs (x, t) an observation of u(x, t) Cost function: J(u 0 ) = 1 2 T L 0 0 ( u(x, t) u obs (x, t) ) 2 dx dt E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 40/58

Example: an adjoint for the Burgers equation u t + u u x ν 2 u = f x ]0, L[, t [0, T ] x 2 u(0, t) = ψ 1 (t) u(l, t) = ψ 2 (t) t [0, T ] u(x, 0) = u 0 (x) x [0, L] u obs (x, t) an observation of u(x, t) Cost function: J(u 0 ) = 1 2 T L 0 0 ( u(x, t) u obs (x, t) ) 2 dx dt Adjoint model p t + u p x + ν 2 p x 2 = u uobs x ]0, L[, t [0, T ] p(0, t) = 0 p(l, t) = 0 t [0, T ] p(x, T ) = 0 x [0, L] final condition!! backward integration Gradient of J J = p(., 0) function of x E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 40/58

Getting the gradient is not obvious In actual applications like meteorology / oceanography, N = [u] = O(10 6 10 9 ) this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute J. It requires writing a tangent linear code and an adjoint code (beyond the scope of this lecture): obeys systematic rules is not the most interesting task you can imagine there exists automatic differentiation softwares: cf http://www.autodiff.org E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 41/58

Getting the gradient is not obvious In actual applications like meteorology / oceanography, N = [u] = O(10 6 10 9 ) this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute J. It requires writing a tangent linear code and an adjoint code (beyond the scope of this lecture): obeys systematic rules is not the most interesting task you can imagine there exists automatic differentiation softwares: cf http://www.autodiff.org On the contrary, do not forget that, if the size of the control variable is very small (< 10), J can be easily estimated by the computation of growth rates. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 41/58

Generalization: statistical approach E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 42/58

Generalization: statistical approach To be estimated: x = x 1. x n IR n Observations: y = Observation operator: y H(x), with H : IR n IR p Statistical framework: y is a realization of a random vector Y One is looking for the BLUE, i.e. a r.v. ˆX that is linear: ˆX = AY with size(a) = (n, p) y 1. y p IR p unbiased: E(ˆX) = x of minimal variance: n Var(ˆX) = Var( ˆX i ) = Tr(Cov(ˆX)) minimum i=1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 43/58

Generalization: statistical approach Hypotheses Linear observation operator: H(x) = Hx Let Y = Hx + ε with ε random vector in IR p E(ε) = 0 unbiased measurement devices Cov(ε) = E(εε T ) = R known accuracies and covariances E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 44/58

Generalization: statistical approach Hypotheses Linear observation operator: H(x) = Hx Let Y = Hx + ε with ε random vector in IR p E(ε) = 0 unbiased measurement devices Cov(ε) = E(εε T ) = R known accuracies and covariances BLUE ˆX = (H T R 1 H) 1 H T R 1 Y with Cov(ˆX) = (H T R 1 H) 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 44/58

Generalization: statistical approach Hypotheses Linear observation operator: H(x) = Hx Let Y = Hx + ε with ε random vector in IR p E(ε) = 0 unbiased measurement devices Cov(ε) = E(εε T ) = R known accuracies and covariances BLUE ˆX = (H T R 1 H) 1 H T R 1 Y with Cov(ˆX) = (H T R 1 H) 1 Reminder: variational approach in the linear case J o (x) = 1 2 Hx y 2 o = 1 2 (Hx y)t R 1 (Hx y) min x IR J o(x) ˆx = (H T R 1 H) 1 H T R 1 y n E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 44/58

Formalism background value + new observations Background: X b = x + ε b and new observations: Y = Hx + ε o Hypotheses: E(ε b ) = 0 E(ε o ) = 0 Cov(ε b, ε o ) = 0 Cov(ε b ) = B et Cov(ε o ) = R BLUE unbiased background unbiased measurement devices independent background and observation errors known accuracies and covariances ˆX = X b + (B 1 + H T R 1 H) 1 H T R 1 (Y HX b ) }{{}}{{} gain matrix innovation vector [ 1 with Cov(ˆX)] = B 1 + H T R 1 H accuracies are added E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 45/58

Link with the variational approach Statistical approach: BLUE ˆX = X b + (B 1 + H T R 1 H) 1 H T R 1 (Y HX b ) with Cov(ˆX) = (B 1 + H T R 1 H) 1 Variational approach in the linear case J(x) = 1 2 x x b 2 b + 1 2 H(x) y 2 o = 1 2 (x x b) T B 1 (x x b ) + 1 2 (Hx y)t R 1 (Hx y) min x IR n J(x) ˆx = x b + (B 1 + H T R 1 H) 1 H T R 1 (y Hx b ) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 46/58

Link with the variational approach Statistical approach: BLUE ˆX = X b + (B 1 + H T R 1 H) 1 H T R 1 (Y HX b ) with Cov(ˆX) = (B 1 + H T R 1 H) 1 Variational approach in the linear case J(x) = 1 2 x x b 2 b + 1 2 H(x) y 2 o = 1 2 (x x b) T B 1 (x x b ) + 1 2 (Hx y)t R 1 (Hx y) min x IR n J(x) ˆx = x b + (B 1 + H T R 1 H) 1 H T R 1 (y Hx b ) Same remarks as previously The statistical approach rationalizes the choice of the norms for J o and J b in the variational approach. [ ] 1 Cov(ˆX) = B 1 + H T R 1 H = Hess(J) }{{}}{{} accuracy convexity E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 46/58

If the problem is time dependent Dynamical system: x t (t k+1 ) = M(t k, t k+1 )x t (t k ) + e(t k ) x t (t k ) true state at time t k M(t k, t k+1 ) model assumed linear between t k and t k+1 e(t k ) model error at time t k At every observation time t k, we have an observation y k and a model forecast x f (t k ). The BLUE can be applied: E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 47/58

If the problem is time dependent x t (t k+1 ) = M(t k, t k+1 ) x t (t k ) + e(t k ) Hypotheses e(t k ) is unbiased, with covariance matrix Q k e(t k ) and e(t l ) are independent (k l) Unbiased observation y k, with error covariance matrix R k e(t k ) and analysis error x a (t k ) x t (t k ) are independent E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 48/58

If the problem is time dependent Kalman filter (Kalman and Bucy, 1961) Initialization: x a (t 0 ) = x 0 approximate initial state P a (t 0 ) = P 0 error covariance matrix Step k: (prediction - correction, or forecast - analysis) x f (t k+1 ) = M(t k, t k+1 ) x a (t k ) Forecast P f (t k+1 ) = M(t k, t k+1 )P a (t k )M T (t k, t k+1 ) + Q k [ x a (t k+1 ) = x f (t k+1 ) + K k+1 yk+1 H k+1 x f (t k+1 ) ] BLUE [ K k+1 = P f (t k+1 )H T k+1 Hk+1 P f (t k+1 )H T k+1 + R 1 k+1] P a (t k+1 ) = P f (t k+1 ) K k+1 H k+1 P f (t k+1 ) where exponents f and a stand respectively for forecast and analysis. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 49/58

If the problem is time dependent Equivalence with the variational approach If H k and M(t k, t k+1 ) are linear, and if the model is perfect (e k = 0), then the Kalman filter and the variational method minimizing J(x) = 1 2 (x x 0) T P 1 0 (x x 0) + 1 N (H k M(t 0, t k )x y k ) T R 1 k (H k M(t 0, t k )x y k ) 2 k=0 lead to the same solution at t = t N. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 50/58

In summary E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 51/58

In summary variational approach least squares minimization (non dimensional terms) no particular hypothesis either for stationary or time dependent problems If M and H are linear, the cost function is quadratic: a unique solution if p n Adding a background term ensures this property. If things are non linear, the approach is still valid. Possibly several minima statistical approach hypotheses on the first two moments time independent + H linear + p n: BLUE (first two moments) time dependent + M and H linear: Kalman filter (based on the BLUE) hypotheses on the pdfs: Bayesian approach (pdf) + ML or MAP estimator E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 52/58

In summary The statistical approach gives a rationale for the choice of the norms, and gives an estimation of the uncertainty. time independent problems if H is linear, the variational and the statistical approaches lead to the same solution (provided. b is based on B 1 and. o is based on R 1 ) time dependent problems if H and M are linear, if the model is perfect, both approaches lead to the same solution at final time. 4D-Var Kalman filter E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 53/58

To go further... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 54/58

Common main methodological difficulties Non linearities: J non quadratic / what about Kalman filter? Huge dimensions [x] = O(10 6 10 9 ): minimization of J / management of huge matrices Poorly known error statistics: choice of the norms / B, R, Q Scientific computing issues (data management, code efficiency, parallelization...) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 55/58

In short Variational methods: a series of approximations of the cost function, corresponding to a series of methods: 4D-Var, incremental 4D-Var, 3D-FGAT, 3D-Var the more sophisticated ones (4D-Var, incremental 4D-Var) require the tangent linear and adjoint models (the development of which is a real investment) Statistical methods: extended Kalman filter handle (weakly) non linear problems (requires the tangent linear model) reduced order Kalman filters address huge dimension problems a quite efficient method, addressing both problems: ensemble Kalman filters (EnKF) these are so called Gaussian filters particle filters: currently being developed - fully Bayesian approach - still limited to low dimension problems E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 56/58

Some present research directions new methods: less expensive, more robust w.r.t. nonlinearities and/or non gaussianity (particle filters, En4DVar, BFN...) better management of errors (prior statistics, identification, a posteriori validation...) complex observations (images, Lagrangian data...) new application domains (often leading to new methodological questions) definition of observing systems, sensitivity analysis... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 57/58

Two announcements CNA 2014: 5ème Colloque National d Assimilation de données Toulouse, 1-3 décembre 2014 Doctoral course Introduction to data assimilation Grenoble, January 5-9, 2015 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 58/58