An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

An introduction to data assimilation Eric Blayo University of Grenoble and INRIA

Data assimilation, the science of compromises Context characterizing a (complex) system and/or forecasting its evolution, given several heterogeneous and uncertain sources of information Widely used for geophysical fluids (meteorology, oceanography, atmospheric chemistry... ), but also in other numerous domains (e.g. nuclear energy, medicine, agriculture planning... ) Closely linked to inverse methods, control theory, estimation theory, filtering... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 2/58

A daily example: numerical weather forecast E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 3/58

Data assimilation, the science of compromises Numerous possible aims: Forecast: estimation of the present state (initial condition) Model tuning: parameter estimation Inverse modeling: estimation of parameter fields Data analysis: re-analysis (model = interpolation operator) OSSE: optimization of observing systems... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 4/58

Objectives for this lecture introduce data assimilation from several points of view give an overview of the main families of methods point out the main difficulties and current corresponding answers Outline 1. Data assimilation for dummies: a simple model problem 2. Generalization: linear estimation theory, variational and sequential approaches 3. Main current challenges E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 5/58

Some references 1. Blayo E. and M. Nodet, 2012: Introduction à l assimilation de données variationnelle. Lecture notes for UJF Master course on data assimilation. https://team.inria.fr/moise/files/2012/03/methodes-inverses-var-m2-math-2009.pdf 2. Bocquet M., 2014 : Introduction aux principes et méthodes de l assimilation de données en géophysique. Lecture notes for ENSTA - ENPC data assimilation course. http://cerea.enpc.fr/homepages/bocquet/doc/assim-mb.pdf 3. Bocquet M., E. Cosme and E. Blayo (Eds.), 2014: Advanced Data Assimilation for Geosciences. Oxford University Press. 4. Bouttier F. and P. Courtier, 1999: Data assimilation, concepts and methods. Meteorological training course lecture series ECMWF, European Center for Medium range Weather Forecast, Reading, UK. http://www.ecmwf.int/newsevents/training/rcourse notes/data ASSIMILATION/ ASSIM CONCEPTS/Assim concepts21.html 5. Cohn S., 1997: An introduction to estimation theory. Journal of the Meteorological Society of Japan, 75, 257-288. 6. Daley R., 1993: Atmospheric data analysis. Cambridge University Press. 7. Evensen G., 2009: Data assimilation, the ensemble Kalman filter. Springer. 8. Kalnay E., 2003: Atmospheric modeling, data assimilation and predictability. Cambridge University Press. 9. Lahoz W., B. Khattatov and R. Menard (Eds.), 2010: Data assimilation. Springer. 10. Rodgers C., 2000: Inverse methods for atmospheric sounding. World Scientific, Series on Atmospheric Oceanic and Planetary Physics. 11. Tarantola A., 2005: Inverse problem theory and methods for model parameter estimation. SIAM. http://www.ipgp.fr/~tarantola/files/professional/books/inverseproblemtheory.pdf E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 6/58

A simple but fundamental example E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 7/58

Model problem: least squares approach Two different available measurements of a single quantity. Which estimation for its true value? least squares approach E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 8/58

Model problem: least squares approach Two different available measurements of a single quantity. Which estimation for its true value? least squares approach Example 2 obs y 1 = 19 C and y 2 = 21 C of the (unknown) present temperature x. Let J(x) = 1 2 [ (x y1 ) 2 + (x y 2 ) 2] Min x J(x) ˆx = y 1 + y 2 2 = 20 C E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 8/58

Model problem: least squares approach Observation operator If units: y 1 = 66.2 F and y 2 = 69.8 F Let H(x) = 9 5 x + 32 Let J(x) = 1 [ (H(x) y1 ) 2 + (H(x) y 2 ) 2] 2 Min x J(x) ˆx = 20 C E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 9/58

Model problem: least squares approach Observation operator If units: y 1 = 66.2 F and y 2 = 69.8 F Let H(x) = 9 5 x + 32 Let J(x) = 1 [ (H(x) y1 ) 2 + (H(x) y 2 ) 2] 2 Min x J(x) ˆx = 20 C Drawback # 1: if observation units are inhomogeneous y 1 = 66.2 F and y 2 = 21 C J(x) = 1 [ (H(x) y1 ) 2 + (x y 2 ) 2] ˆx = 19.47 C!! 2 Drawback # 2: if observation accuracies are inhomogeneous If y 1 is twice more accurate than y 2, one should obtain ˆx = 2y 1 + y 2 [ (x ) 3 2 ( ) ] 2 y1 x y2 + = 19.67 C J should be J(x) = 1 2 1/2 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 9/58

Model problem: statistical approach Reformulation in a probabilistic framework: the goal is to estimate a scalar value x y i is a realization of a random variable Y i One is looking for an estimator (i.e. a r.v.) ˆX that is linear: ˆX = α1 Y 1 + α 2 Y 2 (in order to be simple) unbiased: E( ˆX) = x (it seems reasonable) of minimal variance: Var( ˆX) minimum (optimal accuracy) BLUE (Best Linear Unbiased Estimator) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 10/58

Model problem: statistical approach Let Y i = x + ε i with Hypotheses E(ε i ) = 0 (i = 1, 2) unbiased measurement devices Var(ε i ) = σi 2 (i = 1, 2) known accuracies Cov(ε 1, ε 2 ) = 0 independent measurement errors Then, since ˆX = α 1 Y 1 + α 2 Y 2 = (α 1 + α 2 )x + α 1 ε 1 + α 2 ε 2 : E( ˆX) = (α 1 + α 2 )x + α 1 E(ε 1 ) + α 2 E(ε 2 ) = α 1 + α 2 = 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 11/58

Model problem: statistical approach Let Y i = x + ε i with Hypotheses E(ε i ) = 0 (i = 1, 2) unbiased measurement devices Var(ε i ) = σi 2 (i = 1, 2) known accuracies Cov(ε 1, ε 2 ) = 0 independent measurement errors Then, since ˆX = α 1 Y 1 + α 2 Y 2 = (α 1 + α 2 )x + α 1 ε 1 + α 2 ε 2 : E( ˆX) = (α 1 + α 2 )x + α 1 E(ε 1 ) + α 2 E(ε 2 ) = α 1 + α 2 = 1 Var( ˆX) = E [( ˆX ] x) 2 = E [ (α 1 ε 1 + α 2 ε 2 ) 2] = α1 2σ2 1 +(1 α 1) 2 σ2 2 α 1 = 0 = α 1 = σ2 2 σ 2 1 + σ2 2 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 11/58

Model problem: statistical approach In summary: BLUE ˆX = 1 σ 2 1 Y 1 + 1 σ 2 2 1 Y 2 Its accuracy: σ 2 1 [ ] 1 1 Var( ˆX) = σ1 2 + 1 σ2 2 + 1 σ 2 2 accuracies are added E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 12/58

Model problem: statistical approach In summary: BLUE ˆX = 1 σ 2 1 Y 1 + 1 σ 2 2 1 Y 2 Its accuracy: σ 2 1 [ ] 1 1 Var( ˆX) = σ1 2 + 1 σ2 2 + 1 σ 2 2 accuracies are added Remarks: The hypothesis Cov(ε 1, ε 2 ) = 0 is not compulsory at all. Cov(ε 1, ε 2 ) = c α i = σ2 j c σ 2 1 +σ2 2 2c Statistical hypotheses on the two first moments of ε 1, ε 2 lead to statistical results on the two first moments of ˆX. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 12/58

Model problem: statistical approach Variational equivalence This is equivalent to the problem: Minimize J(x) = 1 [ (x y1 ) 2 2 σ 2 1 + (x y 2) 2 ] σ2 2 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 13/58

Model problem: statistical approach Variational equivalence This is equivalent to the problem: Minimize J(x) = 1 [ (x y1 ) 2 2 σ 2 1 + (x y 2) 2 ] σ2 2 Remarks: This answers the previous problems of sensitivity to inhomogeneous units and insensitivity to inhomogeneous accuracies This gives a rationale for choosing the norm for defining J J (ˆx) = 1 }{{} σ 2 + 1 1 σ2 2 convexity = [Var(ˆx)] 1 }{{} accuracy E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 13/58

Model problem Alternative formulation: background + observation If one considers that y 1 is a prior (or background) estimate x b for x, and y 2 = y is an independent observation, then: J(x) = 1 (x x b ) 2 2 σb 2 + 1 (x y) 2 2 σo }{{}}{{ 2 } J b J o and ˆx = 1 σ 2 b 1 σ 2 b x b + 1 σo 2 y + 1 σ 2 o = x b + σ2 b σb 2 + (y x b ) σ2 o }{{}}{{} innovation gain E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 14/58

Model problem Interpretation If the background error and the observation error are uncorrelated: E(e o e b ) = 0, then one can show that the estimation error and the innovation are uncorrelated: E(e a (Y X b )) = 0 orthogonal projection for the scalar product < Z 1, Z 2 >= E(Z 1 Z 2 ) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 15/58

Model problem: Bayesian approach One can also consider x as a realization of a r.v. X, and be interested in the pdf p(x Y ). Several optimality criteria minimum variance: ˆXMV such that the spread around it is minimal ˆX MV = E(X Y ) maximum a posteriori: most probable value of X given Y ˆX MAP such that p(x Y ) X = 0 maximum likelihood: ˆXML that maximizes p(y X) Based on the Bayes rule: P(Y = y X = x) P(X = x) P(X = x Y = y) = P(Y = y) requires additional hypotheses on prior pdf for X and for Y X In the Gaussian case, these estimations coincide with the BLUE E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 16/58

Model problem: Bayesian approach Back to our example: observations y 1 and y 2 of an unknown value x. The simplest approach: maximum likelihood (no prior on X) Hypotheses: Y i N (X, σi 2 ) unbiased, known accuracies + known pdf Cov(Y 1, Y 2 ) = 0 independent measurement errors Likelihood function: L(x) = dp(y 1 = y 1 and Y 2 = y 2 X = x) One is looking for ˆx ML = Argmax L(x) maximum likelihood estimation E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 17/58

Model problem: Bayesian approach L(x) = 2 dp(y i = y i X = x) = i=1 2 1 e (y i x) 2σ i 2 2π σi i=1 Argmax L(x) = Argmin ( ln L(x)) = Argmin 1 [ (x y1 ) 2 2 σ 2 1 + (x y 2) 2 ] σ2 2 2 Hence ˆx ML = 1 σ 2 1 1 σ 2 1 y 1 + 1 σ 2 2 + 1 σ 2 2 y 2 BLUE again (because of Gaussian hypothesis) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 18/58

Model problem: synthesis Data assimilation methods are often split into two families: variational methods and statistical methods. Variational methods: minimization of a cost function (least squares approach) Statistical methods: algebraic computation of the BLUE (with hypotheses on the first two moments), or approximation of pdfs (with hypotheses on the pdfs) and computation of the MAP estimator There are strong links between those approaches, depending on the case (linear? Gaussian?) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 19/58

Model problem: synthesis Data assimilation methods are often split into two families: variational methods and statistical methods. Variational methods: minimization of a cost function (least squares approach) Statistical methods: algebraic computation of the BLUE (with hypotheses on the first two moments), or approximation of pdfs (with hypotheses on the pdfs) and computation of the MAP estimator There are strong links between those approaches, depending on the case (linear? Gaussian?) Theorem If you have understood this previous stuff, you have (almost) understood everything on data assimilation. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 19/58

Generalization: variational approach E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 20/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 21/58

Generalization: arbitrary number of unknowns and observations A simple example of observation operator x 1 ( ) If x = x 2 x 3 and y = an observation of x 1+x 2 2 an observation of x 4 x 4 1 1 then H(x) = Hx with H = 0 0 2 2 0 0 0 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 22/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p Cost function: J(x) = 1 H(x) y 2 with 2. to be chosen. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 23/58

Reminder: norms and scalar products u = u 1. u n IR n Euclidian norm: u 2 = u T u = n ui 2 i=1 Associated scalar product: (u, v) = u T v = n u i v i i=1 Generalized norm: let M a symmetric positive definite matrix n n M-norm: u 2 M = u T M u = m ij u i u j i=1 j=1 Associated scalar product: (u, v) M = u T M v = n n m ij u i v j i=1 j=1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 24/58

Generalization: arbitrary number of unknowns and observations To be estimated: x = Observations: y = y 1. y p x 1. x n IR n IR p Observation operator: y H(x), with H : IR n IR p Cost function: J(x) = 1 H(x) y 2 with 2. to be chosen. Remark (Intuitive) necessary (but not sufficient) condition for the existence of a unique minimum: p n E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 25/58

Formalism background value + new observations z = ( xb y ) background new observations The cost function becomes: J(x) = 1 2 x x b 2 b }{{} J b + 1 2 H(x) y 2 o }{{} J o E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 26/58

Formalism background value + new observations z = ( xb y ) background new observations The cost function becomes: J(x) = 1 2 x x b 2 b }{{} J b + 1 2 H(x) y 2 o }{{} J o The necessary condition for the existence of a unique minimum (p n) is automatically fulfilled. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 26/58

If the problem is time dependent Observations are distributed in time: y = y(t) The observation cost function becomes: N J o (x) = 1 2 i=0 H i (x(t i )) y(t i ) 2 o E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 27/58

If the problem is time dependent Observations are distributed in time: y = y(t) The observation cost function becomes: N J o (x) = 1 2 i=0 H i (x(t i )) y(t i ) 2 o dx There is a model describing the evolution of x: = M(x) with dt x(t = 0) = x 0. Then J is often no longer minimized w.r.t. x, but w.r.t. x 0 only, or to some other parameters. J o (x 0 ) = 1 2 N H i (x(t i )) y(t i ) 2 o = 1 2 i=0 N H i (M 0 ti (x 0 )) y(t i ) 2 o i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 27/58

If the problem is time dependent J(x 0 ) = 1 2 x 0 x0 b 2 b + 1 N H i (x(t i )) y(t i ) 2 o }{{} 2 i=0 }{{} background term J b observation term J o E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 28/58

Uniqueness of the minimum? N J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 i=0 H i (M 0 ti (x 0 )) y(t i ) 2 o If H and M are linear then J o is quadratic. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 29/58

Uniqueness of the minimum? N J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 i=0 H i (M 0 ti (x 0 )) y(t i ) 2 o If H and M are linear then J o is quadratic. However J o generally does not have a unique minimum, since the number of observations is generally less than the size of x 0 (the problem is underdetermined: p < n). Example: let (x1 t, x 2 t ) = (1, 1) and y = 1.1 an observation of 1 2 (x 1 + x 2 ). J o(x 1, x 2 ) = 1 2 ( x1 + x 2 2 ) 2 1.1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 29/58

Uniqueness of the minimum? N J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 i=0 H i (M 0 ti (x 0 )) y(t i ) 2 o If H and M are linear then J o is quadratic. However it generally does not have a unique minimum, since the number of observations is generally less than the size of x 0 (the problem is underdetermined). Adding J b makes the problem of minimizing J = J o + J b well posed. Example: let (x1 t, x 2 t ) = (1, 1) and y = 1.1 an observation of 1 2 (x 1 + x 2 ). Let (x1 b, x 2 b ) = (0.9, 1.05) J(x 1, x 2 ) = 1 ( ) x1 + x 2 2 1.1 2 2 }{{} J o + 1 2 (x1, x 2 ) = (0.94166..., 1.09166...) [ (x1 0.9) 2 + (x 2 1.05) 2] } {{ } J b E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 30/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 31/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) dx = α(y x) dt dy = βx y xz dt dz = γz + xy dt i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 31/58

http://www.chaos-math.org E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 32/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 N H i (M 0 ti (x 0 )) y(t i ) 2 o 2 i=0 If H and/or M are nonlinear then J o is no longer quadratic. Example: the Lorenz system (1963) dx = α(y x) dt dy = βx y xz dt dz = γz + xy dt J o (y 0 ) = 1 2 N (x(t i ) x obs (t i )) 2 dt i=0 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 33/58

Uniqueness of the minimum? J(x 0 ) = J b (x 0 ) + J o (x 0 ) = 1 2 x 0 x b 2 b + 1 2 N H i (M 0 ti (x 0 )) y(t i ) 2 o If H and/or M are nonlinear then J o is no longer quadratic. i=0 Adding J b makes it more quadratic (J b is a regularization term), but J = J o + J b may however have several local minima. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 34/58

A fundamental remark before going into minimization aspects Once J is defined (i.e. once all the ingredients are chosen: control variables, norms, observations... ), the problem is entirely defined. Hence its solution. The physical (i.e. the most important) part of data assimilation lies in the definition of J. The rest of the job, i.e. minimizing J, is only technical work. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 35/58

Minimizing J J(x) = J b (x) + J o (x) = 1 2 (x x b) T B 1 (x x b ) + 1 2 (Hx y)t R 1 (Hx y) Optimal estimation in the linear case ˆx = x b + (B 1 + H T R 1 H) 1 H T R 1 }{{} (y Hx b ) }{{} gain matrix innovation vector Given the size of n and p, it is generally impossible to handle explicitly H, B and R. So the direct computation of the gain matrix is impossible. even in the linear case (for which we have an explicit expression for ˆx), the computation of ˆx is performed using an optimization algorithm. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 36/58

Minimizing J: descent methods Descent methods for minimizing the cost function require the knowledge of (an estimate of) its gradient. x k+1 = x k + α k d k with d k = J(x k ) gradient method [Hess(J)(x k )] 1 J(x k ) Newton method B k J(x k ) quasi-newton methods (BFGS,... ) J(x k ) + J(x k ) 2 J(x k 1 ) d 2 k 1 conjugate gradient...... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 37/58

Getting the gradient is not obvious It is often difficult (or even impossible) to obtain the gradient through the computation of growth rates. Example: { dx(t)) = M(x(t)) t [0, T ] dt x(t = 0) = u J(u) = 1 2 T 0 J(u) = with u = u 1. u N x(t) x obs (t) 2 requires one model run J u 1 (u). J u N (u) [J(u + α e 1 ) J(u)] /α. [J(u + α e N ) J(u)] /α N + 1 model runs E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 38/58

Example: an adjoint for the Burgers equation u t + u u x ν 2 u = f x ]0, L[, t [0, T ] x 2 u(0, t) = ψ 1 (t) u(l, t) = ψ 2 (t) t [0, T ] u(x, 0) = u 0 (x) x [0, L] u obs (x, t) an observation of u(x, t) Cost function: J(u 0 ) = 1 2 T L 0 0 ( u(x, t) u obs (x, t) ) 2 dx dt Adjoint model p t + u p x + ν 2 p x 2 = u uobs x ]0, L[, t [0, T ] p(0, t) = 0 p(l, t) = 0 t [0, T ] p(x, T ) = 0 x [0, L] final condition!! backward integration Gradient of J J = p(., 0) function of x E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 40/58

Getting the gradient is not obvious In actual applications like meteorology / oceanography, N = [u] = O(10 6 10 9 ) this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute J. It requires writing a tangent linear code and an adjoint code (beyond the scope of this lecture): obeys systematic rules is not the most interesting task you can imagine there exists automatic differentiation softwares: cf http://www.autodiff.org E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 41/58

Getting the gradient is not obvious In actual applications like meteorology / oceanography, N = [u] = O(10 6 10 9 ) this method cannot be used. Alternatively, the adjoint method provides a very efficient way to compute J. It requires writing a tangent linear code and an adjoint code (beyond the scope of this lecture): obeys systematic rules is not the most interesting task you can imagine there exists automatic differentiation softwares: cf http://www.autodiff.org On the contrary, do not forget that, if the size of the control variable is very small (< 10), J can be easily estimated by the computation of growth rates. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 41/58

Generalization: statistical approach E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 42/58

Generalization: statistical approach To be estimated: x = x 1. x n IR n Observations: y = Observation operator: y H(x), with H : IR n IR p Statistical framework: y is a realization of a random vector Y One is looking for the BLUE, i.e. a r.v. ˆX that is linear: ˆX = AY with size(a) = (n, p) y 1. y p IR p unbiased: E(ˆX) = x of minimal variance: n Var(ˆX) = Var( ˆX i ) = Tr(Cov(ˆX)) minimum i=1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 43/58

Generalization: statistical approach Hypotheses Linear observation operator: H(x) = Hx Let Y = Hx + ε with ε random vector in IR p E(ε) = 0 unbiased measurement devices Cov(ε) = E(εε T ) = R known accuracies and covariances BLUE ˆX = (H T R 1 H) 1 H T R 1 Y with Cov(ˆX) = (H T R 1 H) 1 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 44/58

Generalization: statistical approach Hypotheses Linear observation operator: H(x) = Hx Let Y = Hx + ε with ε random vector in IR p E(ε) = 0 unbiased measurement devices Cov(ε) = E(εε T ) = R known accuracies and covariances BLUE ˆX = (H T R 1 H) 1 H T R 1 Y with Cov(ˆX) = (H T R 1 H) 1 Reminder: variational approach in the linear case J o (x) = 1 2 Hx y 2 o = 1 2 (Hx y)t R 1 (Hx y) min x IR J o(x) ˆx = (H T R 1 H) 1 H T R 1 y n E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 44/58

Formalism background value + new observations Background: X b = x + ε b and new observations: Y = Hx + ε o Hypotheses: E(ε b ) = 0 E(ε o ) = 0 Cov(ε b, ε o ) = 0 Cov(ε b ) = B et Cov(ε o ) = R BLUE unbiased background unbiased measurement devices independent background and observation errors known accuracies and covariances ˆX = X b + (B 1 + H T R 1 H) 1 H T R 1 (Y HX b ) }{{}}{{} gain matrix innovation vector [ 1 with Cov(ˆX)] = B 1 + H T R 1 H accuracies are added E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 45/58

Link with the variational approach Statistical approach: BLUE ˆX = X b + (B 1 + H T R 1 H) 1 H T R 1 (Y HX b ) with Cov(ˆX) = (B 1 + H T R 1 H) 1 Variational approach in the linear case J(x) = 1 2 x x b 2 b + 1 2 H(x) y 2 o = 1 2 (x x b) T B 1 (x x b ) + 1 2 (Hx y)t R 1 (Hx y) min x IR n J(x) ˆx = x b + (B 1 + H T R 1 H) 1 H T R 1 (y Hx b ) Same remarks as previously The statistical approach rationalizes the choice of the norms for J o and J b in the variational approach. [ ] 1 Cov(ˆX) = B 1 + H T R 1 H = Hess(J) }{{}}{{} accuracy convexity E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 46/58

If the problem is time dependent Dynamical system: x t (t k+1 ) = M(t k, t k+1 )x t (t k ) + e(t k ) x t (t k ) true state at time t k M(t k, t k+1 ) model assumed linear between t k and t k+1 e(t k ) model error at time t k At every observation time t k, we have an observation y k and a model forecast x f (t k ). The BLUE can be applied: E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 47/58

If the problem is time dependent x t (t k+1 ) = M(t k, t k+1 ) x t (t k ) + e(t k ) Hypotheses e(t k ) is unbiased, with covariance matrix Q k e(t k ) and e(t l ) are independent (k l) Unbiased observation y k, with error covariance matrix R k e(t k ) and analysis error x a (t k ) x t (t k ) are independent E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 48/58

If the problem is time dependent Kalman filter (Kalman and Bucy, 1961) Initialization: x a (t 0 ) = x 0 approximate initial state P a (t 0 ) = P 0 error covariance matrix Step k: (prediction - correction, or forecast - analysis) x f (t k+1 ) = M(t k, t k+1 ) x a (t k ) Forecast P f (t k+1 ) = M(t k, t k+1 )P a (t k )M T (t k, t k+1 ) + Q k [ x a (t k+1 ) = x f (t k+1 ) + K k+1 yk+1 H k+1 x f (t k+1 ) ] BLUE [ K k+1 = P f (t k+1 )H T k+1 Hk+1 P f (t k+1 )H T k+1 + R 1 k+1] P a (t k+1 ) = P f (t k+1 ) K k+1 H k+1 P f (t k+1 ) where exponents f and a stand respectively for forecast and analysis. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 49/58

If the problem is time dependent Equivalence with the variational approach If H k and M(t k, t k+1 ) are linear, and if the model is perfect (e k = 0), then the Kalman filter and the variational method minimizing J(x) = 1 2 (x x 0) T P 1 0 (x x 0) + 1 N (H k M(t 0, t k )x y k ) T R 1 k (H k M(t 0, t k )x y k ) 2 k=0 lead to the same solution at t = t N. E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 50/58

In summary E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 51/58

In summary variational approach least squares minimization (non dimensional terms) no particular hypothesis either for stationary or time dependent problems If M and H are linear, the cost function is quadratic: a unique solution if p n Adding a background term ensures this property. If things are non linear, the approach is still valid. Possibly several minima statistical approach hypotheses on the first two moments time independent + H linear + p n: BLUE (first two moments) time dependent + M and H linear: Kalman filter (based on the BLUE) hypotheses on the pdfs: Bayesian approach (pdf) + ML or MAP estimator E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 52/58

In summary The statistical approach gives a rationale for the choice of the norms, and gives an estimation of the uncertainty. time independent problems if H is linear, the variational and the statistical approaches lead to the same solution (provided. b is based on B 1 and. o is based on R 1 ) time dependent problems if H and M are linear, if the model is perfect, both approaches lead to the same solution at final time. 4D-Var Kalman filter E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 53/58

To go further... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 54/58

Common main methodological difficulties Non linearities: J non quadratic / what about Kalman filter? Huge dimensions [x] = O(10 6 10 9 ): minimization of J / management of huge matrices Poorly known error statistics: choice of the norms / B, R, Q Scientific computing issues (data management, code efficiency, parallelization...) E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 55/58

In short Variational methods: a series of approximations of the cost function, corresponding to a series of methods: 4D-Var, incremental 4D-Var, 3D-FGAT, 3D-Var the more sophisticated ones (4D-Var, incremental 4D-Var) require the tangent linear and adjoint models (the development of which is a real investment) Statistical methods: extended Kalman filter handle (weakly) non linear problems (requires the tangent linear model) reduced order Kalman filters address huge dimension problems a quite efficient method, addressing both problems: ensemble Kalman filters (EnKF) these are so called Gaussian filters particle filters: currently being developed - fully Bayesian approach - still limited to low dimension problems E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 56/58

Some present research directions new methods: less expensive, more robust w.r.t. nonlinearities and/or non gaussianity (particle filters, En4DVar, BFN...) better management of errors (prior statistics, identification, a posteriori validation...) complex observations (images, Lagrangian data...) new application domains (often leading to new methodological questions) definition of observing systems, sensitivity analysis... E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 57/58

Two announcements CNA 2014: 5ème Colloque National d Assimilation de données Toulouse, 1-3 décembre 2014 Doctoral course Introduction to data assimilation Grenoble, January 5-9, 2015 E. Blayo - An introduction to data assimilation Formation Modélisation ECCOREV 58/58