Fundamentals of Data Assimila1on

Similar documents
Fundamentals of Data Assimila1on

Relative Merits of 4D-Var and Ensemble Kalman Filter

Par$cle Filters Part I: Theory. Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading

4. DATA ASSIMILATION FUNDAMENTALS

(Extended) Kalman Filter

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Fundamentals of Data Assimilation

Variational data assimilation

UNDERSTANDING DATA ASSIMILATION APPLICATIONS TO HIGH-LATITUDE IONOSPHERIC ELECTRODYNAMICS

Introduction to Particle Filters for Data Assimilation

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

1. Current atmospheric DA systems 2. Coupling surface/atmospheric DA 3. Trends & ideas

Dynamic System Identification using HDMR-Bayesian Technique

Lagrangian Data Assimilation and Its Application to Geophysical Fluid Flows

Convective-scale data assimilation in the Weather Research and Forecasting model using a nonlinear ensemble filter

Application of the Ensemble Kalman Filter to History Matching

An Efficient Ensemble Data Assimilation Approach To Deal With Range Limited Observation

Weak Constraints 4D-Var

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

Short tutorial on data assimilation

Ensemble Data Assimilation and Uncertainty Quantification

Cross-validation methods for quality control, cloud screening, etc.

Advancing Data AssimilaJon Science for OperaJonal Hydrology: Methodology, ComputaJon, and Algorithms

Gaussian Process Approximations of Stochastic Differential Equations

Probabilistic numerics for deep learning

Data assimilation; comparison of 4D-Var and LETKF smoothers

Can hybrid-4denvar match hybrid-4dvar?

Autonomous Navigation for Flying Robots

2D Image Processing. Bayes filter implementation: Kalman filter

A Note on the Particle Filter with Posterior Gaussian Resampling

Kalman Filter and Ensemble Kalman Filter

Numerical Weather Prediction: Data assimilation. Steven Cavallo

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Enhancing information transfer from observations to unobserved state variables for mesoscale radar data assimilation

Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed

How 4DVAR can benefit from or contribute to EnKF (a 4DVAR perspective)

Ensemble Data Assimila.on and Uncertainty Quan.fica.on

Stability of Ensemble Kalman Filters

CS491/691: Introduction to Aerial Robotics

Methods of Data Assimilation and Comparisons for Lagrangian Data

Gaussian Process Approximations of Stochastic Differential Equations

Forecasting and data assimilation

Variational data assimilation of lightning with WRFDA system using nonlinear observation operators

Lecture 2: From Linear Regression to Kalman Filter and Beyond

An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

Improving GFS 4DEnVar Hybrid Data Assimilation System Using Time-lagged Ensembles

4DEnVar. Four-Dimensional Ensemble-Variational Data Assimilation. Colloque National sur l'assimilation de données

A Spectral Approach to Linear Bayesian Updating

DART Ini)al Condi)ons for a Refined Grid CAM- SE Forecast of Hurricane Katrina. Kevin Raeder (IMAGe) Colin Zarzycki (ASP)

Inverse problems and uncertainty quantification in remote sensing

Lecture 3: Pattern Classification

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm

The priority program SPP1167 Quantitative Precipitation Forecast PQP and the stochastic view of weather forecasting

Maximum Likelihood Ensemble Filter Applied to Multisensor Systems

2D Image Processing. Bayes filter implementation: Kalman filter

Basic Verification Concepts

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

ECE521 week 3: 23/26 January 2017

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Robust Ensemble Filtering With Improved Storm Surge Forecasting

Local Ensemble Transform Kalman Filter

Graphical Models for Collaborative Filtering

Variational Data Assimilation Current Status

CS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model

DART Tutorial Sec'on 5: Comprehensive Filtering Theory: Non-Iden'ty Observa'ons and the Joint Phase Space

CS 6140: Machine Learning Spring 2016

Adaptive Data Assimilation and Multi-Model Fusion

Ensemble square-root filters

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

Mobile Robot Localization

DART Tutorial Part IV: Other Updates for an Observed Variable

Machine Learning for Signal Processing Bayes Classification and Regression

DART_LAB Tutorial Section 5: Adaptive Inflation

Ensemble 4DVAR and observa3on impact study with the GSIbased hybrid ensemble varia3onal data assimila3on system. for the GFS

M.Sc. in Meteorology. Numerical Weather Prediction

Tangent-linear and adjoint models in data assimilation

DATA ASSIMILATION FOR FLOOD FORECASTING

Smoothers: Types and Benchmarks

Some Applications of WRF/DART

New Fast Kalman filter method

Hierarchical Bayes Ensemble Kalman Filter

STA 4273H: Statistical Machine Learning

The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich. Main topics

Signal Processing - Lecture 7

Efficient Data Assimilation for Spatiotemporal Chaos: a Local Ensemble Transform Kalman Filter

SLAM Techniques and Algorithms. Jack Collier. Canada. Recherche et développement pour la défense Canada. Defence Research and Development Canada

EnKF-based particle filters

Machine Learning 4771

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006

Adaptive Inflation for Ensemble Assimilation

The ECMWF Hybrid 4D-Var and Ensemble of Data Assimilations

Assessing Potential Impact of Air Pollutant Observations from the Geostationary Satellite on Air Quality Prediction through OSSEs

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Ensemble Kalman Filter based snow data assimilation

Development of the NCAR 4D-REKF System and Comparison with RTFDDA, DART-EaKF and WRFVAR

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Transcription:

014 GSI Community Tutorial NCAR Foothills Campus, Boulder, CO July 14-16, 014 Fundamentals of Data Assimila1on Milija Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University Fort Collins, Colorado

Outline Motivation for data assimilation Basic data assimilation General challenges of data assimilation Data assimilation methodologies Relevance of forecast error covariance Future of DA

Mo1va1on for Data Assimila1on Models and observations o models are described by a set of equations used to simulate the real- world processes and predict their future behavior o In geosciences, these equations typically refer to a system of partial differential equations (PDE) o Various parameters can impact the performance of a PDE model: Initial conditions (IC), model errors (ME), empirical parameters (EP), o Our knowledge of these parameters is never perfect, implying uncertainty! State vector (x) o o o A (smallest) subset of variables defining a dynamical/physical system Typically it refers to the initial conditions only In general, it may include initial conditions, model errors, and empirical parameters x = ( p T q u,v q cloud q snow O 3 T soil q soil ) T p = ( p 1 p N ) T T = ( T 1 T N ) T... 3

Uncertainty Defines how reliable is the state vector estimate Measure of missing knowledge Important for decision- making Classical information theory: - uncertainty is related to probability and entropy - probability and entropy are measures of incomplete knowledge Hurricane Ike (008) wind speed probability Entropy and order 4

What is data assimila1on? How can we improve model prediction? o o o By improving general parameters of a modeling system (IC, ME, EP) Also, could improve model equations: include missing processes, coupling, spatiotemporal resolution (if discrete), How to improve IC, ME, EP? Use observations (measurements) as a source of information about the real- world. If we believe that a model has a skill, could also use past model performances as an additional information: (1) If model has no skill, use observations as the only source of information about the real world, employ statistics. () If a model have a meaningful role, use both the model and observations as sources of information. Employ a combination of physics/dynamics, probability, and statistics Mathematical method used to blend the information from models and observations is called data assimilation. Data assimilation has a goal of producing optimal estimates of the state and its uncertainty. 5

Data assimila1on process Observations Guess forecast Analysis Various observation types and model forecast are combined to obtain an improved model state (analysis)

Forecast Assume that forecast is a dynamic- stochastic process dx t = m(x t,t)dt + g(x t,t)dβ t m = Dynamics/model time evolution g = Stochastic forcing x β = State vector = Random vector The probability density (p) evolution of the process is described by the Fokker- Planck equation p(x,t) t + [ p(x,t)m(x,t) ] x = 1 p(x,t)g (x,t) x - - Diffusion equation for pdfs Huge dimensions and numerous unknowns, not used in practice Information from the model is fundamentally probabilistic 7

Observa1ons h y ε Assume that observed variables are nonlinearly related to model variables Observations include instrument and representativeness errors = Nonlinear mapping from model to observations = Observation vector = Observation error y = h(x t ) + ε Given the probabilistic character of the model state (x t ) and the existence of observation errors, the observation transformation equation implies the probabilistic character of observations 8

Data assimila1on is probabilis1c Model equations are imperfect. Also, IC, ME, EP are not perfectly known, implying model forecast will have an error Observations are imperfect: instrument errors, representativeness error (unexplained by the model Since observations and model forecast are input to data assimilation, DA has also an error (e.g., insufficient knowledge of the input implies insufficient knowledge of the output) Uncertainties and imperfect knowledge are best measured by probability o o make an assumption regarding adequate probability, or, (if possible) form a histogram and deduce best probability distribution 9

Bayesian principle in data assimila1on Bayes theorem: create independent probabilities p(x Y ) = p(y X)p(X) p(y ) o o o X = State variable, Y = Observations p(x) Prior PDF p(x Y) Conditional PDF It is implicitly assumed that it is easier to calculate the prior and the conditional PDFs than the joint PDF [p(x,y)] A learning algorithm: probability estimate is updated as additional evidence is acquired 10

Prior and condi1onal probability density func1ons Prior : Defines the knowledge about dynamical state before new observations (Y N ) are assimilated p(x ) = p(x Y N 1 Y 1 ) o Y N-1 Y 1 = Old observations o Y N = New observation not yet used Conditional probability of new observations with respect to the prior state p(y N X ) o Y N = New observations 11

Gaussian assump1on Probability Density Function (PDF) can be highly nonlinear, or with numerous unknown parameters that need to be estimated One of the simplest and most widely applicable PDFs is the Gaussian PDF o Errors of physical processes tend to accumulate near zero (e.g., small errors dominate) o Gaussian PDF has the smallest number of unknown parameters (e.g., mean and covariance first two moments) 1 " (z µ) p(z) = exp$ σ π # σ % ' & 1

Other relevant non- Gaussian PDFs Double exponential (Laplacian) PDF p L (x µ,b) = 1 b exp x µ b Wind speed errors can be described using Laplacian PDF Sharp gradient fields (e.g., atmospheric fronts) exhibit Laplacian PDF There are other non- Gaussian (skewed) PDFs that are relevant: - Lognormal PDF (humidity, cloud variables) - Gamma function PDFs (precipitation) 13

Gaussian prior and condi1onal PDFs Multivariate Gaussian PDFs prior conditional # p(x) exp% 1 x x f $ # p( y x) exp% 1 $ y h(x) ( ) T P 1 f ( x x f ) & ( ' ( ) T R 1 ( y h(x) ) & ( ' o x = state variable, P f = covariance o y = observations, R = observation covariance o h = nonlinear observation operator (mapping from model to observations) } } } Covariance is independent of x Only mean and covariance are required Zero mean is implicitly assumed in the above equation 14

Data assimila1on with Gaussian PDFs Maximum a- posteriori estimate: Find optimal state X opt that maximizes the posterior probability density function p(x Y) X opt = argmax x p(x Y ) Minimum variance estimate: Find optimal state X opt with the smallest error (variance) [L= loss function of conditional mean] X opt = argmin x E( L[E(x y)] ) Both estimates are identical for Gaussian PDF, otherwise they differ 15

Gaussian posterior PDF and cost func1on Given the prior and conditional PDF, the posterior is # p(x y) exp% 1 $ y h(x) ( ) T R 1 ( y h(x) ) 1 x x f ( ) T P 1 ( f x x f ) & ( ' Log- likelihood function (also referred to as the cost function) f (x) = log p(x y) = 1 " # x x f $ % T Pf 1 " # x x f $ % + 1 " # y h(x) $ %T R 1 " # y h(x) $ % Consequence: Minimize cost function = Maximize posterior PDF X opt = argmax x p(x y) = argmin f (x) x 16

Op1mality condi1ons for the minimum (1) First variation equal to zero δ f (x) = 0 () Second variation greater than zero δ f (x) > 0 δ f (x) > 0 δ f (x) = 0 Taylor expansion for function with higher- order derivatives: f (x + δ x) f (x) = δ f (x) + 1! δ f (x) + 1 3! δ 3 f (x) + Data assimilation typically utilizes only first two variations Keeping more terms improves nonlinear capability of DA 17

First and second varia1on of a func1on First variation δ f (x) = f (x) f (x),δ x = x x T δ x Second variation δ f (x) = δ[δ f (x)] Neglect second variation of x: δ x 0 δ f (x) δ x, f (x) x δ x = δ x ( ) T f (x) x δ x First and second variation of a function imply the use of inner products 18

Minimize cost function: Op1mal solu1on f (x) = 1 x x f T Pf 1 x x f + 1 [ y h(x) ] T R 1 [ y h(x) ] (1) δ f (x) = 0 δ f (x) = δ x P f 1 [ ] T P f 1 { x x f + H T R 1 [ y h(x) ]} x x f + H T R 1 [ y h(x) ] = 0 () δ f (x) > 0 δ f (x) = δ x [ ] T δ P f 1 { x x f + H T R 1 [ y h(x) ]} = δ x { }δ x [ ] T P f 1 + H T R 1 H Since P 1 f + H T R 1 H minimum exists. is positive definite and symmetric, the 19

One- point DA algorithm deriva1on Minimize quadratic cost function: f (x) = 1 x x f T Pf 1 x x f (1) δ f (x) = 0 P f 1 + 1 [ y Hx ] T R 1 [ y Hx] x x f H T R 1 [ y Hx] = 0 x = (I + P f H T R 1 H ) 1 (x f + P f H T R 1 y) One- point DA with observations at a grid- point H I P f σ f R σ R x a = (I + σ f σ R ) 1 (x f + σ f σ R y) 0

One- point DA (1) x a = (I + σ f σ R ) 1 (x f + σ f σ R y) x a = σ R σ f + σ R x σ f + f σ f + σ R y x a = α x f + βy α + β =? Analysis is a linear combination of the first guess and observation vectors, or analysis is an interpolation from observation and first guess uncertainty defines the interpolation weights 1

One- point DA () Note that interpolation weights are normalized: σ R σ f + σ R + σ f σ f + σ R = σ R + σ f σ f + σ = 1 α + β = 1 R x a = α x f + (1 α )y α = σ R σ f + σ R Normalization of weights assures that the analysis will be between the guess and the observation α = σ R σ f + σ = 1 R σ f σ R +1 = 1 1+ σ f σ R 1 Only the ratio between uncertainties is important!

One- point DA (3) x a = σ R σ f + σ R x f + 1 σ R σ f + σ R y Observation error Forecast error a = σ R b = σ f (1) Large confidence in observations: a > b () Equal confidence in observations and first guess: a = b First guess x b b Optimal analysis First guess x b b Optimal analysis x a a x a a y Observation y Observation Interpretation of data assimilation is simple, the complexity comes from high dimensional state and nonlinear operators 3

Example 1 1 x a = 1+ σ f σ R σ f σ x f + R 1+ σ f σ R y x x f = (0,0) x a = (5,0) x x y = (10,0) σ f (1) Given = 1, what are the coordinates of x σ a? R x a 1 1 = 1+1 (0,0) + 1+1 (10,0) = 1 (0,0) + 1 (10,0) = 1 0 + 1 10, 1 0 + 1 0 = (5,0) Equal confidence implies the analysis is in the middle 4

Example 1 x a = 1+ σ f σ R σ f σ x f + R 1+ σ f σ R y x x f = (0,0) x a = (9,0) x x y = (10,0) σ () Given f = 3, what are the coordinates of x a? σ R x a 1 9 = 1+ 9 (0,0) + 1+ 9 (10,0) = 1 10 (0,0) + 9 10 (10,0) = 1 10 0 + 9 10 10, 1 10 0 + 9 10 0 = (9,0) More confidence in observation implies the analysis is close to observation 5

Example 3 1 x a = 1+ σ f σ R σ f σ x f + R 1+ σ f σ R y x x a = (1,0) x x f = (0,0) x y = (10,0) σ f (3) Given = 1, what are the coordinates of x σ R 3 a? 1 1 x a = 1+ 1 (0,0) + 9 9 1+ 1 9 (10,0) = 9 10 (0,0) + 1 10 (10,0) = 9 10 0 + 1 10 10, 9 10 0 + 1 10 0 = (1,0) More confidence in first guess implies the analysis is close to the guess 6

Challenges of realis1c data assimila1on High dimensionality of state and observations o impacts degrees of freedom of forecast error covariance, acceptable choices of DA methodology Nonlinearity of simulated physical processes and observation operators o Need capability to handle nonlinearities Computation o Costly integration of realistic forecast models, matrix inversion Observation errors o Bias correction, correlated observation errors Multivariate character of the DA problem o Dynamical stability of the analysis 7

Prac1cal data assimila1on algorithms: Basic methods Variational data assimilation (3D- Var, 4D- Var) o Maximum a- posteriori estimate o Iterative minimization has advantage for nonlinear operators o Forecast uncertainty pre- defined (e.g., static) o Forecast uncertainty has all degrees of freedom o Employs an adjoint (e.g., transpose) operator Ensemble Kalman filter data assimilation (EnKF, EnSRF) o Minimum variance o Assumed linear KF solution o Statistical sampling of forecast error covariance o Forecast uncertainty is flow- dependent (e.g., ensemble forecasts) o Reduced number of degrees of freedom o No need for an adjoint, use difference of nonlinear functions

Varia1onal cost func1on 3D- Var cost function (one observation time): f (x) = 1 x x f T Pf 1 x x f + 1 [ y h(x) ] T R 1 [ y h(x) ] 4D- Var cost function (sum over observation times): f (x) = 1 x x f T Pf 1 x x f + 1 T k=1 [ y h(m(x)) ] T k R 1 k y h(m(x)) [ ] k o 4D- Var allows smooth transition to forecast after data assimilation o 4D- Var analysis is more costly to calculate than 3D- Var o 3D- Var can considerably improve with better definition of the background error covariance (P f )

What is ensemble data assimila1on? (1) Forecast uncertainty is calculated from multiple model forecasts (ensembles) FCST ERROR M(x) } Forecast uncertainty Initial uncertainty { t t+1 TIME () Analysis employs the ensemble information and produces uncertainty Dynamical model (phase) space Observation space K(x) X X X 30

Flow- dependent forecast error covariance grid- point x time obs obs 1 Geographically distant observations can bring more information than close- by observations, if in a dynamically significant region

Impact of sta1c error covariance grid- point x Correlation length scale time obs obs 1 Low- valued information (obs ) will be assimilated instead of a high- valued information (obs 1 )

Reduced rank aspect of (ensemble) data assimila1on Only a limited number of ensembles can be calculated: - high cost of forecast model integra1on - high- dimensional state vector Full model (phase) space Ensemble (phase) space Observations outside ensemble space cannot be assimilated Hybrid varia1onal- ensemble methods improve DOF problem by crea1ng uncertainty in all parts of the model space (e.g., combine flow- dependent and sta1c error covariance)

Why is forecast error covariance important? Singular Value Decomposition [Golub and van Loan (1989)] P 1/ f = VΣW T T = σ i v i w i i P f = P 1/ 1/ f ( P f ) T = VΣW T ( W ΣV T ) = VΣ V T = σ i i v i v i T x a = x f + P f H T z obs = H T ( HP f H T + R) 1 [y h(x f )] = P f z obs ( HP f H T + R) 1 [y h(x f )] The analysis update is x a x f = i σ i v i v i T z obs = i µ i v i µ i = σ i v i T z obs Analysis update is a linear combina1on of forecast error covariance singular vectors Analysis increments are defined in the subspace spanned by forecast error covariance singular vectors

Prac1cal data assimila1on algorithms Hybrid methods Hybrid variational- ensemble data assimilation J(x,α ) = P f 1/ = β f o Combined static and flow- dependent error covariance o Iterative minimization o Sequential method ( δ x f ) T 1 P VAR ( δ x f ) + β e (α ) T ( P ENS L) 1 (α ) + 1 y Hδ xtot T R 1 y Hδ x tot δx tot = δx f + k α k P ENS 1/ " # $ % k 1 β f + 1 β e =1 4D- EN- VAR o 4- D control variable: simultaneous adjustment in time and space o Sequential method o Increased dimension of the control vector o Reduced rank, but could be used as a hybrid

Future of data assimila1on Practice: o great need for blending information from observations and models o Coupled data assimilation (DA for coupled modeling systems) o high temporal frequency observations (e.g., geostationary satellite) Theory: o more general mathematical formalism (fewer auxiliary parameters) o reduce number of assumptions (e.g., Gaussian pdf), 36