Data assimilation. Polyphemus Training Session. June 9, Introduction 2

Similar documents
Assimilation de données dans Polyphemus

A Comparison Study of Data Assimilation Algorithms for Ozone Forecasts

A comparison study of data assimilation algorithms for ozone forecasts

First SAEMC-IAI, STIC-AmSud Workshop. 5 June 2007

Ensemble Kalman Filter

Enhancing information transfer from observations to unobserved state variables for mesoscale radar data assimilation

Brian J. Etherton University of North Carolina

Inter-comparison of 4D-Var and EnKF systems for operational deterministic NWP

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

The Ensemble Kalman Filter:

A Stochastic Collocation based. for Data Assimilation

Ensemble prediction and strategies for initialization: Tangent Linear and Adjoint Models, Singular Vectors, Lyapunov vectors

A data-driven method for improving the correlation estimation in serial ensemble Kalman filter

Improved analyses and forecasts with AIRS retrievals using the Local Ensemble Transform Kalman Filter

Some Applications of WRF/DART

Ensemble Kalman Filter based snow data assimilation

The ECMWF Hybrid 4D-Var and Ensemble of Data Assimilations

Gaussian Filtering Strategies for Nonlinear Systems

Introduction to Data Assimilation

4. DATA ASSIMILATION FUNDAMENTALS

Ting Lei, Xuguang Wang University of Oklahoma, Norman, OK, USA. Wang and Lei, MWR, Daryl Kleist (NCEP): dual resolution 4DEnsVar

GSI Tutorial Background and Observation Errors: Estimation and Tuning. Daryl Kleist NCEP/EMC June 2011 GSI Tutorial

Towards operationalizing ensemble DA in hydrologic forecasting

Data Assimilation Research Testbed Tutorial

Data Assimilation Research Testbed Tutorial. Section 7: Some Additional Low-Order Models

Estimating Observation Impact in a Hybrid Data Assimilation System: Experiments with a Simple Model

Relative Merits of 4D-Var and Ensemble Kalman Filter

An Efficient Ensemble Data Assimilation Approach To Deal With Range Limited Observation

Application of the Ensemble Kalman Filter to History Matching

Data assimilation didactic examples. User s guide - David Cugnet, 2011

Fundamentals of Data Assimilation

The hybrid ETKF- Variational data assimilation scheme in HIRLAM

Cross-validation methods for quality control, cloud screening, etc.

Methods of Data Assimilation and Comparisons for Lagrangian Data

Data Assimilation in Geosciences: A highly multidisciplinary enterprise

DATA ASSIMILATION FOR FLOOD FORECASTING

ICON. Limited-area mode (ICON-LAM) and updated verification results. Günther Zängl, on behalf of the ICON development team

Numerical Weather Prediction: Data assimilation. Steven Cavallo

Variational ensemble DA at Météo-France Cliquez pour modifier le style des sous-titres du masque

4DEnVar. Four-Dimensional Ensemble-Variational Data Assimilation. Colloque National sur l'assimilation de données

Estimation of State Noise for the Ensemble Kalman filter algorithm for 2D shallow water equations.

Introduction to Data Assimilation. Saroja Polavarapu Meteorological Service of Canada University of Toronto

Kalman Filter and Ensemble Kalman Filter

Applications of an ensemble Kalman Filter to regional ocean modeling associated with the western boundary currents variations

SPECIAL PROJECT PROGRESS REPORT

(Extended) Kalman Filter

The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich. Main topics

Can hybrid-4denvar match hybrid-4dvar?

Title. Syntax. Description. Options. stata.com. forecast estimates Add estimation results to a forecast model

1. Current atmospheric DA systems 2. Coupling surface/atmospheric DA 3. Trends & ideas

Comparing Variational, Ensemble-based and Hybrid Data Assimilations at Regional Scales

Hybrid variational-ensemble data assimilation. Daryl T. Kleist. Kayo Ide, Dave Parrish, John Derber, Jeff Whitaker

Model error and parameter estimation

Coupled atmosphere-ocean data assimilation in the presence of model error. Alison Fowler and Amos Lawless (University of Reading)

Juli I. Rubin. NRC Postdoctoral Research Associate Naval Research Laboratory, Monterey, CA

Validation of Complex Data Assimilation Methods

Hierarchical Bayes Ensemble Kalman Filter

Assimilation of SEVIRI data II

Report on the Joint SRNWP workshop on DA-EPS Bologna, March. Nils Gustafsson Alex Deckmyn.

Ensemble-based Chemical Data Assimilation I: General Approach

CS Homework 3. October 15, 2009

Mesoscale Predictability of Terrain Induced Flows

Summary of activities with SURFEX data assimilation at Météo-France. Jean-François MAHFOUF CNRM/GMAP/OBS

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm

Data Assimilation Research Testbed Tutorial

Improving GFS 4DEnVar Hybrid Data Assimilation System Using Time-lagged Ensembles

Experiences from implementing GPS Radio Occultations in Data Assimilation for ICON

JOURNAL OF GEOPHYSICAL RESEARCH, VOL.???, XXXX, DOI: /,

Data assimilation in coastal ocean modelling and forecasting. Application aspects - beyond the user manual

Estimating Observation Impact in a Hybrid Data Assimilation System: Experiments with a Simple Model

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

Re-analysis activities at SMHI

AMPS Update June 2016

Practical Aspects of Ensemble-based Kalman Filters

Short tutorial on data assimilation

WRF-LETKF The Present and Beyond

Development of the NCAR 4D-REKF System and Comparison with RTFDDA, DART-EaKF and WRFVAR

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

Numerical Weather prediction at the European Centre for Medium-Range Weather Forecasts (2)

Generating climatological forecast error covariance for Variational DAs with ensemble perturbations: comparison with the NMC method

COMPARATIVE EVALUATION OF ENKF AND MLEF FOR ASSIMILATION OF STREAMFLOW DATA INTO NWS OPERATIONAL HYDROLOGIC MODELS

Current Limited Area Applications

Met Office convective-scale 4DVAR system, tests and improvement

Data assimilation; comparison of 4D-Var and LETKF smoothers

WRF Modeling System Overview

GEOS Chem v7 Adjoint User's Guide

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

Variational data assimilation of lightning with WRFDA system using nonlinear observation operators

Current Issues and Challenges in Ensemble Forecasting

Ergodicity in data assimilation methods

Report on stay at ZAMG

Background Error, Observation Error, and GSI Hybrid Analysis

Numerical Weather prediction at the European Centre for Medium-Range Weather Forecasts

OOPC-GODAE workshop on OSE/OSSEs Paris, IOCUNESCO, November 5-7, 2007

Tangent linear and adjoint models for variational data assimilation

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation.

Regional services and best use for boundary conditions

Update on the KENDA project

Lecture 1 Data assimilation and tropospheric chemistry. H. Elbern

Retrospective and near real-time tests of GSIbased EnKF-Var hybrid data assimilation system for HWRF with airborne radar data

Transcription:

Data assimilation Polyphemus Training Session June 9, 2009 About Purpose: introduction to optimal interpolation, Kalman filtering and variational data assimilation with Polair3D; application to photochemistry Authors: Vivien Mallet, Vivien.Mallet@inria.fr; Lin Wu, Lin.Wu@cerea.enpc.fr Polyphemus version: 1.5 Location: http://www.enpc.fr/cerea/polyphemus/sessions.html Contents Introduction 2 1 Test Case 3 1.1 Introduction....................................... 3 1.2 Main Configuration File................................ 4 1.3 Simulation and Observations............................. 5 2 Optimal Interpolation 6 2.1 Introduction....................................... 6 2.2 Configuration...................................... 7 2.3 Experiment....................................... 7 3 Ensemble Kalman Filter 8 3.1 Introduction....................................... 8 3.2 Configuration...................................... 8 3.3 Experiment....................................... 8 4 4D-Var 8 4.1 Introduction....................................... 8 4.2 Configuration and Experiments............................ 9 5 Reduced-Rank Kalman Filter 9 5.1 Introduction....................................... 9 5.2 Configuration and Experiment............................ 10 6 Optional Questions 10 1

Preparing the Session Extracting the archive creates a directory assimilation/ in polyphemus-sessions/, which has also been created if necessary. Place the source Polyphemus-1.5 in polyphemus-sessions/ also. Introduction In Polyphemus, data assimilation algorithms are implemented as drivers independent from the models. Actually each driver expects a few methods in the model interface. In the base driver, used for forward simulations without assimilation, the main method (BaseDriver::Run, see include/driver/common/basedriver.cxx) reads /*** Initializations ***/ Model.Init(); if (rank == 0) OutputSaver.Init(Model); /*** Time loop ***/ for (int i = 0; i < Model.GetNt(); i++) { if (rank == 0 && option_display["show_iterations"]) cout << "Performing iteration #" << i << endl; if (rank == 0 && option_display["show_date"]) cout << "Current date: " << Model.GetCurrentDate().GetDate("%y-%m-%d %h:%i") << endl; Model.InitStep(); if (rank == 0) OutputSaver.InitStep(Model); Model.Forward(); if (rank == 0) OutputSaver.Save(Model); } The main calls are Model.Init() for initialization (reading configuration and memory allocation), Model.InitStep() for initialization at each timestep (reading input data) and Model.Forward (time integration over one timestep). Now assume that observations are read in the driver and that a method allows to modify model concentrations. In include/driver/assimilation/optimalinterpolationdriver.cxx, the following lines are added in the main time loop: 2

// Retrieves observations. this->obsmanager.setdate(this->model.getcurrentdate()); if (this->obsmanager.isavailable()) { cout << "Performing optimal interpolation at date [" << this->model.getcurrentdate().getdate("%y-%m-%d %h:%i") << "]..."; cout.flush(); this->model.getstate(state_vector); Analyze(state_vector); this->model.setstate(state_vector); this->outputsaver.setgroup("analysis"); this->outputsaver.save(this->model); } cout << " done." << endl; The key methods are this->model.getstate to retrieve the state vector in the model (concentrations, maybe emissions or whatsoever) and this->model.setstate to replace the forecasted state with the analyzed state. A few other methods in the model are required by the driver. Eulerian models Castor and Polair3D do not have these methods in their base versions. Hence derived versions of them (C++ inheritance) should add the required methods. In directory include/models/, Polair3DChemistry is derived into Polair3DChemistryAssimConc in order to implement GetState, SetState, GetState ccl (adjoint variable), SetState ccl, BackgroundErrorCovariance, ModelErrorCovariance and a few other methods. All drivers for data assimilation are in the directory include/driver/assimilation/. They implement the optimal interpolation algorithm (Section 2), Kalman filters (ensemble Section 3 and RRSQRT Section 5), 4D-Var assimilation (Section 4). 1 Test Case 1.1 Introduction The test case is a simulation with RACM (72 chemical species, with ozone, O 3, among targets) over Europe from 1 st July 2001 at 00:00 UT to 2 nd July at 23:00 UT. There is enough data to extend the simulation period up to 10 days. In order to speed up the computations, a very coarse mesh is used: the horizontal resolution is 2 with 11 cells along latitude and 16 cells along longitude, and there are 3 vertical levels (up to 1200 m). This is enough to reproduce the main 3

physical phenomena and to test the assimilation algorithms. The domain and its discretization is shown in Figure 1. 55 Latitude (degrees) 50 45 40 35-10 -5 0 5 10 15 20 Longitude (degrees) Figure 1: Horizontal discretization for the reference simulation. The circles show the locations of EMEP monitoring stations, and the disc shows the location of the monitoring station Montandon. 1.2 Main Configuration File The configuration files are in assimilation/config/. Move to this directory. All commands should be launched from this directory. Open polair3d.cfg. It contains several sections: [domain]; [data assimilation]: data assimilation parameters shared by all or several algorithms. [EnKF]: options for ensemble Kalman filter; [RRSQRT]: options for reduced rank square root Kalman filter; [4DVar]: options for variational assimilation; [optimizer]: options for the minimizer (in 4D-Var); [display], [options] and [data]; [output]: note Configuration file is set to polair3d.cfg, hence output savers configuration is put in polair3d.cfg too; [state]: defines the state vector in the assimilation algorithm; [observation management]: path to observations description; [perturbation management]: path to perturbations description; 4

[save]: typical output saver configuration; sections [@save]: you can insert any section whose name is not expected by the code, and the code will ignore it; here, sections [@save] are therefore skipped rename them into [save] in order to activate them; [adjoint]: options about adjoint management. Further details are available primarily in Polyphemus user s guide and in next sections. 1.3 Simulation and Observations Launch the reference simulation and display an ozone map on 1 st July at 10:00 UT. The command is../../polyphemus-1.5/processing/racm/polair3d-unsplit polair3d.cfg 1. You should get the map in Figure 2. You may use getm and getd with config/disp.cfg. 200 55 175 150 50 125 45 100 75 40 50 25 35-10 -5 0 5 10 15 20 0 Figure 2: Ozone map of the reference simulation on 1 st July at 10:00 UT. Comparison with observations is performed with AtmoPy. The file config/score.py shows an example. Launch run -i score in a IPython shell (launched in assimilation/config/. Browse score.py. In this program, the main variables are sim3d: concentration field (same as sim3d = getd("disp.cfg")); sim: list (indexed by 87 monitoring stations) of hourly simulated concentrations at monitoring stations sim[i] is therefore a vector of simulated concentrations at station #i; obs: list (indexed by 87 monitoring stations) of hourly measured concentrations at monitoring stations obs has the same structure as sim; rmse: root mean square error as function of time (hours). 1 Note that compilingpolair3d-unsplit with SCons (recommended) will require to slightly modify SConstruct in processing/racm/. Explanations are available in processing/racm/sconstruct. 5

Compute the root mean square error (RMSE) over the whole simulation period (two days) using ensemble.collect. Help about ensemble.collect may be found in AtmoPy documentation (http://cerea.enpc.fr/polyphemus/doc/atmopy/index.html, module ensemble.combine) or in IPython with command help ensemble.collect. Display observations and simulated concentrations at Montandon, France. Use sim and obs in which Montandon is the first station and command atmopy.display.plot. You should get Figure 3. Note that Montandon is in cell (j = 6, i = 8). We chose not to interpolate concentrations at the station; hence concentrations at Montandon are the same as concentrations in cell (6, 8). 140 Polyphemus Observation 120 100 Concentration 80 60 40 20 0 10 20 30 40 50 Observation number Figure 3: Ozone concentrations at Montandon (EMEP station in France). 2 Optimal Interpolation 2.1 Introduction Optimal interpolation (OI) improves the state vector c each time a new vector o of observations is available: c = c b + PH T (HPH T + R) 1 (o Hc b ) (1) where c b is the background value of the state vector, H is the observation operator that maps the state vector to the observation space, R is the covariance matrix of observation errors and P is the covariance matrix of background error. In this session, hourly ozone measurements from the EMEP network are assimilated. The implementation of the algorithm may be found in../../polyphemus-1.5/include/driver/assimilation/optimalinterpolationdriver.cxx. From a technical point of view, data assimilation involves (1) a model with the suitable interface, (2) an observation manager, (3) a driver that gathers both and implements 6

the assimilation algorithm. The model here derives from Polair3DChemistry and is called Polair3DChemistryAssimConc. This derived model implements a few additional methods needed for data assimilation with concentrations as state variable. The observation manager may be GroundObservationManager or SimObservationManager. In this session, only GroundObservationManager is used. Consequently, in../../polyphemus-1.5/processing/assimilation/polair3d-oi.cpp, the driver is declared this way: OptimalInterpolationDriver<real, ClassModel, BaseOutputSaver<real, ClassModel>, GroundObservationManager<real> > Driver(argv[1]); 2.2 Configuration The covariance matrix of background error is either diagonal or in Balgovind form. In the latter option, the error covariance between two points depends on the distance r between these two points: ( f(r) = 1 + r ) ( exp r ) v (2) L L where L is a characteristic length (Balgovind scale background) and v is a variance (Background error variance). In the configuration, section [data assimilation] describes error statistics. Refer to the user s guide about Polair3DChemistryAssimConc for further details. The configuration of Polair3DChemistryAssimConc includes the section [state] which describes the state variables for the assimilation procedure. The observation manager GroundObservationManager is configured in observation.cfg. 2.3 Experiment It is advocated to save the results of the reference simulation, otherwise they will be replaced with the new results (in directory results/). For instance, execute cp../results/o3.bin../results/save/o3-ref.bin. Later, you may reload the results with run score../results/save/o3-ref.bin. Launch the assimilation experiment. The command is../../polyphemus-1.5/processing/assimilation/polair3d-oi polair3d.cfg. Estimate the performances of the new simulation. In IPython, launch the command run score. In polair3d.cfg, rename the first two sections[@save] into[save]. This will add the saver unit of type domain assimilation and domain prediction (entry Type) which saves analyzed and predicted concentrations and their dates. Display the analyzed and predicted concentrations. In order to check the algorithm behavior, assimilate only observations from Montandon. Since Montandon is the first station, we select it if you set Nstation to 1 in observation.cfg. Compare two simulations with Montandon observations solely assimilated and with different Balgovind parameters. 7

3 Ensemble Kalman Filter 3.1 Introduction Ensemble Kalman filter (EnKF) differs with optimal interpolation in that the error covariance matrix is flow-dependent and approximated by ensemble statistics. The model applies to the r member ensemble {c i, i = 1,...,r}, and produces forecast {c f i, i = 1,...,r}. Whenever observations are available, each ensemble member c f i is updated according to the OI formula (1). The statistics of the updated ensemble {c a i, i = 1,...,r} gives the estimation of model status conditioned to observations. The forecast error covariance matrix P f is approximated by P f = 1 r 1 r (c ( f i cf) c f i cf) T i=1 where c f is the ensemble mean. After the update, the estimated error covariance matrix P a is approximated by P a = 1 r 1 (3) r (c a i c a )(c a i c a ) T (4) i=1 The implementation of the algorithm may be found in../../polyphemus-1.5/include/driver/assimilation/enkfdriver.cxx. 3.2 Configuration The configurations for Polair3DChemistryAssimConc and GroundObservationManager are similar to those of OI. In polair3d.cfg, the ensemble number r can be modified in section [EnKF]. The observations in the update formula can be perturbed to preserve consistent statistics. The ensemble is generated by perturbation of fields defined in perturbation.cfg. 3.3 Experiment The experiment settings are similar to those of OI. Launch the assimilation experiment. The command is../../polyphemus-1.5/processing/assimilation/polair3d-enkf polair3d.cfg. In IPython, launch the command run score to estimate the performances of the new simulation. In polair3d.cfg, rename the third section [@save] into [save]. This will add the saver unit of type domain ensemble which saves ensemble forecast during assimilation. Program load ensemble.py is provided to load ensemble date. Browse the program. The main variable is ens; it stores concentrations of all ensemble members. Display concentrations of all members at Montandon. Analyze the spatial distribution of uncertainties. You may use atmopy.stat.spatial distribution (refer to user s guide). Try with other fields perturbations. Evaluate the impact of the random-generator seed. 4 4D-Var 4.1 Introduction In 4DVar, the optimal initial condition is obtained by minimizing a cost function, 8

J(c) = 1 2 (c cb ) T B 1 (c c b ) + 1 }{{} 2 J b N {}}{ (o t H t (c t )) T Rt 1 (o t H t (c t )) t=0 J ot } {{ } J o under the constraint c t = M 0 t (c) = M t 1 t... M 0 1 c where M t 1 t denotes integrations of the underlying chemistry-transport model. The gradient is calculated with the adjoint model. Based on the optimal initial condition, the assimilated concentrations are simulated from time step 0 to N. Further integrations (after time step N) provide a subsequent forecast. Note that in this version, only diagonal B is supported. The implementation of the algorithm may be found in FourDimVarDriver.cxx. 4.2 Configuration and Experiments The configuration and experiment setting are similar to those of EnKF. The tuning parameters for 4DVar can be adjusted in polair3d.cfg. Launch the assimilation experiment. The command is../../polyphemus-1.5/processing/assimilation/polair3d-4dvar polair3d.cfg. In IPython, launch the command run score to estimate the performances of the new simulation. Display concentrations at Montandon. Add the second model layer in the state vector and compare with previous results. 5 Reduced-Rank Kalman Filter 5.1 Introduction Reduced-rank square root Kalman filter (RRSQRT) uses a low-rank representation LL T of error covariances matrix P. L = [l 1,...,l q ] is the mode matrix whose columns (modes) are the dominant directions of the forecast error. The modes can also be interpreted as bases for how the true states might differ from mean state. The evolution of a mode can thus be approximated by the differences of the forecasts based on the mean state and its perturbation by this mode, that is, l i t = M t 1 t ( c t 1 + l i t 1) M t 1 t ( c t 1 ) (6) where t 1, t are consecutive time index. Note that in extended Kalman filter, the forecast error covariance matrix is calculated by L t L T t = P t = MP t 1 M T + Q t 1 = (5) [ML t 1 Q 1 2t 1 ] [ ML t 1 Q 1 2 t 1 ] T (7) where M is the tangent linear model of M t 1 t, Q 1 2 t 1 is the square root of model error covariance matrix. We do not use tangent linear model, but employ formula (6) to simulate ML t 1. Q 1 2 t 1 is obtained by perturbing fields as in the EnKF implementation. The update formula for the mode matrix can be sketched as L a t = L t Π. where Π is the square root of I KH, I is the identity matrix, K is the Kalman gain calculated from L t, and H is the linearized observation operator. Implementation details can be found in:../../polyphemus-1.5/include/driver/assimilation/rrsqrtdriver.cxx. 9

5.2 Configuration and Experiment The configuration and experiment setting are similar to those of EnKF. The tuning parameters for RRSQRT can be adjusted in polair3d.cfg (section (RRSQRT]. Launch the assimilation experiment. The command is../../polyphemus-1.5/processing/assimilation/polair3d-rrsqrt polair3d.cfg. In IPython, launch the command run score to estimate the performances of the new simulation. 6 Optional Questions Try to use the drivers AdjointDriver and GradientDriver. In the previous questions, the assimilation is performed over the first day and prediction occurs the second day. Assimilate over the first two days and predict for the third day. Compare with results where assimilation is performed only over the second day. 10