Tangent linear and adjoint models for variational data assimilation

Similar documents
Variational assimilation Practical considerations. Amos S. Lawless

Introduction to Data Assimilation. Reima Eresmaa Finnish Meteorological Institute

Effect of random perturbations on adaptive observation techniques

Diagnostics of linear and incremental approximations in 4D-Var

Introduction to initialization of NWP models

Ensemble prediction and strategies for initialization: Tangent Linear and Adjoint Models, Singular Vectors, Lyapunov vectors

The University of Reading

Fundamentals of Data Assimilation

Assimilation Techniques (4): 4dVar April 2001

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

Weak Constraints 4D-Var

E. Andersson, M. Fisher, E. Hólm, L. Isaksen, G. Radnóti, and Y. Trémolet. Research Department

Variational data assimilation

Variational Data Assimilation Current Status

How does 4D-Var handle Nonlinearity and non-gaussianity?

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

4. DATA ASSIMILATION FUNDAMENTALS

Data assimilation; comparison of 4D-Var and LETKF smoothers

Model-error estimation in 4D-Var

Relative Merits of 4D-Var and Ensemble Kalman Filter

Kalman Filter and Ensemble Kalman Filter

Sensitivity analysis in variational data assimilation and applications

Coupled atmosphere-ocean data assimilation in the presence of model error. Alison Fowler and Amos Lawless (University of Reading)

Accelerating the spin-up of Ensemble Kalman Filtering

ESA CONTRACT REPORT. WP-1000 report: Preliminary analysis and planning. M. Janisková and M. Fielding. Contract Report to the European Space Agency

OOPS - An Object Oriented Framework for Coupling Data Assimilation Algorithms to Models.

Module III: Partial differential equations and optimization

Stability of Ensemble Kalman Filters

Tangent-linear and adjoint models in data assimilation

3.3.1 Linear functions yet again and dot product In 2D, a homogenous linear scalar function takes the general form:

Linearized Physics (LP): Progress and Issues

Data Assimilation for Weather Forecasting: Reducing the Curse of Dimensionality

Performance of 4D-Var with Different Strategies for the Use of Adjoint Physics with the FSU Global Spectral Model

A Comparative Study of 4D-VAR and a 4D Ensemble Kalman Filter: Perfect Model Simulations with Lorenz-96

Data assimilation didactic examples. User s guide - David Cugnet, 2011

The ECMWF Hybrid 4D-Var and Ensemble of Data Assimilations

Phase space, Tangent-Linear and Adjoint Models, Singular Vectors, Lyapunov Vectors and Normal Modes

The Impact of Background Error on Incomplete Observations for 4D-Var Data Assimilation with the FSU GSM

LECTURE # - NEURAL COMPUTATION, Feb 04, Linear Regression. x 1 θ 1 output... θ M x M. Assumes a functional form

Lagrangian data assimilation for point vortex systems

Observation Impact Assessment for Dynamic. Data-Driven Coupled Chaotic System

Using Model Reduction Methods within Incremental Four-Dimensional Variational Data Assimilation

Sensitivity analysis by adjoint Automatic Differentiation and Application

Numerical Weather prediction at the European Centre for Medium-Range Weather Forecasts

Initial Uncertainties in the EPS: Singular Vector Perturbations

An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

Chapter 6. Nonlinear Equations. 6.1 The Problem of Nonlinear Root-finding. 6.2 Rate of Convergence

Sensitivities and Singular Vectors with Moist Norms

Linear Discriminant Functions

Fixed Point Theorem and Sequences in One or Two Dimensions

An artificial neural networks (ANNs) model is a functional abstraction of the

GEOS Chem v7 Adjoint User's Guide

Mathematical Concepts of Data Assimilation

(One Dimension) Problem: for a function f(x), find x 0 such that f(x 0 ) = 0. f(x)

Dacian N. Daescu 1, I. Michael Navon 2

Numerical Methods for PDE-Constrained Optimization

Ensemble Kalman Filter potential

Chapter 9. Derivatives. Josef Leydold Mathematical Methods WS 2018/19 9 Derivatives 1 / 51. f x. (x 0, f (x 0 ))

Local Ensemble Transform Kalman Filter

8. THEOREM If the partial derivatives f x. and f y exist near (a, b) and are continuous at (a, b), then f is differentiable at (a, b).

Development of a tangent linear model (version 1.0) for the High-Order Method Modeling Environment dynamical core

2D.4 THE STRUCTURE AND SENSITIVITY OF SINGULAR VECTORS ASSOCIATED WITH EXTRATROPICAL TRANSITION OF TROPICAL CYCLONES

Lateral Boundary Conditions

Hybrid variational-ensemble data assimilation. Daryl T. Kleist. Kayo Ide, Dave Parrish, John Derber, Jeff Whitaker

Parallel Algorithms for Four-Dimensional Variational Data Assimilation

Adjoint code development and optimization using automatic differentiation (AD)

Presentation in Convex Optimization

Four-Dimensional Ensemble Kalman Filtering

COUPLED OCEAN-ATMOSPHERE 4DVAR

J1.3 GENERATING INITIAL CONDITIONS FOR ENSEMBLE FORECASTS: MONTE-CARLO VS. DYNAMIC METHODS

Coupled atmosphere-ocean variational data assimilation in the presence of model error

Neural Networks and the Back-propagation Algorithm

Handling nonlinearity in Ensemble Kalman Filter: Experiments with the three-variable Lorenz model

Nonlinear error dynamics for cycled data assimilation methods

Local Ensemble Transform Kalman Filter: An Efficient Scheme for Assimilating Atmospheric Data

P 1.86 A COMPARISON OF THE HYBRID ENSEMBLE TRANSFORM KALMAN FILTER (ETKF)- 3DVAR AND THE PURE ENSEMBLE SQUARE ROOT FILTER (EnSRF) ANALYSIS SCHEMES

4DEnVar. Four-Dimensional Ensemble-Variational Data Assimilation. Colloque National sur l'assimilation de données

Sparse Optimization Lecture: Basic Sparse Optimization Models

Computational Aspects of Data Assimilation for Aerosol Dynamics

4. Assessment of model input uncertainties with 4D-VAR chemical data assimilation (Task 3)

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm

(Toward) Scale-dependent weighting and localization for the NCEP GFS hybrid 4DEnVar Scheme

4D-Var or Ensemble Kalman Filter? TELLUS A, in press

Using Jacobian sensitivities to assess a linearization of the relaxed Arakawa Schubert convection scheme

Lecture 7: Weak Duality

Comparing Variational, Ensemble-based and Hybrid Data Assimilations at Regional Scales

Ting Lei, Xuguang Wang University of Oklahoma, Norman, OK, USA. Wang and Lei, MWR, Daryl Kleist (NCEP): dual resolution 4DEnsVar

Data assimilation. Polyphemus Training Session. June 9, Introduction 2

Direct assimilation of all-sky microwave radiances at ECMWF

Multistep Methods for IVPs. t 0 < t < T

Certification of Derivatives Computed by Automatic Differentiation

4-D Var Data Assimilation. Ionel M. Navon, R. Ştefănescu

Intro to Automatic Differentiation

Some ideas for Ensemble Kalman Filter

Masahiro Kazumori, Takashi Kadowaki Numerical Prediction Division Japan Meteorological Agency

ADJONT-BASED ANALYSIS OF OBSERVATION IMPACT ON TROPICAL CYCLONE INTENSITY FORECASTS

Multidisciplinary System Design Optimization (MSDO)

Estimating observation impact without adjoint model in an ensemble Kalman filter

Upgrade of JMA s Typhoon Ensemble Prediction System

Transcription:

Data Assimilation Training Course, Reading, -4 arch 4 Tangent linear and adjoint models for variational data assimilation Angela Benedetti with contributions from: arta Janisková, Philippe Lopez, Lars Isaksen, Gabor Radnoti and Yannick Tremolet

Data Assimilation Training Course, Reading, -4 arch 4 Introduction 4D-Var is based on minimization of a cost function which measures the distance between the model with respect to the observations and with respect to the background state The cost function and its gradient are needed in the minimization. The tangent linear model provides a computationally efficient (although approimate) way to calculate the model trajectory, and from it the cost function. The adjoint model is a very efficient tool to compute the gradient of the cost function. Overview: Brief introduction to 4D-Var with focus on TL/AD aspects General definitions of Tangent Linear and Adjoint models and why they are etremely useful in variational assimilation Writing TL and AD models and testing them Brief mention of automatic differentiation software

Data Assimilation Training Course, Reading, -4 arch 4 4D-Var In 4D-Var the cost function can be epressed as follows: n T T J ( b) B ( b) + ( H i i[ ] yi) Ri ( H i i[ ] yi) i J b J o B background error covariance matri, R observation error covariance matri (instrumental + interpolation + observation operator error), forward nonlinear forecast model (time evolution of the model state, inde i), H observation operator (model space observation space). n T ' T B b i Hi Ri i i yi i ( [ ] ) min J J ( ) + [ t,t ] H H T adjoint of observation operator and T adjoint of forecast model.

Incremental 4D-Var at ECWF In incremental 4D-Var, the cost function is minimized in terms of increments: [ t, t ] i i with the model state defined at any time t i as: i ti i is the trajectory around which the linearization is performed ( t +, ti ( t, ti ) t t at t) b 4D-Var cost function can then be approimated to the first order by: n T ' T ' J ( ) B + ( H i '[ t, ti ] di ) R ( Hi '[ t, ti ] ( ) where di yi Hi ti is the so-called departure computed using the nonlinear model and observational operator. The gradient of the cost function to be minimized is: Data Assimilation Training Course, Reading, -4 arch 4 ( ) n T ' T ' i i i i i i min J J B + [ t, t ] H R H d and are the tangent linear models which are used in the computations of incremental updates during the minimization (iterative procedure). T i i ' H i 'T H i and are the adjoint models which are used to obtain the gradient of the cost function with respect to the initial condition. d )

Data Assimilation Training Course, Reading, -4 arch 4 Details on linearisation In the first order approimation, a perturbation of the control variable (initial condition) evolves according to the tangent linear model: where i is the time-step inde. The perturbation of the cost function around the initial state is: n T ' T ' J(, ) B + ( Hii di) Ri ( Hii di) i n T T ' ' ' ' ' ' ' ' B + H... d R H... i i d ' H i i i i ( ) where is the linearised version of about and i are the departures from observations. ( ) ( ) i i i i i i i H i i d y H ( ) i ' i i i i i i

Data Assimilation Training Course, Reading, -4 arch 4 Details of the linearisation (cnt.) The gradient of the cost function with respect to is given by: n ' ' ' ' T ' ' ' ' J B H i i i R H i i i i di + (... ) (... ) remembering that ( AB) T B T A T n ' T ' T ' T ' T ' ' ' ' i i i i i i i i B +... H R ( H... d) n ' T ' T ' i i i i i i B + [ t, t ] H R ( H d ) The optimal initial perturbation is obtained by finding the value of for which: J The gradient of the cost function with respect to the initial condition is provided by the adjoint solution at time t. Let s see how

For any linear operator such as:, Definition of adjoint operator y Data Assimilation Training Course, Reading, -4 arch 4 ' there eist an adjoint operator ',, y where is an inner scalar product and, y are vectors (or functions) of the space where this product is defined. It can be shown that for the inner product defined in the Euclidean space : 'T J We will now show that the gradient of the cost function at time t is provided by the solution of the adjoint equations at the same time:

Data Assimilation Training Course, Reading, -4 arch 4 Adjoint solution Usually the initial guess is chosen to be equal to the background b so that the initial perturbation The gradient of the cost function is hence simplified as: n ' T ' T J [, ti t] Hi R di i We choose the solution of the adjoint system as follows: n i + (arbitrary final condition) i+ i+ + H i R d i, i,..., n (initial condition) (iterative solution) We then substitute progressively the solution epression for i into the

Data Assimilation Training Course, Reading, -4 arch 4 Adjoint solution (cnt.) J Finally, regrouping and remembering that and that and we obtain the following equality: n + 'T ' ' [, ] n T T i i i i t t H R d The gradient of the cost function with respect to the control variable (initial condition) is obtained by a backward integration of the adjoint model. 'T H H... )) ( ( ) ( 3 3 + + + d R H d R H d R H

Data Assimilation Training Course, Reading, -4 arch 4 Iterative steps in the 4D-Var Algorithm. Integrate forward model gives.. Integrate adjoint model backwards gives. 3. If then stop. 4. Compute descent direction (Newton, CG, ). 5. Compute step size : 6. Update initial condition:

Data Assimilation Training Course, Reading, -4 arch 4 Finding the minimum of cost function J is an iterative minimization procedure cost function J J( b ) + +ρ D m m m m ρ D m m J mini

An analysis cycle in 4D-Var st ifstraj: Non-linear model is used to compute the high-res trajectory (T79 operational, -h forecast) High-res departures are computed at eact obs time and location Trajectory is interpolated at low res (T59) minimizations in the old configuration Now 3 minimizations are operational! st ifsmin (7 iterations): Iterative minimization at T59 resolution Tangent linear with simplified physics to calculate the increments i The Adjoint is used to compute the gradient of the cost function with respect to the departure in initial condition J Analysis increment at initial time is interpolated back linearly from low-res to high-res and it provides a new initial state for the nd trajectory run Data Assimilation Training Course, Reading, -4 arch 4 nd ifstraj: repeat st ifstraj and interpolates at T55 resolution nd ifsmin (3 iterations): repeat st ifsmin at T55 Last ifstraj: Uses updated initial condition to run another -h forecast and stores analysis departures in the Observational Data Base (ODB)

Data Assimilation Training Course, Reading, -4 arch 4 Recap on TL and AD models

Simple eample of adjoint writing Data Assimilation Training Course, Reading, -4 arch 4

Data Assimilation Training Course, Reading, -4 arch 4 Simple eample of adjoint writing (cnt.) Often the adjoint variables in mathematical formulations are indicated with an asterisk Do not forget the last equation!!! That too is part of the adjoint! As an alternative to the matri method, adjoint coding can be carried out using a line-by-line approach.

Data Assimilation Training Course, Reading, -4 arch 4 ore practical eamples on adjont coding: the Lorenz model where is the Prandtl number, the Rayleigh number, and the aspect ratio. is the intensity of convection, is the maimum temperature difference is the stratification change due to convection. Details on the Lorenz model can be found in the references.

Data Assimilation Training Course, Reading, -4 arch 4 The linear code in Fortran Linearize each line of the code one by one, and set d/dty for simplicity: y() -p() +p() :Nonlinear statement ()yd() -pd() +pd() :Tangent linear y() ()(r-(3)) -() :Nonlinear statement ()yd() d()(r-(3)) -()d(3) -d() :Tangent linear etc Remember that p, r, b are constants; (), () and (3) are the independent variables; y(), y() and y(3) are the dependent variables. We chose the suffi d for the tangent linear variable for consistency with the automatic differentation software TAPENADE (see optional practical eercise). Adjoint variables are indicated with the suffi b. This is just a convention. Note that in the ECWF Integrated Forecast System (IFS) the tangent linear and adjoint variables are indicated without any subscripts and the nonlinear trajectory () is indicated with the suffi 5 (5).

Data Assimilation Training Course, Reading, -4 arch 4 Adjoint of one instruction We start from the tangent linear code: yd()-pd()+pd() In matri form, it can be written as: which can easily be transposed (asterisk indicates adjoint variables): The corresponding adjoint code in FORTRAN is: b()b()-pyb() b()b()+pyb() yb() y p p y y p p y

Data Assimilation Training Course, Reading, -4 arch 4 Adjoint of one instruction (II) We start again from the tangent linear code: yd() d()(r-(3))-d()- ()d(3) In matri form, it can be written as: which can easily be transposed (asterisk indicates transposition): The corresponding adjoint code in FORTRAN is: b()b()+(r-(3))yb() b()b()- yb() b(3)b(3)- ()yb() yb() 3 3 3 ) ( y r y 3 3 3 ) ( y r y These terms come from the trajectory! Needs to be stored in memory or recomputed

Data Assimilation Training Course, Reading, -4 arch 4 Trajectory The trajectory has to be available. It can be: saved which costs memory, recomputed which costs CPU time. Depending on the compleity of the code, one option or the other is adopted (or both options at the same time).

Data Assimilation Training Course, Reading, -4 arch 4 The Adjoint Code Property of adjoints (transposition): Application: where represents the line of the tangent linear model. The adjoint code is made of the transpose of each line of the tangent linear code in reverse order.

Data Assimilation Training Course, Reading, -4 arch 4 Adjoint of loops In the TL code for the Lorenz model we have: DO i,3 d(i)d(i)+dtyd(i) ENDDO dt is a constant for our purposes. This loop can be written eplicitly: d()d()+dtyd() d()d()+dtyd() d(3)d(3)+dtyd(3) We can now transpose and reverse the lines to get the adjoint: yb(3)yb(3)+dtb(3) yb()yb()+dtb() yb()yb()+dtb() which is equivalent to DO i3,,-!reverse order of indeces! yb(i)yb(i)+dtb(i) ENDDO

Data Assimilation Training Course, Reading, -4 arch 4 Conditional statements ( IF statements) What we want is the adjoint of the statements which were actually eecuted in the direct model. We need to know which branch of the IF statement was eecuted The result of the conditional statement has to be stored: it is part of the trajectory!!!

Summary of basic rules for line-by-line adjoint coding () Adjoint statements are derived from tangent linear ones in a reversed order Tangent linear code A y + B z y y + A Data Assimilation Training Course, Reading, -4 arch 4 z z + B A + B z z z + B do k, N (k) A (k) + B y(k) end do Order of operations is important when variable is updated! if (condition) tangent linear code Adjoint code A do k N,, (Reverse the loop!) (k ) (k) + A (k) y (k ) y (k) + B (k) (k) end do if (condition) adjoint code And do not forget to initialize local adjoint variables to zero!

Summary of basic rules for line-by-line adjoint coding () To save memory, the trajectory can be recomputed just before the adjoint calculations (again it depends on the compleity of the model). if ( > ) then A / A Log() end if Tangent linear code Trajectory and adjoint code ------------- Trajectory ---------------- store (storage for use in adjoint) if ( > ) then A Log() end if --------------- Adjoint ------------------ if ( store > ) then Data Assimilation Training Course, Reading, -4 arch 4 A / store end if The most common sources of error in adjoint coding are: ) Pure coding errors ) Forgotten initialization of local adjoint variables to zero 3) ismatching trajectories in tangent linear and adjoint (even slightly) 4) Bad identification of trajectory updates

Data Assimilation Training Course, Reading, -4 arch 4 ore facts about adjoints The adjoint always eists and it is unique, assuming spaces of finite dimension. Hence, coding the adjoint does not raise questions about its eistence, only questions related to the technical implementation. In the meteorological literature, the term adjoint is often improperly used to denote the adjoint of the tangent linear of a non-linear operator. In reality, the adjoint can be defined for any linear operator. One must be aware that discussions about the eistence of the adjoint usually should address the eistence of the tangent linear model. Without re-computation, the cost of the TL is usually about.5 times that of the non-linear code, the cost of the adjoint between and 3 times. The tangent linear model is not strictly necessary to run a 4D-Var system (but it is needed in the incremental 4D-Var formulation in use operationally at ECWF). It is also needed as an intermediate step to write and test the adjoint.

Data Assimilation Training Course, Reading, -4 arch 4 Test for tangent linear model Perturbation scaling factor machine precision reached

Data Assimilation Training Course, Reading, -4 arch 4 Test for adjoint model The adjoint test is truly unforgiving. If you do not have a ratio of the norm close to within the precision of the machine, you know there is a bug in your adjoint. At the end of your debugging you will have a perfect adjoint. If properly tested, the adjoint is the only piece of code on Earth to be entirely bug-free (although you may still have an imperfect tangent linear)!

Data Assimilation Training Course, Reading, -4 arch 4 Test of adjoint in practice Compute perturbed variable (y) using perturbation in input variables (,z) with the tangent linear code y y + NOR _ TL y z z Compute TL norm: Call adjoint routine to obtain gradients in and z with respect to initial perturbation in and z (, z ) from perturbation in y. z z Compute the norm from the adjoint calculation, using unperturbed state and gradients: NOR _ According to the test of adjoint NOR_TL must be equal to NOR_AD to the machine precision! AD y y + y + z z + z z

Data Assimilation Training Course, Reading, -4 arch 4 Automatic differentiation Because of the strict rules of tangent linear and adjoint coding, automatic differentiation is possible. Eisting tools: TAF (TAC), TAPENADE (Odyssée),... Reverse the order of instructions, Transpose instructions instantly without typos!!! Especially good in deriving tangent linear codes! There are still unresolved issues: It is NOT a black bo tool, Cannot handle non-differentiable instructions (TL is wrong), Can create huge arrays to store the trajectory, The codes often need to be cleaned-up and optimised. Look in the Supplementary material for more information!

Data Assimilation Training Course, Reading, -4 arch 4 Useful References Variational data assimilation: Lorenc, A., 986, Quarterly Journal of the Royal eteorological Society,, 77-94. Courtier, P. et al., 994, Quarterly Journal of the Royal eteorological Society,, 367-387. Rabier, F. et al.,, Quarterly Journal of the Royal eteorological Society, 6, 43-7. The adjoint technique: Errico, R.., 997, Bulletin of the American eteorological Society, 78, 577-59. Tangent-linear approimation: Errico, R.. et al., 993, Tellus, 45A, 46-477. Errico, R.., and K. Reader, 999, Quarterly Journal of the Royal eteorological Society, 5, 69-95. Janisková,. et al., 999, onthly Weather Review, 7, 6-45. ahfouf, J.-F., 999, Tellus, 5A, 47-66. Lorenz model: X. Y. Huang and X. Yang. Variational data assimilation with the Lorenz model. Technical Report 6, HIRLA, April 996. Available on ftp site (see notes for practical session). E. Lorenz. Deterministic nonperiodic flow. J. Atmos. Sci., :3-4, 963. Automatic differentiation: Giering R., Tangent Linear and Adjoint odel Compiler, Users anual Center for Global Change Sciences, Department of Earth, Atmospheric, and PlanetaryScience,IT,997 Giering R. and T. Kaminski, Recipes for Adjoint Code Construction, AC Transactions on athematical Software, 998 TAC: http://www.autodiff.org/ TAPENADE: http://www-sop.inria.fr/tropics/tapenade.html Sensitivity studies using the adjoint technique Janiskova,. and J.-J. orcrette., 5. Investigation of the sensitivity of the ECWF radiation scheme to input parameters using adjoint technique. Quart. J. Roy. eteor. Soc., 3,975-996.