M.Sc. in Meteorology. Numerical Weather Prediction

Similar documents
Fundamentals of Data Assimilation

Brian J. Etherton University of North Carolina

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

Numerical Weather Prediction: Data assimilation. Steven Cavallo

Data assimilation; comparison of 4D-Var and LETKF smoothers

Introduction to data assimilation and least squares methods

Introduction to Data Assimilation

Introduction to Data Assimilation. Saroja Polavarapu Meteorological Service of Canada University of Toronto

Introduction to Data Assimilation. Reima Eresmaa Finnish Meteorological Institute

Basic concepts of data assimilation

4. DATA ASSIMILATION FUNDAMENTALS

Introduction to initialization of NWP models

Kalman Filter and Ensemble Kalman Filter

Introduction to ensemble forecasting. Eric J. Kostelich

M.Sc. in Meteorology. Numerical Weather Prediction

Dynamic Inference of Background Error Correlation between Surface Skin and Air Temperature

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Relative Merits of 4D-Var and Ensemble Kalman Filter

Gaussian Process Approximations of Stochastic Differential Equations

Background and observation error covariances Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience

Interpretation of two error statistics estimation methods: 1 - the Derozier s method 2 the NMC method (lagged forecast)

Simple Examples. Let s look at a few simple examples of OI analysis.

Gaussian Filtering Strategies for Nonlinear Systems

Accelerating the spin-up of Ensemble Kalman Filtering

Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed

Ensemble Kalman Filter based snow data assimilation

Data assimilation concepts and methods March 1999

Ji-Sun Kang. Pr. Eugenia Kalnay (Chair/Advisor) Pr. Ning Zeng (Co-Chair) Pr. Brian Hunt (Dean s representative) Pr. Kayo Ide Pr.

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006

Lecture notes on assimilation algorithms

Three-dimensional spatial interpolation of surface meteorological observations from high-resolution local networks

P 1.86 A COMPARISON OF THE HYBRID ENSEMBLE TRANSFORM KALMAN FILTER (ETKF)- 3DVAR AND THE PURE ENSEMBLE SQUARE ROOT FILTER (EnSRF) ANALYSIS SCHEMES

The Local Ensemble Transform Kalman Filter (LETKF) Eric Kostelich. Main topics

Ensemble 4DVAR for the NCEP hybrid GSI EnKF data assimilation system and observation impact study with the hybrid system

Sensitivity analysis in variational data assimilation and applications

(Statistical Forecasting: with NWP). Notes from Kalnay (2003), appendix C Postprocessing of Numerical Model Output to Obtain Station Weather Forecasts

4DEnVar. Four-Dimensional Ensemble-Variational Data Assimilation. Colloque National sur l'assimilation de données

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm

The Canadian Land Data Assimilation System (CaLDAS)

Enhancing information transfer from observations to unobserved state variables for mesoscale radar data assimilation

The Impact of Background Error on Incomplete Observations for 4D-Var Data Assimilation with the FSU GSM

The Kalman Filter. Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience. Sarah Dance

Tue 4/19/2016. MP experiment assignment (due Thursday) Final presentations: 28 April, 1-4 pm (final exam period)

Fundamentals of Data Assimila1on

Observation impact on data assimilation with dynamic background error formulation

Comparing Variational, Ensemble-based and Hybrid Data Assimilations at Regional Scales

Environment Canada s Regional Ensemble Kalman Filter

Improving weather prediction via advancing model initialization

Met Office convective-scale 4DVAR system, tests and improvement

Ensemble Data Assimilation and Uncertainty Quantification

(Extended) Kalman Filter

Ensembles and Particle Filters for Ocean Data Assimilation

Some ideas for Ensemble Kalman Filter

Numerical Weather Prediction Chaos, Predictability, and Data Assimilation

GSI Tutorial Background and Observation Errors: Estimation and Tuning. Daryl Kleist NCEP/EMC June 2011 GSI Tutorial

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

Demonstration and Comparison of of Sequential Approaches for Altimeter Data Assimilation in in HYCOM

Fundamentals of Data Assimila1on

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

1. Current atmospheric DA systems 2. Coupling surface/atmospheric DA 3. Trends & ideas

Development of the Canadian Precipitation Analysis (CaPA) and the Canadian Land Data Assimilation System (CaLDAS)

Assimilation of Doppler radar observations for high-resolution numerical weather prediction

Experiences from implementing GPS Radio Occultations in Data Assimilation for ICON

Satellite Observations of Greenhouse Gases

Application of the sub-optimal Kalman filter to ozone assimilation. Henk Eskes, Royal Netherlands Meteorological Institute, De Bilt, the Netherlands

Comparison of of Assimilation Schemes for HYCOM

Localization in the ensemble Kalman Filter

A Comparative Study of 4D-VAR and a 4D Ensemble Kalman Filter: Perfect Model Simulations with Lorenz-96

Multivariate Correlations: Applying a Dynamic Constraint and Variable Localization in an Ensemble Context

15B.7 A SEQUENTIAL VARIATIONAL ANALYSIS APPROACH FOR MESOSCALE DATA ASSIMILATION

University of Athens School of Physics Atmospheric Modeling and Weather Forecasting Group

Numerical Weather prediction at the European Centre for Medium-Range Weather Forecasts

The Structure of Background-error Covariance in a Four-dimensional Variational Data Assimilation System: Single-point Experiment

Radar data assimilation using a modular programming approach with the Ensemble Kalman Filter: preliminary results

Parameter Estimation in EnKF: Surface Fluxes of Carbon, Heat, Moisture and Momentum

Can hybrid-4denvar match hybrid-4dvar?

ON DIAGNOSING OBSERVATION ERROR STATISTICS WITH LOCAL ENSEMBLE DATA ASSIMILATION

Direct assimilation of all-sky microwave radiances at ECMWF

Assimilation Techniques (4): 4dVar April 2001

Model and observation bias correction in altimeter ocean data assimilation in FOAM

Hybrid variational-ensemble data assimilation. Daryl T. Kleist. Kayo Ide, Dave Parrish, John Derber, Jeff Whitaker

The role of data assimilation in atmospheric composition monitoring and forecasting

Comparison of 3D-Var and LETKF in an Atmospheric GCM: SPEEDY

Model error in coupled atmosphereocean data assimilation. Alison Fowler and Amos Lawless (and Keith Haines)

Xuguang Wang and Ting Lei. School of Meteorology, University of Oklahoma and Center for Analysis and Prediction of Storms, Norman, OK.

Quantifying observation error correlations in remotely sensed data

EnKF Localization Techniques and Balance

Comparing Local Ensemble Transform Kalman Filter with 4D-Var in a Quasi-geostrophic model

Variational data assimilation of lightning with WRFDA system using nonlinear observation operators

Development of the Local Ensemble Transform Kalman Filter

Satellite data assimilation for Numerical Weather Prediction II

Theory and Practice of Data Assimilation in Ocean Modeling

WP2 task 2.2 SST assimilation

Snow Analysis for Numerical Weather prediction at ECMWF

Operational Use of Scatterometer Winds at JMA

The hybrid ETKF- Variational data assimilation scheme in HIRLAM

Diagnosis of observation, background and analysis-error statistics in observation space

Transcription:

M.Sc. in Meteorology UCD Numerical Weather Prediction Prof Peter Lynch Meteorology & Climate Cehtre School of Mathematical Sciences University College Dublin Second Semester, 2005 2006.

Text for the Course The lectures will be based closely on the text Atmospheric Modeling, Data Assimilation and Predictability by Eugenia Kalnay published by Cambridge University Press (2002). 2

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem 3

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem Given an estimate of the present state of the atmosphere (initial conditions) appropriate surface and lateral boundary conditions the model simulates or forecasts the evolution of the atmosphere. 3

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem Given an estimate of the present state of the atmosphere (initial conditions) appropriate surface and lateral boundary conditions the model simulates or forecasts the evolution of the atmosphere. The more accurate the estimate of the initial conditions, the better the quality of the forecasts. 3

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem Given an estimate of the present state of the atmosphere (initial conditions) appropriate surface and lateral boundary conditions the model simulates or forecasts the evolution of the atmosphere. The more accurate the estimate of the initial conditions, the better the quality of the forecasts. Operational NWP centers produce initial conditions through a statistical combination of observations and short-range forecasts. 3

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem Given an estimate of the present state of the atmosphere (initial conditions) appropriate surface and lateral boundary conditions the model simulates or forecasts the evolution of the atmosphere. The more accurate the estimate of the initial conditions, the better the quality of the forecasts. Operational NWP centers produce initial conditions through a statistical combination of observations and short-range forecasts. This approach is called data assimilation 3

5

6

7

8

9

10

11

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. 12

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. For a time window of ±3 hours, there are typically 10 to 100 thousand observations of the atmosphere, two orders of magnitude less than the number of degrees of freedom of the model. 12

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. For a time window of ±3 hours, there are typically 10 to 100 thousand observations of the atmosphere, two orders of magnitude less than the number of degrees of freedom of the model. Moreover, they are distributed nonuniformly in space and time. 12

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. For a time window of ±3 hours, there are typically 10 to 100 thousand observations of the atmosphere, two orders of magnitude less than the number of degrees of freedom of the model. Moreover, they are distributed nonuniformly in space and time. It is necessary to use additional information, called the background field, first guess or prior information. 12

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. For a time window of ±3 hours, there are typically 10 to 100 thousand observations of the atmosphere, two orders of magnitude less than the number of degrees of freedom of the model. Moreover, they are distributed nonuniformly in space and time. It is necessary to use additional information, called the background field, first guess or prior information. A short-range forecast is used as the first guess in operational data assimilation systems. 12

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. For a time window of ±3 hours, there are typically 10 to 100 thousand observations of the atmosphere, two orders of magnitude less than the number of degrees of freedom of the model. Moreover, they are distributed nonuniformly in space and time. It is necessary to use additional information, called the background field, first guess or prior information. A short-range forecast is used as the first guess in operational data assimilation systems. Present-day operational systems typically use a 6-h cycle performed four times a day. 12

Typical 6-hour analysis cycle. 13

Suppose the background field is a model 6-h forecast: x b 14

Suppose the background field is a model 6-h forecast: x b To obtain the background or first guess observations, the model forecast is interpolated to the observation location 14

Suppose the background field is a model 6-h forecast: x b To obtain the background or first guess observations, the model forecast is interpolated to the observation location If the observed quantities are not the same as the model variables, the model variables are converted to observed variables y o. 14

Suppose the background field is a model 6-h forecast: x b To obtain the background or first guess observations, the model forecast is interpolated to the observation location If the observed quantities are not the same as the model variables, the model variables are converted to observed variables y o. The first guess of the observations is denoted H(x b ) where H is called the observation operator. 14

Suppose the background field is a model 6-h forecast: x b To obtain the background or first guess observations, the model forecast is interpolated to the observation location If the observed quantities are not the same as the model variables, the model variables are converted to observed variables y o. The first guess of the observations is denoted H(x b ) where H is called the observation operator. The difference between the observations and the background, y o H(x b ), is called the observational increment or innovation. 14

The analysis x a is obtained by adding the innovations to the background field with weights W that are determined based on the estimated statistical error covariances of the forecast and the observations: x a = x b + W[y o H(x b )] 15

The analysis x a is obtained by adding the innovations to the background field with weights W that are determined based on the estimated statistical error covariances of the forecast and the observations: x a = x b + W[y o H(x b )] Different analysis schemes (SCM, OI, 3D-Var, and KF) are based on this equation, but differ by the approach taken to combine the background and the observations to produce the analysis. 15

The analysis x a is obtained by adding the innovations to the background field with weights W that are determined based on the estimated statistical error covariances of the forecast and the observations: x a = x b + W[y o H(x b )] Different analysis schemes (SCM, OI, 3D-Var, and KF) are based on this equation, but differ by the approach taken to combine the background and the observations to produce the analysis. Earlier methods such as the SCM used weights which were determined empirically. The weights were a function of the distance between the observation and the grid point, and the analysis wass iterated several times. 15

In Optimal Interpolation (OI), the matrix of weights W is determined from the minimization of the analysis errors at each grid point. 16

In Optimal Interpolation (OI), the matrix of weights W is determined from the minimization of the analysis errors at each grid point. In the 3D-Var approach one defines a cost function proportional to the square of the distance between the analysis and both the background and the observations. This cost function is minimized to obtain the analysis. 16

In Optimal Interpolation (OI), the matrix of weights W is determined from the minimization of the analysis errors at each grid point. In the 3D-Var approach one defines a cost function proportional to the square of the distance between the analysis and both the background and the observations. This cost function is minimized to obtain the analysis. Lorenc (1986) showed that OI and the 3D-Var approach are equivalent if the cost function is defined as: J = 1 { } [y 2 o H(x)] T R 1 [y o H(x)] + (x x b ) T B 1 (x x b ) 16

In Optimal Interpolation (OI), the matrix of weights W is determined from the minimization of the analysis errors at each grid point. In the 3D-Var approach one defines a cost function proportional to the square of the distance between the analysis and both the background and the observations. This cost function is minimized to obtain the analysis. Lorenc (1986) showed that OI and the 3D-Var approach are equivalent if the cost function is defined as: J = 1 { } [y 2 o H(x)] T R 1 [y o H(x)] + (x x b ) T B 1 (x x b ) The cost function J measures: The distance of a field x to the observations (first term) The distance to the background x b (second term). 16

The distances are scaled by the observation error covariance R and by the background error covariance B respectively. The minimum of the cost function is obtained for x = x a, which is defined as the analysis. 17

The distances are scaled by the observation error covariance R and by the background error covariance B respectively. The minimum of the cost function is obtained for x = x a, which is defined as the analysis. The analysis obtained by OI and 3DVar is the same if the weight matrix is given by W = BH T (HBH T + R 1 ) 1 17

The distances are scaled by the observation error covariance R and by the background error covariance B respectively. The minimum of the cost function is obtained for x = x a, which is defined as the analysis. The analysis obtained by OI and 3DVar is the same if the weight matrix is given by W = BH T (HBH T + R 1 ) 1 The difference between OI and the 3D-Var approach is in the method of solution: 17

The distances are scaled by the observation error covariance R and by the background error covariance B respectively. The minimum of the cost function is obtained for x = x a, which is defined as the analysis. The analysis obtained by OI and 3DVar is the same if the weight matrix is given by W = BH T (HBH T + R 1 ) 1 The difference between OI and the 3D-Var approach is in the method of solution: In OI, the weights W are obtained for each grid point or grid volume, using suitable simplifications. 17

The distances are scaled by the observation error covariance R and by the background error covariance B respectively. The minimum of the cost function is obtained for x = x a, which is defined as the analysis. The analysis obtained by OI and 3DVar is the same if the weight matrix is given by W = BH T (HBH T + R 1 ) 1 The difference between OI and the 3D-Var approach is in the method of solution: In OI, the weights W are obtained for each grid point or grid volume, using suitable simplifications. In 3D-Var, the minimization of J is performed directly, allowing for additional flexibility and a simultaneous global use of the data. 17

Schematic of the size of matrix operations in OI and 3D-Var (from www.ecmwf.int)

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). In the analysis cycle, the importance of the model cannot be overemphasized: 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). In the analysis cycle, the importance of the model cannot be overemphasized: It transports information from data-rich to data-poor regions 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). In the analysis cycle, the importance of the model cannot be overemphasized: It transports information from data-rich to data-poor regions It provides a complete estimation of the four-dimensional state of the atmosphere. 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). In the analysis cycle, the importance of the model cannot be overemphasized: It transports information from data-rich to data-poor regions It provides a complete estimation of the four-dimensional state of the atmosphere. The introduction of 4DVar at ECMWF has resulted in marked improvements in the quality of medium-range forecasts. 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). In the analysis cycle, the importance of the model cannot be overemphasized: It transports information from data-rich to data-poor regions It provides a complete estimation of the four-dimensional state of the atmosphere. The introduction of 4DVar at ECMWF has resulted in marked improvements in the quality of medium-range forecasts. End of Introduction 18

21

Background Field For operational models, it is not enough to perform spatial interpolation of observations into regular grids: There are not enough data available to define the initial state. 23

Background Field For operational models, it is not enough to perform spatial interpolation of observations into regular grids: There are not enough data available to define the initial state. The number of degrees of freedom in a modern NWP model is of the order of 10 7. The total number of conventional observations is of the order of 10 4 10 5. 23

Background Field For operational models, it is not enough to perform spatial interpolation of observations into regular grids: There are not enough data available to define the initial state. The number of degrees of freedom in a modern NWP model is of the order of 10 7. The total number of conventional observations is of the order of 10 4 10 5. There are many new types of data, such as satellite and radar observations, but: they don t measure the variables used in the models their distribution in space and time is very nonuniform. 23

Background Field In addition to observations, it is necessary to use a first guess estimate of the state of the atmosphere at the grid points. 24

Background Field In addition to observations, it is necessary to use a first guess estimate of the state of the atmosphere at the grid points. The first guess (also known as background field or prior information) is our best estimate of the state of the atmosphere prior to the use of the observations. 24

Background Field In addition to observations, it is necessary to use a first guess estimate of the state of the atmosphere at the grid points. The first guess (also known as background field or prior information) is our best estimate of the state of the atmosphere prior to the use of the observations. A short-range forecast is normally used as a first guess in operational systems in what is called an analysis cycle. 24

Background Field In addition to observations, it is necessary to use a first guess estimate of the state of the atmosphere at the grid points. The first guess (also known as background field or prior information) is our best estimate of the state of the atmosphere prior to the use of the observations. A short-range forecast is normally used as a first guess in operational systems in what is called an analysis cycle. If a forecast is unavailable (e.g., if the cycle is broken), we may have to use climatological fields...... but they are normally a poor estimate of the initial state. 24

Global 6-h analysis cycle (00, 06, 12, and 18 UTC). 25

Regional analysis cycle, performed (perhaps) every hour. 26

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: Over data-rich regions, the analysis is dominated by the information contained in the observations. 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: Over data-rich regions, the analysis is dominated by the information contained in the observations. In data-poor regions, the forecast benefits from the information upstream. 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: Over data-rich regions, the analysis is dominated by the information contained in the observations. In data-poor regions, the forecast benefits from the information upstream. For example, 6-h forecasts over the North Atlantic Ocean are relatively good, because of the information coming from North America. 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: Over data-rich regions, the analysis is dominated by the information contained in the observations. In data-poor regions, the forecast benefits from the information upstream. For example, 6-h forecasts over the North Atlantic Ocean are relatively good, because of the information coming from North America. The model is able to transport information from data-rich to data-poor areas. 27

Least Squares Method (Kalnay, 5.3) We start with a toy model example, the two temperatures problem.

Least Squares Method (Kalnay, 5.3) We start with a toy model example, the two temperatures problem. We use two methods to solve it, a sequential and a variational approach, and find that they are equivalent: they yield identical results.

Least Squares Method (Kalnay, 5.3) We start with a toy model example, the two temperatures problem. We use two methods to solve it, a sequential and a variational approach, and find that they are equivalent: they yield identical results. The problem is important because the methodology and results carry over to multivariate OI, Kalman filtering, and 3D-Var and 4D-Var assimilation.

Least Squares Method (Kalnay, 5.3) We start with a toy model example, the two temperatures problem. We use two methods to solve it, a sequential and a variational approach, and find that they are equivalent: they yield identical results. The problem is important because the methodology and results carry over to multivariate OI, Kalman filtering, and 3D-Var and 4D-Var assimilation. If you fully understand the toy model, you should find the more realistic application straightforward.

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? 2

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? The best estimate of the state of the atmosphere is obtained by combining prior information about the atmosphere (background or first guess) with observations. 2

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? The best estimate of the state of the atmosphere is obtained by combining prior information about the atmosphere (background or first guess) with observations. In order to combine them optimally, we also need statistical information about the errors in these pieces of information. 2

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? The best estimate of the state of the atmosphere is obtained by combining prior information about the atmosphere (background or first guess) with observations. In order to combine them optimally, we also need statistical information about the errors in these pieces of information. As an introduction to statistical estimation, we consider the simple problem, that we call the two temperatures problem: 2

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? The best estimate of the state of the atmosphere is obtained by combining prior information about the atmosphere (background or first guess) with observations. In order to combine them optimally, we also need statistical information about the errors in these pieces of information. As an introduction to statistical estimation, we consider the simple problem, that we call the two temperatures problem: Given two independent observations T 1 and T 2, determine the best estimate of the true temperature T t. 2

Simple (toy) Example Let the two observations of temperature be } T 1 = T t + ε 1 T 2 = T t + ε 2 [For example, we might have two iffy thermometers]. 3

Simple (toy) Example Let the two observations of temperature be } T 1 = T t + ε 1 T 2 = T t + ε 2 [For example, we might have two iffy thermometers]. The observations have errors ε i, which we don t know. 3

Simple (toy) Example Let the two observations of temperature be } T 1 = T t + ε 1 T 2 = T t + ε 2 [For example, we might have two iffy thermometers]. The observations have errors ε i, which we don t know. Let E( ) represent the expected value, i.e., the average of many similar measurements. 3

Simple (toy) Example Let the two observations of temperature be } T 1 = T t + ε 1 T 2 = T t + ε 2 [For example, we might have two iffy thermometers]. The observations have errors ε i, which we don t know. Let E( ) represent the expected value, i.e., the average of many similar measurements. We assume that the measurements T 1 and T 2 are unbiased: or equivalently, E(T 1 T t ) = 0, E(T 2 T t ) = 0 E(ε 1 ) = E(ε 2 ) = 0 3

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 4

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 We next assume that the errors of the two measurements are uncorrelated: E(ε 1 ε 2 ) = 0 4

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 We next assume that the errors of the two measurements are uncorrelated: E(ε 1 ε 2 ) = 0 This implies, for example, that there is no systematic tendency for one thermometer to read high (ε 2 > 0) when the other is high (ε 2 > 0). 4

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 We next assume that the errors of the two measurements are uncorrelated: E(ε 1 ε 2 ) = 0 This implies, for example, that there is no systematic tendency for one thermometer to read high (ε 2 > 0) when the other is high (ε 2 > 0). The above equations represent the statistical information that we need about the actual observations. 4

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) This implies a 1 + a 2 = 1 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) This implies a 1 + a 2 = 1 T a will be the best estimate of T t if the coefficients are chosen to minimize the mean squared error of T a : σ 2 a = E[(T a T t ) 2 ] = E {[ a 1 (T 1 T t ) + a 2 (T 2 T t ) ] 2 } subject to the constraint a 1 + a 2 = 1. 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) This implies a 1 + a 2 = 1 T a will be the best estimate of T t if the coefficients are chosen to minimize the mean squared error of T a : σ 2 a = E[(T a T t ) 2 ] = E {[ a 1 (T 1 T t ) + a 2 (T 2 T t ) ] 2 } subject to the constraint a 1 + a 2 = 1. This may be written σ 2 a = E[(a 1 ε 1 + a 2 ε 2 ) 2 ] 5

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. 6

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. Naïve solution: σ 2 a/ a 1 = 2a 1 σ 2 1 = 0, so a 1 = 0. Similarly, σ 2 a/ a 2 = 0 implies a 2 = 0. 6

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. Naïve solution: σ 2 a/ a 1 = 2a 1 σ 2 1 = 0, so a 1 = 0. Similarly, σ 2 a/ a 2 = 0 implies a 2 = 0. We have forgotten the constraint a 1 + a 2 = 1. So, a 1 and a 2 are not independent. 6

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. Naïve solution: σ 2 a/ a 1 = 2a 1 σ 2 1 = 0, so a 1 = 0. Similarly, σ 2 a/ a 2 = 0 implies a 2 = 0. We have forgotten the constraint a 1 + a 2 = 1. So, a 1 and a 2 are not independent. Substituting a 2 = 1 a 1, we get σa 2 = a 2 1 σ2 1 + (1 a 1) 2 σ2 2 6

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. Naïve solution: σ 2 a/ a 1 = 2a 1 σ 2 1 = 0, so a 1 = 0. Similarly, σ 2 a/ a 2 = 0 implies a 2 = 0. We have forgotten the constraint a 1 + a 2 = 1. So, a 1 and a 2 are not independent. Substituting a 2 = 1 a 1, we get σa 2 = a 2 1 σ2 1 + (1 a 1) 2 σ2 2 Equating the derivative w.r.t. a 1 to zero, σa/ a 2 1 = 0, gives a 1 = σ2 2 σ 2 1 + σ2 2 a 2 = σ2 1 σ 2 1 + σ2 2 6

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. Naïve solution: σ 2 a/ a 1 = 2a 1 σ 2 1 = 0, so a 1 = 0. Similarly, σ 2 a/ a 2 = 0 implies a 2 = 0. We have forgotten the constraint a 1 + a 2 = 1. So, a 1 and a 2 are not independent. Substituting a 2 = 1 a 1, we get σa 2 = a 2 1 σ2 1 + (1 a 1) 2 σ2 2 Equating the derivative w.r.t. a 1 to zero, σa/ a 2 1 = 0, gives a 1 = σ2 2 σ 2 1 + σ2 2 a 2 = σ2 1 σ 2 1 + σ2 2 Thus, we have expressions for the weights a 1 and a 2 in terms of the variances (which are assumed to be known). 6

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. 7

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. Note: The term precision, while a good one, does not have universal currency, so it should be defined when used. 7

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. Note: The term precision, while a good one, does not have universal currency, so it should be defined when used. Substituting the coefficients in σa 2 = a 2 1 σ2 1 + a2 2 σ2 2, we obtain σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 7

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. Note: The term precision, while a good one, does not have universal currency, so it should be defined when used. Substituting the coefficients in σa 2 = a 2 1 σ2 1 + a2 2 σ2 2, we obtain σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 This can be written in the alternative form: 1 = 1 σ 2 + 1 1 σ 2 2 σ 2 a 7

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. Note: The term precision, while a good one, does not have universal currency, so it should be defined when used. Substituting the coefficients in σa 2 = a 2 1 σ2 1 + a2 2 σ2 2, we obtain σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 This can be written in the alternative form: 1 = 1 σ 2 + 1 1 σ 2 2 σ 2 a Thus, if the coefficients are optimal, the precision of the analysis is the sum of the precisions of the measurements. 7

Variational approach We can also obtain the same best estimate of T t by minimizing a cost function. 8

Variational approach We can also obtain the same best estimate of T t by minimizing a cost function. The cost function is defined as the sum of the squares of the distances of T to the two observations, weighted by their observational error precisions: J(T ) = 1 2 [ (T T 1 ) 2 σ 2 1 ] + (T T 2) 2 σ2 2 8

Variational approach We can also obtain the same best estimate of T t by minimizing a cost function. The cost function is defined as the sum of the squares of the distances of T to the two observations, weighted by their observational error precisions: J(T ) = 1 2 [ (T T 1 ) 2 σ 2 1 ] + (T T 2) 2 σ2 2 The minimum of the cost function J is obtained is obtained by requiring J/ T = 0. 8

Variational approach We can also obtain the same best estimate of T t by minimizing a cost function. The cost function is defined as the sum of the squares of the distances of T to the two observations, weighted by their observational error precisions: J(T ) = 1 2 [ (T T 1 ) 2 σ 2 1 ] + (T T 2) 2 σ2 2 The minimum of the cost function J is obtained is obtained by requiring J/ T = 0. Exercise: Prove that J/ T = 0 gives the same value for T a as the least squares method. 8

The control variable for the minimization of J (i.e., the variable with respect to which we are minimizing the cost function) is the temperature. For the least squares method, the control variables were the weights. 9

The control variable for the minimization of J (i.e., the variable with respect to which we are minimizing the cost function) is the temperature. For the least squares method, the control variables were the weights. The equivalence between the minimization of the analysis error variance and the variational cost function approach is important. 9

The control variable for the minimization of J (i.e., the variable with respect to which we are minimizing the cost function) is the temperature. For the least squares method, the control variables were the weights. The equivalence between the minimization of the analysis error variance and the variational cost function approach is important. This equivalence also holds true for multidimensional problems, in which case we use the covariance matrix rather than the scalar variance. It indicates that OI and 3D-Var are solving the same problem by different means. 9

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. T a = a 1 T 1 + a 2 T 2 = 1 5 2 + 4 5 0 = 0.4 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. T a = a 1 T 1 + a 2 T 2 = 1 5 2 + 4 5 0 = 0.4 σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 = 4 1 4 + 1 = 0.8 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. T a = a 1 T 1 + a 2 T 2 = 1 5 2 + 4 5 0 = 0.4 σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 = 4 1 4 + 1 = 0.8 This solution is illustrated in the next figure. 10

The probability distribution for a simple case. The analysis has a pdf with a maximum closer to T 2, and a smaller standard deviation than either observation. 11

Conclusion of the foregoing. 12

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. 13

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. Recall that we wrote the analysis as a linear combination T a = a 1 T 1 + a 2 T 2 13

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. Recall that we wrote the analysis as a linear combination T a = a 1 T 1 + a 2 T 2 The requirement that the analysis be unbiassed led to a 1 + a 2 = 1, so T a = T 1 + a 2 (T 2 T 1 ) 13

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. Recall that we wrote the analysis as a linear combination T a = a 1 T 1 + a 2 T 2 The requirement that the analysis be unbiassed led to a 1 + a 2 = 1, so T a = T 1 + a 2 (T 2 T 1 ) Assume that one of the two temperatures, say T 1 = T b, is not an observation, but a background value, such as a forecast. Assume that the other value is an observation, T 2 = T o. 13

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. Recall that we wrote the analysis as a linear combination T a = a 1 T 1 + a 2 T 2 The requirement that the analysis be unbiassed led to a 1 + a 2 = 1, so T a = T 1 + a 2 (T 2 T 1 ) Assume that one of the two temperatures, say T 1 = T b, is not an observation, but a background value, such as a forecast. Assume that the other value is an observation, T 2 = T o. We can write the analysis as T a = T b + W (T o T b ) where W = a 2 can be expressed in terms of the variances. 13

The least squares method gave us the optimal weight: W = σ2 b σ 2 b + σ2 o 14

The least squares method gave us the optimal weight: W = σ2 b σ 2 b + σ2 o When the analysis is written as T a = T b + W (T o T b ) the quantity (T o T b ) is called the observational innovation, i.e., the new information brought by the observation. 14

The least squares method gave us the optimal weight: W = σ2 b σ 2 b + σ2 o When the analysis is written as T a = T b + W (T o T b ) the quantity (T o T b ) is called the observational innovation, i.e., the new information brought by the observation. It is also known as the observational increment (with respect to the background). 14

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o 15

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o The analysis variance can be written as σ 2 a = (1 W )σ 2 b 15

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o The analysis variance can be written as σ 2 a = (1 W )σ 2 b Exercise: Verify all the foregoing formulæ. 15

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o The analysis variance can be written as σ 2 a = (1 W )σ 2 b Exercise: Verify all the foregoing formulæ. We have shown that the simple two-temperatures problem serves as a paradigm for the problem of objective analysis of the atmospheric state. 15

Collection of Main Equations We gather the principal equations here: 16

Collection of Main Equations We gather the principal equations here: T a = T b + W (T o T b ) W = σ2 b σ 2 b + σ2 o σ 2 a = σ2 b σ2 o σ 2 b + σ2 o = W σ 2 o σ 2 a = (1 W )σ 2 b 16

These four equations have been derived for the simplest scalar case...... but they are important for the problem of data assimilation because they have exactly the same form as more general equations: 17

These four equations have been derived for the simplest scalar case...... but they are important for the problem of data assimilation because they have exactly the same form as more general equations: The least squares sequential estimation method is used for real multidimensional problems (OI, interpolation, 3D-Var and even Kalman filtering). 17

These four equations have been derived for the simplest scalar case...... but they are important for the problem of data assimilation because they have exactly the same form as more general equations: The least squares sequential estimation method is used for real multidimensional problems (OI, interpolation, 3D-Var and even Kalman filtering). Therefore we will interpret these four equations in detail. 17

The first equation T a = T b + W (T o T b ) 18

The first equation This says: T a = T b + W (T o T b ) The analysis is obtained by adding to the background value, or first guess, the innovation (the difference between the observation and first guess), weighted by the optimal weight. 18

The second equation This says: W = σ2 b σ 2 b + σ2 o The optimal weight is the background error variance multiplied by the inverse of the total error variance (the sum of the background and the observation error variances). 19

The second equation This says: W = σ2 b σ 2 b + σ2 o The optimal weight is the background error variance multiplied by the inverse of the total error variance (the sum of the background and the observation error variances). Note that the larger the background error variance, the larger the correction to the first guess. 19

The second equation This says: W = σ2 b σ 2 b + σ2 o The optimal weight is the background error variance multiplied by the inverse of the total error variance (the sum of the background and the observation error variances). Note that the larger the background error variance, the larger the correction to the first guess. Look at the limits: σ 2 o = 0 ; σ 2 b = 0. 19

The third equation The variance of the analysis is σ 2 a = σ2 b σ2 o σ 2 b + σ2 o 20

The third equation The variance of the analysis is σ 2 a = σ2 b σ2 o σ 2 b + σ2 o This can also be written 1 σ 2 a = 1 σ 2 b + 1 σ 2 o 20

The third equation The variance of the analysis is σ 2 a = σ2 b σ2 o σ 2 b + σ2 o This can also be written 1 σ 2 a = 1 σ 2 b + 1 σ 2 o This says: The precision of the analysis (inverse of the analysis error variance) is the sum of the precisions of the background and the observation. 20

The fourth equation This says: σ 2 a = (1 W )σ 2 b The error variance of the analysis is the error variance of the background, reduced by a factor equal to one minus the optimal weight. 21

The fourth equation This says: σ 2 a = (1 W )σ 2 b The error variance of the analysis is the error variance of the background, reduced by a factor equal to one minus the optimal weight. It can also be written σ 2 a = W σ 2 o 21

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. 22

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. In these problems, in which T b and T a are three-dimensional fields of size order 10 7 and T o is a set of observations (typically of size 10 5 ), we have to replace expressions as follows: 22

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. In these problems, in which T b and T a are three-dimensional fields of size order 10 7 and T o is a set of observations (typically of size 10 5 ), we have to replace expressions as follows: error variance = error covariance matrix 22

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. In these problems, in which T b and T a are three-dimensional fields of size order 10 7 and T o is a set of observations (typically of size 10 5 ), we have to replace expressions as follows: error variance = error covariance matrix optimal weight = optimal gain matrix. 22

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. In these problems, in which T b and T a are three-dimensional fields of size order 10 7 and T o is a set of observations (typically of size 10 5 ), we have to replace expressions as follows: error variance = error covariance matrix optimal weight = optimal gain matrix. Note that there is one essential tuning parameter in OI: It is the ratio of the observationalvariance to the background error variance: ( ) 2 σo σ b 22

Application to Analysis If the background is a forecast, we can use the four equations to create a simple sequential analysis cycle. The observation is used once at the time it appears and then discarded. 23

Application to Analysis If the background is a forecast, we can use the four equations to create a simple sequential analysis cycle. The observation is used once at the time it appears and then discarded. Assume that we have completed the analysis at time t i (e.g., at 06 UTC), and we want to proceed to the next cycle (time t i+1, or 12 UTC). 23

Application to Analysis If the background is a forecast, we can use the four equations to create a simple sequential analysis cycle. The observation is used once at the time it appears and then discarded. Assume that we have completed the analysis at time t i (e.g., at 06 UTC), and we want to proceed to the next cycle (time t i+1, or 12 UTC). The analysis cycle has two phases, a forecast phase to update the background T b and its error variance σb 2, and an analysis phase, to update the analysis T a and its error variance σa. 2 23

Typical 6-hour analysis cycle. 24

Forecast Phase In the forecast phase of the analysis cycle, the background is first obtained through a forecast: T b (t i+1 ) = M [T a (t i )] where M represents the forecast model. 25

Forecast Phase In the forecast phase of the analysis cycle, the background is first obtained through a forecast: T b (t i+1 ) = M [T a (t i )] where M represents the forecast model. We also need the error variance of the background. 25

Forecast Phase In the forecast phase of the analysis cycle, the background is first obtained through a forecast: T b (t i+1 ) = M [T a (t i )] where M represents the forecast model. We also need the error variance of the background. In OI, this is obtained by making a suitable simple assumption, such as that the model integration increases the initial error variance by a fixed amount, a factor a somewhat greater than 1: σ 2 b (t i+1) = aσ 2 a(t i ) 25

Forecast Phase In the forecast phase of the analysis cycle, the background is first obtained through a forecast: T b (t i+1 ) = M [T a (t i )] where M represents the forecast model. We also need the error variance of the background. In OI, this is obtained by making a suitable simple assumption, such as that the model integration increases the initial error variance by a fixed amount, a factor a somewhat greater than 1: σ 2 b (t i+1) = aσ 2 a(t i ) This allows the new weight W (t i+1 ) to be estimated using W = σ2 b σ 2 b + σ2 o 25

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) 26

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) The estimates of σ 2 b is from σ 2 b (t i+1) = aσ 2 a(t i ) 26

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) The estimates of σ 2 b is from σb 2 (t i+1) = aσa(t 2 i ) The new analysis error variance σa(t 2 i+1 ) comes from σa 2 = (1 W )σb 2 It is smaller than the background error. 26

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) The estimates of σ 2 b is from σ 2 b (t i+1) = aσ 2 a(t i ) The new analysis error variance σ 2 a(t i+1 ) comes from σ 2 a = (1 W )σ 2 b It is smaller than the background error. After the analysis, the cycle for time t i+1 is completed, and we can proceed to the next cycle. 26

Reading Assignment Study the Remarks in Kalnay, 5.3.1 27