M.Sc. in Meteorology. Numerical Weather Prediction

M.Sc. in Meteorology UCD Numerical Weather Prediction Prof Peter Lynch Meteorology & Climate Cehtre School of Mathematical Sciences University College Dublin Second Semester, 2005 2006.

Text for the Course The lectures will be based closely on the text Atmospheric Modeling, Data Assimilation and Predictability by Eugenia Kalnay published by Cambridge University Press (2002). 2

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem 3

Data Assimilation (Kalnay, Ch. 5) NWP is an initial/boundary value problem Given an estimate of the present state of the atmosphere (initial conditions) appropriate surface and lateral boundary conditions the model simulates or forecasts the evolution of the atmosphere. The more accurate the estimate of the initial conditions, the better the quality of the forecasts. Operational NWP centers produce initial conditions through a statistical combination of observations and short-range forecasts. 3

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. 12

Insufficiency of Data Coverage Modern primitive equations models have a number of degrees of freedom of the order of 10 7. For a time window of ±3 hours, there are typically 10 to 100 thousand observations of the atmosphere, two orders of magnitude less than the number of degrees of freedom of the model. Moreover, they are distributed nonuniformly in space and time. It is necessary to use additional information, called the background field, first guess or prior information. 12

Typical 6-hour analysis cycle. 13

Suppose the background field is a model 6-h forecast: x b 14

Suppose the background field is a model 6-h forecast: x b To obtain the background or first guess observations, the model forecast is interpolated to the observation location 14

Suppose the background field is a model 6-h forecast: x b To obtain the background or first guess observations, the model forecast is interpolated to the observation location If the observed quantities are not the same as the model variables, the model variables are converted to observed variables y o. The first guess of the observations is denoted H(x b ) where H is called the observation operator. 14

The analysis x a is obtained by adding the innovations to the background field with weights W that are determined based on the estimated statistical error covariances of the forecast and the observations: x a = x b + W[y o H(x b )] Different analysis schemes (SCM, OI, 3D-Var, and KF) are based on this equation, but differ by the approach taken to combine the background and the observations to produce the analysis. Earlier methods such as the SCM used weights which were determined empirically. The weights were a function of the distance between the observation and the grid point, and the analysis wass iterated several times. 15

In Optimal Interpolation (OI), the matrix of weights W is determined from the minimization of the analysis errors at each grid point. 16

In Optimal Interpolation (OI), the matrix of weights W is determined from the minimization of the analysis errors at each grid point. In the 3D-Var approach one defines a cost function proportional to the square of the distance between the analysis and both the background and the observations. This cost function is minimized to obtain the analysis. Lorenc (1986) showed that OI and the 3D-Var approach are equivalent if the cost function is defined as: J = 1 { } [y 2 o H(x)] T R 1 [y o H(x)] + (x x b ) T B 1 (x x b ) 16

The distances are scaled by the observation error covariance R and by the background error covariance B respectively. The minimum of the cost function is obtained for x = x a, which is defined as the analysis. The analysis obtained by OI and 3DVar is the same if the weight matrix is given by W = BH T (HBH T + R 1 ) 1 The difference between OI and the 3D-Var approach is in the method of solution: In OI, the weights W are obtained for each grid point or grid volume, using suitable simplifications. 17

Schematic of the size of matrix operations in OI and 3D-Var (from www.ecmwf.int)

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). 18

Recently, the variational approach has been extended to four dimensions, by including within the cost function the distance to observations over a time interval (assimilation window). This is called four-dimensional variational assimilation (4DVar). In the analysis cycle, the importance of the model cannot be overemphasized: It transports information from data-rich to data-poor regions It provides a complete estimation of the four-dimensional state of the atmosphere. 18

Background Field For operational models, it is not enough to perform spatial interpolation of observations into regular grids: There are not enough data available to define the initial state. 23

Background Field For operational models, it is not enough to perform spatial interpolation of observations into regular grids: There are not enough data available to define the initial state. The number of degrees of freedom in a modern NWP model is of the order of 10 7. The total number of conventional observations is of the order of 10 4 10 5. There are many new types of data, such as satellite and radar observations, but: they don t measure the variables used in the models their distribution in space and time is very nonuniform. 23

Background Field In addition to observations, it is necessary to use a first guess estimate of the state of the atmosphere at the grid points. 24

Background Field In addition to observations, it is necessary to use a first guess estimate of the state of the atmosphere at the grid points. The first guess (also known as background field or prior information) is our best estimate of the state of the atmosphere prior to the use of the observations. A short-range forecast is normally used as a first guess in operational systems in what is called an analysis cycle. 24

Global 6-h analysis cycle (00, 06, 12, and 18 UTC). 25

Regional analysis cycle, performed (perhaps) every hour. 26

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: 27

Intermittent data assimilation is used in most global operational systems, typically with a 6-h cycle performed four times a day. The model forecast plays a very important role: Over data-rich regions, the analysis is dominated by the information contained in the observations. In data-poor regions, the forecast benefits from the information upstream. For example, 6-h forecasts over the North Atlantic Ocean are relatively good, because of the information coming from North America. 27

Least Squares Method (Kalnay, 5.3) We start with a toy model example, the two temperatures problem.

Least Squares Method (Kalnay, 5.3) We start with a toy model example, the two temperatures problem. We use two methods to solve it, a sequential and a variational approach, and find that they are equivalent: they yield identical results. The problem is important because the methodology and results carry over to multivariate OI, Kalman filtering, and 3D-Var and 4D-Var assimilation.

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? 2

Statistical estimation Introduction. Each of you: Guess the temperature in this room right now. How can we get a best estimate of the temperature? The best estimate of the state of the atmosphere is obtained by combining prior information about the atmosphere (background or first guess) with observations. In order to combine them optimally, we also need statistical information about the errors in these pieces of information. As an introduction to statistical estimation, we consider the simple problem, that we call the two temperatures problem: 2

Simple (toy) Example Let the two observations of temperature be } T 1 = T t + ε 1 T 2 = T t + ε 2 [For example, we might have two iffy thermometers]. 3

Simple (toy) Example Let the two observations of temperature be } T 1 = T t + ε 1 T 2 = T t + ε 2 [For example, we might have two iffy thermometers]. The observations have errors ε i, which we don t know. Let E( ) represent the expected value, i.e., the average of many similar measurements. We assume that the measurements T 1 and T 2 are unbiased: or equivalently, E(T 1 T t ) = 0, E(T 2 T t ) = 0 E(ε 1 ) = E(ε 2 ) = 0 3

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 4

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 We next assume that the errors of the two measurements are uncorrelated: E(ε 1 ε 2 ) = 0 4

We also assume that we know the variances of the observational errors: E ( ε 2 1) = σ 2 1 E ( ε 2 2) = σ 2 2 We next assume that the errors of the two measurements are uncorrelated: E(ε 1 ε 2 ) = 0 This implies, for example, that there is no systematic tendency for one thermometer to read high (ε 2 > 0) when the other is high (ε 2 > 0). 4

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) This implies a 1 + a 2 = 1 5

We estimate T t as a linear combination of the observations: T a = a 1 T 1 + a 2 T 2 The analysis T a should be unbiased: E(T a ) = E(T t ) This implies a 1 + a 2 = 1 T a will be the best estimate of T t if the coefficients are chosen to minimize the mean squared error of T a : σ 2 a = E[(T a T t ) 2 ] = E {[ a 1 (T 1 T t ) + a 2 (T 2 T t ) ] 2 } subject to the constraint a 1 + a 2 = 1. 5

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. 6

Expanding this expression for σ 2 a, we get σ 2 a = a 2 1 σ2 1 + a2 2 σ2 2 To minimize σ 2 a w.r.t. a 1, we require σ 2 a/ a 1 = 0. Naïve solution: σ 2 a/ a 1 = 2a 1 σ 2 1 = 0, so a 1 = 0. Similarly, σ 2 a/ a 2 = 0 implies a 2 = 0. We have forgotten the constraint a 1 + a 2 = 1. So, a 1 and a 2 are not independent. Substituting a 2 = 1 a 1, we get σa 2 = a 2 1 σ2 1 + (1 a 1) 2 σ2 2 Equating the derivative w.r.t. a 1 to zero, σa/ a 2 1 = 0, gives a 1 = σ2 2 σ 2 1 + σ2 2 a 2 = σ2 1 σ 2 1 + σ2 2 6

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. 7

We define the precision to be the inverse of the variance. It is a measure of the accuracy of the observations. Note: The term precision, while a good one, does not have universal currency, so it should be defined when used. Substituting the coefficients in σa 2 = a 2 1 σ2 1 + a2 2 σ2 2, we obtain σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 This can be written in the alternative form: 1 = 1 σ 2 + 1 1 σ 2 2 σ 2 a 7

Variational approach We can also obtain the same best estimate of T t by minimizing a cost function. 8

Variational approach We can also obtain the same best estimate of T t by minimizing a cost function. The cost function is defined as the sum of the squares of the distances of T to the two observations, weighted by their observational error precisions: J(T ) = 1 2 [ (T T 1 ) 2 σ 2 1 ] + (T T 2) 2 σ2 2 The minimum of the cost function J is obtained is obtained by requiring J/ T = 0. 8

The control variable for the minimization of J (i.e., the variable with respect to which we are minimizing the cost function) is the temperature. For the least squares method, the control variables were the weights. The equivalence between the minimization of the analysis error variance and the variational cost function approach is important. This equivalence also holds true for multidimensional problems, in which case we use the covariance matrix rather than the scalar variance. It indicates that OI and 3D-Var are solving the same problem by different means. 9

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. T a = a 1 T 1 + a 2 T 2 = 1 5 2 + 4 5 0 = 0.4 10

Example: Suppose T 1 = 2 σ 1 = 2 T 2 = 0 σ 2 = 1. Show that T a = 0.4 and σ a = 0.8. σ 2 1 + σ2 2 = 5 a 1 = σ2 2 σ 2 1 + σ2 2 = 1 5 a 2 = σ2 1 σ 2 1 + σ2 2 = 4 5 CHECK: a 1 + a 2 = 1. T a = a 1 T 1 + a 2 T 2 = 1 5 2 + 4 5 0 = 0.4 σ 2 a = σ2 1 σ2 2 σ 2 1 + σ2 2 = 4 1 4 + 1 = 0.8 10

The probability distribution for a simple case. The analysis has a pdf with a maximum closer to T 2, and a smaller standard deviation than either observation. 11

Conclusion of the foregoing. 12

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. 13

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. Recall that we wrote the analysis as a linear combination T a = a 1 T 1 + a 2 T 2 13

Simple Sequential Assimilation We consider the toy example as a prototype of a full multivariate OI. Recall that we wrote the analysis as a linear combination T a = a 1 T 1 + a 2 T 2 The requirement that the analysis be unbiassed led to a 1 + a 2 = 1, so T a = T 1 + a 2 (T 2 T 1 ) Assume that one of the two temperatures, say T 1 = T b, is not an observation, but a background value, such as a forecast. Assume that the other value is an observation, T 2 = T o. 13

The least squares method gave us the optimal weight: W = σ2 b σ 2 b + σ2 o 14

The least squares method gave us the optimal weight: W = σ2 b σ 2 b + σ2 o When the analysis is written as T a = T b + W (T o T b ) the quantity (T o T b ) is called the observational innovation, i.e., the new information brought by the observation. 14

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o 15

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o The analysis variance can be written as σ 2 a = (1 W )σ 2 b 15

The analysis error variance is, as before, given by 1 σ 2 a = 1 σ 2 b + 1 σ 2 o or σ 2 a = σ2 b σ2 o σ 2 b + σ2 o The analysis variance can be written as σ 2 a = (1 W )σ 2 b Exercise: Verify all the foregoing formulæ. We have shown that the simple two-temperatures problem serves as a paradigm for the problem of objective analysis of the atmospheric state. 15

Collection of Main Equations We gather the principal equations here: 16

Collection of Main Equations We gather the principal equations here: T a = T b + W (T o T b ) W = σ2 b σ 2 b + σ2 o σ 2 a = σ2 b σ2 o σ 2 b + σ2 o = W σ 2 o σ 2 a = (1 W )σ 2 b 16

These four equations have been derived for the simplest scalar case...... but they are important for the problem of data assimilation because they have exactly the same form as more general equations: The least squares sequential estimation method is used for real multidimensional problems (OI, interpolation, 3D-Var and even Kalman filtering). 17

The first equation T a = T b + W (T o T b ) 18

The first equation This says: T a = T b + W (T o T b ) The analysis is obtained by adding to the background value, or first guess, the innovation (the difference between the observation and first guess), weighted by the optimal weight. 18

The second equation This says: W = σ2 b σ 2 b + σ2 o The optimal weight is the background error variance multiplied by the inverse of the total error variance (the sum of the background and the observation error variances). Note that the larger the background error variance, the larger the correction to the first guess. 19

The third equation The variance of the analysis is σ 2 a = σ2 b σ2 o σ 2 b + σ2 o 20

The third equation The variance of the analysis is σ 2 a = σ2 b σ2 o σ 2 b + σ2 o This can also be written 1 σ 2 a = 1 σ 2 b + 1 σ 2 o 20

The third equation The variance of the analysis is σ 2 a = σ2 b σ2 o σ 2 b + σ2 o This can also be written 1 σ 2 a = 1 σ 2 b + 1 σ 2 o This says: The precision of the analysis (inverse of the analysis error variance) is the sum of the precisions of the background and the observation. 20

The fourth equation This says: σ 2 a = (1 W )σ 2 b The error variance of the analysis is the error variance of the background, reduced by a factor equal to one minus the optimal weight. 21

The fourth equation This says: σ 2 a = (1 W )σ 2 b The error variance of the analysis is the error variance of the background, reduced by a factor equal to one minus the optimal weight. It can also be written σ 2 a = W σ 2 o 21

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. 22

All the above statements are important because they also hold true for sequential data assimilation systems (OI and Kalman filtering) for multidimensional problems. In these problems, in which T b and T a are three-dimensional fields of size order 10 7 and T o is a set of observations (typically of size 10 5 ), we have to replace expressions as follows: error variance = error covariance matrix optimal weight = optimal gain matrix. 22

Application to Analysis If the background is a forecast, we can use the four equations to create a simple sequential analysis cycle. The observation is used once at the time it appears and then discarded. Assume that we have completed the analysis at time t i (e.g., at 06 UTC), and we want to proceed to the next cycle (time t i+1, or 12 UTC). The analysis cycle has two phases, a forecast phase to update the background T b and its error variance σb 2, and an analysis phase, to update the analysis T a and its error variance σa. 2 23

Typical 6-hour analysis cycle. 24

Forecast Phase In the forecast phase of the analysis cycle, the background is first obtained through a forecast: T b (t i+1 ) = M [T a (t i )] where M represents the forecast model. 25

Forecast Phase In the forecast phase of the analysis cycle, the background is first obtained through a forecast: T b (t i+1 ) = M [T a (t i )] where M represents the forecast model. We also need the error variance of the background. In OI, this is obtained by making a suitable simple assumption, such as that the model integration increases the initial error variance by a fixed amount, a factor a somewhat greater than 1: σ 2 b (t i+1) = aσ 2 a(t i ) 25

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) 26

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) The estimates of σ 2 b is from σb 2 (t i+1) = aσa(t 2 i ) The new analysis error variance σa(t 2 i+1 ) comes from σa 2 = (1 W )σb 2 It is smaller than the background error. 26

Analysis Phase In the analysis phase of the cycle we get the new observation T o (t i+1 ), and we derive the new analysis T a (t i+1 ) using T a = T b + W (T o T b ) The estimates of σ 2 b is from σ 2 b (t i+1) = aσ 2 a(t i ) The new analysis error variance σ 2 a(t i+1 ) comes from σ 2 a = (1 W )σ 2 b It is smaller than the background error. After the analysis, the cycle for time t i+1 is completed, and we can proceed to the next cycle. 26

Reading Assignment Study the Remarks in Kalnay, 5.3.1 27