Simple Examples. Let s look at a few simple examples of OI analysis.

Simple Examples Let s look at a few simple examples of OI analysis. Example 1: Consider a scalar prolem. We have one oservation y which is located at the analysis point. We also have a ackground estimate x. In addition, we assume that we know the error statistics of y and x : x x 0 ( x x ) t ( y y )( x x ) 0 y y 0 t t t t ( y y ) t o Using the notion of OI solution, n =1, p=1, H=(1), R=( ), B=( ). o 1

( x ) ( x ) a 1 (1)( o) [( y) (1)( x)] 1 1 ( ) (1)( o) (1) x x y x x y o a ( ) o o o 1 1 1 a o The analysis equation can e rewritten as 1 1 o xa x y, o o 1 1 x then xa o o y, xa x y therefore. a o We see that the analysis is a linear comination of the ackground and oservation, with the weighting coefficients proportional to the inverse of variances. In fact, if we have N samples of

oservations, or more generally, N measured estimates of state variale x, called y n, plus one ackground estimate x, then the final estimate comining all availale pieces of information is x x y N a n a n1 on, N 1 1 1 where. a n1 on The variance of the analysis is smaller than that of oth ackground and oservations, i.e., 1 a min[, o1, o,... on ]. This is ecause the varianace is always positive, should e larger than the largest term of the right hand side of the analysis covariance equation aove, therefore should e smaller than the smallest on the right hand side that corresponds to the largest term. a a 3

Now assume that the analysis is for temperature and y =.0, R = ( o ) = 1.0 x = 0.5, B = ( ) =.0 we otain the analysis and its variance x a = 1.5, A 0.7 ( a 0.8). The following figure shows that ackground, oservation and analysis proaility distriution functions assuming the errors have Gaussian distriutions. It is clear that the analysis is etween the ackground and oservation, and is in this particular case closer to the oservation ecause of its smaller expected errors. It is also shown that the analysis has a higher proaility and smaller expected errors compared to oth the ackground and the oservation. Read also section 5.3.3 of Kalnay s ook. 4

Proaility distriution function for analysis, given oservation and the ackground. 5

Example. Now, suppose we have one oservation, y, located etween two analysis points. We have ackground information on the two analysis points, denoted y x 1 and x, and we can linearly interpolate the ackground information to the oservation location as Hx x 1 1 x 1 (1 ) x x where z z For linear interpolation,, where z is the spatial coordinate. z z1 The assumed oservation error is as given efore, R = ( o ), while the ackground error covariance matrix now takes the form of 11 1 1 B. 1 1 Here we have assumed that 6

11 1 1 with eing the ackground error correlation coefficient etween the two grid points. The OI solution is then x x (1 ) y [ x (1 ) x ] x a1 1 1 a x (1 ) [ (1 ) (1 ) ] o. (5) Let s consider three cases: Case 1. The oservation is collocated with analysis grid point 1 (= 1) and the ackground errors are not correlated etween points 1 and (= 0). The aove solution now reduces to x x 1 y x x a1 1 1 a x 0 o. In this case, the solution at point 1 is identical to that in example 1. The solution at point is equal to the ackground and no information from the oservation is added there. Case. The oservation is collocated with analysis at grid 1 (= 1) as in case 1. However, the ackground errors are correlated etween point 1 and point ( 0). The solution now reduces to 7

x x 1 y x x a1 1 1 a x o. In this case, the solution at point 1 is unchanged from case 1, ut the solution at point is equal to the ackground plus times the analysis increment added to point 1 now we can see the role of ackground error correlation in spreading oservational information (or more strictly the analysis increment). Case 3. The oservation is located inetween of the analysis points ( 1) ut the ackground errors are not correlated etween points 1 and (= 0). Now the solution ecomes xa 1 x 1 y [ x 1 (1 ) x] x a x 1 [ (1 ) ] o. In this case, the analysis increments for point 1 and point are proportional to and 1 -, respectively, i.e., the analysis result depends on the distance of the oservation from the grid point. This also says that even in the asence of ackground error correlation, the grid points involved in the forward oservation operators are usually influenced directly y the oservations, through the link in the oservation operator. 8

Final Comments: From the full solution (5), we can see that oth the oservation operator and error correlation have made contriutions. However, when generalizing the solution from two analysis points to n points, the linear interpolation operator will only influence the analysis points around the oservation, while the error correlations may spread information to all analysis points that have error correlation with the first guess oservation H( x ). 9

Approximations with Practical OI Implementation The OI analysis is given y x x W[ y H( x )] a o W ( W BH ( R HBH ) T T 1 The actual implementation requires simplifications in the computation of the weight W. W The equation for xa can e regarded as a list of scalar analysis equations, one per model variale in the vector x. For each model variale the analysis increment is given y the corresponding line of W times the vector of ackground departures ( y H( x )). Given 30

W w w w T 1 T : T n, and T xa 1 x 1 w 1 T x a x w [ y o H ( x )], : : : T xan xn wn therefore, x x w T [ y H ( x )]. ai i i o The fundamental hypothesis in the typically implementation of OI is: For each model variale, only a few oservations are important in determining the analysis increment. Based on this assumption, the prolem of matrix product and inversion is reduced y including only a smaller numer of oservations for the analysis at a given grid point. The following two figures show two data selection strategies. 31

The actual implemented can e as follows: 1) For each model variale x i, select a small numer of p i oservations using empirical selection criteria. ) Form the corresponding list of ackground departures ( y H( x )) i, 3) Form the p i ackground error covariances etween the model variale x i and the model state interpolated at the p i oservation points (i.e. the relevant p i coefficients of the i-th line of BH T ), and 4) Form the p i p i ackground and oservation error covariance sumatrices formed y the restrictions of HBH T and R to the selected oservations. 5) Invert the positive definite matrix formed y the restriction of (HBH T +R) to the selected oservations, 6) Multiply it y the i-th line of BH T to get the necessary line of W. It is possile to save some computer time on the matrix inversion y solving directly a symmetric positive linear system, since we know in advance the vector of departures to which the inverse matrix will e applied. Also, if the same set of oservations is used to analyze several model variales, then the same matrix inverse (or factorization) can e reused. 34

Models of Error Covariances Correct specification of ackground and oservation error covariances is crucial they determine the relative weight of ackground and oservations. Variances are essential for determining the magnitude of errors therefore the relative weight Covariance determines how oservation information is spread in model space (when the model resolution does not match that of oservation) Oservation error variances include instrument errors and representativeness errors. Systematic oservation iases should e removed efore using the data Oservation error correlation/covariance often assumed zero, i.e., measurements are assumed uncorrelated. Oservation error correlation can show up when o Sets of oservations are taken y the same platform, e.g., radar, rawinsonde, aircraft, satellite o Data preprocessing that introduce systematic errors o Representative errors of close-y oservations o Error of the forward operator, e.g., interpolator that contains similar errors The presence of (positive) oservation error correlation reduces the weight given to the average of the oservations reasonale ecause these oservations are alike 35

Oservation error correlations are difficult to estimate and account for. In practice, efforts are made to minimize them through reducing ias, y o Avoiding unnecessary preprocessing o By thinning dense data (denser than grid resolution) o By improving model and oservation operators (model plays the role of forward operator in the case of 4DVAR) After these are done, it is safer to assume the oservation correlation to e zero, i.e., the oservation error covariance matrix R is diagonal. Background error variances they are usually estimates of the error variances in the forecast used to produce x. This is a difficult prolem, ecause they are never oserved directly they can only e estimated in a statistical sense. If the analysis is of good quality (i.e. if there are a lot of oservations) an estimate can e provided y the variance of the differences etween the forecast and a verifying analysis. If the oservations can e assumed to e uncorrelated, much etter averaged ackground error variances can e otained y using the oservational method (or the Hollingworth-Lonnerg method Tellus 1986). 36

However, in a system like the atmosphere the actual ackground errors are expected to depend a lot on the weather situation, and ideally the ackground errors should e flow-dependent. This can e achieved y the Kalman filter, y 4D-Var to some extent, or y some empirical laws of error growth ased on physical grounds. If ackground error variances are adly specified, it will lead to too large or too small analysis increments. In least-squares analysis algorithms, only the relative magnitude of the ackground and oservation error variances is important. Background error correlations they are essential ecause Information spreading. o In data-sparse areas, the shape of the analysis increment is completely determined y the covariance structures. o The correlations in B will perform the spatial spreading of information from the oservation points to a finite domain surrounding it. Information smoothing. 37

o In data-dense areas, the amount of smoothing of the oserved information is governed y the correlations in B, which can e understood y noting that the left most term in W is B. o The smoothing of the increments is important in ensuring that the analysis contains scales which are statistically compatile with the smoothness properties of the physical fields. For instance, when analyzing stratospheric or anticyclonic air masses, it is desirale to smooth the increments a lot in the horizontal in order to average and spread efficiently the measurements. When doing a low-level analysis in frontal, coastal or mountainous areas, or near temperature inversions, it is desirale on the contrary to limit the extent of the increments so as not to produce an unphysically smooth analysis. This has to e reflected in the specification of ackground error correlations. o Both of the aove address only the issue of correlations among same variales, or autocorrelations. Balance properties correlation across variales, or cross-correlations o There are sometimes more degrees of freedom in a model than in reality (i.e., not all model variales are free from each other). For instance, the large-scale atmosphere is usually hydrostatic and is almost geostrophic these relationships introduce alances among the fields 38

o These alance properties show up as correlations in the ackground errors o Because of these correlations / alances, oservations can e used more effectively, i.e., oserving one model variale yields information aout all variales that are in alance with it. For example, a low-level wind oservation allows one to correct the surface pressure field y assuming some amount of geostrophy. When comined with the spatial smoothing of increments this can lead to a considerale impact on the quality of the analysis, e.g. a properly spread oservation of geopotential height can produce a complete three-dimensional correction to the geostrophic wind field (see Figure ). The relative amplitude of the increments in terms of the various model fields will depend directly on the specified amount of correlation as well as on the assumed error variance in all the concerned parameters. o Accurate estimation and use of ackground error (cross-) correlations can do the magic of retrieving quantities not directly oserved, a thing that ensemle Kalman filter attempts to do in, e.g., the assimilation of radar data. o Accurate ackground errors are flow-dependent. 39

The oservational (or Hollingworth Lonnerg) method for estimating ackground error This method relies on the use of ackground departures (y - H(x )) in an oserving network that is dense and large enough to provide information on many scales, and that can e assumed to consist of uncorrelated and discrete oservations. The principle (illustrated in Fig. 8 ) is to calculate an histogram of ackground departure covariances, stratified against separation (for instance). At zero separation the histogram provides averaged information aout the ackground and oservation errors, at nonzero separation it gives the averaged ackground error correlation. 41

In most systems the ackground error covariances should go to zero for very large separations. If this is not the case, it is usually the sign of iases in the ackground and/or in the oservations and the method may not work correctly (Hollingsworth and Lonnerg 1986.). The formula is like this: For the innovation covariance ij, etween point i and j, 4

T E[( y H x )( y H x ) ] ij i i j j T T E[{( y H x ) ( H x H x )}{( y H x ) ( H x H x ) }] i i t i t i j j t j t j T T T E[( y H x )( y H x ) ] H E[( x x )( x x ) ] H 0 0 R i i t j j t i t t j H BH T ij i j Reference: Hollingsworth, A. and P. Lonnerg, 1986: The statistical structure of short-range forecast errors as determined from radiosonde data. Part I: The wind field. Tellus, 38A, 111-136. 43

The NMC method for estimating ackground errors The so-called "NMC method" (Parrish and Derer, 199) estimates the forecast error covariance according to B E{[ x (48 hr) x (4 hr)][ x (48 hr) x (4 hr)] T }, f f f f i.e., the structure of the forecast or ackground error covariance is estimated as the average over many (e.g., 50) differences etween two short-range model forecasts verifying at the same time. The magnitude of the covariance is then appropriately scaled. In this approximation, rather than estimating the structure of the forecast error covariance from differences with oservations, the model-forecast differences themselves provide a multivariate gloal forecast difference covariance. The forecast covariance strictly speaking is the covariance of the forecast differences and is only a proxy of the structure of forecast errors. Nevertheless, it has een shown to produce etter results than previous estimates computed from forecast minus oservation estimates. An important reason is that the rawinsonde oservational network does not have enough density to allow a proper estimate of the gloal structures. The NMC method has een in use at most operational centers ecause of its simplicity and comparative effectiveness. 44

Being ased on many past forecasts (over e.g., 1- months), the estimate is at est seasonally dependent, however. Reference: Parrish, D. E. and J. C. Derer, 199: The National Meteorological Center's spectral statisticalinterpolation analysis system. Mon. Wea. Rev., 10, 1747-1763. The Modeling of ackground correlations The full B matrix is usually too ig to e specified explicitly. The variances are just the n diagonal terms of B, which are usually specified completely. o The off-diagonal terms are more difficult to specify. They must generate a symmetric positive definite matrix. o Additionally B is often required to have some physical properties which are required to e reflected in the analysis: o the correlations must e smooth in physical space, on sensile scales, o the correlations should go to zero for very large separations if it is elieved that oservations should only have a local effect on the increments, o the correlations should not exhiit physically unjustifiale variations according to direction or location, 45

o the most fundamental alance properties, like geostrophy, must e reasonaly well enforced. o the correlations should not lead to unreasonale effective ackground error variances for any parameter that is oserved, used in the susequent model forecast, or output to the users as an analysis product. The complexity and sutlety of these requirements mean that the specification of ackground error covariances is a prolem similar to physical parameterization. Some of the more popular techniques are listed elow. o Correlation models can e specified independently from variance fields, under the condition that the scales of variation of the variances are much larger than the correlation scales, o Vertical autocorrelation matrices for each parameter are usually small enough to e specified explicitly. o Horizontal autocorrelations cannot e specified explicitly, ut they can e reduced to sparse matrices y assuming that they are homogeneous and isotropic to some extent. o Three-dimensional multivariate correlation models can e uilt y carefully comining separaility, homogeneity and independency hypotheses like: zero correlations in the vertical for distinct spectral wavenumers, homogeneity of the vertical correlations in the horizontal and/or horizontal correlations in the vertical, property of the correlations eing products of horizontal and vertical correlations. Numerically they imply that the correlation matrix is sparse ecause it is made of lock matrices which are themselves lock-diagonal 46

o Balance constraints can e enforced y transforming the model variales into suitaly defined complementary spaces of alanced and unalanced variales. The latter are supposed to have smaller ackground error variances than the former, meaning that they will contriute less to the increment structures. o The geostrophic alance constraint can e enforced using the classical f-plane or -plane alance equations, or projections onto suspaces spanned y so-called Rossy and Gravity normal modes. o More general kinds of alance properties can e expressed using linear regression operators calirated on actual ackground error fields, if no analytical formulation is availale. Many of such treatments are done in the NCEP operational 3DVAR system. Good discussions can e found in: Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roerts, 003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part II: Spatially inhomogeneous and anisotropic general covariances. Mon. Wea. Rev., 131, 1536-1548. Purser, R. J., W.-S. Wu, D. F. Parrish, and N. M. Roerts, 003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances. Mon. Wea. Rev., 131, 154-1535. 47

Correlation coefficients etween Z at the lack dot and model states at t=80min. Experiment assimilating Z(>10dBZ) only (See Tong and Xue MWR 005) - color is for the variales, contours are for the correlations. 48

Comment on OI (and 3DVAR) versus other schemes Perhaps the most important advantage of statistical interpolation schemes such as Optimal Interpolation and 3D-Var over empirical schemes such as successive correction method (SCM), is that the correlation etween oservational increments can e taken into account. With SCM, the weights of the oservational increments depend only on their distance to the grid point. Therefore, if a numer of oservations are "unched up" in one quadrant, with just a single oservation in a different quadrant, then all the oservations will e given similar weight. In Optimal Interpolation (or 3D Var), y contrast, the isolated oservational increment will e given more weight in the analysis than oservations that are close together and therefore less independent. When several oservations are too close together, then the OI solution ecomes an ill-posed prolem. In those cases, it is common to compute a "super-oservation" comining the close individual oservations. This has the advantage of removing the ill posedness, while at the same time reducing y averaging the random errors of the individual oservations. The superoservation should e a weighted average that takes into account the relative oservation errors of the original close oservations. 49

The role of oservation operator H Earlier, we said that The H matrix (a p n matraix) transforms vectors in model space (e.g., x, which is a vector of length n) into their corresponding values in oservation space (vectors of length p). The transpose or adjoint of H, H T, (an n p matrix) transforms vectors in oservation space (e.g., y, a vector of length p) to vectors in model space (vectors of length n). The oservation operator and its adjoint can also operate on the error covariance matrices, B and R, and when they do so, they have similar effect as on vectors. T For example, we indicated earlier that BH is the ackground error covariances etween the grid points and oservation points, therefore H T plays the role of taking ackground error from grid point to oservational points, partially though ecause the product still represents correlations etween grid and oservation points. The ackground error is completely rought into the oservation space y the H operator and its T adjoint (transpose), so that HBH represents error covariances of the ackground in terms of the oserved quantities. This can e seen from elow: 50

y H ( x ) y y H( x + x x ) H( x ) H( x x ) t t t t t ( T t )( t ) ( t )( t ) y y y y H x x x x H HBH T T T where approximation has een made to linearize the oservation operator H. For a linear oservation operator, it is exact. Further illustration of the point Suppose we have three oservations, o o o y1, y and y 3 taken etween two grid points with ackground values of x 1 and x : o o o x 1 y 1 y 3 y x The forward operator is simply the linear interpolator, so that 1 1 1 x 1 x 1 Hx 1. x x 1 The ackground error covariance matrix B is 51

therefore B 11 1 1 T 11 1 1 BH 1 1 11 1 11 1 3 11 3 1 1 1 1 3 1 3 Indeed, the ackground error covariances have een interpolated y the oservation operator (interpolator in this case). T For example, the first element of BH, 11 1 1 represents the ackground error covariance etween grid point one and oservation 1, and is equal to the interpolation of the ackground error covariance etween x 1 point with itself ( 11 ) and the ackground error covariance etween x 1 point and x point ( 1 ) to the y 1 oservation point, using interpolation coefficients 1 and 1. Let s denote the matrix c c c T BH c c c 11 1 13 1 3. The first index of c denotes grid point location and second index indicates the oservation point.. 5

T Applying H operator again to BH takes the other grid point end of covariances also to the oservation points, so that we are left with covariances etween oservation points, ut still of ackground errors. HBH. c11 c1 c1 c c13 c3 d11 d1 d13 c c c c c c d d d. 1 T c11 c1 c13 c1 c c3 11 1 1 13 3 1 3 3c11 3c1 3c1 3c 3c13 3c3 d31 d3 d33 Here, covariances c ij are interpolated again, from grid point (indicated y the first index) to the oservational point. For example, d 1 represents ackground error covariance etween y 1 and y point, and is equal to the interpolated value (using weight 1 and 1 ) of the covariance etween x 1 point and y point and the error variance at y point. The error variances and covariances for x are defined as 53

N 1 1 x x x x ( )( ) 11 1i 1t 1i 1t 1i 1i N 1 i1 N 1 i1 N 1 1 x x x x ( )( ) 1 1i 1t i t 1i i N 1 i1 N 1 i1 N 1 1 x x x x ( )( ) 1 i t 1i 1t i 1i N 1 i1 N 1 i1 N 1 1 x x x x ( )( ) i t i t i i N 1 i1 N 1 i1 N N N N where the summations are over the numer of samples, and 1 i x1 i x1 t, i xi xt are the errors of x at points 1 and. x t denotes the truth. Covarance d 1 is then 54

N N N N 1 d1 c1 c ( 1i 1i 1i i ) ( i 1i i i ) N 1 i1 i1 i1 i1 N 1 1i1i 1 i i i1i i i N 1 i1 N 1 ( 1i i)( 1 i i) N 1 i1 N 1 ( y y )( y y N 1 i1 1i 1t i t ) where y1 i x1 i x i and yi x1 i xi. which is clearly the covariance etween the x errors interpolated to oservation points 1 and. For T this reason, in the ensemle Kalman filter procedure where we need HBH from ensemle samples, we apply the oservation operator H to x first, then directly calculate the ackground error covariance etween oservation points, which is much cheaper than calculating B first then multiple H on the left and H T on the right. When the oservation operator H is linear, the two approaches are identical. When H is not linear, the results are not identical, ut if the linearization approximation is reasonaly good, the difference is small. Here, we have two grid points and three oservations, therefore n= and p=3, therefore B is a x T array while HBH is a 3x3 array. Suppose the ackground error covariance is zero, i.e., there is no correlation etween errors at different grid points, then 55

0 0 11 B, T BH and HBH 1 11 11 3 11 3, 1 T 111 11 311 3 1 11 11 3 11 3 1 11 11 3 11 3 3 1 11 3 3 11 3 3 3 11 3 3 Consider the special case where y 1 is located at x 1 point, and y 3 is located at x point, and y is located etween x 1 and x, then 1 1, 1 0, 3 0, 3 1.. T 0 1 11 11 BH, 0 3 56

11 11 T HBH 11 11. Assuming 0 0 R 0 0 o 0 o 0 0 0 o a Derive the formula for, a x, a 1 x and x 3 (you homework no need to turn it in ut make sure you do it!) o o Consider three situations: (1) y is located at equal distance from x 1 and x, () y is located at the o o same point as y 1 and x 1, (3) y does not exist. Discuss the your results. 57