Evaluation of Analysis by Cross-Validation, Part II: Diagnostic and Optimization of Analysis Error Covariance

Size: px
Start display at page:

Download "Evaluation of Analysis by Cross-Validation, Part II: Diagnostic and Optimization of Analysis Error Covariance"

Transcription

1 atmosphere Article Evaluation of Analysis by Cross-Validation, Part II: Diagnostic and Optimization of Analysis Error Covariance Richard Ménard * ID and Martin Deshaies-Jacques Air Quality Research Division, Environment and Climate Change Canada, 2121 ranscanada Highway, Dorval, QC H9P 1J3, Canada; martin.deshaies-jacques@canada.ca * Correspondence: richard.menard@canada.ca; el.: Received: 7 November 2017; Accepted: 13 February 2018; Published: 15 February 2018 Abstract: We present a general ory of estimation of analysis error covariances based on cross-validation as well as a geometric interpretation of method. In particular, we use variance of passive observation-minus-analysis residuals and show that true analysis error variance can be estimated, without relying on optimality assumption. his approach is used to obtain near optimal analyses that are n used to evaluate air quality analysis error using several different methods at active and passive observation sites. We compare estimates according to method of Hollingsworth-Lönnberg, Desroziers et al., a new diagnostic we developed, and perceived analysis error computed from analysis scheme, to conclude that, as long as analysis is near optimal, all estimates agree within a certain error margin. Keywords: data assimilation; statistical diagnostics of analysis residuals; estimation of analysis error; air quality model diagnostics; Desroziers et al. method; cross-validation 1. Introduction At Environment and Climate Change Canada (ECCC) we have been producing hourly surface pollutants analyses covering North America [1 3] using an optimum interpolation scheme which combines operational air quality forecast model GEM-MACH output [4] with real-time hourly observations of O 3, PM 2.5, PM 10, NO 2, and SO 2 from AirNow gateway with additional observations from Canada. hese analyses are not used to initialize air quality model and we wish to evaluate m by cross-validation, that is by leaving out a subset of observations from analysis to use m for verification. Observations used to produce analysis are called active observations while those used for verification are called passive observations. In a first-part paper of this study, i.e., Ménard and Deshaies-Jacques [5], we have examined different verification metrics using eir active or passive observations. As we changed ratio of observation error to background error variances γ = σo 2 /σb 2, while keeping sum σ2 o + σb 2 equal to var(o B), we found a minimum in var(o A) in passive observation space. In this second-part paper, we formalize this result, develop principles of estimation of analysis error covariance by cross-validation, and apply it to estimate and optimize analysis error covariance of ECCC s surface analyses of O 3 and PM 2.5. When we refer to analysis error, or analysis error covariance, it is important to distinguish perceived analysis error with true analysis error [6]. he perceived analysis error is analysis error that results from analysis algorithm itself, whereas true analysis error is difference between analysis and true state. Analysis schemes are usually derived from an optimization of some sort. In a variational analysis scheme for example, analysis is obtained by minimizing a cost function with some given or prescribed observation error and background error covariances, R and B Atmosphere 2018, 9, 70; doi: /atmos

2 Atmosphere 2018, 9, 70 2 of 21 respectively. In a linear unbiased analysis scheme, gain matrix K is obtained by minimum variance estimation, yielding an expression of form, K = BH(H BH + R) 1, where H is observation operator. he perceived analysis error covariance is n derived as à = (I KH) B. In order to derive an expression for perceived analysis error covariance we in fact assume that given error covariances, R and B, are error covariances with respect to true state, i.e., true error covariances. We also assume that observation operator is not an approximation with some error, but is true error-free observation operator. Of course, in real applications neir of R and B are covariance measures with respect to true state, but only a more or less accurate estimate of those. Daley [6] have argued that in principle for an arbitrary gain matrix K, true analysis error covariance A can be computed as A = (I KH)B(I KH) + KR K provided that we know true observation and background error covariances, R and B. his expression is a quadratic matrix equation, and has property that true analysis error variance, tr(a) is minimum when K = BH(HBH + R) 1 = K. In that sense, analysis is truly optimal. he optimal gain matrix K is called Kalman gain. It thus illustrates that although an analysis is obtained through some minimization principle, resulting analysis error is not necessarily true analysis error. One of main sources of information to obtain true R and B is from var(o B) statistic. However, it has always been argued that this is not possible without making some assumptions [7 9], most useful one being that background errors are spatially correlated while observation errors are spatially uncorrelated, or at least on a much shorter length-scale. Even under those assumptions, different estimation methods such as Hollingsworth-Lönnberg method [10], maximum likelihood gives different error variances and different correlation lengths [11]. Or methods use var(o B) for rescaling but assume that observation error is known. he assumption that observation error is known is also debated as y contain representativeness errors [12] that include observation operator errors. How to obtain an optimal analysis is thus unclear. he evaluation of true or perceived analysis error covariance using its own active observations is also a misleading problem unless analysis is already optimal. Hollingsworth and Lönnberg [13] addressed this issue for first time where y noted that in case of an optimal gain (i.e., optimal analysis), statistics of observation-minus-analysis residuals O  are related to analysis error by E[(O Â)(O Â) ] = R HÂH, where  is optimal analysis error covariance and H and R are observation operator and observation error covariance respectively. he caret (ˆ) over A indicates that analysis uses an optimal gain. In context of spatially uncorrelated observation errors, off-diagonal elements of E[(O Â)(O Â) ] would n give analysis error covariance in observation space. Hollingsworth and Lönnberg [13] argued that for most practical purposes, negative intercept of E[(O Â)(O Â) ] at zero distance and prescribed observation weight should be nearly equal, and thus could be used as an assessment of optimality of an analysis. However, in case where such agreement does not exist, an estimate of actual analysis error is not possible. Anor method, proposed by Desroziers et al. [14], argued that diagnostic E[(O Â)( B) ] should be equal to analysis error covariance in observation space but, again, only if gain is optimal and innovation covariance consistency is respected [15]. he impasse of estimation of true analysis error seems to be tied with using active observations, i.e., using same observations as those used to create analysis. A robust approach that does not require an optimal analysis is to use observations whose errors are uncorrelated with analysis error. For example, if we assume that observation errors are temporarily (serially) uncorrelated, an estimation of analysis error can be made with help of a forecast model initialized by analysis by verifying forecast against se observations. his is essential assumption used traditionally in meteorological data assimilation to assess indirectly analysis error by comparing resulting forecast with observations valid at forecast time. As forecast error grows with time, observation-minus-forecast can be used to assess wher an analysis is better than anor. In a somewhat different method but making same assumption, Daley [6]

3 Atmosphere 2018, 9, 70 3 of 21 used temporal (serial) correlation of innovations to diagnose optimality of gain matrix. his property was first established in context of Kalman filter estimation ory by Kailath [16]. However, both traditional meteorological forecast approach and Daley method [6] are subject to limitations: y assume that model forecast has no bias and analysis corrections are made correctly on all variables needed to initialize model. In practice, improper initialization of unobserved meteorological variables gives rise to spin-up problems or imbalances. Furrmore, with traditional meteorological approach, compensation due to model error can occur, so that an optimal analysis does not necessarily yield an optimal forecast [7]. An alternative approach introduced by Marseille et al. [17], which we will follow here, is to use independent observation or passive observations to assess analysis error. he essential assumption of this method is that observations have spatially uncorrelated errors, so that observations used for verification, i.e., passive observations, have uncorrelated errors with analysis. he advantage of this approach is that it does not involve any model to propagate analysis information to a later time. Marseille et al. [17] n showed that by multiplying Kalman gain with an appropriate scalar value, one can reduce analysis error. In this paper, we go furr by using principles of error covariance estimation to obtain a near optimal Kalman gain. In addition we impose innovation covariance consistency [15] and show that all diagnostics of analysis error variance nearly agree with one anor. hese include Hollingsworth and Lönnberg [13], Desroziers et al. [14] and new diagnostics that we will introduce. he paper is organized as follows. First we present in Section 2 ory and diagnostics of analysis error covariance in both passive and active observation spaces, as well as a geometrical representation. his leads us to a method to minimize true analysis error variance. In Section 3, we present experimental setup on how we obtain near optimal analyses and presents results of several diagnostics in active and passive observation spaces, and compare with analysis error variance obtained from optimum interpolation scheme itself. In Section 4, we discuss statistical assumptions being used, how and if y can be extended and how this formalism can be used in or applications such as estimation of correlated observation errors with satellite observations. Finally, we draw some conclusions in Section heoretical Framework his section is composed of mainly three parts. In Sections 2.1 and 2.2, we first describe diagnostics to obtain true error covariances using passive observations wher analysis is optimal or not. We n give a geometric interpretation in Section 2.3 that indicate way to obtain optimal analysis, that is, one that minimizes true analysis error. Lastly, in Sections 2.4 and 2.5, we formulate diagnostics of analysis error for optimal analyses Diagnostic of Analysis Error Covariance in Passive Observation Space Let us decompose observation space in two disjoint sets; active observation set or training set {y} used to create analysis, and independent or passive observation set {y c } used to evaluate analysis. An analysis built from prescribed background and observation error covariances, B and R respectively, is given by x a = x f + BH (H BH + R) 1 d = x f + Kd (1) where K is gain matrix built from prescribed error covariances, H is observation operator for active observation set, d is active innovation vector, d = y Hx f = ε o Hε f, x f is

4 Atmosphere 2018, 9, 70 4 of 21 background state or model forecast, ε o is active observation error and ε f background error. he observation-minus-analysis residual (O A) for active set is given by [14,15,18]. (O A) = y Hx a = ε o Hε a = d H BH (H BH + R) 1 d = R(H BH + R) 1 d (2) where ε a is analysis error. he analysis interpolated at passive observation sites can be denoted by H c x a, where H c is observation operator at passive observation sites. he observation-minus-analysis residual at passive observation sites (O A) c is n given by (O A) c = y c H c x a = ε o c H c ε a = d c H c BH (H BH + R) 1 d, (3) where d c = y c H c x f = ε o c H c ε f is innovation at passive observation sites. Note that formalism introduced here is general, and can be used with any set of independent observations such as different instruments or observation networks as long as H c operator is properly defined. Consequently, for generality, we distinguish passive observation errors or independent observation error, ε o c, from active observation error ε o. here are two important statistical assumptions from which we derive cross-validation diagnostics. Assuming that observation errors are spatially uncorrelated, it follows that E[ε o (ε o c) ] = 0 (4) where E[ ] is mamatical expectation that represents mean over an ensemble of realizations. It has been argued by Marseille et al. [17] that representativeness error can violate this assumption for a close pair of active-passive observations, but we will neglect this effect. Also, assuming that observation errors are uncorrelated with background error, we have E[ε o (Hε f ) ] = 0, E[ε o c(h c ε f ) ] = 0 (5) We come now to most important property: since analysis is a linear combination of active observations and background state, analysis error is n uncorrelated with passive observation errors, E[( c ε a )(ε o c) ] = 0 (6) and thus we get following cross-validation diagnostic in passive observation space, E[(O A) c (O A) c ] = R c + H c AH c (7) similarly to Marseille et al. [17]. A very important point to note is that A is true analysis error covariance it does not assume that gain is optimal. he matrices in Equation (7) are of dimension of passive observation space and R c is observation error covariance matrix for passive or independent observations A Complete Set of Diagnostics of Error Covariances in Passive Observation Space It is also possible to define a set of diagnostics that would determine, in principle, true error covariances, R, B and A. From Equations (4) and (5) we get anor cross-validation diagnostic, E[(O B) c (O B) ] = H c BH (8)

5 Atmosphere 2018, 9, 70 5 of 21 his diagnostic is related to Hollingsworth-Lönnberg [10] estimation of spatially correlated part of innovation, and in practice can be dominated by sampling error. Note that it is not a square matrix and an estimation of parameters of B may not be trivial. We can also obtain innovation covariance matrix in passive observation space (identical in fact to one in active observations space) as E[(O B) c (O B) c ] = R c + H c BH c (9) he system of Equations (7) (9) gives a complete set of equations to determine true R, B and A at passive observation sites provided that by interpolation/extrapolation we can obtain H c BH c from H c BH. Neverless, and for sake of completeness, we also investigated meaning of statistics E[(O A) c (O B) ] and came to conclusion that it can interpreted as a misfit to Desroziers et al. estimate of B [14] with true value B. We recall that first iterate of Desroziers et al. estimate for B in active observation space is given by HB D H = E[(Hx a Hx f )(y Hx f ) ] [15], where we used superscript D indicate Desroziers et al. first iterate estimate. We can actually generalize this diagnostic to be a cross-covariance between state space and active observation space as B D H = E[(x a x f )(y Hx f ) ] (10) By applying H c to Equation (10) we can n introduce a generalized Desroziers et al. estimate of background error covariance B between passive and active observation spaces as, E[(A B) c (O B) ] = H c B D H (11) Since (O A) c = (O B) c (A B) c, we get with Equations (8) and (10), E[(O A) c (O B) ] = H c (B B D )H (12) hat is difference between true B and Desroziers et al. first estimate B D in cross passive-active observation spaces. Similarly to Equation (8) this diagnostics requires spatial interpolation of error covariances from active sites to passive sites of basically noisy statistics. he estimation of B from this diagnostics is furr complicated by fact that what is being interpolated, that is B B D, may not even be positive definite, but augmented matrix, cov ( (O B) (O A) c ) = [ R + HBH H c (B B D )H H(B B D )Hc R c + H c AHc is positive definite. Except for diagonal of Equation (13), we have not attempted in this study to conduct this complete estimation of R, B and A, but rar focused on getting a reliable estimate of analysis error covariance A Geometrical Interpretation A geometrical illustration of some of relationships obtained above can be made by using a Hilbert space representation of random variables in observation space. A 2D representation for analysis of a scalar quantity was used in Desroziers et al. [14] to illustrate ir a posteriori diagnostics. We will generalize this approach to include passive observations by considering a 3D representation. As in Desroziers et al. [14] let s consider analysis of a scalar quantity. Several variables are to be considered in this observation space: y o active observation (or measurement) of scalar quantity, y b background (or prior) value equivalent in observation space (i.e., y b = H x b ), y a analysis in observation space (i.e., y a = H x a ), and for verification y c an independent observation (or passive ] (13)

6 Atmosphere 2018, 9, 70 6 of 21 observation) that is not used to compute analysis. Each of se quantities is a random variable as y each contain random errors, and any linear combination of random variables in observation space also belong to observation space. For example, y o y b is innovation (commonly denoted by O-B) and that belongs to observation space, y a y b is analysis increment in observation space (commonly denoted by A-B), and y o y a is analysis residual in observation space (commonly denoted by O-A). We can also define an inner product of any random variables in observation space. y 1, y 2, as y 1, y 2 := E[(y 1 E(y 1 ))(y 2 E(y 2 ))] (14) he squared norm n represents variance, so inner product has following geometric interpretation y 2 := y, y = σ 2 y (15) y 1, y 2 = y 1 y 2 cos θ (16) where cos θ is correlation coefficient. Uncorrelated random variables are thus statistically orthogonal. With this inner product, observation space forms a Hilbert space of random variables. Figure 1 illustrates statistical relationship in observation space between: active observation y o (illustrated as O in figure), prior or background y b (i.e., B), analysis y a (i.e., A), and independent observation y c (i.e., O c ). he origin corresponds to truth of scalar quantity, and also corresponds to zero of central moment of each random variables, e.g., y E[y], since each variables are assumed to be unbiased. We also assume that background, active and passive observations errors are uncorrelated to one anor, so three axes; ε o for active observation error, ε b for background error, and ε o c for passive observation error are orthogonal. he plane defined by ε o and ε b axes is space where analysis takes place, and is called analysis plane. However, since we define analysis to be linear and unbiased, only linear combinations of form y a = ky o + (1 k)y b where k is a constant are allowed. he analysis A n lies on line (B, O). he thick lines in Figure 1 represent norm of associated error. For example, thick line along ε o axis depict (active) observation standard deviation σ o, and similarly for or axes and or random variables. Since active observation error is uncorrelated with background error, triangle OB is a right triangle, and by Pythagoras orem we have, (y o y b ) 2 := (O B), (O B) = σo 2 + σb 2. his is usual statement that innovation variance is sum of background and observation error variances. he analysis is optimum when analysis error ε a 2 = σa 2 is minimum, in which case line (, A) is perpendicular to line (O, B). Now let s consider passive observation O c. he passive observation error is perpendicular to analysis plane, thus triangle O c A is a right triangle, (y c y a ) 2 := (O A) c, (O A) c = σ 2 c + σ 2 a (17) where σ 2 c is passive observation error variance. he most important fact to stress here is that orthogonality expressed in Equation (17) is true wher or not analysis is optimal. Furrmore, as distance (y c y a ) 2 varies with position of A along line (O, B), distance (y c y a ) 2 reaches a minimum value when σ 2 a is minimum that is when analysis is optimal. We thus also argue from this representation that re is always a minimum, and minimum is unique. Finally, we note that BO c is also a right triangle so that (y c y b ) 2 := (O B) c, (O B) c = σ 2 c + σ 2 b, which is scalar version of Equation (9).

7 Atmosphere 2018, 9, 70 7 of 21 Atmosphere 2017, 8, 70 7 of 21 Figure Figure 1. Hilbert 1. Hilbert space space representation of a of scalar a scalar analysis analysis and and cross-validation problem. he he arrows arrows indicate indicate directionsof of variability of of random variables, and and plane defined plane defined by background by b o background and observation and observation errors ε b, errors ε o defines ε, ε analysis defines plane. analysis he thick plane. lines he represent thick lines represent norm associated norm with associated different with random different variables. random indicate variables. truth, indicate O active truth, observation, O active Bobservation, background, B Abackground, analysisa and Oanalysis c passive and observation. Oc passive observation. o extend o extend this this formalism to random to random vectors vectors requires requires to define to define a proper a proper matricial matricial inner inner product product as briefly as briefly described described in Section in Section of Caines of Caines [19]. [19]. It is It is important here here to to distinguish stochastic metric metric space space from from observation vector vector space. space. In In stochastic stochastic metric metric space space a scalar a scalar needs needs not not to to be a benumber, a but but only only a a non-random quantity quantity invariant invariant with with respect respect to E to [ ] E[. Hence, ]. Hence, we we can can define define a a Hilbert Hilbert space with with following following (stochastic scalar) (stochastic matricial scalar) inner product matricial y, w = inner E[(y product E(y))(w y, E(w) w = E )] [( y= cov(y, E( y))( ww). Ehe ( w) matrix )] = cov( nature y, w) of. cov(y, he matrix w) pertains nature to of observation cov( y, w) pertains vector space, to but it observation remains a vector scalar space, with respect but it remains to stochastic a scalar with Hilbert respect space to herein stochastic defined. In Hilbert order space to obtain herein a true defined. scalar In ( order R), one to obtain would a need true to scalar define ( a R metric ), one on would observation need to define space a metric matrices on as well, observation such as trace. Also to be able to compare active and passive observations, projectors need to be introduced on space matrices as well, such as trace. Also to be able to compare active and passive observations, observation space, implying yet anor metric structure on observation space. We do not carry projectors need to be introduced on observation space, implying yet anor metric structure on out this formalism here as it would represent a rar lengthy development that would distract us observation space. We do not carry out this formalism here as it would represent a rar lengthy from main purpose of this paper; this will be considered in a future manuscript. Finally, we remark development that would distract us from main purpose of this paper; this will be considered in a that Hilbert space representation of random variables in infinite dimensional space (i.e., continuous future manuscript. Finally, we remark that Hilbert space representation of random variables in space) can also be defined, see Appendix 1 of Cohn [20]. infinite dimensional space (i.e., continuous space) can also be defined, see Appendix 1 of Cohn [20] Error Covariance Diagnostics in Active Observation Space for Optimal Analysis 2.4. Error Covariance Diagnostics in Active Observation Space for Optimal Analysis An analysis is optimal if analysis error E[(ε a ) ε a ] is minimum. his implies that gain a a matrix An analysis using is optimal prescribed if error analysis covariances, error E [( K ε ) ε ] is minimum. his implies that gain matrix using prescribed error covariances, K ~ in Equation (1), must be identical to gain using true error covariances, i.e., K in Equation (1), must be identical to gain 1 using true error covariances, i.e., K ~ = BH (HBH + R) 1 [14,15]. It is important to mention that necessary and sufficient conditions to = obtain BH ( HBH true + Rerror ) [14,15]. covariances It is important HBH and to Rmention in observation that space, are: 1 Kalman gain condition, H K = HK true and 2 innovation covariance consistency, necessary and sufficient conditions to obtain true error covariances HBH and R in E[(O B)(O B) ] = H BH + R. For a proof see heorem true observation space, are: 1 Kalman gain condition, H K ~ on error covariance estimates in Ménard [15]. = HK and 2 innovation ~ ~ covariance Fromconsistency, optimality E [( of O B)( analysis O B) (or ] = HBH Kalman+ gain) R. For alone, a proof we derive see that E[(Â heorem )(O on error B) ] = covariance 0 or E[(Â estimates )(Oin Ménard Â) ] = [15]. 0. Indeed, from Equation (2), we get (O Â) = R(HBH + R) 1 d, From optimality of analysis (or Kalman gain) alone, we derive that ( ˆ and for analysis error in observation space we get, (Â ) = E [ R(HBH A )( O+ R) B) 1 ] = Hε 0 f + or HBH ( ˆ 1 E [ A (HBH )( O A+ ˆ) R) ] = 1 0ε. o Indeed,, from which from Equation we derive (2), we expectations get ( O Aˆ) = above. R ( HBH + Using R) d, and geometrical for representation in Section 2.3 distance 1 f 1 o analysis error in observation space we get, ( Aˆ between A and is minimum, when AO and AB ) = R ( HBH + R) Hε + HBH ( HBH + R) ε, (Figure 1) are right triangles. We should also note that for scalar problem, Kalman gain depends from which we derive expectations above. Using geometrical representation in Section 2.3 distance between A and is minimum, when AO and AB (Figure 1) are right triangles. We should also note that for scalar problem, Kalman gain depends only on ratio of

8 Atmosphere 2018, 9, 70 8 of 21 only on ratio of observation to background error variances and thus scalar Kalman gain is optimal if ratio of prescribed error variances is equal to ratio of true error variances. If in addition to optimality of analysis or Kalman gain we add innovation covariance consistency n we get three different statistical diagnostics of (optimal) analysis error covariance. Hollingsworth and Lönnberg [13] was first to introduce a statistical diagnostic of analysis error in active observation space, as: E[(O Â)(O Â) ] = R H HL H (18) Here we use a subscript, HL to indicate that this is Hollingsworth-Lönnberg estimate. Equation (18) can obtained from covariance of (O Â) and that, for an optimal gain matrix, HÂH = R R(HBH + R) 1 R which derives from usual formula,  = B BH (HBH + R) 1 HB. Geometrically it derives from fact that triangle ÂO is a right triangle, and from innovation covariance consistency that implies that triangle OB is a right triangle. Using data to construct E[(O Â)(O Â) ], an estimated analysis error covariance obtained from Equation (18) is symmetric but could be non-positive definite as it is obtained by subtracting two positive definite matrices. he effect of misspecification in prescribed error covariances resulting from a lack of innovation covariance consistency will be discussed in result Section 3 and in Appendix B. Inspired from geometrical interpretation that AB is also be a right triangle we derived following diagnostic, E[( B)( B) ] = HBH H MDJ H = H(B  MDJ )H (19) where MDJ stands for Ménard-Deshaies-Jacques. his relationship is obtained by using expression ( B) = HBH (HBH + R) 1 d, innovation covariance consistency and formula for optimal analysis error covariance  = B BH (HBH + R) 1 HB. As for HL diagnostics, estimated error covariance obtained from this diagnostic is symmetric by construction but may not be positive definite. Anor way of looking at Equation (19) is that it expresses, in observation space, error reduction due to use of observations. Anor diagnostic of analysis error covariance was proposed by Desroziers et al. [14]. By combining (O Â) and ( B) we get E[(O Â)( B) ] = H D H (20) where subscript D denotes Desroziers et al. estimate. By construction, estimated analysis error covariance is not necessarily symmetric. A geometrical derivation is provided in Appendix A. We also provide in Appendix B a sensitivity analysis on departure from innovation covariance consistency for each diagnostics in both active and passive observation spaces Error Covariance Diagnostics in Passive Observation Space for Optimal Analysis We can also derive optimal analysis diagnostics in passive observation space. Considering 3D geometric interpretation, and in particular tetrahedron (O c,, A, B), we notice that since AB is a right triangle, so is O c AB, which is a projection of triangle AB on plane passing through O c, O and B. We thus have E[( B) c ( B) c ] + E[(O Â) c (O Â) c ] = E[(O B) c (O B) c ]. Combining this result with Equation (7) and using, E[(O B) c (O B) c ] = H cbh c + R c, we n get E[( B) c ( B) c ] = H cbh c H c  MDJ H c (21) Note that our analysis diagnostic is only diagnostic that is valid in both active and passive observation spaces, i.e., Equation (21) is similar to Equation (19).

9 Atmosphere 2018, 9, 70 9 of 21 he or, less direct, diagnostic for optimal analysis is simply based on Equation (7) that is argmin γ, L c {E[(O A) c (O A) c ]} = R c + H c Â(γ, L c )H c (22) hese five diagnostics will be used later in results section. 3. Results with Near Optimal Analyses 3.1. Experimental Setup We will just give here a short summary of experimental setup we are using in this study. More details can be found in Part I paper (Ménard and Deshaies-Jacques [5]). A series of hourly analyses of O 3 and PM 2.5 at 21 UC for a period of 60 days (14 June to 12 August 2014) were performed using an optimum interpolation scheme combining operational air quality model GEM-MACH forecast and real-time AirNow observations (see Section 2 of [5] for furr details). he analyses are made off-line so y are not used to initialize model. As input error covariances, we use uniform observation and background error variances, with R = σo 2 I and B = σb 2 C, where C is a homogeneous isotropic error correlation based on a second-order autoregressive model. he correlation length is estimated by using a maximum likelihood method using at first, error variances obtained from a local Hollingworth-Lönnberg fit [11] (and only for purpose of obtaining a first estimate of correlation length). We n conduct a series of analyses by changing error variance ratio γ = σo 2 /σb 2 while at same time respecting innovation variance consistency condition, σo 2 + σb 2 = var(o B). his corresponds basically in searching for minimum of tr (trace) of Equation (7) while trace of innovation covariance consistency, tr[r + HBH ] = tr{e[(o B)(O B) ]}, is respected. First observations are separated into 3 sets of observations of equal number and distributed randomly in space. By leaving out one set of observations for verification and constructing analyses with remaining 2 or sets, we construct a cross-validation setup from which we can evaluate diagnostic var(o A) c = tr{e[(o A) c (O A) c } in passive observation space. his constitutes our first guess experiment that we will refer to as iter 0. No observation or model bias correction was applied, nor were mean of innovation at stations were removed prior to performing analysis. he variance statistics are first computed at station using 60-day members, and n averaged over domain to give equivalent of tr{e[(o A) c (O A) c }. We repeat procedure by exhausting all permutations possible, in this case 3. he mean value statistic for three verifying subsets are n averaged. More details can be found in Section 2 and beginning of Section 3 of Part I [5]. he red curve on Figure 2 illustrates how this diagnostic varies with and exhibits a minimum. his minimum can easily be understood by referring to Figure 1: as analysis point A changes position along line (O, B) distance O c A reaches a minimum, and this is what we observe in Figure 2. In our next step, iter 1, we first re-estimate correlation length by applying a maximum likelihood method, as in Ménard [15], using iter 0 error variances that are consistent with optimal ratio ˆγ obtained in iter 0 and innovation variance consistency. hen, with this new correlation length, we estimate a new optimal ratio ˆγ (iter 1), which turn out to be very close to value obtained in iter 0. We recall that we use uniform error variances, both to keep things simple but also because optimal ratio is obtained by minimizing a domain-averaged variance var(o c A). A summary of error covariance parameters obtained for iter 0 and iter 1 are presented in able 1.

10 Atmosphere 2018, 9, of 21 Atmosphere 2017, 8, of 21 Figure Red line, variance of of observation-minus-analysis of OO3 3 in in passive observation space. Blue line, variance of observation-minus-model. able able1. Input error statistics for first experiment and and optimized variance variance ratio ratio experiment. 2 Experiment L c (km) ( O B) Experiment L c (km) (O B) 2 ˆγ = ˆ γ ˆσ =σo/ 2 ˆ o ˆσ / ˆ σ b σˆ o ˆb σ χ / N b 2 ˆσ o 2 ˆσ b 2 χ 2 s/n s O O3 3 iter 0iter O 3 O3 iter 1iter PM 2.5 PM2.5 iter 0iter PM 2.5 iter PM2.5 iter We have also added in 2 able 1, χ 2 /N / N s diagnostic s values [15] [15] which is is 60-day mean value of χ 2 2 value of k /N s (k) where N s (k) is total number of observations available at time t χ k / N s ( k) where N s (k) is total number of observations available at time k and χ 2 t k = ( 1dk k and dk H BH + R) 2 ~ ~. 1 he χ 2 χ k = d k ( HBH + R) 2 /N s diagnostics [15] is closely related to J min diagnostic used in variational methods, d and k. he should χ have / Ns diagnostics a value of 1 [15] in is closely case where related to innovations J min diagnostic are consistent used with in variational prescribed methods, error and statistics. should When have a re value isof innovation 1 case covariance where consistency innovations n are χconsistent 2 /N s = 1 but with reverse prescribed is not true. error We statistics. observe that When re was re a significant is innovation improvement covariance fromconsistency iter 0 to itern 1 in 2 terms χ / Nof s = χ1 2 /N but s but is reverse still not is equal not true. to one. We We observe thus refer that re analysis was a of significant iter 1 as near improvement optimal. from 2 iter 0 he to iter repeated 1 in terms application of χ / of Nthis s but sequence is still not of estimation equal to one. methods, We thus i.e., refer find analysis correlation of iter length 1 as by maximum likelihood and optimize variance ratio, converges really fast and in practice re is near optimal. no need to go beyond iter 1. Figure 3 displays iterates 0 to 4 with our estimation procedure for O he repeated application of this sequence of estimation methods, i.e., find correlation length 3. With one iteration update we nearly converge. A similar procedure was used in Ménard [15], where by maximum likelihood and optimize variance ratio, converges really fast and in practice re is variance and correlation length (estimated by maximum likelihood) were estimated in sequence, no need to go beyond iter 1. Figure 3 displays iterates 0 to 4 with our estimation procedure for O3. which taught us that a slow and fictitious drift in estimated variances and correlation length can occur With one iteration update we nearly converge. A similar procedure was used in Ménard [15], where when correlation model is not true correlation. So in regard of similar considerations that may variance and correlation length (estimated by maximum likelihood) were estimated in sequence, occur here, we do not extend our iteration procedure beyond first iterate. which taught us that a slow and fictitious drift in estimated variances and correlation length can occur 3.2. when Statistical correlation Diagnostics model of Analysis not Error true Variance correlation. So in regard of similar considerations that may occur here, we do not extend our iteration procedure beyond first iterate. For each of se experiments, statistics related diagnostics for analysis error variance, discussed in 3.2. Section Statistical 2, are Diagnostics computedof and Analysis results Error Variance are presented in able 2 for verification made against active observations, and in able 3 to verification made against passive observations. If analysis For was each truly of optimal se experiments, different diagnostics statistics related would diagnostics all agree. for analysis error variance, discussed in Section 2, are computed and results are presented in able 2 for verification made against active observations, and in able 3 to verification made against passive observations. If analysis was truly optimal different diagnostics would all agree.

11 Atmosphere 2018, 9, of 21 Atmosphere 2017, 8, of Figure 3. Figure Optimal 3. Optimal estimates estimates of σ of o, σ b and maximum likelihood estimate of c o 2, σb 2 and maximum likelihood estimate of correlation length L c for first four for iterates. first four Blue, iterates. is Blue, optimal is background optimal background error error variance, variance, green, green, optimal observation error error variance and in red correlation length (in km, with labels on right side of figure). variance and in red correlation length (in km, with labels on right side of figure). able 2. Analysis statistics against active observations. able 2. Analysis statistics against active observations. Active Active Active Active Active Experiment ˆ var( A B) tr ˆ ( HAMDJ H ) / Ns tr ( H Aˆ DH ) / N ˆ s var( O A) tr ˆ Active Active Active Active ( HA HLH Active )/ Ns Experiment O3 iter 0 var(â B) tr(hâ MDJ H )/N s tr(hâ 9.61 D H )/N s var(o Â) 6.03 tr(hâ HL H )/N s O3 iter O 3 iter PM2.5 iter O 3 iter PM2.5 iter PM 2.5 iter PM 2.5 iter In second, third and last column of able 2 are tabulated estimates of analysis error variance at active location sites, i.e., tr (HAH ) / Ns, obtained by three different methods. he 2 second column is an estimate given with our method σ b var( A ˆ B) = tr( HAˆ MDJ H ) / Ns. he third column is Desroziers et al. estimate of analysis error [14], Equation (20), and last column is estimate using method proposed by Hollingsworth and Lönnberg [13], Equation (18). We note that analysis error variance estimate provided by first two methods is fairly consistent 2 for an updated correlation length estimate, i.e., iter 1 (but not iter 0). We also note that χ / N s is closer to one for iter 1. hese two facts indicate that updated correlation length (iter 1) with uniform error variances is closer to innovation covariance consistency. he Hollingsworth and Lönnberg [13] method however, is very sensitive and negatively biased in lack of innovation covariance consistency. Estimate of analysis error variance at passive observation locations, i.e., tr ( H cahc )/ N s, provided by two different methods are given by Equation (21) in column 3 and by Equation (22) in column 5 of able 3. As for estimate at active locations (able 2), re is a general agreement on analysis error estimates with updated correlation length (iter 1), although this distinction is not that clear for PM2.5. In second, third and last column of able 2 are tabulated estimates of analysis error variance at active location sites, i.e., tr(hah )/N s, obtained by three different methods. he second column is an estimate given with our method σb 2 var(â B) = tr(hâ MDJ H )/N s. he third column is Desroziers et al. estimate of analysis error [14], Equation (20), and last column is estimate using method proposed by Hollingsworth and Lönnberg [13], Equation (18). We note that analysis error variance estimate provided by first two methods is fairly consistent for an updated correlation length estimate, i.e., iter 1 (but not iter 0). We also note that χ 2 /N s is closer to one for iter 1. hese two facts indicate that updated correlation length (iter 1) with uniform error variances is closer to innovation covariance consistency. he Hollingsworth and Lönnberg [13] method however, is very sensitive and negatively biased in lack of innovation covariance consistency. Estimate of analysis error variance at passive observation locations, i.e., tr(h c AHc )/N s, provided by two different methods are given by Equation (21) in column 3 and by Equation (22) in column 5 of able 3. As for estimate at active locations (able 2), re is a general agreement on analysis error estimates with updated correlation length (iter 1), although this distinction is able 3. Analysis statistics against passive observations. not that clear for PM 2.5. Passive Passive Passive Passive Experiment able var[( Aˆ 3. Analysis B) ] tr statistics ( H Aˆ 2 Hagainst ) / N passive var[( Oobservations. Aˆ) ] var[( O Aˆ) ] σ c c MDJ O3 iter O3 iter 1 Passive Passive Passive Passive Experiment PM2.5 iter var[(â 0 B) c ] tr(h c  MDJ H c )/N s var[(o Â) c ] var[(o Â) c ] σoc 2 O 3 iter 0PM2.5 iter O 3 iter PM 2.5 iter PM 2.5 iter c s c c oc We note also that analysis error variance at active sites is smaller than analysis error variance at passive observation sites. his involves in particular fact that since passive

12 Atmosphere 2018, 9, of 21 observation are away from active observation sites, reduction of variance at passive observation sites is smaller than at active observation sites Comparison with Perceived Analysis Error Variance Atmosphere 2017, 8, of 21 We computed analysis error covariance A resulting from analysis scheme, so-called perceived analysis We note covariance also that [6], analysis usingerror variance expression, at active sites is smaller than analysis error variance at passive observation sites. his involves in particular fact that since passive observation are away from active A = B BH observation sites, (H BH + R) 1 reduction of H B = B GG variance at passive (23) observation sites is smaller than at active observation sites. Contrary 3.3. Comparison to statistical with Perceived diagnostics Analysis above, Error Variance perceived analysis error variance is obtained at each model grid point. We n compared perceived analysis error variance at active We computed analysis error covariance A resulting from analysis scheme, observationso-called sites with perceived estimated analysis covariance active analysis [6], using error expression, variance obtained from previous diagnostics. Since we showed that in both active and ~ passive ~ ~ observations ~ 1 ~ ~ sites different diagnostics agrees A = B BH ( HBH + R) HB = B GG. (23) when analysis is optimal, this would indicate that if perceived analysis error variance agrees with analysis Contrary errorto variance statistical diagnostics, that above, whole perceived map analysis of analysis error variance erroris variance obtained at is credible. each model grid point. We n compared perceived analysis error variance at active In order to calculate perceived analysis error covariance Equation (23) we first perform a observation sites with estimated Choleski decomposition of H BH active analysis + R = LL error variance obtained from previous diagnostics. Since we showed that in both active and passive, where observations L is a sites lower triangular different diagnostics matrix. agrees hen with a forward substitution when analysis we obtain is optimal, L 1 this, from would which indicate we that compute if perceived G = analysis BH L error. he variance perceived agrees analysis with analysis error variance diagnostics, that whole map of analysis error variance is credible. error variance for ozone optimal analysis (i.e., O 3 iter 1) is displayed in Figure 4 (A similar figure In order to calculate perceived analysis error covariance Equation (23) we first perform a but for PM 2.5 is given in supplementary material). We note that although input statistics used for Choleski decomposition of H BH ~ + R ~ = LL, where L is a lower triangular matrix. hen with a analysis are uniform (i.e., uniform background 1 and observation error variances, and homogeneous forward substitution we obtain L, from which we compute G = BH ~ L. he perceived analysis correlationerror model), variance for computed ozone optimal analysis analysis error (i.e., variance at active observation location displays O3 iter 1) is displayed in Figure 4 (A similar figure large variations, but for which PM2.5 is given is attributed in supplementary to non-uniform material). We note spatial that although distribution input of statistics active used observations. for In Figure analysis 5 we display are uniform a (i.e., histogram uniform background of those variances and observation for error ozone variances, optimal and homogeneous analysis O 3 iter 1 correlation model), computed analysis error variance at active observation location displays (panel b) and for first experiment O 3 iter 0 (panel a) without optimization (A similar figure is but large variations, which is attributed to non-uniform spatial distribution of active observations. for PM 2.5 is given In Figure in supplementary 5 we display a histogram material). of those variances for ozone optimal analysis O3 iter 1 Note that (panel median b) and for mean first experiment values of O3 variances iter 0 (panel area) significantly without optimization different (A similar between figure is but optimal and non-optimal for analysis PM2.5 given cases. in supplementary Although material). observation and background errors are uniform in both Note that median or mean values of variances are significantly different between optimal optimal and non-optimal analyses, in optimal analysis case perceived analysis uncertainty at and non-optimal analysis cases. Although observation and background errors are uniform in observation both optimal location and is distributed non-optimal analyses, more equally in across optimal all analysis values, case indicating perceived thatanalysis re is a better propagationuncertainty of information, observation measured location by analysis distributed errormore variance, equally across all values, different indicating observation that sites. he exception re being is a better propagation isolated observation of information, sites, measured for which by analysis we error observe variance, a maxima across ondifferent high end of observation sites. he exception being isolated observation sites, for which we observe a maxima histograms. At those sites analysis error variance is simply obtained by scalar equation 1/σa 2 on = 1/σo 2 high + 1/σb 2 end. For of O histograms. At those sites analysis error variance is simply obtained by 23 iter 12 scalar 2 analysis error variance gives 16.2, and for O 3 iter 0 we scalar equation 1/ σ a = 1/ σ o + 1/ σ b. For O3 iter 1 scalar analysis error variance gives 16.2, and get 15.0, thus for explaining secondary maxima on high end of histogram. O3 iter 0 we get 15.0, thus explaining secondary maxima on high end of histogram. (a) (b) Figure 4. Analysis error variance for ozone optimal analysis case O 3 iter 1. (a) is analysis error on model grid and (b) at active observation sites. Note that color bar of left and right panels are different. he maximum of color bar for left panel correspond to σ 2 o + σ 2 b.

13 Atmosphere 2017, 8, of 21 Figure 4. Analysis error variance for ozone optimal analysis case O3 iter 1. (a) is analysis error on model grid and (b) at active observation sites. Note that color bar of left and right Atmosphere 2018, 9, of 21 panels are different. he maximum of color bar for left panel correspond to σ o + σ b. (a) (b) Figure Figure Distribution Distribution (histogram) (histogram) of of ozone ozone analysis analysis error error variance variance at at active active observation observation locations. (a) First analysis experiment O3 locations. (a) First analysis experiment O iter 0 (no optimization) on left panel; (b) Optimal 3 iter 0 (no optimization) on left panel; (b) Optimal analysis case O3 analysis case O iter 1. Note that scales are different between left and right panels. 3 iter 1. Note that scales are different between left and right panels. he mean perceived analysis error variance for all experiments is presented in able 4. Comparing he mean perceived analysis error variance for all experiments is presented in able 4. Comparing se values with estimated values of analysis error variance based on diagnostics in able 2 we se values with estimated values of analysis error variance based on diagnostics in able 2 we note that for both optimal experiments, O3 iter 1 and PM2.5 iter 1, perceived analysis error note that for both optimal experiments, O variance roughly agrees with all analysis 3 iter 1 and PM error variances 2.5 iter 1, perceived analysis error variance estimated with diagnostics (able 2). roughly agrees with all analysis error variances estimated with diagnostics (able 2). able 4. Perceived analysis error variance. Mean over active observation sites. able 4. Perceived analysis error variance. Mean over active observation sites. Experiment Perceived tr ( HA PH )/ N s Experiment Perceived tr(hâ O3 iter P H )/N s O O3 3 iter iter O 3 iter PM2.5 iter PM 2.5 iter PM iter iter However, for non-optimal analyses, O3 iter 0 and PM2.5 iter 0, re is a general However, for non-optimal analyses, O disagreement between all estimated values. 3 iter 0 and PM Looking more 2.5 iter 0, re is a general disagreement closely, however, we note that between all estimated values. Looking more closely, however, we note that agreement in optimal agreement in optimal case is not perfect. he perceived analysis error variance is about 20% case is not perfect. he perceived analysis error variance is about 20% lower than best estimates lower than best estimates tr ( H Aˆ MDJ H ) / Ns and tr ( H Aˆ 2 tr(hâ DH ) / Ns. he optimal χ / Ns values MDJ H )/N s and tr(hâ D H )/N s. he optimal χ 2 /N s values in optimal cases are slightly above in one, optimal thus indicating cases that are slightly innovation above covariance one, thus consistency indicating is not that exact and innovation some furr covariance tuning ofconsistency error statistics not exact could and be done. some More furr on that tuning matter of will error be presented statistics in could Section be 4.5. done. More on that matter will be presented in Section Discussion on Statistical Assumptions and Practical Applications 4. Discussion on Statistical Assumptions and Practical Applications 4.1. Representativeness Error with In situ Observations 4.1. he Representativeness statistical diagnostics Error with presented In situ Observations in Section 2 derive from assumption that observation errors he are statistical horizontally diagnostics uncorrelated presented and in uncorrelated Section 2 derive withfrom background assumption error. that Although observation this assumption errors are horizontally is never entirely uncorrelated observedand in reality, uncorrelated re are with ways tobackground work around error. it. InAlthough case this of inassumption situ observations, is never and entirely assuming observed thatin any reality, systematic re are error ways have to work been removed, around it. random In case errors of in are situ still observations, present, dueand to assuming difference that between any systematic observation error have and been removed, model s equivalent random errors of are observation called still present, due to representativeness difference between error (see Janjic observation et al. [12] and for a review). model s Representativeness equivalent of error observation called is due to unresolved representativeness scales and processes error in(see Janjic model et and al. [12] interpolation for a review). or forward Representativeness observation model error is errors. due to hese unresolved errors are scales typically and processes roughlyin at model scale of and interpolation model grid or [21,22], forward so typically observation a few tens of kilometers for air quality models. his should not be confused with representativeness of an observation, where, for example, remote stations are representative of large area (e.g., several

14 Atmosphere 2017, 8, of 21 model errors. hese errors are typically roughly at scale of model grid [21,22], so typically a Atmosphere 2018, 9, of 21 few tens of kilometers for air quality models. his should not be confused with representativeness of an observation, where, for example, remote stations are representative of large area (e.g., several hundreds of kilometers), whereas urban and suburban stations are at scale of human activity in cities, traffic and industries, etc. and are, depending on chemical specie, of a few kilometers and less. Representativeness error of in situ measurements can be discarded altoger by simply filtering any pair of observations that are in range of a few model grid sizes, both in in assimilation and estimation of error statistics [1] or in pairs of passive-active observations for cross-validation [17]. Once this filtering is done, assumption on observation errors being spatially uncorrelated and uncorrelated with background error n applies Correlated Observation-Background Errors In any case, it itis is interesting to show to show how how different different diagnostics, diagnostics, introduced introduced Section in 2, Section depends 2, depends on statistical on statistical assumptions assumptions of observation of observation error. One error. way to One get way an understanding to get understanding of effect of se effect assumptions of se assumptions is to look at is it to from look a geometrical at it from a geometrical point of view, point using of view, representation using representation introduced introduced Section 2.2. in Section Note that 2.2. Note same that results same can be results obtained cananalytically, be obtainedbut analytically, geometrical but interpretation geometrical interpretation gives a simple gives and appealing a simple and way appealing of looking way at of looking problem. at problem. Let us consider effect on analysis of observation error correlated with background error. he case where observation error is is uncorrelated with with background error error is is represented in in Figure Figure 6 on 6 on panel panel a and a and when when y y are are correlated on on panel panel b. b. (a) (b) Figure Figure Geometrical Geometrical representation representation of of analysis. analysis. (a) (a) for for observation observation errors errors uncorrelated uncorrelated with with background background error. error. (b) (b) with with correlated correlated errors. errors. indicate indicate truth, O truth, observation, O observation, B background B and background  optimal and  analysis. optimal analysis. he observation and background error variances are kept unchanged, with same ( O, ) he observation and background error variances are kept unchanged, with same (O, ) length and length (B, ) and length ( B, ) inlength both panels. in both Inpanels. casein of correlated case of errors correlated angle errors BO angle is no longer BOa right is no angle. longer Yet, a right it is angle. still possible Yet, it to is obtain still possible an optimal to obtain analysis, optimal Â, as a linear analysis, combination Â, as a linear of combination observation and of observation background, and on line background, (O, B), for on which line distance ( O, B), for  which to (i.e., distance analysis  error to variance) (i.e., is minimum. In this case, (Â, ) (O, B). Note that for strongly correlated errors and when σ analysis error variance) is minimum. In this case, ( Aˆ b 2, ) ( O, B). Note that for strongly correlated > σ2 o, although  is still on line (O, B), it may actually lie outside segment [O, B]. Yet, principles 2 2 and errors ory and still when holdσ bin> that σ o, case. although  is still on line ( O, B), it may actually lie outside segment When[ O, B observation ]. Yet, principles error is uncorrelated and ory with still hold background in that case. error, (O, ) (B, ), triangles OÂWhen and B observation are similarerror and is it uncorrelated follows that with (O Â)( background B) = ( error,  ) ( O, 2, ) which ( B, is ), Desroziers et al. triangles OA ˆ [14] diagnostic and BA ˆ for analysis error variance. However, when observation 2 error are similar and it follows that ( O Aˆ)( Aˆ B) = ( Aˆ ), which is is correlated with background error (right panel of Figure 6), triangles O and B are no Desroziers longer similar et al. [14] triangles diagnostic andfor analysis Desroziers error et variance. al. [14] However, diagnostics when for analysis observation error does error not is correlated hold (seewith derivation background in Appendix error A). (right However, panel of Figure HL, Equation 6), triangles (18), and O MDJ A ˆ and diagnostic, BA ˆ Equation are no longer (19), depend similar triangles only on having and right Desroziers triangleset O al. [14] and diagnostics BÂ, and for not analysis on error orthogonality does not of hold (B,(see ) with derivation (O, ). in herefore, Appendix A). HL However, and MDJ diagnostics HL, Equation are(18), valid and with MDJ ordiagnostic, without correlated Equation observation-background errors. (19), depend only on having right triangles OA ˆ and BA ˆ, and not on orthogonality of

15 Atmosphere 2018, 9, of Estimation of Satellite Observation Errors with In situ Observation Cross-Validation One of important problems in satellite assimilation is estimation of satellite observation error, which could be addressed with a simple modification of our cross-validation procedure. Let us assume that we have in situ observations that we assume to have uncorrelated errors between mselves (or use a filter with a minimum distance as discussed in Section 4.1), with background errors and satellite observation errors. Yet, satellite observation errors could be correlated with background error. Satellite observations could come from a multi-channel instrument with channel-correlated observation errors, as found with many instruments, and yet our validation procedure can still be used. Let us consider that analyses comprise of satellite and in situ observations but, for purpose of cross-validation, we use only 2/3rd of in situ observations in analysis, and keep remaining 1/3rd as passive to carry out cross-validation procedure. he first thing to note is that passive in situ observations have uncorrelated errors with analysis error ( analysis is composed of satellite observations and 2/3rd of in situ observations). We n use Equation (7) where interpolation of analysis is made only at in situ active observations. Minimizing E[(O A) c (O A)] (i.e., trace of left hand side of Equation (7)) results in finding optimal in situ observation weight. hen, computing analysis error covariance in satellite observation space from analysis scheme (eir from a Hessian of a variational cost function, or with an explicit gain as in Equation (23)), i.e., H sat ÂHsat, we use HL formulation Equation (18) to obtain satellite observation error covariance, H sat ÂH sat + E[(O A) sat (O A) sat ] = R sat (24) he Equation (24) has important properties that estimated observation error covariance is symmetric and positive definite by construction. hen, a new analysis could be carried out to obtain a more realistic H sat ÂH sat, with a resulting updated R sat, and so forth until convergence Remark on Cross-Validation of Satellite Retrievals As a last remark, it appears that cross-validation of satellite retrieval observations using a k-fold approach where observations are used as passive observations to validate analysis can be a difficult problem. Retrievals from passive remote sensing at nadir generally involve a prior or climatology or a model assumption over different regions, and is thus likely to have spatially correlated errors and errors correlated with background error. It does not mean, however, that nothing can be done in that case. For example, for certain sensors, such as infrared sensors, it is possible to disentangle prior from retrieval, so that by an appropriate transformation of measurements, observations can be practically decorrelated from background [23,24]. However to authors knowledge, such an approach have never been undertaken for visible measurements such as for NO 2 or AOD s Lack of Innovation Covariance Consistency and Its Relevance to Statistical Diagnostics he error covariance diagnostics for optimal analysis, presented in Sections 2.4 and 2.5, depends on innovation covariance consistency, E[(O B)(O B) ] = H BH + R, and our results presented in Section 3 have shown that different estimates for optimal analysis error variance are close, but do not strictly agree to each or. his disagreement is related to lack of innovation consistency as follows. Let us introduce a departure matrix from innovation covariance consistency as, E[dd ](H BH + R) 1 = I + (25)

16 Atmosphere 2018, 9, of 21 he trace of Equation (25), which is related to χ 2, is given by E[χ 2 ] = tr{e[dd ](H BH + R) 1 } = N s + tr( ) (26) We recall that in experiment iter 1 we got χ 2 /N s values of 1.36 for O 3 and 1.25 for PM 2.5 (see able 1), indicating that innovation covariance consistency is deficient, although less serious than with experiment iter 0 where values of 2 and higher have been obtained. If we take into account fact that re can be a difference between E [ dd ] and (H BH + R) and we rederive (active) analysis error covariance for HL, MDJ and D schemes, we get (see Appendix B) tr{hâ HL H } = tr{hâ true H } + tr{ R } tr{h BH } + tr{error MDJ } (27) tr{hâ MDJ H } = tr{hâ true H } tr{error MDJ } (28) tr{hâ D H } = tr{hâ true H } tr{error D } (29) where tr{error MDJ } = tr{(h BH )(H BH + R) 1 (H BH )}, tr{error D } = tr{ R(H BH + R) 1 (H BH )}. We note that although error terms are complex expressions, y all depend linearly on. hus, disagreement between HL, MDJ and D analysis error variance estimates is due to lack of innovation covariance consistency. 5. Conclusions We showed that analysis error variance can be estimated and optimized, without using a model forecast, by partitioning original observation data set into a training set, to create analysis, and an independent (or passive) set, used to evaluate analysis. his kind of evaluation by partitioning is called cross-validation. he method derives from assuming that observations have spatially uncorrelated errors or, minimally, that independent (or passive) observations have uncorrelated errors with active observation, and are uncorrelated background error. his leads to important property that passive observations are uncorrelated with analysis error and can n be used to evaluate analysis [17]. We have developed a oretical framework and a geometric interpretation that has allowed us to derive a number of statistical estimation formulas of analysis error covariance that can be used in both passive and active observation spaces. It is shown that by minimizing variance of observation-minus-analysis residuals in passive observation space we actually identify optimal analysis. his has been done with respect to a single parameter, namely ratio of observation to background error variances, to obtain a near optimal Kalman gain. he optimization is also done under constraint of innovation covariance consistency [14,15]. his optimization could have been done with more than one error covariance parameter but this has not been attempted here. he ory does suggest, however, that minimum is unique. Once an optimal analysis is identified we conduct an evaluation of analysis error covariance using several different formulas; Desroziers et al. [14], Hollingsworth Lönnberg [13], and one that we develop in this paper which works in eir active or passive observation spaces. As a way to validate analysis error variance computed by analysis scheme itself, so-called perceived analysis error variance [6], we compare it with values obtained from different statistical diagnostics of analysis error variance. his methodology arises from a need to assess and improve ECCC s surface air quality analyses using our operational air quality model GEM-MACH and real-time surface observations of O 3 and PM 2.5. Our method applied ory in a simplified way. First by considering averaged observation and background error variances and finding an optimal ratio γ = σo 2 /σb 2 using as a constraint trace of innovation covariance consistency [15]. Second, using a single parameter correlation model, its correlation length, we used maximum likelihood estimation [11] to obtain near

17 Atmosphere 2018, 9, of 21 optimal analyses. Also we did not attempt to account for representativeness error in observations by, for example, filtering observations that are close. Despite all se limitations, our results show that with near optimal analyses, all estimates of analysis error variance roughly agree with each or, while disagreeing strongly when input error statistics are not optimal. his check on estimating analysis error variance gives us confidence that method we propose is reliable, and provides us an objective method to evaluate different analysis components configurations, such as type of background error correlation model, spatial distribution of error variances and possibly use of thinning observations to circumvent effects of representativeness errors. he methodology introduced here for estimating analysis error variances is general and not restricted to case of surface pollutant analysis. It would be desirable to investigate or areas of applications, such as surface analysis in meteorology and oceanography. he method could, in principle, provide guidance for any assimilation system. By considering observation space subdomain [25], proper scaling, local averaging [26], or or methods discussed in Janjic et al. [12] it may also be possible to extend this methodology to spatially varying error statistics. Based on our verification results in Part I [5], we found that re is a dependence between model values and error variances, which we will investigate furr in view of our next operational implementation of Canadian surface air quality analysis and assimilation. One strong limitation of optimum interpolation scheme we are using (i.e., homogeneous isotropic error correlation and uniform error variances), which is also case for most 3D-Var implementations, is lack of innovation covariance consistency. Ensemble Kalman filters seem, however, much better in that regard although y have ir own issues with localization and inflation. Experiments with chemical data assimilation using an ensemble Kalman filter does gives χ 2 /N s values very close to unity after simple adjustments for observation and model error variances [27]. We thus argue that ensemble methods, such as ensemble Kalman filter, would produce analysis error variance estimates that are much more consistent between different diagnostics. Estimates of analysis uncertainties can also be obtained by resampling techniques, such as jackknife method and bootstrapping [28]. In bootstrapping with replacement, distribution of analysis error is obtained by creating new analyses by replacing and duplicating observations from an existing set of observations [28]. his technique relies on assumption that each member of dataset is independent and identically distributed. For surface ozone analyses where re is persistence to next day and statistics is spatially inhomogeneous, assumption of statistical independence may not be adequate. he comparison of se resampling estimates of analysis uncertainties could be compared with our analysis error variance estimates to help us identify limitations and areas of improvement. Supplementary Materials: he following are available online at Figure S1: Analysis error variance for ozone optimal analysis case PM 2.5 iter 1. Left panel is analysis error on model grid and on right panel at active observation sites. Note that color bar of left and right panels are different. he maximum of color bar for left panel correspond to σo 2 + σb 2, Figure S2: Distribution (histogram) of ozone analysis error variance at active observation locations. First analysis experiment PM 2.5 iter 0 (no optimization) on left panel, and optimal analysis case PM 2.5 iter 1 on right panel. Acknowledgments: We are grateful to US/EPA for use of AIRNow database for surface pollutants and to all provincial governments and territories of Canada for kindly transmitting ir data to Canadian Meteorological Centre to produce surface analysis of atmospheric pollutants. We are also thankful for proof read by Kerill Semeniuk, and for two anonymous reviewers for ir comments and help in improving manuscript. Author Contributions: his research was conducted as a joint effort by both authors. R.M. contributed to oretical development and wrote paper, and M.D.-J. conducted all experiments design and execution, proof reading and introduced a new diagnostic of optimal analysis error that was furr extended to passive observations space. Conflicts of Interest: he authors declare no conflict of interest. he founding sponsors, which is government of Canada, had no role in design of study; in collection, analyses, or interpretation of data; in writing of manuscript, and in decision to publish results.

18 Atmosphere 2018, 9, of 21 Appendix A. A Geometrical Derivation of Desroziers et al. Diagnostic Let u and v be two random variables of a real Hilbert space as defined in Section 2.3. wo properties of Hilbert spaces are: polarization identity u, v = 1 4 { u + v 2 u v 2} (A1) and parallelogram identity [19] Combining se two equations, we get: ( u + v 2 + u v 2 = 2 u 2 + v 2) u, v = 1 2 { u + v 2 u 2 v 2} Let u = O A and v = A B, n u + v = O B, and so we have (O A), (A B) = 1 2 { O B 2 O A 2 A B 2} (A2) (A3) (A4) If we assume that analysis is optimal, so that OA is a right triangle (see Figure 1), n  2 + O  2 = O 2 (A5) and similarly that AB is a right triangle,  2 +  B 2 = B 2 and substitute se expression into Equation (A4) we get (O Â), ( B) = 1 2 { O B 2 O 2 B 2 2  2} (A6) (A7) If in addition we assume that we have uncorrelated observation-background errors, that is O B 2 = O 2 + B 2 (A8) we n get (O Â), ( B) =  2 (A9) Note that difference with result obtained in Appendix A (property 6) of Ménard [15] where it is shown that necessary and sufficient condition for perceived analysis error covariance in active observation space to meet Desroziers et al. diagnostics for analysis error, is to have innovation covariance consistency and uncorrelated observation-background errors. he result obtained here is different; it concerns optimal analysis error covariance. Appendix B. Diagnostics of Analysis error Covariance and Innovation Covariance Consistency Let us introduce a departure matrix from innovation covariance consistency as, E[dd ](H BH + R) 1 = I + (A10)

19 Atmosphere 2018, 9, of 21 he HL diagnostic, Equation (18), can be expanded as E[(O Â)(O Â) ] = E[dd ] + H BH (H BH + R) 1 E[dd ](H BH + R) 1 E[dd ](H BH + R) 1 H BH H BH (H BH + R) 1 E[dd ] = R H( B BH(H BH + R) 1 H B)H + error HL = R HÂH + error HL (A11) where error HL is given as error HL = R (H BH ) + H BH (H BH + R) 1 (H BH ) (A12) which includes three error terms. A scalar version of Equation (A11) is given as he error analysis for MDJ diagnostic, Equation (19) is error HL = σ2 b 1 + γ σ2 b + σ2 o (A13) E[(Â B)(Â B)] = H KE[dd ] KH = H BH (H BH + R) 1 H BH + error MDJ (A14) where error MDJ is given as error MDJ = H BH (H BH + R) 1 (H BH ) (A15) here is only one term, and its scalar version is given as, error MDJ = σ2 b 1 + γ (A16) he error analysis for Desroziers et al. diagnostic, Equation (20) is E[(O Â)(Â B)] = E[dd ](H BH + R) 1 H BH H BH (H BH + R)E[dd ](H BH + R) 1 H BH = H BH H BH (H BH + R) 1 H BH + error D (A17) where error D is given as error D = {I H BH (H BH + R) 1 } (H BH ) = R(H BH + R) 1 (H BH ) (A18) Again only one error term that is similar to MDJ diagnostic, and its scalar version is given as, error D = σ2 o 1 + γ (A19) Error analysis for diagnostics using passive observations can also be derived. For passive MDJ diagnostics, Equation (21), we have similarly to (A14), E[(Â B) c (Â B) c ] = H c KE[dd ] KH c = H c BH (H BH + R) 1 H BH c + error MDJ_passive (A20) with a single error term given as, error MDJ_passive = H c BH (H BH + R) 1 (H BH c ) (A21)

20 Atmosphere 2018, 9, of 21 o express a scalar version of this equation we need to account for background error correlation ρ between active observation location and passive observation location, and thus expressed as, error MDJ_passive = ρ2 σ 2 b 1 + γ (A22) Finally, fundamental diagnostic of cross-validation Equation (7) does not depend explicitly on innovation covariance consistency. However, attaining its true minimum by tuning only γ and L c as would suggest Equation (22), does introduce some innovation in-consistency, which all or optimal diagnostics Equations (18) (21) has to account for. References 1. Ménard, R.; Robichaud, A. he chemistry-forecast system at Meteorological Service of Canada. In Proceedings of ECMWF Seminar Proceedings on Global Earth-System Monitoring, Reading, UK, 5 9 September 2005; pp Robichaud, A.; Ménard, R. Multi-year objective analysis of warm season ground-level ozone and PM 2.5 over North-America using real-time observations and Canadian operational air quality models. Atmos. Chem. Phys. 2014, 14, [CrossRef] 3. Robichaud, A.; Ménard, R.; Zaïtseva, Y.; Anselmo, D. Multi-pollutant surface objective analyses and mapping of air quality health index over North America. Air Qual. Atmos. Health 2016, 9, [CrossRef] [PubMed] 4. Moran, M.D.; Ménard, S.; Pavlovic, R.; Anselmo, D.; Antonopoulus, S.; Robichaud, A.; Gravel, S.; Makar, P.A.; Gong, W.; Stroud, C.; et al. Recent advances in Canada s national operational air quality forecasting system. In Proceedings of 32nd NAO-SPS IM, Utrecht, he Nerlands, 7 11 May Ménard, R.; Deshaies-Jacques, M. Evaluation of analysis by cross-validation. Part I: Using verification metrics. Atmosphere 2018, in press. 6. Daley, R. he lagged-innovation covariance: A performance diagnostic for atmospheric data assimilation. Mon. Wear Rev. 1992, 120, [CrossRef] 7. Daley, R. Atmospheric Data Analysis; Cambridge University Press: New York, NY, USA, 1991; p alagrand, O. A posteriori evaluation and verification of analysis and assimilation algorithms. In Proceedings of Workshop on Diagnosis of Data Assimilation Systems, November 1998; European Centre for Medium-Range Wear Forecasts: Reading, UK, 1999; pp odling, R. Notes and Correspondence: A complementary note to A lag-1 smoor approach to system-error estimaton : he intrinsic limitations of residuals diagnostics. Q. J. R. Meteorol. Soc. 2015, 141, [CrossRef] 10. Hollingsworth, A.; Lönnberg, P. he statistical structure of short-range forecast errors as determined from radiosonde data. Part I: he wind field. ellus 1986, 38A, [CrossRef] 11. Ménard, R.; Deshaies-Jacques, M.; Gasset, N. A comparison of correlation-length estimation methods for objective analysis of surface pollutants at Environment and Climate Change Canada. J. Air Waste Manag. Assoc. 2016, 66, [CrossRef] [PubMed] 12. Janjic,.; Bormann, N.; Bocquet, M.; Carton, J.A.; Cohn, S.E.; Dance, S.L.; Losa, S.N.; Nichols, N.K.; Potthast, R.; Waller, J.A.; et al. On representation error in data assimilation. Q. J. R. Meteorol. Soc [CrossRef] 13. Hollingsworth, A.; Lönnberg, P. he verification of objective analyses: Diagnostics of analysis system performance. Meteorol. Atmos. Phys. 1989, 40, [CrossRef] 14. Desroziers, G.; Berre, L.; Chapnik, B.; Poli, P. Diagnosis of observation-, background-, and analysis-error statistics in observation space. Q. J. R. Meteorol. Soc. 2005, 131, [CrossRef] 15. Ménard, R. Error covariance estimation methods based on analysis residuals: heoretical foundation and convergence properties derived from simplified observation networks. Q. J. R. Meteorol. Soc. 2016, 142, [CrossRef] 16. Kailath,. An innovation approach to least-squares estimation. Part I: Linear filtering in additive white noise. IEEE rans. Autom. Control 1968, 13,

21 Atmosphere 2018, 9, of Marseille, G.-J.; Barkmeijer, J.; de Haan, S.; Verkley, W. Assessment and tuning of data assimilation systems using passive observations. Q. J. R. Meteorol. Soc. 2016, 142, [CrossRef] 18. Waller, J.A.; Dance, S.L.; Nichols, N.K. heoretical insight into diagnosing observation error correlations using observation-minus-background and observation-minus-analysis statistics. Q. J. R. Meteorol. Soc. 2016, 142, [CrossRef] 19. Caines, P.E. Linear Stochastic Systems; John Wiley and Sons: New York, NY, USA, 1988; p Cohn, S.E. he principle of energetic consistency in data assimilation. In Data Assimilation; Lahoz, W., Boris, K., Richard, M., Eds.; Springer: Berlin/Heidelberg, Germany, Mitchell, H.L.; Daley, R. Discretization error and signal/error correlation in atmospheric data assimilation: (I). All scales resolved. ellus 1997, 49A, [CrossRef] 22. Mitchell, H.L.; Daley, R. Discretization error and signal/error correlation in atmospheric data assimilation: (II). he effect of unresolved scales. ellus 1997, 49A, [CrossRef] 23. Joiner, J.; da Silva, A. Efficient methods to assimilate remotely sensed data based on information content. Q. J. R. Meteorol. Soc. 1998, 124, [CrossRef] 24. Migliorini, S. On quivalence between radiance and retrieval assimilation. Mon. Wear Rev. 2012, 140, [CrossRef] 25. Chapnik, B.; Desroziers, G.; Rabier, F.; alagrand, O. Properties and first application of an error-statistics tunning method in variational assimilation. Q. J. R. Meteorol. Soc. 2005, 130, [CrossRef] 26. Ménard, R.; Deshiaes-Jacques, M. Error covariance estimation methods based on analysis residuals and its application to air quality surface observation networks. In Air Pollution and Its Application XXV; Mensink, C., Kallos, G., Eds.; Springer International AG: Cham, Switzerland, Skachko, S.; Errera, Q.; Ménard, R.; Christophe, Y.; Chabrillat, S. Comparison of ensemlbe Kalman filter and 4D-Var assimilation methods using a stratospheric tracer transport model. Geosci. Model Dev. 2014, 7, [CrossRef] 28. Efron, B. An Introduction to Boostrap; Chapman & Hall: New York, NY, USA, 1993; p by authors. Licensee MDPI, Basel, Switzerland. his article is an open access article distributed under terms and conditions of Creative Commons Attribution (CC BY) license (

Interpretation of two error statistics estimation methods: 1 - the Derozier s method 2 the NMC method (lagged forecast)

Interpretation of two error statistics estimation methods: 1 - the Derozier s method 2 the NMC method (lagged forecast) Interpretation of two error statistics estimation methods: 1 - the Derozier s method 2 the NMC method (lagged forecast) Richard Ménard, Yan Yang and Yves Rochon Air Quality Research Division Environment

More information

Background and observation error covariances Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience

Background and observation error covariances Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience Background and observation error covariances Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience Sarah Dance School of Mathematical and Physical Sciences, University of Reading

More information

Diagnosis of observation, background and analysis-error statistics in observation space

Diagnosis of observation, background and analysis-error statistics in observation space Q. J. R. Meteorol. Soc. (2005), 131, pp. 3385 3396 doi: 10.1256/qj.05.108 Diagnosis of observation, background and analysis-error statistics in observation space By G. DESROZIERS, L. BERRE, B. CHAPNIK

More information

M.Sc. in Meteorology. Numerical Weather Prediction

M.Sc. in Meteorology. Numerical Weather Prediction M.Sc. in Meteorology UCD Numerical Weather Prediction Prof Peter Lynch Meteorology & Climate Cehtre School of Mathematical Sciences University College Dublin Second Semester, 2005 2006. Text for the Course

More information

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.

Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields. Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields. Optimal Interpolation ( 5.4) We now generalize the

More information

Brian J. Etherton University of North Carolina

Brian J. Etherton University of North Carolina Brian J. Etherton University of North Carolina The next 90 minutes of your life Data Assimilation Introit Different methodologies Barnes Analysis in IDV NWP Error Sources 1. Intrinsic Predictability Limitations

More information

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. hree-dimensional variational assimilation (3D-Var) In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance. Lorenc (1986) showed

More information

Kalman Filter and Ensemble Kalman Filter

Kalman Filter and Ensemble Kalman Filter Kalman Filter and Ensemble Kalman Filter 1 Motivation Ensemble forecasting : Provides flow-dependent estimate of uncertainty of the forecast. Data assimilation : requires information about uncertainty

More information

Introduction to Data Assimilation. Saroja Polavarapu Meteorological Service of Canada University of Toronto

Introduction to Data Assimilation. Saroja Polavarapu Meteorological Service of Canada University of Toronto Introduction to Data Assimilation Saroja Polavarapu Meteorological Service of Canada University of Toronto GCC Summer School, Banff. May 22-28, 2004 Outline of lectures General idea Numerical weather prediction

More information

Cross-validation methods for quality control, cloud screening, etc.

Cross-validation methods for quality control, cloud screening, etc. Cross-validation methods for quality control, cloud screening, etc. Olaf Stiller, Deutscher Wetterdienst Are observations consistent Sensitivity functions with the other observations? given the background

More information

Quantifying observation error correlations in remotely sensed data

Quantifying observation error correlations in remotely sensed data Quantifying observation error correlations in remotely sensed data Conference or Workshop Item Published Version Presentation slides Stewart, L., Cameron, J., Dance, S. L., English, S., Eyre, J. and Nichols,

More information

Numerical Weather Prediction: Data assimilation. Steven Cavallo

Numerical Weather Prediction: Data assimilation. Steven Cavallo Numerical Weather Prediction: Data assimilation Steven Cavallo Data assimilation (DA) is the process estimating the true state of a system given observations of the system and a background estimate. Observations

More information

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where: VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

More information

4. DATA ASSIMILATION FUNDAMENTALS

4. DATA ASSIMILATION FUNDAMENTALS 4. DATA ASSIMILATION FUNDAMENTALS... [the atmosphere] "is a chaotic system in which errors introduced into the system can grow with time... As a consequence, data assimilation is a struggle between chaotic

More information

Lecture notes on assimilation algorithms

Lecture notes on assimilation algorithms Lecture notes on assimilation algorithms Elías alur Hólm European Centre for Medium-Range Weather Forecasts Reading, UK April 18, 28 1 Basic concepts 1.1 The analysis In meteorology and other branches

More information

Fundamentals of Data Assimilation

Fundamentals of Data Assimilation National Center for Atmospheric Research, Boulder, CO USA GSI Data Assimilation Tutorial - June 28-30, 2010 Acknowledgments and References WRFDA Overview (WRF Tutorial Lectures, H. Huang and D. Barker)

More information

Enhancing information transfer from observations to unobserved state variables for mesoscale radar data assimilation

Enhancing information transfer from observations to unobserved state variables for mesoscale radar data assimilation Enhancing information transfer from observations to unobserved state variables for mesoscale radar data assimilation Weiguang Chang and Isztar Zawadzki Department of Atmospheric and Oceanic Sciences Faculty

More information

Quantifying observation error correlations in remotely sensed data

Quantifying observation error correlations in remotely sensed data Quantifying observation error correlations in remotely sensed data Conference or Workshop Item Published Version Presentation slides Stewart, L., Cameron, J., Dance, S. L., English, S., Eyre, J. and Nichols,

More information

Data assimilation concepts and methods March 1999

Data assimilation concepts and methods March 1999 Data assimilation concepts and methods March 1999 By F. Bouttier and P. Courtier Abstract These training course lecture notes are an advanced and comprehensive presentation of most data assimilation methods

More information

Background Error Covariance Modelling

Background Error Covariance Modelling Background Error Covariance Modelling Mike Fisher Slide 1 Outline Diagnosing the Statistics of Background Error using Ensembles of Analyses Modelling the Statistics in Spectral Space - Relaxing constraints

More information

A new Hierarchical Bayes approach to ensemble-variational data assimilation

A new Hierarchical Bayes approach to ensemble-variational data assimilation A new Hierarchical Bayes approach to ensemble-variational data assimilation Michael Tsyrulnikov and Alexander Rakitko HydroMetCenter of Russia College Park, 20 Oct 2014 Michael Tsyrulnikov and Alexander

More information

Introduction to Data Assimilation

Introduction to Data Assimilation Introduction to Data Assimilation Alan O Neill Data Assimilation Research Centre University of Reading What is data assimilation? Data assimilation is the technique whereby observational data are combined

More information

Local Ensemble Transform Kalman Filter

Local Ensemble Transform Kalman Filter Local Ensemble Transform Kalman Filter Brian Hunt 11 June 2013 Review of Notation Forecast model: a known function M on a vector space of model states. Truth: an unknown sequence {x n } of model states

More information

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm

EnKF Review. P.L. Houtekamer 7th EnKF workshop Introduction to the EnKF. Challenges. The ultimate global EnKF algorithm Overview 1 2 3 Review of the Ensemble Kalman Filter for Atmospheric Data Assimilation 6th EnKF Purpose EnKF equations localization After the 6th EnKF (2014), I decided with Prof. Zhang to summarize progress

More information

Sensitivity analysis in variational data assimilation and applications

Sensitivity analysis in variational data assimilation and applications Sensitivity analysis in variational data assimilation and applications Dacian N. Daescu Portland State University, P.O. Box 751, Portland, Oregon 977-751, U.S.A. daescu@pdx.edu ABSTRACT Mathematical aspects

More information

Experiences from implementing GPS Radio Occultations in Data Assimilation for ICON

Experiences from implementing GPS Radio Occultations in Data Assimilation for ICON Experiences from implementing GPS Radio Occultations in Data Assimilation for ICON Harald Anlauf Research and Development, Data Assimilation Section Deutscher Wetterdienst, Offenbach, Germany IROWG 4th

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Introduction to Data Assimilation. Reima Eresmaa Finnish Meteorological Institute

Introduction to Data Assimilation. Reima Eresmaa Finnish Meteorological Institute Introduction to Data Assimilation Reima Eresmaa Finnish Meteorological Institute 15 June 2006 Outline 1) The purpose of data assimilation 2) The inputs for data assimilation 3) Analysis methods Theoretical

More information

(Statistical Forecasting: with NWP). Notes from Kalnay (2003), appendix C Postprocessing of Numerical Model Output to Obtain Station Weather Forecasts

(Statistical Forecasting: with NWP). Notes from Kalnay (2003), appendix C Postprocessing of Numerical Model Output to Obtain Station Weather Forecasts 35 (Statistical Forecasting: with NWP). Notes from Kalnay (2003), appendix C Postprocessing of Numerical Model Output to Obtain Station Weather Forecasts If the numerical model forecasts are skillful,

More information

Satellite Observations of Greenhouse Gases

Satellite Observations of Greenhouse Gases Satellite Observations of Greenhouse Gases Richard Engelen European Centre for Medium-Range Weather Forecasts Outline Introduction Data assimilation vs. retrievals 4D-Var data assimilation Observations

More information

GSI Tutorial Background and Observation Errors: Estimation and Tuning. Daryl Kleist NCEP/EMC June 2011 GSI Tutorial

GSI Tutorial Background and Observation Errors: Estimation and Tuning. Daryl Kleist NCEP/EMC June 2011 GSI Tutorial GSI Tutorial 2011 Background and Observation Errors: Estimation and Tuning Daryl Kleist NCEP/EMC 29-30 June 2011 GSI Tutorial 1 Background Errors 1. Background error covariance 2. Multivariate relationships

More information

Density Matrices. Chapter Introduction

Density Matrices. Chapter Introduction Chapter 15 Density Matrices 15.1 Introduction Density matrices are employed in quantum mechanics to give a partial description of a quantum system, one from which certain details have been omitted. For

More information

Calculation and Application of MOPITT Averaging Kernels

Calculation and Application of MOPITT Averaging Kernels Calculation and Application of MOPITT Averaging Kernels Merritt N. Deeter Atmospheric Chemistry Division National Center for Atmospheric Research Boulder, Colorado 80307 July, 2002 I. Introduction Retrieval

More information

Consistent Histories. Chapter Chain Operators and Weights

Consistent Histories. Chapter Chain Operators and Weights Chapter 10 Consistent Histories 10.1 Chain Operators and Weights The previous chapter showed how the Born rule can be used to assign probabilities to a sample space of histories based upon an initial state

More information

Vector and Matrix Norms. Vector and Matrix Norms

Vector and Matrix Norms. Vector and Matrix Norms Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose

More information

Henrik Aalborg Nielsen 1, Henrik Madsen 1, Torben Skov Nielsen 1, Jake Badger 2, Gregor Giebel 2, Lars Landberg 2 Kai Sattler 3, Henrik Feddersen 3

Henrik Aalborg Nielsen 1, Henrik Madsen 1, Torben Skov Nielsen 1, Jake Badger 2, Gregor Giebel 2, Lars Landberg 2 Kai Sattler 3, Henrik Feddersen 3 PSO (FU 2101) Ensemble-forecasts for wind power Comparison of ensemble forecasts with the measurements from the meteorological mast at Risø National Laboratory Henrik Aalborg Nielsen 1, Henrik Madsen 1,

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Dynamic Inference of Background Error Correlation between Surface Skin and Air Temperature

Dynamic Inference of Background Error Correlation between Surface Skin and Air Temperature Dynamic Inference of Background Error Correlation between Surface Skin and Air Temperature Louis Garand, Mark Buehner, and Nicolas Wagneur Meteorological Service of Canada, Dorval, P. Quebec, Canada Abstract

More information

The Canadian approach to ensemble prediction

The Canadian approach to ensemble prediction The Canadian approach to ensemble prediction ECMWF 2017 Annual seminar: Ensemble prediction : past, present and future. Pieter Houtekamer Montreal, Canada Overview. The Canadian approach. What are the

More information

Data assimilation; comparison of 4D-Var and LETKF smoothers

Data assimilation; comparison of 4D-Var and LETKF smoothers Data assimilation; comparison of 4D-Var and LETKF smoothers Eugenia Kalnay and many friends University of Maryland CSCAMM DAS13 June 2013 Contents First part: Forecasting the weather - we are really getting

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA An introduction to data assimilation Eric Blayo University of Grenoble and INRIA Data assimilation, the science of compromises Context characterizing a (complex) system and/or forecasting its evolution,

More information

12 Statistical Justifications; the Bias-Variance Decomposition

12 Statistical Justifications; the Bias-Variance Decomposition Statistical Justifications; the Bias-Variance Decomposition 65 12 Statistical Justifications; the Bias-Variance Decomposition STATISTICAL JUSTIFICATIONS FOR REGRESSION [So far, I ve talked about regression

More information

Validation of Complex Data Assimilation Methods

Validation of Complex Data Assimilation Methods Fairmode Technical Meeting 24.-25.6.2015, DAO, Univ. Aveiro Validation of Complex Data Assimilation Methods Hendrik Elbern, Elmar Friese, Nadine Goris, Lars Nieradzik and many others Rhenish Institute

More information

Relative Merits of 4D-Var and Ensemble Kalman Filter

Relative Merits of 4D-Var and Ensemble Kalman Filter Relative Merits of 4D-Var and Ensemble Kalman Filter Andrew Lorenc Met Office, Exeter International summer school on Atmospheric and Oceanic Sciences (ISSAOS) "Atmospheric Data Assimilation". August 29

More information

Ensemble Data Assimilation and Uncertainty Quantification

Ensemble Data Assimilation and Uncertainty Quantification Ensemble Data Assimilation and Uncertainty Quantification Jeff Anderson National Center for Atmospheric Research pg 1 What is Data Assimilation? Observations combined with a Model forecast + to produce

More information

Estimating and Testing the US Model 8.1 Introduction

Estimating and Testing the US Model 8.1 Introduction 8 Estimating and Testing the US Model 8.1 Introduction The previous chapter discussed techniques for estimating and testing complete models, and this chapter applies these techniques to the US model. For

More information

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y. ATMO/OPTI 656b Spring 009 Bayesian Retrievals Note: This follows the discussion in Chapter of Rogers (000) As we have seen, the problem with the nadir viewing emission measurements is they do not contain

More information

JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 115, D18307, doi: /2009jd013115, 2010

JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 115, D18307, doi: /2009jd013115, 2010 JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 115,, doi:10.1029/2009jd013115, 2010 Chemical state estimation for the middle atmosphere by four dimensional variational data assimilation: A posteriori validation

More information

Aspects of the practical application of ensemble-based Kalman filters

Aspects of the practical application of ensemble-based Kalman filters Aspects of the practical application of ensemble-based Kalman filters Lars Nerger Alfred Wegener Institute for Polar and Marine Research Bremerhaven, Germany and Bremen Supercomputing Competence Center

More information

The Inversion Problem: solving parameters inversion and assimilation problems

The Inversion Problem: solving parameters inversion and assimilation problems The Inversion Problem: solving parameters inversion and assimilation problems UE Numerical Methods Workshop Romain Brossier romain.brossier@univ-grenoble-alpes.fr ISTerre, Univ. Grenoble Alpes Master 08/09/2016

More information

Fundamentals of Data Assimila1on

Fundamentals of Data Assimila1on 014 GSI Community Tutorial NCAR Foothills Campus, Boulder, CO July 14-16, 014 Fundamentals of Data Assimila1on Milija Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006

Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems. Eric Kostelich Data Mining Seminar, Feb. 6, 2006 Data Assimilation: Finding the Initial Conditions in Large Dynamical Systems Eric Kostelich Data Mining Seminar, Feb. 6, 2006 kostelich@asu.edu Co-Workers Istvan Szunyogh, Gyorgyi Gyarmati, Ed Ott, Brian

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

SIO 221B, Rudnick adapted from Davis 1. 1 x lim. N x 2 n = 1 N. { x} 1 N. N x = 1 N. N x = 1 ( N N x ) x = 0 (3) = 1 x N 2

SIO 221B, Rudnick adapted from Davis 1. 1 x lim. N x 2 n = 1 N. { x} 1 N. N x = 1 N. N x = 1 ( N N x ) x = 0 (3) = 1 x N 2 SIO B, Rudnick adapted from Davis VII. Sampling errors We do not have access to the true statistics, so we must compute sample statistics. By this we mean that the number of realizations we average over

More information

Estimating interchannel observation-error correlations for IASI radiance data in the Met Office system

Estimating interchannel observation-error correlations for IASI radiance data in the Met Office system Estimating interchannel observation-error correlations for IASI radiance data in the Met Office system A BC DEF B E B E A E E E E E E E E C B E E E E E E DE E E E B E C E AB E BE E E D ED E E C E E E B

More information

Adaptive Data Assimilation and Multi-Model Fusion

Adaptive Data Assimilation and Multi-Model Fusion Adaptive Data Assimilation and Multi-Model Fusion Pierre F.J. Lermusiaux, Oleg G. Logoutov and Patrick J. Haley Jr. Mechanical Engineering and Ocean Science and Engineering, MIT We thank: Allan R. Robinson

More information

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher Ensemble forecasting and flow-dependent estimates of initial uncertainty Martin Leutbecher acknowledgements: Roberto Buizza, Lars Isaksen Flow-dependent aspects of data assimilation, ECMWF 11 13 June 2007

More information

ECONOMETRIC METHODS II: TIME SERIES LECTURE NOTES ON THE KALMAN FILTER. The Kalman Filter. We will be concerned with state space systems of the form

ECONOMETRIC METHODS II: TIME SERIES LECTURE NOTES ON THE KALMAN FILTER. The Kalman Filter. We will be concerned with state space systems of the form ECONOMETRIC METHODS II: TIME SERIES LECTURE NOTES ON THE KALMAN FILTER KRISTOFFER P. NIMARK The Kalman Filter We will be concerned with state space systems of the form X t = A t X t 1 + C t u t 0.1 Z t

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Doppler radial wind spatially correlated observation error: operational implementation and initial results

Doppler radial wind spatially correlated observation error: operational implementation and initial results Doppler radial wind spatially correlated observation error: operational implementation and initial results D. Simonin, J. Waller, G. Kelly, S. Ballard,, S. Dance, N. Nichols (Met Office, University of

More information

Gaussian Filtering Strategies for Nonlinear Systems

Gaussian Filtering Strategies for Nonlinear Systems Gaussian Filtering Strategies for Nonlinear Systems Canonical Nonlinear Filtering Problem ~u m+1 = ~ f (~u m )+~ m+1 ~v m+1 = ~g(~u m+1 )+~ o m+1 I ~ f and ~g are nonlinear & deterministic I Noise/Errors

More information

Observation error specifications

Observation error specifications Observation error specifications Gérald Desroziers, with many contributions Météo-France, CNRS 4 av. G. Coriolis, 357 Toulouse Cédex, France gerald.desroziers@meteo.fr ABSTRACT The aim of this paper is

More information

Volatility. Gerald P. Dwyer. February Clemson University

Volatility. Gerald P. Dwyer. February Clemson University Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Here represents the impulse (or delta) function. is an diagonal matrix of intensities, and is an diagonal matrix of intensities.

Here represents the impulse (or delta) function. is an diagonal matrix of intensities, and is an diagonal matrix of intensities. 19 KALMAN FILTER 19.1 Introduction In the previous section, we derived the linear quadratic regulator as an optimal solution for the fullstate feedback control problem. The inherent assumption was that

More information

Simulation of error cycling

Simulation of error cycling Simulation of error cycling Loïk BERRE, Météo-France/CNRS ISDA, Reading, 21 July 2016 with inputs from R. El Ouaraini, L. Raynaud, G. Desroziers, C. Fischer Motivations and questions EDA and innovations

More information

Quarterly Journal of the Royal Meteorological Society

Quarterly Journal of the Royal Meteorological Society Quarterly Journal of the Royal Meteorological Society Effects of sequential or simultaneous assimilation of observations and localization methods on the performance of the ensemble Kalman filter Journal:

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Satellite data information content, channel selection and density

Satellite data information content, channel selection and density Satellite data information content, channel selection and density Élisabeth Gérard Météo-France/CNRM/GMAP Toulouse, France [EG2], 2 nd ENVISAT summer school Frascati, Italy, 16-26 Aug 2004 1 Overview A.

More information

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation.

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation. DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation. UCAR 2014 The National Center for Atmospheric Research is sponsored by the National

More information

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions

The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions Joshua Hacker National Center for Atmospheric Research hacker@ucar.edu Topics The closure problem and physical parameterizations

More information

Hierarchical Bayes Ensemble Kalman Filter

Hierarchical Bayes Ensemble Kalman Filter Hierarchical Bayes Ensemble Kalman Filter M Tsyrulnikov and A Rakitko HydroMetCenter of Russia Wrocław, 7 Sep 2015 M Tsyrulnikov and A Rakitko (HMC) Hierarchical Bayes Ensemble Kalman Filter Wrocław, 7

More information

The University of Reading

The University of Reading The University of Reading Radial Velocity Assimilation and Experiments with a Simple Shallow Water Model S.J. Rennie 2 and S.L. Dance 1,2 NUMERICAL ANALYSIS REPORT 1/2008 1 Department of Mathematics 2

More information

Data Assimilation Research Testbed Tutorial

Data Assimilation Research Testbed Tutorial Data Assimilation Research Testbed Tutorial Section 3: Hierarchical Group Filters and Localization Version 2.: September, 26 Anderson: Ensemble Tutorial 9//6 Ways to deal with regression sampling error:

More information

Kalman filter using the orthogonality principle

Kalman filter using the orthogonality principle Appendix G Kalman filter using the orthogonality principle This appendix presents derivation steps for obtaining the discrete Kalman filter equations using a method based on the orthogonality principle.

More information

Lagrangian data assimilation for point vortex systems

Lagrangian data assimilation for point vortex systems JOT J OURNAL OF TURBULENCE http://jot.iop.org/ Lagrangian data assimilation for point vortex systems Kayo Ide 1, Leonid Kuznetsov 2 and Christopher KRTJones 2 1 Department of Atmospheric Sciences and Institute

More information

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit Evan Kwiatkowski, Jan Mandel University of Colorado Denver December 11, 2014 OUTLINE 2 Data Assimilation Bayesian Estimation

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

ON DIAGNOSING OBSERVATION ERROR STATISTICS WITH LOCAL ENSEMBLE DATA ASSIMILATION

ON DIAGNOSING OBSERVATION ERROR STATISTICS WITH LOCAL ENSEMBLE DATA ASSIMILATION ON DIAGNOSING OBSERVATION ERROR STATISTICS WITH LOCAL ENSEMBLE DATA ASSIMILATION J. A. Waller, S. L. Dance, N. K. Nichols University of Reading 1 INTRODUCTION INTRODUCTION Motivation Only % of observations

More information

Estimation and Prediction Scenarios

Estimation and Prediction Scenarios Recursive BLUE BLUP and the Kalman filter: Estimation and Prediction Scenarios Amir Khodabandeh GNSS Research Centre, Curtin University of Technology, Perth, Australia IUGG 2011, Recursive 28 June BLUE-BLUP

More information

Multivariate Time Series

Multivariate Time Series Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form

More information

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form Topic 7 - Matrix Approach to Simple Linear Regression Review of Matrices Outline Regression model in matrix form - Fall 03 Calculations using matrices Topic 7 Matrix Collection of elements arranged in

More information

Application of the Ensemble Kalman Filter to History Matching

Application of the Ensemble Kalman Filter to History Matching Application of the Ensemble Kalman Filter to History Matching Presented at Texas A&M, November 16,2010 Outline Philosophy EnKF for Data Assimilation Field History Match Using EnKF with Covariance Localization

More information

GSI Tutorial Background and Observation Error Estimation and Tuning. 8/6/2013 Wan-Shu Wu 1

GSI Tutorial Background and Observation Error Estimation and Tuning. 8/6/2013 Wan-Shu Wu 1 GSI Tutorial 2013 Background and Observation Error Estimation and Tuning 8/6/2013 Wan-Shu Wu 1 Analysis system produces an analysis through the minimization of an objective function given by J = x T B

More information

Simple Examples. Let s look at a few simple examples of OI analysis.

Simple Examples. Let s look at a few simple examples of OI analysis. Simple Examples Let s look at a few simple examples of OI analysis. Example 1: Consider a scalar prolem. We have one oservation y which is located at the analysis point. We also have a ackground estimate

More information

Data Assimilation Research Testbed Tutorial

Data Assimilation Research Testbed Tutorial Data Assimilation Research Testbed Tutorial Section 2: How should observations of a state variable impact an unobserved state variable? Multivariate assimilation. Single observed variable, single unobserved

More information

On Identification of Cascade Systems 1

On Identification of Cascade Systems 1 On Identification of Cascade Systems 1 Bo Wahlberg Håkan Hjalmarsson Jonas Mårtensson Automatic Control and ACCESS, School of Electrical Engineering, KTH, SE-100 44 Stockholm, Sweden. (bo.wahlberg@ee.kth.se

More information

We begin with some definitions which apply to sets in general, not just groups.

We begin with some definitions which apply to sets in general, not just groups. Chapter 8 Cosets In this chapter, we develop new tools which will allow us to extend to every finite group some of the results we already know for cyclic groups. More specifically, we will be able to generalize

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Data Assimilation in Geosciences: A highly multidisciplinary enterprise

Data Assimilation in Geosciences: A highly multidisciplinary enterprise Data Assimilation in Geosciences: A highly multidisciplinary enterprise Adrian Sandu Computational Science Laboratory Department of Computer Science Virginia Tech Data assimilation fuses information from

More information

The Kalman filter. Chapter 6

The Kalman filter. Chapter 6 Chapter 6 The Kalman filter In the last chapter, we saw that in Data Assimilation, we ultimately desire knowledge of the full a posteriori p.d.f., that is the conditional p.d.f. of the state given the

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Kalman Filters with Uncompensated Biases

Kalman Filters with Uncompensated Biases Kalman Filters with Uncompensated Biases Renato Zanetti he Charles Stark Draper Laboratory, Houston, exas, 77058 Robert H. Bishop Marquette University, Milwaukee, WI 53201 I. INRODUCION An underlying assumption

More information

The Canadian Land Data Assimilation System (CaLDAS)

The Canadian Land Data Assimilation System (CaLDAS) The Canadian Land Data Assimilation System (CaLDAS) Marco L. Carrera, Stéphane Bélair, Bernard Bilodeau and Sheena Solomon Meteorological Research Division, Environment Canada Dorval, QC, Canada 2 nd Workshop

More information