Data Assimilation Research Testbed Tutorial Section 3: Hierarchical Group Filters and Localization Version 2.: September, 26 Anderson: Ensemble Tutorial 9//6
Ways to deal with regression sampling error: 3. Use additional a priori information about relation between observations and state variables. Regression Weight.5 2 2 Distance from Observation (Km?) Can use other functions to weight regression. Unclear what distance means for some obs./state variable pairs. Referred to as LOCALIZATION. Anderson: Ensemble Tutorial 2 9//6
Localization is function of expected correlation between obs and state. Often, don t know much about this. Horizontal distance between same type of variable may be okay. What is expected correlation for co-located temperature and pressure? What about vertical localization? Looks pretty complex. What about complicated forward operators: Expected correlation of satellite radiance and wind component? Note: DART does allow vertical localization for more complex models. Anderson: Ensemble Tutorial 3 9//6
Ways to deal with regression sampling error: 4. Try to determine the amount of sampling error and correct for it: A. Could weight regressions based on sample correlation. Limited success in tests. For small true correlations, can still get large sample correl. B. Do bootstrap with sample correlation to measure sampling error. Limited success. Repeatedly compute sample correlation with a sample removed. C. Use hierarchical Monte Carlo. Have a sample of samples. Compute expected error in regression coefficients and weight. Anderson: Ensemble Tutorial 4 9//6
Ways to deal with regression sampling error: 4C. Use hierarchical Monte Carlo: ensemble of ensembles. t k * * t k * * H H H H H M independent N-member Ensembles H y y β y t k+2 Regression Confidence Factor, α β Μ y t k+2 M groups of N-member ensembles. Compute obs. increments for each group. For given obs. / state pair:. Have M samples of regression coefficient, β. 2. Uncertainty in β implies state variable increments should be reduced. 3. Compute regression confidence factor, α. Anderson: Ensemble Tutorial 5 9//6
4C. Use hierarchical Monte Carlo: ensemble of ensembles. Split ensemble into M independent groups. For instance, 8 ensemble members becomes 4 groups of 2. With M groups get M estimates of regression coefficient, β i. Find regression confidence factor α (weight) that minimizes: M j = M [ αβ i β j ] 2 i =, i j Minimizes RMS error in the regression (and state increments). Anderson: Ensemble Tutorial 6 9//6
4C. Use hierarchical Monte Carlo: ensemble of ensembles. Regression Confidence Factor, α.9.8.7.6.5.4.3.2. Group Size 2 Group Size 4 Group Size 8 Group Size 6 2 3 4 5 Q: Ratio of sample standard deviation to mean Weight regression by α. If one has repeated observations, can generate sample mean or median statistics for α. Mean α can be used in subsequent assimilations as a localization. α is function of M and Q = Σ β β (sample SD / sample mean regression) Anderson: Ensemble Tutorial 7 9//6
4C. Use hierarchical Monte Carlo: ensemble of ensembles. Hierarchical filter controlled by setting number of groups, M. num_groups in filter_nml. If we don t know how to localize to start with, can use groups to help. Retrieve original values for input.nml by getting from archive copy (we ll have to determine where this is for this workshop). Try splitting 8 ensemble members into 4 groups for Lorenz-96. (num_groups=4, ens_size=8, 4 groups of 2 each). Use adaptive inflation (.5 lower bounds) to make things nice. (inf_flavor=3, inf_sd_initial=.5, inf_sd_lower_bound=.5) Anderson: Ensemble Tutorial 8 9//6
4C. Use hierarchical Monte Carlo: ensemble of ensembles. Turn on regression factor diagnostics, save_reg_diagnostics=.true. After running the 8 by 4 group filter, look at plots of α. Essentially an estimate of a good localization for a given observation. Use plot_reg_factor in matlab. Select default input file name. Only observations, 2, 3, and 4 are available: Located at:.39,.7,.64,.86 Think about value of time median vs. time mean. Could use time mean or median as prior localization functions Play around with model error again. What happens to localization? Anderson: Ensemble Tutorial 9 9//6
Lorenz 96 Experimental Design Initial ensemble members random draws from climatology Observations every time step 4 step assimilations, results shown from second 2 steps Covariance inflation tuned for minimum RMS 4 groups of ensembles used unless otherwise noted EXPERIMENT SET : 4 Randomly located observations Error variance -7 (SMALL!) ERROR comes almost entirely from degeneracy of ensemble covariance All errors shown are prior ensemble mean estimates Anderson: Ensemble Tutorial 9//6
Time Mean Regression Confidence Envelopes: Small Error Limit.9 Regression Confidence Factor.8.7.6.5.4.3.2. Most results are for observation at.64..2.4.6.8 State Variable Location Location of 4 randomly located observations Anderson: Ensemble Tutorial 9//6
Time Mean Regression Confidence Envelopes: Small Error Limit.9 OBS ENS. 3 Regression Confidence Factor.8.7.6.5.4.3.2..2.4.6.8 State Variable Location Envelope (localization) for 3 member ensemble (barely degenerate) Anderson: Ensemble Tutorial 2 9//6
Time Mean Regression Confidence Envelopes: Small Error Limit Regression Confidence Factor.9.8.7.6.5.4.3.2 OBS ENS. 3 ENS. 8..2.4.6.8 State Variable Location Envelope for 8 member ensemble Anderson: Ensemble Tutorial 3 9//6
Time Mean Regression Confidence Envelopes: Small Error Limit Regression Confidence Factor.9.8.7.6.5.4.3.2 OBS ENS. 3 ENS. 8 ENS. 5..2.4.6.8 State Variable Location Envelope for 5 member ensemble Anderson: Ensemble Tutorial 4 9//6
Time Mean Regression Confidence Envelopes: Small Error Limit Regression Confidence Factor.9.8.7.6.5.4.3.2 OBS ENS. 3 ENS. 8 ENS. 5 GC.2..2.4.6.8 State Variable Location Compare to Gaspari Cohn with half-width.2 Anderson: Ensemble Tutorial 5 9//6
Time Mean Regression Confidence Envelopes: Small Error Limit Regression Confidence Factor.9.8.7.6.5.4.3.2 OBS ENS. 3 ENS. 8 ENS. 5 GC.2 ENS. > 3..2.4.6.8 State Variable Location Ensemble sizes > 3 require NO LOCALIZATION (no significant error) Anderson: Ensemble Tutorial 6 9//6
Time Median Regression Confidence Envelopes: Small Error Limit Regression Confidence Factor.9.8.7.6.5.4.3.2 ENS. 3 ENS. 8 ENS. 5 GC.2 ENS. > 3..2.4.6.8 State Variable Location Median reduces noise for small expected correlations Anderson: Ensemble Tutorial 7 9//6
Additional Experiments: Experimental Design. Base case: plain ensemble, optimized Gaspari Cohn localization half-width 2. Time mean case: plain ensemble but with localization using time mean regression confidence envelope from group assimilation 3. Time median case: as above but with time median from group All Start from group ensemble at time step 2 Covariance inflation tuned independently for each case Anderson: Ensemble Tutorial 8 9//6
.4.38.36.34.32.3.28.26 Time Mean global mean RMS Error: Small Error Results Error grows with reduced ensemble size base 4 group 4 mean 4 median 8 group 8 mean 8 median 8 Group filters always best Time mean and median close to tuned base case.24.22.2 Ensemble 3 Ensemble 8 Ensemble 5 Mean/median cost same as base case Anderson: Ensemble Tutorial 9 9//6
Experimental Design: Varying Observational Error Variance Observation set as before Observation error variance -5, -3,.,.,., 7 4 member ensembles; not degenerate Error source is now from observation limitations, etc. Claim: behavior is similar, error source is irrelevant for correction Understand degeneracy and other error sources as sampling error Anderson: Ensemble Tutorial 2 9//6
Time Median Envelopes: Varying Obs. Error Variance.9 Regression Confidence Factor.8.7.6.5.4.3.2. e 5.2.4.6.8 State Variable Location Small error implies no need for localization Anderson: Ensemble Tutorial 2 9//6
Time Median Envelopes: Varying Obs. Error Variance.9 Regression Confidence Factor.8.7.6.5.4.3.2. e 5 e 3.2.4.6.8 State Variable Location Anderson: Ensemble Tutorial 22 9//6
Time Median Envelopes: Varying Obs. Error Variance Regression Confidence Factor.9.8.7.6.5.4.3.2 e 5. e 3 e.2.4.6.8 State Variable Location Increasing error implies increasing localization Anderson: Ensemble Tutorial 23 9//6
Time Median Envelopes: Varying Obs. Error Variance.9 Regression Confidence Factor.8.7.6.5.4.3.2 e 5. e 3 e..2.4.6.8 State Variable Location Anderson: Ensemble Tutorial 24 9//6
Time Median Envelopes: Varying Obs. Error Variance Regression Confidence Factor.9.8.7.6.5.4.3.2 e 5 e 3. e...2.4.6.8 State Variable Location Single Gaspari Cohn half-width can t deal with this range of errors Anderson: Ensemble Tutorial 25 9//6
Time Median Envelopes: Varying Obs. Error Variance Regression Confidence Factor.9.8.7.6.5.4.3 e 5.2 e 3 e... e7.2.4.6.8 State Variable Location Climatological case is unique: Looks like time mean coherence Anderson: Ensemble Tutorial 26 9//6
.45.4.35.3 Time Median Envelopes: Varying Obs. Error Variance Base 4 4x4 Mean Median Error scaled by obs. error standard deviation As error grows, filter becomes less efficient Things are more nonlinear for large errors.25.2 E 5 E 3... Group, mean and median compare favorably with tuned base for all cases! Anderson: Ensemble Tutorial 27 9//6
Regression Confidence Factor.9.8.7.6.5.4.3.2 Sensitivity of results to group size 2 Groups 4 Groups 8 Groups 6 Groups..3.4.5.6.7.8.9 State Variable Location Obs. error variance., 4 member ensemble case Anderson: Ensemble Tutorial 28 9//6
.35.34.33.32.3.3.29 2 groups 4 groups 8 groups 6 groups Sensitivity of results to group size Gradual improvement for increased group size Time mean/median improve more slowly Important that small groups give good results (for cost).28.27.26.25 Group better than time mean implies some time varying information (time means should reduce noise) Group Time Mean Time Median Obs. error variance., 4 member ensemble case Anderson: Ensemble Tutorial 29 9//6
5 Time variation of regression confidence factor.8 5.8 45 45 4.7 4.7 35.6 35.6 3 3 Time 25.5 Time 25.5 2 2 5.4 5.4.3.3 5 5 2 3 4 State Variable Location.2 2 3 4 State Variable Location 6 groups 2 groups Some consistent time variation; noise dominates far from obs. Notice behavior around step 33 for instance.2 Anderson: Ensemble Tutorial 3 9//6
Time Mean Error Challenging Traditional Localization: Varying spatial obs. density.5.4.3.2. 4 Groups Base Filter Time Mean.2.4.6.8 State Variable Location 39 Dense obs isolated ob. Group filter does better in sparse region Time mean filter performance in between Time Mean Prior RMS Error as function of spatial location Anderson: Ensemble Tutorial 3 9//6
Challenging Traditional Localization: Varying spatial obs. density.9 Regression Confidence Factor.8.7.6.5.4.3.2. OB OB 9 OB 39 OB 4.2.4.6.8 State Variable Location Time median envelopes vary lots with spatial location of obs. Anderson: Ensemble Tutorial 32 9//6
Regression Confidence Factor.7.6.5.4.3.2. Challenging traditional localization: Spatial-mean obs. Forward observation operator Time Mean Time Median is fifteen grid point mean.2.4.6.8 State Variable Location Envelope has become less Gaussian In one-d L96 domain, everything stays pretty Gaussian But..., group filter does eliminate need for tuning localization! Anderson: Ensemble Tutorial 33 9//6
Assimilating observations at times different from state estimate Ensemble smoothers: use future observations Targeted observations: examine impact of obs. in past Real-time assimilation: use of late arriving observations in forecast Expect correlations to diminish as time separation increases Need a localization in time, too Group filter can provide this Anderson: Ensemble Tutorial 34 9//6
Time localization : Experimental design 4 group, 4 ensemble member filter 4 random obs. with. error variance additional observation at location.642 The additional observation is from a prior time step Time mean regression confidence envelope as function of time lag Anderson: Ensemble Tutorial 35 9//6
Regression confidence factor as function of obs. lag time.9 Time Lag 9 8 7 6 5 4 3 2.8.7.6.5.4.3.2..2.4.6.8 State Variable Location Moves with group velocity (approximately); dies off with lead Anderson: Ensemble Tutorial 36 9//6
Assimilation in Idealized AGCM: GFDL FMS B-Grid Dynamical Core (Havana) Held-Suarez Configuration (no zonal variation, fixed forcing) Low-Resolution (6 lons, 3 lats, 5 levels); Timestep hour 9 255 45 25 245 45 9 9 8 27 36 T at Level 3 (Middle of Atmosphere) 2 3 4 24 235 23 225 Cross Section of Zonal Mean U 35 3 25 2 5 5 Has Baroclinic Instability 5 9 45 45 9 5 Anderson: Ensemble Tutorial 37 9//6
9 45 45 9 Surface Pressure: Day39 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day39 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 38 9//6
9 45 45 9 Surface Pressure: Day392 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day392 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 39 9//6
9 45 45 9 Surface Pressure: Day393 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day393 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 4 9//6
9 45 45 9 Surface Pressure: Day394 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day394 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 4 9//6
9 45 45 9 Surface Pressure: Day395 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day395 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 42 9//6
9 45 45 9 Surface Pressure: Day396 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day396 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 43 9//6
9 45 45 9 Surface Pressure: Day397 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day397 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 44 9//6
9 45 45 9 Surface Pressure: Day398 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day398 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 45 9//6
9 45 45 9 Surface Pressure: Day399 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day399 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 46 9//6
9 45 45 9 Surface Pressure: Day4 x 4.2.5..5 9.95 9.9 9.85 9.8 9.75 9.7 9 45 45 Temperature, Level 3: Day4 9 9 8 27 36 258 255 252 249 246 243 24 237 234 23 228 225 222 Anderson: Ensemble Tutorial 47 9//6
Experimental Design Details: Bgrid AGCM Results for 4x2 group filter Assimilation for 4 days; starting from climatological distribution Summary results are from last 2 days No covariance inflation 8 randomly located surface pressure stations observe once every 24 hours Observational error variance is mb Anderson: Ensemble Tutorial 48 9//6
Baseline Case: 8 PS Obs every 24 hours Largest error in midlatitudes, synoptic scales after 4 days Min = 85.243 Max = 244.9236 RMS ERROR = 37.5623 9 45 2 5 5 HECTO PASCALS 5 45 4 35 3 25 2 5 RMS ENSEMBLE MEAN ERROR 45 ENSEMBLE SPREAD 9 9 8 27 36 PS Error Reduces by aboutx Asymptotes after about 5 days 5 5 5 5 5 DAYS Ensemble spread is approximately correlated with RMS error Anderson: Ensemble Tutorial 49 9//6
Largest T error in tropics for interior levels (level 3, day 4 shown) DEGREES K 3 2.5 2.5.5 Baseline Case: 8 PS Obs every 24 hours Min =.92 Max = 2.3893 RMS ERROR =.2435 9 45 45 LEVEL (Top) LEVEL2 LEVEL3 LEVEL4 LEVEL5 (Bottom) 5 5 DAYS 9 9 8 27 36 T Error also reduced by about x Asymptotes about day 7.5.5.5.5 Final error about.25 K for interior levels Anderson: Ensemble Tutorial 5 9//6
6 5 4 3 2 Hierarchical Filter Regression Confidence Factors: PS Obs. at 2N, 6E 2 4 6 8 ps mean factor cross section at row 9.9.8.7.6.5.4.3.2. 6 5 4 3 2 PS to PS ps median factor 2 4 6 8 cross section at column.9.8.7.6.5.4.3.2..8.8.6.6.4.4.2.2 2 4 6 8 2 4 6 Anderson: Ensemble Tutorial 5 9//6
6 5 4 3 2 Hierarchical Filter Regression Confidence Factors: PS Obs. at 2N, 6E 2 4 6 8 6 5 4 3 2 t mean factor level t mean factor level 4 2 4 6 8.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. 6 5 4 3 2 t mean factor level 2 2 4 6 8 6 5 4 3 2 PS to T t mean factor level 5 2 4 6 8.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. 6 5 4 3 2 t mean factor level 3 2 4 6 8.8.6.4.2 cross section at row 9.9.8.7.6.5.4.3.2. 2 4 6 8 Anderson: Ensemble Tutorial 52 9//6
6 5 4 3 2 Hierarchical Filter Regression Confidence Factors: PS Obs. at 2N, 6E 2 4 6 8 6 5 4 3 2 u mean factor level u mean factor level 4 2 4 6 8.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. 6 5 4 3 2 u mean factor level 2 2 4 6 8 6 5 4 3 2 PS to U u mean factor level 5 2 4 6 8.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. 6 5 4 3 2 u mean factor level 3 2 4 6 8.8.6.4.2 cross section at row 8.9.8.7.6.5.4.3.2. 2 4 6 8 Anderson: Ensemble Tutorial 53 9//6
Hierarchical Filter Regression Confidence Factors: PS Obs. at 2N, 6E 6 5 4 3 2 v mean factor level 2 4 6 8 6 5 4 3 2 v mean factor level 4 2 4 6 8.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. 6 5 4 3 2 v mean factor level 2 2 4 6 8 6 5 4 3 2 PS to V v mean factor level 5 2 4 6 8.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. 6 5 4 3 2 v mean factor level 3 2 4 6 8.8.6.4.2 cross section at row 9.9.8.7.6.5.4.3.2. 2 4 6 8 Anderson: Ensemble Tutorial 54 9//6
Running the bgrid dynamical core Go to models/bgrid_solo/work and run workshop_setup.csh This assimilates hourly observations over a two-hour period Observations are 6 randomly located column observations Each column observes surface pressure, plus T, u and v at each level Use diagnostic tools to examine output Useful to compute and examine innovation netcdf field Try ncdiff Prior_Diag.nc Posterior_Diag.nc Innov.nc Anderson: Ensemble Tutorial 55 9//6