Adaptive Data Assimilation and Multi-Model Fusion Pierre F.J. Lermusiaux, Oleg G. Logoutov and Patrick J. Haley Jr. Mechanical Engineering and Ocean Science and Engineering, MIT We thank: Allan R. Robinson Wayne G. Leslie AOSN-II and MB06 teams ONR http://modelseas.mit.edu/
Adaptive Data Assimilation Here, we review and illustrate several of the ESSE adaptable components (see Lermusiaux, Physica D, 2007) Other Adaptive DA references (Blanchet et al, 1997; Menemenlis and Chechelnistky, 2000; etc) Most DA schemes used for realistic studies approximate fundamental principles These DA schemes involve parameters, options and heuristic algorithms whose specifics impact results In the Error Subspace Statistical Estimation system, specifics Vary with each application and with users inputs Are adapted with time, as a function of the available data, regional dynamics or other considerations Data assimilation is said to be adaptive when parameters, functional or schemes used for DA are a (quantitative) function of the measurements: the DA learns from data
1) Adaptive Error Covariance Estimation: Adaptive Learning of the Dominant Errors in ESSE
Real-time 1996 example See Lermusiaux, DAO (1999)
2) Adaptive Error Scaling in ESSE Dominant Error covariance estimate can be scaled by a block diagonal matrix Γ T B=ΓEΠE Γ Each block corresponds to a state variable and is defined by one scaling factor (in atmos., Γ set to a scalar, cov. inflation) Scaling used in the error initialization Presently, scaling is tuned by trial and error (shooting) At t 0, values in Γ are usually set within 0.3 to 0.7 (when EΠE T is set to variability, e.g. Lermusiaux, Anderson and Lozano, 2000). At t 0 and t k s, the tuning of Γ is done by batch. Successive batches are compared to data-model misfits and the initial error increased/decreased accordingly For mesoscale coastal ocean, Γ stabilizes to I after 2-7 DA cycles (days to a week)
3) Adaptive Parameterization of the Truncated Errors Uncertainties not represented by the error subspace j modeled by random noise n k For each state variable v, random noise is sum of additive and multiplicative noise: ( j α ) k ( j ε ) k v v ( n j j j j ) k = αk + εkxk : White noise reddened by Shapiro Filter : Non-dimensional white noise factor of small amplitude (1 to 5%) Presently, scaling (increase/decrease) of these parameters by trial and error (shooting) on future data-model misfits v
4) Adaptive Ensemble Size, Error Subspace Rank and Stochastic Forcing Size of ensemble controlled in real-time by quantitative criteria Error Subspace rank selected based on Sing. Val. Stochastic forcing parameters should be function of data-model misfits d
Sensitivity of T. error correlation estimates to error subspace rank Ens. Size of 500 Ens. Size of 500, Subspace of Rank 100 Ens. Size of 500, Subspace of Rank 300 Ens. Size of 500, Subspace of Rank 20
Sensitivity of T. err. cor. estimates to ensemble size and subspace rank Ens. Size of 100 Ens. Size of 500, Subspace of Rank 100 Ens. Size of 100, Subspace of Rank 20 Ens. Size of 500, Subspace of Rank 20
Stochastic Primitive Equation Model See Lermusiaux, JCP-2006 The diagonal of time-decorrelations: are here The diagonal of noise variances are chosen function of z only, of amplitude set to: ε * geostrophy
Sensitivity of T. err. cor. estimates to stochastic forcing (and error subspace rank) Ens. Size of 500 Ens. Size of 500, with stochastic forcing Ens. Size of 500, Subspace of Rank 100 Ens. Size of 500, stochas. frc., Rank 100
Percentage of Variance Explained Normalized, Cumulative Error Variance Red dash-dotted: without stochastic forcing Blue: with stochastic forcing
5) Effect of (Adaptive) Schur product of ESSE covariances with a matrix whose values decay with distances T and S profiles assimilated on Aug 28, 2003 ESSE error standard deviation prediction for surface T Reduction (prior posterior) of error standard deviation due to DA of T and S, no tapering by Schur product Error reduction, with tapering by Schur product
DA increments (surface T), No tapering DA increments (surface T), With tapering Error Reduction no tapering by Schur product Error reduction, with tapering by Schur product
Multi-Model Fusion for Ocean Prediction based on Adaptive Uncertainty Estimation A Methodology for Multi-Model Forecast Fusion Adaptive Uncertainty Estimation Schemes Bias Correction followed by Error Variance Estimation Capable of operating with observational data that are limited and sparse (in space and in time) with respect to the dominant ocean scales Adaptive/sequential, using the small samples of error estimation events (possibly 1 event)
Bayesian Multi-Model Fusion Approach And Assumptions Errors = systematic + random components Markovian behavior (past errors are at least partially relevant to future errors) 1) Estimate and correct the biases of each model 2) Estimate the error (co)-variances of each bias-corrected model 3) Optimally combine the states or forecasts Sequential/Adaptive Schemes Uncertainty Estimation from Incomplete Data-Model Misfits 1) Linear sequential bias estimator (of mimimum error variance) 2) Error Variance estimator of minimum mean square error Multi-Model Fusion based on Minimum Error Variance
Bayesian Multi-Model Fusion (at any fixed time) Seek multi-model (central) forecast as a linear combination of the individual forecasts, with spatially varying weights (Interpolated on same grid) Weight matrices D found to ensure that central forecast has minimum error variance and is unbiased (For uncorrelated models errors and to ensure unbiased estimator) (Gorokhov and Stoica, IEEE Trans. Sig. Proc., 2000) Logutov, O. G. Multi-Model Fusion and Uncertainty Estimation for Ocean Prediction. Ph.D. dissertation, Harvard University, 2007
Employing Bayesian Multi-Model Fusion for Integration of Multiple Models Into a Single Ocean Prediction System Example: 24 hour HOPS/ROMS SST forecast, valid Aug 28, 2003 Consists of combining the individual forecasts based on their relative error variances (defined here as uncertainty) minimizes error variance of multi-model (central) forecast spatially varying diagonal weights have clear interpretation of Bayes factors associated with the individual models
Optimal Multi-Model Fusion We need: Unbiased forecasts Forecast error variances (main diagonal of ) B (i)k tt
Characteristics of Coastal Ocean Data Assimilation and Prediction: Observational data are sparse in space and in time Data are collected at different locations for different validation events Volume of data changes with time Data Model Misfits [ o C] z = 10 m, 8/7/2003 20 3 Data Model Misfits [ o C] z = 10 m, 8/11/2003 20 3 Data Model Misfits [ o C] z = 10 m, 8/24/2003 20 3 2 2 2 37 o N 37 o N 37 o N 1 1 1 40 40 40 0 0 0 20 1 20 1 20 1 36 o N 2 36 o N 2 36 o N 2 20 123 o W 40 20 122 o W 40 3 20 123 o W 40 20 122 o W 40 3 20 123 o W 40 20 122 o W 40 3 Sequence of validation events => Sequential/Adaptive schemes
Sequential/Adaptive Bias Estimation Model-data misfits consist of the bias and of the random forecast and observational errors (m data pts) Practical Bias Model: sequential, level averaged, linear misfit update Using the above misfit definition, the error variance of the bias is: Weights w are chosen to minimize the error variance of the bias estimate Define: with the unbiased estimator constraint:
Sequential/Adaptive Bias Estimation (continued) Solution to this constrained minimum error variance minimization is (see Gorokhov and Stoica, IEEE Trans. Sig. Proc., 2000): Optimal Bias Model is: with w =
Example of bias estimation for MREA 2003 Exercises (Ligurian Sea) Bias corrected 24-hour forecast profiles Bias correction from one validating event Bias correction from three validating events
Bias estimation for AOSN-2 Z=10 m Z=150 m HOPS ROMS
Error (co)-variance Estimation Given q realizations of (random) forecast error the unconstrained Maximum-Likelihood error covariance estimate has a Wishart distribution of order q This classic estimator has a large variance for q small Mean-Squared Error (MSE) of any estimator of B
Error (co)-variance Estimator Look for error (co)-variance estimate as a linear combination of the classic (ML) unconstrained estimate and of a spatially constrained estimate Unconstrained estimate is asymptotically unbiased but can have larger estimation error variance Constrained (e.g. constant on fixed depth/density levels) estimate has bias coming from structural assumption, but smaller estimation error variance MSE of can be expressed via expectation and variance of quadratic forms in normal variables Given consider a quadratic form in x Expectation and variance of y are given by:
Presently: Error variance Estimator Look for uncertainty estimate in the form From three validation events Optimal lambdas are found from where are simple expressions in terms of trace of Example of Uncertainty Estimate for AOSN-2
Example of Central Forecast for AOSN-2 Central Forecast HOPS ROMS 24-hour T forecast for Aug 14, 2003
RMSE of HOPS, ROMS, and Two-model (Central) 24-hour Temperature Forecasts Z=10 m Z=150 m
Combine uncertainty estimation from data-model misfits with the adaptive ensemble-based ESSE uncertainty modeling By analyzing expectation and variance of quadratic forms we can compute error variance of uncertainty estimates generated from data-model misfits (uncertainty of uncertainty)
CONCLUSIONS Even though much more research on Adaptive DA is needed, results indicate that error estimates, ensemble sizes, error subspace ranks, covariance tapering parameters and stochastic error models can/should be calibrated by quantitative adaptation to observational data New Bayesian-based fusion of multiple model estimates based on Estimation of uncertainties (Bias + Variance) of ocean models based on the comparison of past model estimates to measurements Subsequent sum of model estimates with optimum error variance ased weights Much work remains, including Combinations of Adaptive DA and multi-model fusion schemes Infer improvements needed in models (adaptive modeling)
Conclusions The formalisms of Bayesian multi-model fusion, sequential bias estimation, and forecast uncertainty estimation, suited for ocean prediction, provide the methodology for integrating multiple ocean models into a single ocean prediction system Multi-model fusion consists of combining the individual forecasts based on their relative uncertainties minimizes error variance of central forecast spatially varying weights have clear interpretation of Bayes factors associated with the individual models Sequential bias estimation is different from Dee et al. type of algorithms since we explicitly compute the error variance of bias estimate and use that error variance once new data become available. Adaptiveness of the algorithm is controlled through the prior estimate error variance which determines the effect of the prior data on the current bias estimate Uncertainty estimation: by analyzing expectation and variance of quadratic forms we can compute error variance of uncertainty estimates generated from data-model misfits. Therefore, estimates can be combined with model-propagated uncertainties using Bayesian principle
Error Subspace Statistical Estimation (ESSE) Uncertainty forecasts (with dynamic error subspace, error learning) Ensemble-based (with nonlinear and stochastic primitive eq. model (HOPS) Multivariate, non-homogeneous and non-isotropic Data Assimilation (DA) Consistent DA and adaptive sampling schemes Software: not tied to any model, but specifics currently tailored to HOPS
STOCHASTIC FORCING MODEL: Sub-grid-scales
Example of bias estimation for MREA 2003 Exercises Bias estimation from a single validation profile 24-hour forecast profiles in the 2 nd half of experiment with bias model trained on the 1 st half
Tidal Inversion New HU code implemented in Matlab Shallow water equations in the frequency domain open boundary forcing: where Inverse solution found as: where Adjoint of dynamics Dynamic error covariance Observational error covariance Reference: Egbert G.D. and S. Erofeeva (2002). Efficient Inverse Modeling of Barotropic Ocean Tides. J.Atm.Oc.Tech., Vol. 19, pp. 183-204.