Advanced uncertainty evaluation of climate models by Monte Carlo methods

Similar documents
Model parameters of chaotic dynamics: metrics for comparing trajectories

NWP model forecast skill optimization via closure parameter variations

Correlation integral likelihood for stochastic differential equations

AEROSOL MODEL SELECTION AND UNCERTAINTY MODELLING BY RJMCMC TECHNIQUE

Janne Hakkarainen ON STATE AND PARAMETER ESTIMATION IN CHAOTIC SYSTEMS. Acta Universitatis Lappeenrantaensis 545

Computer Practical: Metropolis-Hastings-based MCMC

Markov chain Monte Carlo methods in atmospheric remote sensing

Parameter variations in prediction skill optimization at ECMWF

Estimation of ECHAM5 climate model closure parameters with adaptive MCMC

Stochastic methods for representing atmospheric model uncertainties in ECMWF's IFS model

A dilemma of the uniqueness of weather and climate model closure parameters

EMPIRICAL EVALUATION OF BAYESIAN OPTIMIZATION IN PARAMETRIC TUNING OF CHAOTIC SYSTEMS

Gaussian Process Approximations of Stochastic Differential Equations

Randomize-Then-Optimize: A Method for Sampling from Posterior Distributions in Nonlinear Inverse Problems

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

On closure parameter estimation in chaotic systems

Reminder of some Markov Chain properties:

DRAGON ADVANCED TRAINING COURSE IN ATMOSPHERE REMOTE SENSING. Inversion basics. Erkki Kyrölä Finnish Meteorological Institute

convective parameterization in an

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

TESTING GEOMETRIC BRED VECTORS WITH A MESOSCALE SHORT-RANGE ENSEMBLE PREDICTION SYSTEM OVER THE WESTERN MEDITERRANEAN

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Operational and research activities at ECMWF now and in the future

Bayesian parameter estimation in predictive engineering

CPSC 540: Machine Learning

Computational statistics

Ergodicity in data assimilation methods

4. DATA ASSIMILATION FUNDAMENTALS

Climate Change: the Uncertainty of Certainty

Bayesian Methods for Machine Learning

Edward Lorenz: Predictability

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

A nested sampling particle filter for nonlinear data assimilation

Exploring stochastic model uncertainty representations

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Short tutorial on data assimilation

Bayesian Estimation of Input Output Tables for Russia

Forecasting and data assimilation

Monte Carlo in Bayesian Statistics

Tutorial on Approximate Bayesian Computation

Stability of Ensemble Kalman Filters

An introduction to Bayesian statistics and model calibration and a host of related topics

Environment Canada s Regional Ensemble Kalman Filter

TC/PR/RB Lecture 3 - Simulation of Random Model Errors

Upgrade of JMA s Typhoon Ensemble Prediction System

Exploring and extending the limits of weather predictability? Antje Weisheimer

Lecture 2: From Linear Regression to Kalman Filter and Beyond

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Unified Cloud and Mixing Parameterizations of the Marine Boundary Layer: EDMF and PDF-based cloud approaches

A new iterated filtering algorithm

The Canadian approach to ensemble prediction

Inferring biomarkers for Mycobacterium avium subsp. paratuberculosis infection and disease progression in cattle using experimental data

Markov Chain Monte Carlo

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Learning the hyper-parameters. Luca Martino

Introduction to Machine Learning

Direct assimilation of all-sky microwave radiances at ECMWF

Bayesian Inference and MCMC

Principles of Bayesian Inference

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Horizontal resolution impact on short- and long-range forecast error

Markov Chain Monte Carlo, Numerical Integration

Clustering Techniques and their applications at ECMWF

Answers and expectations

Approximate Bayesian Computation

Seamless Prediction. Hannah Christensen & Judith Berner. Climate and Global Dynamics Division National Center for Atmospheric Research, Boulder, CO

F denotes cumulative density. denotes probability density function; (.)

Part 1: Expectation Propagation

Bayesian Calibration of Simulators with Structured Discretization Uncertainty

Monte Carlo methods for sampling-based Stochastic Optimization

Some Results on the Ergodicity of Adaptive MCMC Algorithms

A Global Atmospheric Model. Joe Tribbia NCAR Turbulence Summer School July 2008

Data Assimilation Research Testbed Tutorial

Improved analyses and forecasts with AIRS retrievals using the Local Ensemble Transform Kalman Filter

Inverse problems and uncertainty quantification in remote sensing

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Approximate Bayesian Computation and Particle Filters

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation.

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Data assimilation in high dimensions

EXPERIMENTAL ASSIMILATION OF SPACE-BORNE CLOUD RADAR AND LIDAR OBSERVATIONS AT ECMWF

Learning Static Parameters in Stochastic Processes

Assessing Potential Impact of Air Pollutant Observations from the Geostationary Satellite on Air Quality Prediction through OSSEs

Recent Data Assimilation Activities at Environment Canada

Point, Interval, and Density Forecast Evaluation of Linear versus Nonlinear DSGE Models

MCMC Sampling for Bayesian Inference using L1-type Priors

A Bayesian approach to non-gaussian model error modeling

Multimodal Nested Sampling

On Bayesian Computation

Seminar: Data Assimilation

Bayesian Gaussian Process Regression

3.23 IMPROVING VERY-SHORT-TERM STORM PREDICTIONS BY ASSIMILATING RADAR AND SATELLITE DATA INTO A MESOSCALE NWP MODEL

LECTURE 15 Markov chain Monte Carlo

Estimating the intermonth covariance between rainfall and the atmospheric circulation

Bayes Nets: Sampling

Bayesian Inverse problem, Data assimilation and Localization

Model Uncertainty Quantification for Data Assimilation in partially observed Lorenz 96

Riemann Manifold Methods in Bayesian Statistics

Transcription:

Advanced uncertainty evaluation of climate models by Monte Carlo methods Marko Laine marko.laine@fmi.fi Pirkka Ollinaho, Janne Hakkarainen, Johanna Tamminen, Heikki Järvinen (FMI) Antti Solonen, Heikki Haario (LUT) Alexander Ilin, Erkki Oja (Aalto) FMI Finnish Meteorological Institute LUT Lappeenranta University of Technology SIAM UQ 212

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 2/27 Contents Adaptive Markov chain Monte Carlo method Efficient MCMC for short chains Parallel chains Early rejection MCMC for ECHAM5 climate model Formulating the cost function Stochastic Lorenz 95 test case EPPES ensemble prediction and parameter estimation Lorenz 95 experiment with EPPES ECHAM5 experiment with EPPES

Adaptive MCMC The current work is based on our success on applying and developing adaptive Markov chain Monte Carlo (MCMC) for chemical kinetics, ecological models, and satellite retrieval. In 21 we started a project on applying MCMC for climate model closure parameter uncertainty evaluation. Haario H., Saksman E., Tamminen J.: An adaptive Metropolis algorithm. Bernoulli 7(2), pp. 223 242, 21. Haario H., M. Laine, M. Lehtinen, E. Saksman, J. Tamminen: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing, with discussion, J.R. Statist. Soc. B, 66, part 3 pp. 591 67, 24. Haario H., M. Laine, A. Mira, E. Saksman: DRAM: Efficient adaptive MCMC. Stat. Comput. 16, pp. 339 354, ISSN 96-3174, 26. Laine, M., J. Tamminen: Aerosol model selection and uncertainty modelling by adaptive MCMC technique. Atmos. Chem. Phys., 8, 28. M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 3/27

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 4/27 Terminology for modeling with MCMC methods The observation model in general form is y = f (x θ) + ɛ, observations = model + error. Likelihood function for Gaussian errors corresponds to a quadratic cost function, with { } p(y θ) exp 1 i n (y i f (x i θ)) 2 2 σ 2 { = exp 1 } SS(θ) 2 σ 2, where SS(θ) = 2 log(p(y θ)), the log-likelihood in "sum-of-squares" cost function format. For calculating the posterior, we also need to account SS pri (θ) = 2 log(p(θ)), the prior "sum-of-squares".

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 5/27 Metropolis-Hastings algorithm Random walk Metropolis-Hastings algorithm with Gaussian proposal distribution (and Gaussian likelihood). Propose new parameter value θ prop = θ curr + ξ, where ξ N(, Σ prop ) is drawn from the proposal distribution. Accept θ prop with probability α, { α(θ curr, θ prop ) = 1 exp 1 2 ( ) SS(θprop ) SS(θ curr ) 1 2 σ 2 ( SS pri (θ prop ) SS pri (θ curr ) )} Efficient proposal distribution adaptive tuning of Σ prop, AM, DRAM algorithms.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 6/27 Short chains and adaptation It is important to make short chains as efficient as possible. Efficient: produce estimates with small Monte Carlo error..5.4 mean of the 1. parameter mh am dram ram 1.99 95% quantile mh am dram ram.3.98.97.2.96.1.95.94 1 2 3 4 5 1 2 3 4 5 simulation index simulation index Short MCMC chain repeated 1 times with different algorithms. Gaussian 1 dimensional target, too large initial covariance.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 7/27 Short chains and adaptation But, adaptation might slow the convergence..7.6.5.4 mean of the 1. parameter mh am dram ram 1.99.98 95% quantile mh am dram ram.3.97.2.96.1.95.94 1 2 3 4 5 1 2 3 4 5 simulation index simulation index Same as in the previous slide, but now with more optimal initial proposal. Gaussian 1 dimensional target, near optimal initial covariance.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 8/27 Faster MCMC: parallel chains Random walk MCMC is by nature sequential, and it is generally more efficient to run one long chain than many short independent chains. In parallel adaptive MCMC, the adaptation is done over the points in all chains and they share one common adapted proposal covariance. Communication between the chains can be asynchronous. θi+1 N(θi, Σi) π(θi+1) p(θi+1)l(y θi+1) θi+1 N(θi, Σi) π(θi+1) p(θi+1)l(y θi+1) θi+1 N(θi, Σi) π(θi+1) p(θi+1)l(y θi+1) α =min(1, π(θi+1)/π(θi)) α =min(1, π(θi+1)/π(θi)) α =min(1, π(θi+1)/π(θi)) θi+1 = θi + 1 (θi θi) i +1 Σi+1 = i 1 Σi + 1 (θi θi)(θi θi)t i i

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 9/27 Faster MCMC: early rejection Idea: evaluate the likelihood in parts and check after each part if the proposed parameter value can be rejected. Cumulative cost function evaluated after each month during one year climate model simulation 6 12 5 1 4 8 COST FUNCTION VALUE 3 2 6 4 1 2 2 4 6 8 1 12 MONTH 1 2 3 4 5 6 7 8 9 1 11 12 Early rejection month time to stop the simulation proportion of stopped runs by month This simple trick saved 1% 8% of CPU time in different test cases.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 1/27 Early rejection In many cases SS(θ) is a monotonically increasing function wrt. adding new observations or simulating the model further in time. With the acceptance probability α(θ curr, θ prop ) defined in previous slides, we draw u U(, 1), and accept if 2 log(u) < SS(θ prop) SS(θ curr ) σ 2 + SS pri (θ prop ) SS pri (θ curr ). If we write this as SS crit = 2 log(u) + SS(θ curr )/σ 2 + SS pri (θ curr ) < SS(θ prop )/σ 2 + SS pri (θ prop ), then we can stop evaluating the model when SS(θ prop ) (SS crit SS pri (θ prop ))σ 2.

MCMC for ECHAM5 climate model Studies on feasibility of MCMC for large scale climate models, on the formulation of the likelihood and on the problem of chaotic behaviour of the models. H. Järvinen, P. Räisänen, M. Laine, J. Tamminen, A. Ilin, E. Oja, A. Solonen, H. Haario: Estimation of ECHAM5 climate model closure parameters with adaptive MCMC, Atmospheric Chemistry and Physics, 1(1), pages 9993 12, 21. doi:1.5194/acp-1-9993-21 J. Hakkarainen, A. Ilin, A. Solonen, M. Laine, H. Haario, J. Tamminen, E. Oja, H. Järvinen: On closure parameter estimation in chaotic systems, Nonlinear Processes in Geophysics, 19(1), pages 127 143, 212. doi:1.5194/npg-19-127-212 A. Solonen, P. Ollinaho, M. Laine, H. Haario, J. Tamminen, H. Järvinen: Efficient MCMC for climate model parameter estimation: parallel adaptive chains and early rejection, Bayesian Analysis, 7(2), 1 22 212. M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 11/27

Climate model parametrization In climate models, the atmosphere is divided into cells. The scale of many important processes is smaller than a single cell (e.g. clouds and rain). These processes are parametrized: for example clouds and rain are calculated based on the knowledge of the 2.3. Performance and Metricshumidity for Climate Models Chapter 2. Modeling Climate temperature in the cell. Figure 2.2: The grid points that ECHAM5 uses in horizontal resolution T21. M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 12/27

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 13/27 Stochastic Lorenz 95 test case We have 4 slow state variables - - and 32 fast state variables - -, whose effect is parametrized in the forecast model. Good test case to study: Estimation methodologies. Different parameterizations g(x k, θ). Modeling error. Filtering and ensemble methods. dxk dt 3 2 1 4 39 NATURE: = xk 1 (xk 2 xk+1) xk + F hc b Jk j=j(k 1)+1 dyj dt = cbyj+1 (yj+2 yj 1) cyj + c hc Fy + b b x 1+ j 1 J FORECAST MODEL: dxk = xk 1 (xk 2 xk+1) xk + F g(xk, θ) dt yj Wilks, D.: Effects of stochastic parametrizations in the Lorenz 96 system, Quart. J. Roy. Meteor. Soc., 131(66), 389 47, 25. doi:1.1256/qj.4.3

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 14/27 Lorenz 95 true forcing vs. parameterization 7 6 5 forcing due to fast variables 4 3 2 1 1 2 3 1 5 5 1 15 slow variables The true effect of the fast variables wrt. the values of the slow variables in full model simulation. Red lines give estimated optimal linear parameterization and its uncertainty.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 15/27 Climate model parametrization Currently, best expert knowledge is used to define the optimal closure parameter values, based on observations, process studies, etc. Closure parameters act as tuning handles of the simulated climate. Our goal is to come up with an objective, algorithmic way to determine the closure parameters. ECHAM5 parameters estimated by MCMC and ensemble methods: Parameter Description CAULOC A parameter influencing the accretion of cloud droplets by precipitation (rain formation in stratiform clouds). CMFCTOP Relative cloud mass flux at the level above non-buoyancy (in cumulus mass flux scheme). CPRCON A coefficient for determining conversion from cloud water to rain (in convective clouds). ENTRSCV Entrainment rate for shallow convection.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 16/27 Cost function - likelihood For climate model parameter estimation, it is essential to define a metric with which we can measure the goodness of the parameters. In our MCMC experiments, we have specified different cost functions based on the net radiation at the top of the atmosphere. Example: global F and zonal F t,x modeled mean fluxes are compared to the observed ones, F o and F o t,x. J(θ) = (F Fo ) 2 σ 2 + 12 t=1 y (F t,x F o t,x )2 σ 2 t,x

ECHAM5 MCMC results chain histograms pairwise chains CAULOC CAULOC CMFCTOP CMFCTOP.1.5 5 1 15 2 25 3 CPRCON.5.1.15.2 ENTRSCV CPRCON x 1 3 15 1 5 CMFCTOP.5.1.15 sqrt(costf.) 4 4.5 5 5.5 6 1 2 3 4 5 x 1 3 sqrt(costf.) ENTRSCV x 1 3 4 2 5.5 5 4.5 1 2 3 CPRCON ENTRSCV 2 4.5.1 5 1 15 x 1 3 x 1 3 Conducting a MCMC experiment of about 3 one year climate model runs (ECHAM5 in T21 resolution), required 3 months worth of super computer time. M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 17/27

EPPES ensemble prediction and parameter estimation system Estimating closure parameters using ensemble runs. Initially aimed for estimating numerical weather prediction (NWP) model closure parameters using existing operational ensemble prediction (EPS) infra-structure. But usable for climate model uncertainty analysis, too. M. Laine, A. Solonen, H. Haario, H. Järvinen: Ensemble prediction and parameter estimation system: the method, Quarterly Journal of the Royal Meteorological Society, 138(663), 212. doi:1.12/qj.922 H. Järvinen, M. Laine, A. Solonen, H. Haario: Ensemble prediction and parameter estimation system: the concept, Quarterly Journal of the Royal Meteorological Society, 138(663), 212. doi:1.12/qj.923 P. Ollinaho, H. Järvinen, M. Laine, A. Solonen, H. Haario: NWP model forecast skill optimization via closure parameter variations, Quarterly Journal of the Royal Meteorological Society, in revision, 212. M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 18/27

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 19/27 The EPPES concept for NWP EPPES = EPS + parameter estimation. In addition to initial value perturbations, model parameters θ are sampled from a proposal distribution. The parameters are weighted according to a cost function that depends on forecast skill. The background uncertainty in θ is modeled as Gaussian θ N(µ, Σ) with µ and Σ estimated sequentially. 2 model space * * parameter space.4.3.2.1.1.2 1.2 1.4 1.6 1.8 2 2.2 1 *

Cost function and likelihood Cost function for model F, observations y k, initial values x k, and parameter θ k for forecast time window k: (y k F(x k ; θ k )) Σ 1 obs (y k F(x k ; θ k )) + (θ k µ) Σ 1 (θ k µ) interpreted as a statistical inverse problem: = J obs + J pri, p(θ k y k ) exp( 1 2 J obs) exp( 1 2 J pri) p(y k θ k )p(θ k ) = likelihood prior. In EPPES we are estimating local parameters θ k by importance sampling for p(θ k y k ), and also the optimal global hyper parameters µ and Σ that describe the variability in θ between the time windows by sequential updates. M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 2/27

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 21/27 Random effect model The true value of the parameter θ is treated as random. The distribution generating individual θ k for each forecast time window is assumed Gaussian and the parameters of this "meta" distribution are estimated sequentially. The model: θ k N(µ k, Σ k ) µ k N(µ k 1, W k 1 ) Σ k iwish(σ k 1, n k 1 ). with the following update formulas: ( ) 1 W k = W 1 k 1 + Σ 1 k 1 ( ) µ k = W k W 1 k 1 µ k 1 + Σ 1 k 1 θ k n k = n k 1 + 1 Σ k = ( n k 1 Σ k 1 + (θ i µ k )(θ k µ k ) ) /n k

Stochastic Lorenz 95 test case We parameterize the effect of the fast states - - using linear model with two closure parameters in g(x k, θ) of the forecast model. We run 5 member ensembles sequentially by perturbing both the initial values and the closure parameters. We observe some of the slow variables every 2nd "day" of the truth run with added observation error. Cost function is constructed using 6 day forecast skill. dxk dt 3 2 1 4 39 NATURE: = xk 1 (xk 2 xk+1) xk + F hc b Jk j=j(k 1)+1 dyj dt = cbyj+1 (yj+2 yj 1) cyj + c hc Fy + b b x 1+ j 1 J FORECAST MODEL: dxk = xk 1 (xk 2 xk+1) xk + F g(xk, θ) dt yj M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 22/27

Stochastic Lorenz 95 experiment Evolution of two parameters in the L95 experiment. On left, each column of points corresponds to proposed parameter values in one time window of the sequential estimation procedure. On right, forecast skill is calculated over a grid, with an ellipse and dot showing the final estimated Σ and µ. parameter evolution "6 day" forecast skill θ1 1.5 1.5 5 1 15 2 25 5 1 15 Ensemble number 2 25.6 θ2.4.2.2 M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 23/27

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 24/27 ECHAM5 climate model experiment ECHAM5 atmospheric general circulation model was run in ensemble prediction (EPS) mode to estimate of 4 closure parameters related to cloud formation. Control run and 5 perturbed members. Initial states from ECMWF EPS system, with data from Jan 211 to March 211, and 12UTC daily. Model resolution: T42 truncation with 31 levels. Total of 2 9 51 = 918 parameter sample points. Cost function: RMS of 5 hpa geopotential height 1 days forecast error.

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 25/27 ECHAM5 experiment First iteration vs. last iteration as 2 dimensional plots showing the proposed parameter values, the weights they have received, and the matrices Σ and W of the EPPES algorithm. CMFCTOP 1: 211 1 1 :: 1.8.6.4.2 CMFCTOP 1: 211 3 31 12:: 1.8.6.4.2.15.15 CPRCON.1.5 CPRCON.1.5 4 x 1 3 4 x 1 3 ENTRSCV 3 2 1 ENTRSCV 3 2 1 2 4 CAULOC.5 1 CMFCTOP.1.2 CPRCON 2 4 CAULOC.5 1 CMFCTOP.1.2 CPRCON

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 26/27 ECHAM5 ECHAM5 validation validation RMSE variability using default parameters in blue and using RMS variability using default parameters in blue and using the the EPPES optimized parameter values in red. Independent EPPES optimised parameter values in red. Independentdatabut data with but the with same the cost same function cost function as in optimization. as in the optimization. 211, April 211 a 12 11 1 9 8 7 6 5 5hPa geopotential height RMSE default parameters 5 6 7 8 9 1 forecast length (days) about 6 hour increase in the forecast skill optimized parameters M.Laine: Numerical weather prediction model tuning via ensemble prediction system 1/14

M.Laine: Advanced uncertainty evaluation of climate models by Monte Carlo methods 27/27 Conclusions and ongoing research MCMC can be applied to climate models with the help of adaptation and other speed up tricks, but the problem lies in the formulation of the cost function. Existing ensemble run infrastructures can be used to infer about model parameters in NWP and this methodology can be used with climate models, also. MCMC runs, even short ones, are useful to pinpoint problems in parametrizations, formulation of the cost function, etc. Which climate model fields to include in the cost function and how to scale and weight multi-criteria cost function terms? Can filtering method be used in defining climate model cost function? How well do locally tuned parameter work in longer simulations?