Model parameters of chaotic dynamics: metrics for comparing trajectories

Similar documents
Advanced uncertainty evaluation of climate models by Monte Carlo methods

Correlation integral likelihood for stochastic differential equations

Janne Hakkarainen ON STATE AND PARAMETER ESTIMATION IN CHAOTIC SYSTEMS. Acta Universitatis Lappeenrantaensis 545

NWP model forecast skill optimization via closure parameter variations

Stability of Ensemble Kalman Filters

On closure parameter estimation in chaotic systems

EMPIRICAL EVALUATION OF BAYESIAN OPTIMIZATION IN PARAMETRIC TUNING OF CHAOTIC SYSTEMS

A dilemma of the uniqueness of weather and climate model closure parameters

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Exploring and extending the limits of weather predictability? Antje Weisheimer

Parameter variations in prediction skill optimization at ECMWF

Estimation of ECHAM5 climate model closure parameters with adaptive MCMC

Information geometry for bivariate distribution control

Short tutorial on data assimilation

Smoothers: Types and Benchmarks

Markov chain Monte Carlo methods in atmospheric remote sensing

A Note on the Particle Filter with Posterior Gaussian Resampling

STA 294: Stochastic Processes & Bayesian Nonparametrics

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Markov Chain Monte Carlo (MCMC)

Prof. Stephen G. Penny University of Maryland NOAA/NCEP, RIKEN AICS, ECMWF US CLIVAR Summit, 9 August 2017

Organization. I MCMC discussion. I project talks. I Lecture.

ESTIMATING THE ATTRACTOR DIMENSION OF THE EQUATORIAL WEATHER SYSTEM M. Leok B.T.

Chapter 6: Ensemble Forecasting and Atmospheric Predictability. Introduction

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Approximate Bayesian Computation and Particle Filters

Gaussian Process Approximations of Stochastic Differential Equations

Edward Lorenz: Predictability

Ergodicity in data assimilation methods

A Global Atmospheric Model. Joe Tribbia NCAR Turbulence Summer School July 2008

An introduction to Bayesian statistics and model calibration and a host of related topics

Lecture 7 and 8: Markov Chain Monte Carlo

Introduction to Restricted Boltzmann Machines

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

Forecasting Wind Ramps

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Monte Carlo in Bayesian Statistics

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Computer Practical: Metropolis-Hastings-based MCMC

Bayesian Calibration of Simulators with Structured Discretization Uncertainty

Stochastic methods for representing atmospheric model uncertainties in ECMWF's IFS model

Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I

OPTIMIZATION-BASED SAMPLING IN ENSEMBLE KALMAN FILTERING

The Planetary Boundary Layer and Uncertainty in Lower Boundary Conditions

MCMC Sampling for Bayesian Inference using L1-type Priors

5. General Circulation Models

Stable Semi-Discrete Schemes for the 2D Incompressible Euler Equations

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Estimating model evidence using data assimilation

Drought forecasting methods Blaz Kurnik DESERT Action JRC

Numerical Solutions of ODEs by Gaussian (Kalman) Filtering

We honor Ed Lorenz ( ) who started the whole new science of predictability

Sequential Monte Carlo Samplers for Applications in High Dimensions

Lagrangian Data Assimilation and Its Application to Geophysical Fluid Flows

Mechanisms of Chaos: Stable Instability

Model Uncertainty Quantification for Data Assimilation in partially observed Lorenz 96

5.3 METABOLIC NETWORKS 193. P (x i P a (x i )) (5.30) i=1

Bayesian Methods for Machine Learning

Data assimilation in high dimensions

Mesoscale Predictability of Terrain Induced Flows

Data assimilation in high dimensions

Hierarchical Bayesian Modeling and Analysis: Why and How

Lecture 6: Bayesian Inference in SDE Models

Applications of Hurst Coefficient Analysis to Chaotic Response of ODE Systems: Part 1a, The Original Lorenz System of 1963

An ABC interpretation of the multiple auxiliary variable method

an introduction to bayesian inference

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

Fundamentals of Data Assimilation

Estimation of State Noise for the Ensemble Kalman filter algorithm for 2D shallow water equations.

Gaussian Process Approximations of Stochastic Differential Equations

Observability, a Problem in Data Assimilation

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

Bred vectors: theory and applications in operational forecasting. Eugenia Kalnay Lecture 3 Alghero, May 2008

Seminar: Data Assimilation

(Regional) Climate Model Validation

DRAGON ADVANCED TRAINING COURSE IN ATMOSPHERE REMOTE SENSING. Inversion basics. Erkki Kyrölä Finnish Meteorological Institute

A Spectral Approach to Linear Bayesian Updating

DATA ASSIMILATION FOR FLOOD FORECASTING

Bayesian Methods with Monte Carlo Markov Chains II

STA 4273H: Statistical Machine Learning

Introduction to Data Assimilation. Reima Eresmaa Finnish Meteorological Institute

Methods of Data Assimilation and Comparisons for Lagrangian Data

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo

Daniel J. Jacob, Models of Atmospheric Transport and Chemistry, 2007.

Bayesian Inference and the Symbolic Dynamics of Deterministic Chaos. Christopher C. Strelioff 1,2 Dr. James P. Crutchfield 2

On Reparametrization and the Gibbs Sampler

2.5 Shallow water equations, quasigeostrophic filtering, and filtering of inertia-gravity waves

Ensemble Data Assimilation and Uncertainty Quantification

The Canadian approach to ensemble prediction

Calibration and selection of stochastic palaeoclimate models

DETERMINATION OF MODEL VALID PREDICTION PERIOD USING THE BACKWARD FOKKER-PLANCK EQUATION

CPSC 540: Machine Learning

State and Parameter Estimation in Stochastic Dynamical Models

Exploring stochastic model uncertainty representations

Manifold Monte Carlo Methods

The priority program SPP1167 Quantitative Precipitation Forecast PQP and the stochastic view of weather forecasting

SCUOLA DI SPECIALIZZAZIONE IN FISICA MEDICA. Sistemi di Elaborazione dell Informazione. Regressione. Ruggero Donida Labati

Transcription:

Model parameters of chaotic dynamics: metrics for comparing trajectories H. Haario 1,2, L.Kalachev 3, J. Hakkarainen 2, A. Bibov 1, FMI people June 25, 215 1 Lappeenranta University of Technology, Finland 2 Finnish Meteorological Institute (FMI) 3 University of Montana, USA

Contents No standard likelihood due to chaoticity Likelihoods based on Summary statistics State space filtering Metrics based on fractal dimension concepts Examples: Lorenz, Shallow Water

Chaotic Systems: construction of the likelihood (cost function)? 1 8 6 state variable 1 4 2 2 4 6 5 1 15 time Small changes in the initial values (or solver settings) can lead to unpredictable deviations from the observations.

Likelihood: 1st attempts, summary statistics Observations and simulations are transformed to summary statistics: s = s(y 1:n ), s θ = s(z θ,1:n ) Likelihood is formulated for s, which yields the posterior p (θ s) p(θ)p(s θ)

Likelihood: 1st attempts, summary statistics Observations and simulations are transformed to summary statistics: s = s(y 1:n ), s θ = s(z θ,1:n ) Likelihood is formulated for s, which yields the posterior p (θ s) p(θ)p(s θ) The approach was implemented for the ECHAM5 climate model, using likelihood based on monthly global and zonal net radiation averages. MCMC was used to estimate four parameters related to cloud formation and precipitation

Climate model MCMC results CMFCTOP.4.2 CAULOC.15 CMFCTOP CPRCON.1.5 ENTRSCV 6 x 1 3 4 2 2 4 CPRCON.2.4.1.2

Climate model MCMC results CMFCTOP.4.2 CAULOC.15 CMFCTOP CPRCON.1.5 ENTRSCV 6 x 1 3 4 2 2 4 CPRCON.2.4.1.2 The summary statistics do not identify the parameters

6.2. Individual Climate Variables Chapter 6. Performance Metrics Applied Nonsmooth cost function log(cmfctop) log(cmfctop) 4 3 2 1 J1 with t5 1.2 1.8.6.4 3.5 2 2 2 2 4 3 2 1 log(cauloc) J3 with t5 115 11 15 1 95 4 3 2 1 J2 with t5 5 4.5 4 9 75 2 2 2 2 log(cauloc) log(cauloc) (a) the large perturbations of the parameters log(cmfctop) log(cmfctop) 4 3 2 1 log(cauloc) J4 with t5 95 9 85 8 x 1 5 1 J1 with t5 1.2 x 1 5 J2 with t5 1 5 fault 5 1 fault 5

6.2. Individual Climate Variables Chapter 6. Performance Metrics Applied Nonsmooth cost function log(cmfctop) log(cmfctop) 4 3 2 1 J1 with t5 1.2 1.8.6.4 3.5 2 2 2 2 4 3 2 1 log(cauloc) J3 with t5 115 11 15 1 95 4 3 2 1 J2 with t5 5 4.5 4 9 75 2 2 2 2 log(cauloc) log(cauloc) (a) the large perturbations of the parameters log(cmfctop) log(cmfctop) 4 3 2 1 log(cauloc) J4 with t5 95 9 85 8 x 1 5 x 1 5 J1 with t5 J2 with t5 1 1 1.2 5 projections: 5 helps but does 1 bot 5 solve the problem Data Analysis for summary cost functions to get informative fault fault

EKF Likelihood: by State Space filtering Tame chaos by filtering out the state space. Use state estimation methods to keep the model close to the data. Filtering: integrate out the state space, what remains gives a likelihood for parameters. Standard way for linear time series (DLM, Dynamical Linear Models) and SDE (stochastic differential equations) systems. Less standard for chaotic dynamics, but can be implemented with EKF. 1 Lorenz95 system Not assimilated Assimilated Observations 5 State variable 5 2 4 6 8 1 12 14 16 18 2 Time

The toy model: parameterized Lorenz 95 3 2 1 4 39 dx k dt NATURE: = x k 1 (x k 2 x k+1) x k + F hc b Jk j=j(k 1)+1 dy j dt = cbyj+1 (yj+2 yj 1) cyj + c hc Fy + b b x 1+ j 1 J y j FORECAST MODEL: dx k = x k 1 (x k 2 x k+1) x k + F g(x k, θ) dt We use a polynomial parameterization: g(x k, θ) = d i=1 θ ix (i 1) k

The toy model: parameterized Lorenz 95 1 8 forecast model slow state fast state 6 4 state variable 2 2 4 6 1 2 3 4 5 6 time

Results: filter likelihood θ.12 θ 1.1.8.6 1.8 1.9 2 2.1 θ 1 log(σ 2 ) 4 5 6 7 8 4 5 6 7 8 9 1.8 1.9 2 2.1 9.6.8.1.12 Scattering plot of parameter pairs from MCMC runs using 1 (blue), 2 (red), 5 (black) and 5 (green) day simulations.

Comments, so far The summary statistics approach have problems in properly identifying the parameters of chaotic dynamical systems The likelihood can be computed by integrating out the uncertain model states using filtering methods. Each filter algorithm has built-in tuning parameters (model error covariance, linearization...). The amount of bias introduced by them? Filtering rather avoids the problem than solves it.

Metric based on fractal dimension concepts Several concepts exist to define the fractal dimension of a chaotic trajectory, such as the Hausdorff dimension or box-counting. We employ the concept of Correlation Dimension (CD), as it is simple for computations. Recall first the Correlation Integral:

Denote by s i, i = 1, 2,..., N points of a trajectory vector s R n, evaluated at time points t i. For R > set C(R, N) = 1/N 2 i,j #( s i s j < R) and define then the correlation integral as the limit C(R) = lim N C(R, N). So we take the total number of points closer than R, normalize by the number of pairs N 2 and take the limit. Not that for each N we have 1/N C(R, N) 1. If ν is the dimension of the trajectory, we should have C(R) R ν and the Correlation Dimension ν is defined as the limit ν = lim R log C(R)/ log(r).

Numerical estimation of Correlation Dimension In numerical practice, we have a finite time interval [, T ] the trajectory vector s i is evaluated on a finite number of time instants t i, i = 1, 2,..., N. The above limit may be approximated by a log-log plot obtained by computing log C(R) at various values of R. A few constants to be selected first: The number of points, N. The maximum radius R, large enough for each ball B(s i, R) to contain all the points s j, j i The set of smaller radii by R k = b k R, with k = 1, 2,..., M. Select M and the base b. Typical defaults M = 1, b = 2. Then: For each k, compute the C(R k, N) Create the log-log curve log(r k ) vs log C(R k, N), k = 1, 2,...N. Estimate ν from the linear part of the slope.

CD for distance between trajectories The above CD is a standard method to calculate the dimension of a (chaotic) trajectory. Here, we want to modify it to get a measure for the distance between two model trajectories, as given, e.g., with different model parameters. Due to chaoticity, even small differences in initial values or numerical solvers change the trajectories. We want to separate this variability with fixed model parameters from that due to different model parameters.

Distance via a generalized correlation sum Fix again the numerical tuning factors (T,N,R,b,M) to cover the range of the trajectory s. The generalized correlation sum between trajectories s = s(θ, x) and s = s( θ, x) is then defined as C(R, N, θ, x, θ, x) = 1/N 2 i,j #( s i s j < R), (1) where θ, θ denote the respective model parameters and x, x the initial values. For θ = θ, x = x the formula reduced to the original definition of the correlation sum.

Correlation Curve variability with fixed model parameter First, characterize the within variability of a chaotic dynamical system with fixed model parameter vector: 1. Repeatedly simulate the trajectory, with varying initial values (and solver tolerances), but fixed model parameter θ. 2. Compute the distance matrix between (all) different trajectory pairs, to get the values C(R, N, θ, x, θ, x). An example for Lorenz3, with a log-scale for R: 2 4 6 8 1 12 14 16 18 2 1 9 8 7 6 5 4 3 2 1

Cost function for parameter estimation We treat the above vectors y = C(R k, N, θ, x, θ, x), k = 1,..., M as measurements of the variability of a chaotic trajectory with a given fixed model parameter. Construct the respective likelihood: 1. Empirically estimate the statistics of y = C(R, N, θ, x, θ, x) from repeated simulations. 2. Create the empirical likelihood function. 3. For any trajectory s(θ) compute the distance matrix from the reference trajectory, and the respective C(R k, N, θ, x, θ, x). Evaluate the likelihood.

Example: Likelihood for 3D Lorenz dx dt = σ(y X), dy dt = X(ρ Z) Y, dz dt = XY βz. (2) 5 4 3 2 1 1 2 3 245 246 247 248 249 25 251 252 253 254 255 TIME Figure: Observation samples of 3D Lorenz.

Example: Likelihood for 3D Lorenz The values y k = C(R k, N, θ, x, θ, x), k = 1,..., M are averages of distances between state vectors. In analogy with the Central Limit Theorem, test a Gaussian distribution for the vector y: calculate the mean value µ and covariance matrix Σ of the training set. Compute the statistics of the expression (µ y)σ 1 (µ y), should obey the χ 2 distribution for a Gaussian y, (µ y)σ 1 (µ y) χ 2 M (3)

Example: Likelihood for 3D Lorenz.12 χ 2 distribution, dof: 1.35 χ 2 distribution, dof: 92.1.3.8.25.2.6.15.4.1.2.5 5 1 15 2 25 3 35 4 45 4 6 8 1 12 14 16 Figure: Normality check of the correlation integral vector by the χ 2 test for the Lorenz 63 system. Left: with 1 radius values used. Right: with 92 radius values

Inference as a pseudo-marginal MCMC algorithm Due to chaoticity and randomised x the likelihood is non-deterministic. But sampling from can be interpreted as sampling from the joint distribution of the initial values and model parameters. Denote the likelihood function of y, evaluated for an arbitrary θ by T θ (θ, x). The target distribution for for θ is given as π(θ) = T θ (θ, x)λ(x)dx, where λ(x) is the distribution of the initial values x. In our situation, T θ (θ, x) is unknown, but an empirical approximation can be created as above. The method we implement is a bivariate Markov chain: (θ n, T n ) n, where T n are auxiliary variables that are non-negative, unbiased estimators of the underlying intractable target density π(θ n ). In other words, the method is a pseudo-marginal algorithm targeting π.

Pseudo-marginal MCMC Start from a pair (θ, T ) and iterate the following steps for n : 1. Propose θ = θ n + Z, where Z is sampled from a Gaussian proposal distribution. 2. Propose x λ and calculate T = T θ (θ, x ). 3. With probability min { 1, T /T n } accept and set (θ n+1, T n+1 ) = (θ, T ); otherwise reject and set (θ n+1, T n+1 ) = (θ n, T n ). In our cases T is non-negative (Gaussian), and the conditional expectation of T given θ is π(θ ). Therefore, the method provides correct simulation in the sense that the ergodic averages n 1 n k=1 f(θ k) converge to π(f) almost surely given minimal irreducibility and aperiodicity assumptions

Example: 3D Lorenz β 1.5 σ 1 9.5 σ 28.4 28.2 ρ 28 27.8 27.6 2.55 2.6 2.65 2.7 2.75 9.5 1 1.5 Figure: Marginal distributions of model parameters obtained using MCMC simulations for the three dimensional Lorenz system are very close to Gaussian.

Example: L95 2 Lorenz 95 MCMC results (no splitting) 1.5 1 h.5 8 9 1 11 12 F The sampled values for the parameters (F, h) (only, keeping other model parameters fixed) of L95 when the correlation integral vector is computed from the whole system s = (x k, y j ), i.e., the slow and fast, weakly coupled subsystems together.

Example: L95 Lorenz 95 MCMC results 1.3 1.2 1.1 h 1.99.98.97 9.8 9.85 9.9 9.95 1 1.5 1.1 1.15 1.2 F The sampled values for the parameters (F, h) of L95 when the correlation integral vector calculations are splitted, separately computed for the slow and fast subsystems, and both vectors used as data for the sampling cost function.

High dimension: Shallow water model Given as h t + (hu) x + (hv) y =, (hu) t + ( hu 2 + 1 2 gh2) x + (huv) y = ghb x, (hv t ) + (huv) x + ( hu 2 + 1 2 gh2) y = ghb y. Here h denotes water elevation, u and v are horizontal and vertical velocity components, B x and B y denote gradient direction of the surface implementing topography, and g is acceleration of gravity. It is possible to account for additional phenomena (e.g. wind stresses, friction etc.) by playing with the right-hand-side part of the equations

Discretization by finite volumes Numerics: Kurganov-Petrova second-order well-balanced positivity preserving central-upwind scheme The problem is solved for a huge set of discretization cells that form a staggered grid. Each cell describes solution by 5 components, where each component contains related data for velocity field and water elevation

Introduction to CUDA Idea: move computations to the GPU side Each GPU computes information for billions of screen pixels almost independently GPUs are highly parallel Modern GPUs comprise thousands of universal computation cores GPUs can be arranged into arrays forming a multi-gpu node Simple application interfaces that allow general purpose GPU programming exist (e.g. OpenCL, OpenGL Compute Shaders, CUDA etc.)

CPU vs CUDA GPU implementation Both implementations run at resolution of 256-by-256 grid cells Time step is.6 units of model time Time cost of a single model step for CPU implementation:.365 sec Time cost of a single model step for GPU implementation:,5 sec

SWE: can you distinguish change of flow pattern? A stone in river, with Van Karman vortex shedding. Dimension of the state (h, u, v) around 2.. Two slightly different cases. Example:

SWE: can you distinguish change of flow pattern? Snapshots at 1 time points, repeat simulations 5 times (producing a training set of 1225 pairs). Split the simulated vectors in two parts, use 1/2 as the training set, 1/2 as test set:.5 KHI2 NORMALITY TEST OF TRAINING VECTORS.4.3.2.1 2 4 6 8 1 12 14 16 18 2.5 KHI2 SIMILARITY TEST FOR NEW CASES.4.3.2.1 2 4 6 8 1 12 14 16 18 2

Where is the real data? No measured data is directly used for parameter estimation. Instead, assume basic model parameters given, and want to determine the posterior of parameters that would produce essentially the same chaotic dynamics. A real example: reanalysis studies of weather and climate models (e.g., the ERA-4 data and ECHAM5), that combine past real data and model predictions to achieve the best understanding of the systems. The aim here: characterize the parameter distributions of the reanalyzed models, that fit the climatology of long time runs of a given climate model. Further use them to quantify the uncertainty of model predictions with respect to the given parameters, by parameter ensemble simulations under various scenarios, such as increased CO 2 levels.

Summary, next The Correlation distance a promising way to characterize distances between chaotic trajectories. Quite insensitive with respect to varying initial values, solver numerics, etc., in small systems Next: more applications to high dimensional systems. No technical obstacles, in principle: Only L2 norms between vectors computed after model simulation. Can be done in parallel, during the simulation. From K simulations get K(K 1)/2 trajectory pairs to create the empirical distributions: moderate K enough. Problems/modifications expected for multi fractal situations, integration times: Careful with rare but large outliers in the training set. Distinguish, generally, a difference from normal behaviour: use for various classification, pattern recognition problems?

References Järvinen, H., Räisänen, P., Laine, M., Tamminen, J., Ilin, A., Oja, E., Solonen, A., and Haario, H.: Estimation of ECHAM5 climate model closure parameters with adaptive MCMC, Atmos. Chem. Phys., Vol. 1, nro. 2, 9993-12, 21. J. Hakkarainen, A. Ilin, S. Solonen, M. Laine, H. Haario, J. Tamminen, E. Oja, H. Järvinen: On Closure Parameter Estimation in Chaotic Systems. Nonlin. Processes Geophys., 19,127 143, 212. A. Solonen, P. Ollinaho, M. Laine, H. Haario, J. Tamminen, H. Järvinen: Efficient MCMC for Climate model Parameter Estimation: Parallel Adaptive chains and Early Rejection. Bayesian Analysis, 7, Number 2, pp 1 22, 212. Janne Hakkarainen, Antti Solonen, Alexander Ilin, Jouni Susiluoto, Marko Laine, Heikki Haario and Heikki Järvinen. A dilemma on the uniqueness of weather and climate model closure parameters. Tellus A 213, 65, 2147. Ollinaho P, Laine M, Solonen A,Haario H, Järvinen H.:NWP model forecast skill optimization via closure parameter variations. Q. J. R. Meteorol. Soc., 139, 675, pp. 152-1532, 213. Ollinaho, P., Bechtold, P., Leutbecher, M., Laine, M., Solonen, A., Haario, H., and Järvinen, H.: Parameter variations in prediction skill optimization at ECMWF, Nonlin. Processes Geophys., 2, 6,11-11, 213. Heikki Haario, Leonid Kalachev, Janne Hakkarainen Generalized Correlation Integral Vectors: A New Distance Concept for Chaotic Dynamical Systems. UM Tech.Rep 8/214. (http://cas.umt.edu/math/reports/) Generalized Correlation integral vectors: A distance concept for chaotic dynamical systems. Chaos: An Interdisciplinary Journal of Nonlinear Science 25, 6312, 215; doi: 1.163/1.4921939