Bayesian covariate models in extreme value analysis

Similar documents
Efficient adaptive covariate modelling for extremes

MULTIDIMENSIONAL COVARIATE EFFECTS IN SPATIAL AND JOINT EXTREMES

On covariate smoothing for non-stationary extremes

Multidimensional covariate effects in spatial and joint extremes

Bayesian inference for non-stationary marginal extremes

Statistics of extreme ocean environments: Non-stationary inference for directionality and other covariate effects

Consistent Design Criteria for South China Sea with a Large-Scale Extreme Value Model

Proceedings of the ASME nd International Conference on Ocean, Offshore and Arctic Engineering OMAE2013 June 9-14, 2013, 2013, Nantes, France

Distributions of return values for ocean wave characteristics in the South China Sea using directional-seasonal extreme value analysis

Monte Carlo in Bayesian Statistics

Accommodating measurement scale uncertainty in extreme value analysis of. northern North Sea storm severity

Fast Computation of Large Scale Marginal Extremes with Multi-Dimensional Covariates

Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes

Riemann Manifold Methods in Bayesian Statistics

Estimation of storm peak and intra-storm directional-seasonal design conditions in the North Sea

GLAM An Introduction to Array Methods in Statistics

Statistical modelling of extreme ocean environments for marine design : a review

Environmental data science

Uncertainty quantification in estimation of extreme environments

Modelling Operational Risk Using Bayesian Inference

On the spatial dependence of extreme ocean storm seas

Manifold Monte Carlo Methods

Modelling trends in the ocean wave climate for dimensioning of ships

Modelling geoadditive survival data

Flexible Spatio-temporal smoothing with array methods

P -spline ANOVA-type interaction models for spatio-temporal smoothing

Generalized additive modelling of hydrological sample extremes

Analysing geoadditive regression data: a mixed model approach

Bayesian Estimation of Input Output Tables for Russia

Kevin Ewans Shell International Exploration and Production

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

CPSC 540: Machine Learning

Multidimensional Density Smoothing with P-splines

Bayesian Methods for Machine Learning

Modelling spatially-dependent non-stationary extremes with application to hurricane-induced wave heights

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Bayesian Modelling of Extreme Rainfall Data

Bayesian Inference for Clustered Extremes

Notes on pseudo-marginal methods, variational Bayes and ABC

Multi-Dimensional Predictive Analytics for Risk Estimation of Extreme Events

Markov Chain Monte Carlo, Numerical Integration

Bayesian density estimation from grouped continuous data

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Control Variates for Markov Chain Monte Carlo

Sub-kilometer-scale space-time stochastic rainfall simulation

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Currie, Iain Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh EH14 4AS, UK

Space-time modelling of air pollution with array methods

Introduction to Machine Learning

Nonparametric Drift Estimation for Stochastic Differential Equations

Estimating surge in extreme North Sea storms

Bayesian model selection in graphs by using BDgraph package

19 : Slice Sampling and HMC

R-INLA. Sam Clifford Su-Yun Kang Jeff Hsieh. 30 August Bayesian Research and Analysis Group 1 / 14

Markov Chain Monte Carlo methods

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Beyond Mean Regression

Computational statistics

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Wrapped Gaussian processes: a short review and some new results

Paul Karapanagiotidis ECO4060

ICES REPORT March Tan Bui-Thanh And Mark Andrew Girolami

Bayesian Inference and MCMC

On Markov chain Monte Carlo methods for tall data

Large Scale Bayesian Inference

Default Priors and Effcient Posterior Computation in Bayesian

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005

List of projects. FMS020F NAMS002 Statistical inference for partially observed stochastic processes, 2016

Fitting the Bartlett-Lewis rainfall model using Approximate Bayesian Computation

Contents. Part I: Fundamentals of Bayesian Inference 1

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Nonparametric Bayesian Methods - Lecture I

17 : Markov Chain Monte Carlo

EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz

IT S TIME FOR AN UPDATE EXTREME WAVES AND DIRECTIONAL DISTRIBUTIONS ALONG THE NEW SOUTH WALES COASTLINE

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Bayesian inference for multivariate extreme value distributions

Der SPP 1167-PQP und die stochastische Wettervorhersage

Statistics for extreme & sparse data

A short introduction to INLA and R-INLA

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

A spatio-temporal model for extreme precipitation simulated by a climate model

Riemann manifold Langevin and Hamiltonian Monte Carlo methods

Session 5B: A worked example EGARCH model

Forward Problems and their Inverse Solutions

Tutorial on Probabilistic Programming with PyMC3

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

A Review of Pseudo-Marginal Markov Chain Monte Carlo

SMSTC: Probability and Statistics

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

CTDL-Positive Stable Frailty Model

Bayesian Quadrature: Model-based Approximate Integration. David Duvenaud University of Cambridge

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

A STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES

Deposited on: 07 September 2010

Transcription:

Bayesian covariate models in extreme value analysis David Randell, Philip Jonathan, Kathryn Turnbull, Mathew Jones EVA 2015 Ann Arbor Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 1

Acknowledgement Kevin Ewans, Graham Feld and other Shell colleagues. Colleagues in academia, especially at Lancaster University. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 3

Hurricane Katrina Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 4

Hurricane Katrina Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 5

Motivation: extremes in met-ocean Rational and consistent design an assessment of marine structures Reduce bias and uncertainty in estimation of return values. Non-stationary marginal and conditional extremes Multiple locations, multiple variables, time-series, Multidimensional covariates. Improved understanding and communication of risk Incorporation within well-established engineering design practices, Knock-on effects of improved inference, New and existing structures. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 6

Marginal directional-seasonal extremes Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 7

Marginal directional-seasonal extremes Marginal model: single location. Response: storm peak significant wave height, H sp S. Wave climate: monsoonal. Southwest monsoon ( August, to northwest). Northeast monsoon ( January, to east-northeast). Long fetches to Makassar Strait, Java Sea. Land shadows of Borneo (northwest), Sulawesi (northeast), Java (south). www.westernpacificweather.com Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 8

Directional and seasonal variability 3.5 3 2.5 H S 2 1.5 1 0.5 0 90 180 270 360 Direction 3.5 3 2.5 H S 2 1.5 1 0.5 J F M A M J J A S O N D Season Figure: Hindcast storm peak significant wave height H sp for 1956 2012 (black) on direction θ (upper panel, to S which waves propogate) and season φ (lower panel). Also shown is sea-state significant wave height H S (grey) on direction θ (upper panel) and season φ (lower panel). Northeast monsoon: August to northwest (315). Southwest monsoon: January to east-northeast (110). Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 9

Storm model Figure: H S 4 standard deviation of ocean surface profile at a location corresponding to a specified period (typically three hours) Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 10

Model components Linear wave theory suggests ocean waves Rayleigh distributed, Longuet-Higgins [1952]. Response y not exceeding the extreme value threshold ψ follows a truncated Weibull distribution with density similar to Frigessi et al. [2002] { τ f TW (y α, γ) for y ψ f (y ξ, σ, α, γ, ψ, τ) = (1 τ) f GP (y ξ, σ) for y > ψ ψ is defined by τ, α and γ ψ τ, α, γ = α ( log(1 τ)) γ. No imposition of continuity in the density. Rate of occurrence ρ r is modelled as a Poisson distribution as in Chavez-Demoulin and Davison [2005], where r is observation counts in covariate bins. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 11

Bayesian P-Splines Physical considerations suggest model parameters α, γ, ρ, ξ and σ vary smoothly with covariates θ, φ Values of (η =)α, γ, ρ, ξ and σ all take the form η = Bβ η Priors: β η λexp ( 12 ) β ηd Dβ η λ Gamma(Λ η ) τ Beta(Λ τ ) B is a spline basis and D is a difference matrix of order k. Smoothnesses λ can be estimated very easily using Gibbs sampling see Brezger and Lang [2006]. Λ η are smoothness hyper-parameters giving diffuse prior. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 12

DAG for Poisson Rate of Occurrence Λ ρ λ ρ β ρ r Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 13

DAG for Weibull GP Λ σ Λ ξ λ σ Λ α λ ξ λ α β σ β ξ β α Λ τ τ y β γ λ γ Λ γ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 14

Penalised B-splines Wrapped bases for periodic covariates (direction, season). Multidimensional bases easily constructed using tensor products, Eilers and Marx [2010]. GLAMs, Currie et al. [2006] for efficient computation in high dimensions. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 15

Sampling for P-Splines Cannot write full conditionals for generalised Pareto likelihood, so no Gibbs sampling; simple Metropolis Hastings methods don t mix well. Neighbouring spline parameters are highly correlated due to smoothness prior. Correlated spline proposals made from β N(β, G 1 ) where G = B B + λd D. Gradient based MCMC methods also help to improve mixing. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 16

MCMC proposals exploiting gradient Simple Metropolis Hastings sampling is a stochastic analogue of penalised likelihood optimisation, but does not exploit gradient information. MALA and mmala use gradient information for MCMC proposal generation. These are stochastic analogues of back-fitting and IRLS, see Roberts and Stramer [2002]. They provide random walks downhill, particularly important with high numbers of correlated parameters. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 17

Season Season Parameter plots J F M A Weibull Shape. 1.2 1.1 GP Shape 9-0.05-0.1 Poisson Rate ; #10-3 5 4 M J 1-0.15 3 J A 0.9-0.2 2 S O N D 0.8 0.7-0.25-0.3 1 90 180 270 360 Direction 90 180 270 360 90 180 270 360 J F M A M J Weibull Scale, 0.45 0.4 0.35 0.3 GP Scale < 0.6 0.5 0.4 14 12 10 8 NEP: = Posterior Prior J A S O N D 90 180 270 360 Direction 0.25 0.2 0.15 0.1 90 180 270 360 0.3 0.2 0.1 6 4 2 0 0 0.2 0.4 0.6 0.8 1 Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 18

Density Smoothness 0.2 Weibull Shape 6. 0.1 GP Shape 6 9 10 Poisson Rate 6 ; 0.15 0.08 8 0.06 6 0.1 0.04 4 0.05 0.02 2 0 10-2 10 0 10 2 0 10-2 10 0 10 2 Smoothness 0 10-2 10 0 10 2 0.06 0.05 Weibull Scale 6, 0.1 0.08 GP Scale 6 < Direction Season 0.04 0.06 0.03 0.02 0.04 0.01 0.02 0 10-2 10 0 10 2 0 10-2 10 0 10 2 Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 19

Return values Estimated by simulation from sample of posterior. H S100 is the maximum value of H sp S in a simulation period of 100 years. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 20

Directional 100yr return values Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 21

Directional seasonal 100yr return values Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 22

Directional Seasonal 100Yr Return Values Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 23

Summary Modelling non-stationarity essential for understanding extremes and development of design conditions. Non-parametric covariate models flexible, but need to estimate roughness. P-splines simple to implement and extend to periodic and higher dimensional domains. Bayesian P-spline for extremes Roughness estimated using Gibbs sampling, Different roughness for each covariate dimension, Correlated MCMC proposals exploiting gradient, Computationally efficient, and generally more stable than optimisation to point solution. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 24

References Andreas Brezger and Stefan Lang. Generalized structured additive regression based on Bayesian P-splines. Computational Statistics and Data Analysis, 50(October 2004):967 991, 2006. V. Chavez-Demoulin and A.C. Davison. Generalized additive modelling of sample extremes. J. Roy. Statist. Soc. Series C: Applied Statistics, 54:207, 2005. I. D. Currie, M. Durban, and P. H. C. Eilers. Generalized linear array models with applications to multidimensional smoothing. J. Roy. Statist. Soc. B, 68:259 280, 2006. P H C Eilers and B D Marx. Splines, knots and penalties. Wiley Interscience Reviews: Computational Statistics, 2: 637 653, 2010. Arnoldo Frigessi, Ola Haug, and Hå vard Rue. A Dynamic Mixture Model for Unsupervised Tail Estimation without Threshold Selection. Extremes, 5:219 235, 2002. ISSN 1572-915X. doi: 10.1023/A:1024072610684. URL http://link.springer.com.libproxy1.nus.edu.sg/article/10.1023/a%3a1024072610684. M. S. Longuet-Higgins. On the statistical distribution of the height of sea waves. J. Mar. Res., 11:245 266, 1952. G. O. Roberts and O. Stramer. Langevin Diffusions and Metropolis-Hastings Algorithms. pages 337 357, 2002. ISSN 13875841. doi: 10.1023/A:1023562417138. URL http: //www.springerlink.com/content/nv57r5gq18088257/?p=6129beac0358495db95e120f3372c576&pi=1. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 25

Parameters τ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 28

Parameters ψ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 29

Parameters ρ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 30

Parameters α Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 31

Parameters γ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 32

Parameters σ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 33

Parameters ξ Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 34

Directional Seasonal 100Yr Return Values Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 35

Gradient Based MCMC HMC: Hamiltonian Monte Carlo: uses first derivatives of parameters have momentum based on gradient. This approach can be unstable so several leapfrog steps are taken instead of single step. Riemann manifold HMC: uses second derivatives of parameters. Here 2 leapfrog steps are needs so this is computationally challenging MALA Metropolis adjusted Langevin algorithm: uses first derivatives steps. Proposal calulated α by sampling from a Normal distribution N(µ, Σ) where µ = α ɛ 2 α (L + L prior) Σ = ɛi (1) and then implement standard MH based on this proposal. Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 36

mmala Given a current state α a proposal α is sampled from N(µ(α), Σ), where µ(α) = α ɛ 2 G 1 (α) α (L + L prior) Σ = ɛg 1 (α) (2) and then MH is carried through as before. As in MALA we again do not have symmetric proposals and so we must calculate the full acceptance probability. it is also interesting to notice the similarities between IWLS and mmala. To see this compare G(α ξ ) 1 = (B 2 L ξ 2 B + λ ξp) 1 (3) ˆα t+1 = (B Ŵ t B + λd D) 1 B Ŵ t ẑ t (4) Copyright 2015 Shell Global Solutions (UK) EVA 2015 Ann Arbor June 2015 37