Extreme value statistics: from one dimension to many. Lecture 1: one dimension Lecture 2: many dimensions

Similar documents
Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions

Variations. ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

On the occurrence times of componentwise maxima and bias in likelihood inference for multivariate max-stable distributions

Bayesian inference for multivariate extreme value distributions

The extremal elliptical model: Theoretical properties and statistical inference

PREPRINT 2005:38. Multivariate Generalized Pareto Distributions HOLGER ROOTZÉN NADER TAJVIDI

Statistics of Extremes

Multivariate generalized Pareto distributions

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Estimation of spatial max-stable models using threshold exceedances

Bivariate generalized Pareto distribution

Multivariate peaks over thresholds models

General Strong Polarization

Modelling extreme-value dependence in high dimensions using threshold exceedances

Bayesian Modelling of Extreme Rainfall Data

Classical Extreme Value Theory - An Introduction

A Conditional Approach to Modeling Multivariate Extremes

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Worksheets for GCSE Mathematics. Algebraic Expressions. Mr Black 's Maths Resources for Teachers GCSE 1-9. Algebra

Multivariate Pareto distributions: properties and examples

Extreme Value Analysis and Spatial Extremes

General Linear Models. with General Linear Hypothesis Tests and Likelihood Ratio Tests

Bayesian Model Averaging for Multivariate Extreme Values

Bayesian nonparametrics for multivariate extremes including censored data. EVT 2013, Vimeiro. Anne Sabourin. September 10, 2013

Financial Econometrics and Volatility Models Extreme Value Theory

Models and estimation.

Non-Parametric Weighted Tests for Change in Distribution Function

Some conditional extremes of a Markov chain

Simulation of Max Stable Processes

Estimate by the L 2 Norm of a Parameter Poisson Intensity Discontinuous

ESTIMATING BIVARIATE TAIL

ECE 6540, Lecture 06 Sufficient Statistics & Complete Statistics Variations

General Strong Polarization

Bayesian Inference for Clustered Extremes

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

APPLICATION OF EXTREMAL THEORY TO THE PRECIPITATION SERIES IN NORTHERN MORAVIA

Peaks over thresholds modelling with multivariate generalized Pareto distributions

Expectation Propagation performs smooth gradient descent GUILLAUME DEHAENE

Gradient expansion formalism for generic spin torques

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Module 7 (Lecture 27) RETAINING WALLS

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Work, Energy, and Power. Chapter 6 of Essential University Physics, Richard Wolfson, 3 rd Edition

MULTIVARIATE EXTREMES AND RISK

Lecture 11. Kernel Methods

Nonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution

Introduction to Algorithmic Trading Strategies Lecture 10

Lecture 6. Notes on Linear Algebra. Perceptron

Lecture 3. Linear Regression

On The Cauchy Problem For Some Parabolic Fractional Partial Differential Equations With Time Delays

On the Estimation and Application of Max-Stable Processes

Terms of Use. Copyright Embark on the Journey

Extreme Value Theory and Applications

Math 171 Spring 2017 Final Exam. Problem Worth

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra

A new procedure for sensitivity testing with two stress factors

Module 7 (Lecture 25) RETAINING WALLS

Models for Spatial Extremes. Dan Cooley Department of Statistics Colorado State University. Work supported in part by NSF-DMS

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Revision : Thermodynamics

Chapter 22 : Electric potential

Translator Design Lecture 16 Constructing SLR Parsing Tables

CDS 101/110: Lecture 6.2 Transfer Functions

Uncertain Compression & Graph Coloring. Madhu Sudan Harvard

PHY103A: Lecture # 9

Conditioned limit laws for inverted max-stable processes

Jasmin Smajic1, Christian Hafner2, Jürg Leuthold2, March 23, 2015

A brief summary of the chapters and sections that we cover in Math 141. Chapter 2 Matrices

Review for Exam Hyunse Yoon, Ph.D. Adjunct Assistant Professor Department of Mechanical Engineering, University of Iowa

Multivariate generalized Pareto distributions

New Classes of Multivariate Survival Functions

Assessing Dependence in Extreme Values

Random Vectors Part A

Haar Basis Wavelets and Morlet Wavelets

Chapter 5: Spectral Domain From: The Handbook of Spatial Statistics. Dr. Montserrat Fuentes and Dr. Brian Reich Prepared by: Amanda Bell

General Strong Polarization

Variable Selection in Predictive Regressions

Question 1 carries a weight of 25%; question 2 carries 25%; question 3 carries 20%; and question 4 carries 30%.

Fermi Surfaces and their Geometries

Statistics of Extremes

On the spatial dependence of extreme ocean storm seas

1. The graph of a function f is given above. Answer the question: a. Find the value(s) of x where f is not differentiable. Ans: x = 4, x = 3, x = 2,

General Strong Polarization

Exam Programme VWO Mathematics A

Spatial data analysis

Solving Fuzzy Nonlinear Equations by a General Iterative Method

Estimating Bivariate Tail: a copula based approach

" = Y(#,$) % R(r) = 1 4& % " = Y(#,$) % R(r) = Recitation Problems: Week 4. a. 5 B, b. 6. , Ne Mg + 15 P 2+ c. 23 V,

Comparing Different Estimators of Reliability Function for Proposed Probability Distribution

arxiv: v2 [stat.me] 25 Sep 2012

On the Estimation and Application of Max-Stable Processes

arxiv: v1 [stat.ap] 28 Nov 2014

Max-stable processes: Theory and Inference

Max-stable Processes for Threshold Exceedances in Spatial Extremes

Data Preprocessing. Jilles Vreeken IRDM 15/ Oct 2015

SECTION 7: FAULT ANALYSIS. ESE 470 Energy Distribution Systems

Summary. Forward modeling of the magnetic fields using rectangular cells. Introduction

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Transcription:

Extreme value statistics: from one dimension to many Lecture 1: one dimension Lecture 2: many dimensions

The challenge for extreme value statistics right now: to go from 1 or 2 dimensions to 50 or more The challenge for computation: To maximize fifty-dimensional likelihoods with hundreds of parameters The challenge for probability: To construct parametric models which makes this possible and to understand the models (and then model validation, asymptotics, ) Extreme rainfall at many catchments, flooding of many dykes, storm insurance, landslide risk assessment, consistent risk estimation for financial portfolios, extreme influenza epidemics,

Notation Bold symbols are d-variate vectors. For instance, γγ = (γγ 1,..., γγ dd ) and 00 = (0,..., 0) RR dd. Operations and relations are componentwise, with shorter vectors recycled. For instance aaaa + bb = (aa 1 xx 1 + bb 1,..., aa dd xx dd + bb dd ), xx yy iiii xx jj yy jj for jj = 1,..., dd, and tt γγ = (tt γγ 1,, tt γγ dd)

dd-dimensional generalized extreme value (GEV) distributions No finite-dimensional parametrization Marginal distributions GEV: GG ii xx = ee 1+xx μμ ii σσ ii + Can be expressed through spectral representations. 1/γγ ii Can be expressed in terms of point processes, see later slides

Let ηη, dd be the vector of lower endpoints of the marginal distributions of GG The GEV distributions are the limit distributions of maxima: Let XX 1, XX 2, be an i.i.d. sequence of dd-dimensional vectors with cdf FF. If for some scaling and location sequences aa nn > 00 and bb nn PPPP aa 1 nn nn ii=1 XX ii bb n ηη xx dd GG xx, as nn where GG has non-degenerate margins, then GG is a GEV distribution. Conversely all GEV distributions can be obtained in this way.

The GEV distributions are the max-stable distributions: Let XX 1, XX 2, be an i.i.d. sequence with nondegenerate marginal cdf-s GG ii. If for some scaling and location sequences αα nn > 0 and ββ nn 1 nn Pr αα nn XX ii ββ nn xx = GG xx, for nn 1, 2, ii=1 then GG is a GEV distribution. Conversely all GEV distributions can be obtained in this way.

Think of XX 1, XX 2,, XX n as points in RR dd. Subtract aa nn and divide by bb nn to get a point process with points XX ii aa nn bb nn ηη; ii = 1, nn. It converges as nn iff XX DD(GG), to a limit point process xx 2 xx 1 {WW ii RR ii γγ } where (in sequel think of γγ > 00) WW ii are i.i.d. copies of some ( any ) positive stochastic dd-dimensional vector and the RR ii are points of a unit rate Poisson process on RR + WW 1 /RR 1 γγ WW 2 /RR 2 γγ WW 3 /RR 3 γγ 1 2 3

All d-dimensional GEV distributions may be obtained as γγ G xx = PPPP(max{WW ii RR ii ii } xx) Often the vector WW = WW (dd) is thought of as obtained by sampling from a process WW(ss) with ss RR dd (as in figure on previous slide) Often G is defined on a standardized scale, say γγ = 11, and model then is transformed to the real scale Smith process: WW is a non-random Gaussian density function Brown-Resnick process: WW(ss) = exp{xx ss σσ 2 /2}, with XX a Gaussian process with variance σσ 2 Schlater model

The block maxima method As in one dimension: Get observations XX 1,, XX nn of block maxima; assume the observations are i.i.d and follow a GEV distribution; use XX 1,, XX nn to estimate the parameters of the GEV distribution Likelihood estimation: Distribution functions often are possible to calculate, but differentiation with respect to the dd component variables often lead to combinatorial explosion of number of terms (because often GG xx = exp{ l(xx)}) Composite likelihood: Use sum of likelihoods of pairs, or triplets, or Stephenson-Tawn likelihoods: in between block maxima and PoT A. C. Davison, S. A. Padoan and M. Ribatet Statistical Modeling of Spatial Extremes, Statistical Science 2012 R. Huser, A. C. Davison, M. G. Genton Likelihood estimators for multivariate extremes, Extremes 2016

Copula modelling (i.e. modeling after transforming marginal distributions to uniformity) is often used Popular copulas: logistic = Gumbel; asymmetric logistic; inverted logistic; Gaussian; Husler-Reiss; bilogistic; Dirichlet; distributions with same copula, but different margins: Uniform; Fréchet; Gumbel; and Laplace

pointwise 25-year return levels for rainfall (mm) obtained from latent variable and max-stable models. Top and bottom rows: lower and upper bounds of 95% pointwise credible/confidence intervals. Middle row: predictive pointwise posterior mean and pointwise estimates. Left column: latent variable model. Middle column: Another latent variable model. Right column: Extremal t copula model. Copied from Davison, Padoan & Ribatet paper

Multivariate peaks over thresholds modelling and likelihood inference Or: News from the exciting exploration of GP-territory Joint with Anna Kiriliouk, Johan Segers, Maud Thomas, Jennifer Wadsworth

The philosophy is simple Extreme episodes are often quite different from ordinary everyday behavior, and ordinary behavior then has little to say about extremes, so that only other extreme events give useful information about future extreme events Operationally: choose thresholds uu 1,, uu dd, say that an extreme episode occurs if at least one of the components of the observation YY = (YY 1,, Y dd ) exceeds its threshold, only model times of occurence and undershoots and overshoots, XX = YY uu, of extreme episodes Theory: times of occurrence follows a Poisson process, XX is a generalized Pareto (GP) distribution

Generalized Pareto(GP) distributions HH xx = 1 log GG(00) log GG(xx) GG(xx 00), where GG is a GEV cdf with GG 00 > 0

The GP distributions are the limit distributions threshold excesses: Let XX~ FF. If there exist continuous threshold and scaling functions uu tt and ss tt > 0, with FF uu tt < 1 and FF uu tt 1 as tt, such that PPPP ss tt 1 XX uu tt xx XX uu tt dd HH xx, as nn, where HH has non-degenerate margins, then HH is a GP distribution. Conversely all GP distributions can be obtained in this way. maxima of FF are in the domain of attraction of a GEV cdf if and only if threshold excesses of FF are in the threshold domain of attraction of a GP cdf HHH

The GP distributions are the threshold-stable distribution: Let XX~HH, with HH nondegenerate. If there exist continuous threshold and scaling functions uu tt, ss tt 00 and with FF uu tt < 1 and FF uu tt 1 as tt, such that PPPP ss tt 1 XX uu tt xx XX > uu tt = HH xx, for all tt 1, then HH is a GP distribution. Conversely all GP distributions has this property.

Closed under Properties of the class of GP distributions Scaling Increase of level Taking limits Mixing Going to conditional margins Not closed under Location changes Going to (unconditional) margins Conditional 1-d marginals GP: HH jj + xx = 1 1 + γγ jj σσ jj xx (= 1 ee xx/σσ jj if γγ jj = 0) 1/γγ jj

Multivariate PoT modelling Should ideally contain three components: 1) A model for the behavior of extreme episodes in the physical world 2) Understanding of how conditioning on threshold exceedance changes the distribution obtained in 1) 3) The possibility to incorporate trends in the models ( likelihood inference) Inference should be for the original observations, and not after (approximate) transformation of margins of observations to some standard form, say standard Pareto or standard GP

Think of XX 1, XX 2,, XX n as points in RR dd. Subtract aa nn and divide by bb nn to get a point process with points XX ii aa nn bb nn ηη; ii = 1, nn. It converges as nn iff XX DD(GG), to a limit point process xx 2 xx 1 {WW ii RR ii γγ } where (in sequel think of γγ > 00) WW ii are i.i.d. copies of some ( any ) positive stochastic dd-dimensional vector and the RR ii are points of a unit rate Poisson process on RR + WW 1 /RR 1 γγ WW 2 /RR 2 γγ WW 3 /RR 3 γγ 1 2 3

A typical point (positive shape) RR TT γγ RR an arbitrary dd-dimensional random vector, subject to EE RR jj <, and RR jj 0 if γγ jj > 0 TT improper random variable, has density = 1 with respect to Lebesgue measure on RR + RR and TT are mutually independent If γγ jj > 0 then TT γγ jj has density xx 1 γγ jj 1 with respect to Lebesgue measure on RR + If γγ jj = 0 then RR jj TT γγ jj is defined to mean RR jj log TT and log TT has density ee xx with respect to Lebesgue measure on RR

Conditioning on threshold exceedance FF d.f. of RR in typical point RR TT γγ. XX uu = XX uu, PPPP XX uu xx XX uu 00 = PPrr XX uu xx PPrr XX uu xx 00 PPrr XX uu 00 Set XX = WW RR γγ and uu = σσ γγ and condition (formally) on TT uu 2 xx 2 1 not here 1 uu 1 xx 1 (R) HH xx = 0 FF tt γγ xx+ σσ ddtt FF tt γγ xx 00+ σσ γγ γγ 0 FF(tt γγσσ γγ )ddtt is a GP and all GP-s can be obtained this way dddd

(R) HH rr xx = Three representations 0 FFRR tt γγ xx+ σσ ddtt FF γγ RR tt γγ xx 00+ σσ dddd γγ 0 FF RR (tt γγσσ γγ )ddtt is a GP and all GP-s can be obtained this way (R ss ) HH ss xx = 0 FFSS 1 γγ log xx+σσ γγ +log tt ddtt FF SS( 1 γγ log xx 00+σσ γγ 0 FF SS (log tt)ddtt is a GP and all GP-s can be obtained this way +log tt) dddd (R ss ) HH SS xx = 0 1 FF SS = FFSS 0 1 γγ log xx + σσ γγ + log tt dddd 1 γγ log xx + σσ γγ tt ee tt dddd for SS = SS max SS jj and all GP-s can be obtained this way Ferreira de Haan (2014) representation

Densities Explicit formulas for densities for all three representations Formula for (R ss ) density simplest!! Censored likelihood contributions computable

Estimation Likelihood for (R): NN dd 0 tt jj=1 γγ jj ff tt γγ xx ii + γγ σσ ii=1 0 FF(tt γγγγ σσ )ddtt dddd One-dimensional integrals, so relatively easy to compute if ff, FF are easy to compute. In some cases integrals can be computed analytically Censoring makes computation harder Parenthesis: Coefficient of asymptotic dependence: Pr(FF 1 (XX 1 ) > qq,, FF dd (XX dd ) > qq) χχ dd : = lim. qq 1 1 qq GEV and GP modelling aimed at asymptotic dependence, i.e. χχ > 0 for some subset of components. Substantial literature (Heffernan, Tawn, Resnick, Wadsworth, ) on modelling under asymptotic independence, when χχ = 0

Parametrization and identifiability Understanding is on the (R) scale Choice between (R), (R ss ), and (R ss ) a choice of parametrization σσ and γγ parameters of conditional margins Replacing RR by ZZ γγ RR doesn t change HH rr. Similar properties of HH ss and HH ss (R ss ) and (R ss ) continuous at γγ ii = 0. Not so for (R) Much left to explore

Probabilities and conditional probabilities Probabilities of general events given by same formula as HH rr Conditional densities same type of formulas Conditional probabilities same type of formulas (prediction) if γγ 1 = = γγ dd =: γγ then weighted sums of components, conditioned to be positive, are also GP

If γγ 1 = = γγ dd γγ then weighted sums of components, conditioned to be positive, are also GP RR Pf. If σσ = TT γγ γγ (RR 1 σσ 1, RR 2 σσ 2 ) then the sum of the TT γγ γγ TT γγ γγ components equals RR 1+RR 2 σσ 1+σσ 2 =: RR σσ, and by the TT γγ γγ TT γγ γγ onedimensional (R) representation, the conditional distribution of RR σσ given that it is positive is a one-dimensional GP TT γγ γγ distribution with parameters σσ and γγ

Simulation 1. Direct for (R ss ) 2. Change of measure + rejection sampling for (R) and (R ss ) 3. Change of measure + MCMC for (R) and (R ss ) 4. Approximate method for (R) and (R ss )

Before Gudrun After Gudrun SEK 2,8 billion loss (LF) Windstorm Gudrun, January 2005 55% of total loss 1982-2005 Remeber: analysis based on data 1982-1993: 1% chance biggest damage next 15-year period will be more than SEK 2,5 billion??? Bivariate logistic PoT analysis basered on data 1982-2005: 10% chance biggest damage next 15-year period will be more than SEK 7 billion (this time we didn t miss damage to forrest but did we miss something else?) Brodin & Rootzén, Insurance 2009

Landslides Landslide caused by one day with very extreme rainfall, or by two consequtive days with extreme rainfall. Simplest model: XX 1, XX 2, independent positive variables, γγ 1 = γγ 2 = γγ RR TT γγ = XX 1 XX 2 TT γγ, XX 1 + XX 2 TT γγ = XX 1 XX 2 TT γγ, XX 1 XX 2 + XX 1 XX 2 TT γγ = rainiest day, rainest day + rainest adjacent day

Slide: Anna Kiriliouk

Slides: Jennifer Wadsworth

Modelling strategy

Prediction of influenza epidemics in France yy 1 =#cases week 3,, yy 8 =#cases week 11 γγ = 00 (from data), RR ii NN(mm ii, ss ii 2 ), mm 1 = 0 Model: conditional distribution of RR σσ log TT given that RR 1 > σσ log TT Total #cases GP with γγ = 0, σσ = σσ 1 + σσ 8 LR-test of parabolic form for the mm ii, equality of the ss ii 2 #cases per week normal epidemics extreme epidemics

Challenges Model construction Parametrization Time series Prediction Asymptotics H. Rootzén, J. Segers, and J.L. Wadsworth Multivariate peaks over thresholds models, ArXive 2016