HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Similar documents
RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

RISK ANALYSIS AND EXTREMES

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

MULTIVARIATE EXTREMES AND RISK

Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Extreme Value Analysis and Spatial Extremes

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Bayesian Modelling of Extreme Rainfall Data

STATISTICS FOR CLIMATE SCIENCE

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Financial Econometrics and Volatility Models Extreme Value Theory

Extreme Value Theory and Applications

BAYESIAN HIERARCHICAL MODELS FOR EXTREME EVENT ATTRIBUTION

Zwiers FW and Kharin VV Changes in the extremes of the climate simulated by CCC GCM2 under CO 2 doubling. J. Climate 11:

New Classes of Multivariate Survival Functions

INFLUENCE OF CLIMATE CHANGE ON EXTREME WEATHER EVENTS

A STATISTICAL APPROACH TO OPERATIONAL ATTRIBUTION

What Can We Infer From Beyond The Data? The Statistics Behind The Analysis Of Risk Events In The Context Of Environmental Studies

Statistics for extreme & sparse data

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

Data. Climate model data from CMIP3

Investigation of an Automated Approach to Threshold Selection for Generalized Pareto

UNIVERSITY OF CALGARY. Inference for Dependent Generalized Extreme Values. Jialin He A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

Introduction to Algorithmic Trading Strategies Lecture 10

Generalized additive modelling of hydrological sample extremes

Modelação de valores extremos e sua importância na

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

Overview of Extreme Value Analysis (EVA)

Classical Extreme Value Theory - An Introduction

PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES. Rick Katz

Bivariate generalized Pareto distribution

FORECAST VERIFICATION OF EXTREMES: USE OF EXTREME VALUE THEORY

Accounting for Choice of Measurement Scale in Extreme Value Modeling

Bayesian Inference for Clustered Extremes

Emma Simpson. 6 September 2013

DOWNSCALING EXTREMES: A COMPARISON OF EXTREME VALUE DISTRIBUTIONS IN POINT-SOURCE AND GRIDDED PRECIPITATION DATA

Downscaling Extremes: A Comparison of Extreme Value Distributions in Point-Source and Gridded Precipitation Data

Physically-Based Statistical Models of Extremes arising from Extratropical Cyclones

Parameter Estimation

Models and estimation.

On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves

IT S TIME FOR AN UPDATE EXTREME WAVES AND DIRECTIONAL DISTRIBUTIONS ALONG THE NEW SOUTH WALES COASTLINE

Extreme Event Modelling

SEVERE WEATHER UNDER A CHANGING CLIMATE: LARGE-SCALE INDICATORS OF EXTREME EVENTS

CONTAGION VERSUS FLIGHT TO QUALITY IN FINANCIAL MARKETS

Assessing Dependence in Extreme Values

Estimating Bivariate Tail: a copula based approach

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

STATISTICAL METHODS FOR RELATING TEMPERATURE EXTREMES TO LARGE-SCALE METEOROLOGICAL PATTERNS. Rick Katz

Max-stable Processes for Threshold Exceedances in Spatial Extremes

Analysis methods of heavy-tailed data

A Conditional Approach to Modeling Multivariate Extremes

Peaks-Over-Threshold Modelling of Environmental Data

Multivariate generalized Pareto distributions

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Spatial Extremes in Atmospheric Problems

PREPRINT 2005:38. Multivariate Generalized Pareto Distributions HOLGER ROOTZÉN NADER TAJVIDI

High-frequency data modelling using Hawkes processes

High-frequency data modelling using Hawkes processes

Estimation of Gutenberg-Richter seismicity parameters for the Bundaberg region using piecewise extended Gumbel analysis

Bayesian Modeling of Accelerated Life Tests with Random Effects

On the Estimation and Application of Max-Stable Processes

Bayesian nonparametric Poisson process modeling with applications

Efficient Estimation of Distributional Tail Shape and the Extremal Index with Applications to Risk Management

Discussion on Human life is unlimited but short by Holger Rootzén and Dmitrii Zholud

Wei-han Liu Department of Banking and Finance Tamkang University. R/Finance 2009 Conference 1

Bayesian spatial hierarchical modeling for temperature extremes

A New Class of Tail-dependent Time Series Models and Its Applications in Financial Time Series

Bayesian nonparametrics for multivariate extremes including censored data. EVT 2013, Vimeiro. Anne Sabourin. September 10, 2013

Math 576: Quantitative Risk Management

Distributed Scheduling for Achieving Multi-User Diversity (Capacity of Opportunistic Scheduling in Heterogeneous Networks)

A BAYESIAN APPROACH FOR ESTIMATING EXTREME QUANTILES UNDER A SEMIPARAMETRIC MIXTURE MODEL ABSTRACT KEYWORDS

Extremes and Atmospheric Data

Extreme Value Analysis of Tropical Cyclone Trapped-Fetch Waves

Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz

Extreme value statistics: from one dimension to many. Lecture 1: one dimension Lecture 2: many dimensions

The Behavior of Multivariate Maxima of Moving Maxima Processes

Extreme Value Theory as a Theoretical Background for Power Law Behavior

Estimation of Quantiles

STATISTICS OF CLIMATE EXTREMES// TRENDS IN CLIMATE DATASETS

CTDL-Positive Stable Frailty Model

Nonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution

Freeway rear-end collision risk for Italian freeways. An extreme value theory approach

Sharp statistical tools Statistics for extremes

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

Modelling Operational Risk Using Bayesian Inference

DOWNSCALING EXTREMES: A COMPARISON OF EXTREME VALUE DISTRIBUTIONS IN POINT-SOURCE AND GRIDDED PRECIPITATION DATA

Accommodating measurement scale uncertainty in extreme value analysis of. northern North Sea storm severity

A class of probability distributions for application to non-negative annual maxima

Extreme value modelling of rainfalls and

Tail negative dependence and its applications for aggregate loss modeling

Abstract: In this short note, I comment on the research of Pisarenko et al. (2014) regarding the

Models for Spatial Extremes. Dan Cooley Department of Statistics Colorado State University. Work supported in part by NSF-DMS

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Journal of Environmental Statistics

Transcription:

HIERARCHICAL MODELS IN EXTREME VALUE THEORY Richard L. Smith Department of Statistics and Operations Research, University of North Carolina, Chapel Hill and Statistical and Applied Mathematical Sciences Institute Department of Statistics North Carolina State University February 19, 2015

OUTLINE OF TALKs I. An Example of Hierarchical Models Applied to Insurance Extremes II. Attribution of Climate Extremes III. The Attribution Problem for Joint Distributions of Climate Extremes (introduction)

I. AN EXAMPLE OF HIERARCHICAL MODELS APPLIED TO INSURANCE EXTREMES From the book chapter Bayesian Risk Analysis by R.L. Smith and D.J. Goodman (2000) http://www.stat.unc.edu/postscript/rs/pred/inex1.pdf See also: R.L. Smith (2003), Statistics of Extremes, With Applications in Environment, Insurance and Finance. In Extreme Values in Finance, Telecommunications and the Environment, edited by B. Finkenstadt and H. Rootzen, Chapman and Hall/CRC Press, London, pp. 1-78. http://www.stat.unc.edu/postscript/rs/semstatrls.pdf

The data consist of all insurance claims experienced by a large international oil company over a threshold 0.5 during a 15-year period a total of 393 claims. Seven types: Type Description Number Mean 1 Fire 175 11.1 2 Liability 17 12.2 3 Offshore 40 9.4 4 Cargo 30 3.9 5 Hull 85 2.6 6 Onshore 44 2.7 7 Aviation 2 1.6 Total of all 393 claims: 2989.6 10 largest claims: 776.2, 268.0, 142.0, 131.0, 95.8, 56.8, 46.2, 45.2, 40.4, 30.7.

(a) (b) 1000.0 500.0 400 Claim Size 100.0 50.0 10.0 5.0 1.0 0.5 Total number of claims 300 200 100 0 0 5 10 15 Years From Start 0 5 10 15 Years From Start (c) (d) 3000 300 Total size of claims 2500 2000 1500 1000 500 Mean excess over threshold 250 200 150 100 50 0 0 0 5 10 15 Years From Start 0 20 40 60 80 Threshold Some plots of the insurance data.

Some problems: 1. What is the distribution of very large claims? 2. Is there any evidence of a change of the distribution over time? 3. What is the influence of the different types of claim? 4. How should one characterize the risk to the company? More precisely, what probability distribution can one put on the amount of money that the company will have to pay out in settlement of large insurance claims over a future time period of, say, three years?

Introduction to Univariate Extreme Value Theory

EXTREME VALUE DISTRIBUTIONS X 1, X 2,..., i.i.d., F (x) = Pr{X i x}, M n = max(x 1,..., X n ), Pr{M n x} = F (x) n. For non-trivial results must renormalize: find a n > 0, b n such that { } Mn b n Pr x = F (a n x + b n ) n H(x). a n The Three Types Theorem (Fisher-Tippett, Gnedenko) asserts that if nondegenerate H exists, it must be one of three types: H(x) = exp( e x ), all x (Gumbel) { 0 x < 0 H(x) = exp( x α ) x > 0 (Fréchet) { exp( x H(x) = α ) x < 0 1 x > 0 (Weibull) In Fréchet and Weibull, α > 0.

The three types may be combined into a single generalized extreme value (GEV) distribution: ( H(x) = exp 1 + ξ x µ ) 1/ξ ψ, (y + = max(y, 0)) where µ is a location parameter, ψ > 0 is a scale parameter and ξ is a shape parameter. ξ 0 corresponds to the Gumbel distribution, ξ > 0 to the Fréchet distribution with α = 1/ξ, ξ < 0 to the Weibull distribution with α = 1/ξ. ξ > 0: long-tailed case, 1 F (x) x 1/ξ, ξ = 0: exponential tail ξ < 0: short-tailed case, finite endpoint at µ ξ/ψ +

EXCEEDANCES OVER THRESHOLDS Consider the distribution of X conditionally on exceeding some high threshold u: F u (y) = F (u + y) F (u). 1 F (u) As u ω F = sup{x : F (x) < 1}, often find a limit F u (y) G(y; σ u, ξ) where G is generalized Pareto distribution (GPD) G(y; σ, ξ) = 1 ( 1 + ξ y ) 1/ξ σ + Equivalence to three types theorem established by Pickands (1975)..

The Generalized Pareto Distribution G(y; σ, ξ) = 1 ( 1 + ξ y ) 1/ξ σ +. ξ > 0: long-tailed (equivalent to usual Pareto distribution), tail like x 1/ξ, ξ = 0: take limit as ξ 0 to get G(y; σ, 0) = 1 exp i.e. exponential distribution with mean σ, ( y ), σ ξ < 0: finite upper endpoint at σ/ξ.

POISSON-GPD MODEL FOR EXCEEDANCES 1. The number, N, of exceedances of the level u in any one year has a Poisson distribution with mean λ, 2. Conditionally on N 1, the excess values Y 1,..., Y N are IID from the GPD.

Relation to GEV for annual maxima: Suppose x > u. The probability that the annual maximum of the Poisson-GPD process is less than x is Pr{ max Y i x} = Pr{N = 0} + 1 i N = e λ + = exp { n=1 ( λ λ n e λ n! { 1 1 + ξ x u σ ( 1 + ξ x u σ ) 1/ξ }. n=1 ) } 1/ξ n Pr{N = n, Y 1 x,... Y n x} This is GEV with σ = ψ + ξ(u µ), λ = ( 1 + ξ u µ ) 1/ξ ψ. Thus the GEV and GPD models are entirely consistent with one another above the GPD threshold, and moreover, shows exactly how the Poisson GPD parameters σ and λ vary with u.

ALTERNATIVE PROBABILITY MODELS 1. The r largest order statistics model If Y n,1 Y n,2... Y n,r are r largest order statistics of IID sample of size n, and a n and b n are EVT normalizing constants, then ( Yn,1 b n,..., Y ) n,r b n a n a n converges in distribution to a limiting random vector (X 1,..., X r ), whose density is ( h(x 1,..., x r ) = ψ r exp 1 + ξ x ) 1/ξ r µ ψ ( 1 + 1 ξ ) r j=1 log ( 1 + ξ x j µ ψ ).

2. Point process approach (Smith 1989) Two-dimensional plot of exceedance times and exceedance levels forms a nonhomogeneous Poisson process with Λ(A) = (t 2 t 1 )Ψ(y; µ, ψ, ξ) Ψ(y; µ, ψ, ξ) = ( 1 + ξ y µ ψ ) 1/ξ (1 + ξ(y µ)/ψ > 0).

Illustration of point process model.

An extension of this approach allows for nonstationary processes in which the parameters µ, ψ and ξ are all allowed to be timedependent, denoted µ t, ψ t and ξ t. This is the basis of the extreme value regression approaches introduced later Comment. The point process approach is almost equivalent to the following: assume the GEV (not GPD) distribution is valid for exceedances over the threshold, and that all observations under the threshold are censored. Compared with the GPD approach, the parameterization directly in terms of µ, ψ, ξ is often easier to interpret, especially when trends are involved.

GEV log likelihood: l Y (µ, ψ, ξ) = N log ψ ESTIMATION i ( ( ) 1 ξ + 1 1 + ξ Y i µ ψ i ) 1/ξ provided 1 + ξ(y i µ)/ψ > 0 for each i. log ( 1 + ξ Y i µ ψ ) Poisson-GPD model: l N,Y (λ, σ, ξ) = N log λ λt N log σ provided 1 + ξy i /σ > 0 for all i. ( 1 + 1 ξ ) N i=1 log ( 1 + ξ Y i σ ) Usual asymptotics valid if ξ > 1 2 (Smith 1985)

Bayesian approaches An alternative approach to extreme value inference is Bayesian, using vague priors for the GEV parameters and MCMC samples for the computations. Bayesian methods are particularly useful for predictive inference, e.g. if Z is some as yet unobserved random variable whose distribution depends on µ, ψ and ξ, estimate Pr{Z > z} by Pr{Z > z; µ, ψ, ξ}π(µ, ψ, ξ Y )dµdψdξ where π(... Y ) denotes the posterior density given past data Y

Plots of women s 3000 meter records, and profile log-likelihood for ultimate best value based on pre-1993 data.

Example. The left figure shows the five best running times by different athletes in the women s 3000 metre track event for each year from 1972 to 1992. Also shown on the plot is Wang Junxia s world record from 1993. Many questions were raised about possible illegal drug use. We approach this by asking how implausible Wang s performance was, given all data up to 1992. Robinson and Tawn (1995) used the r largest order statistics method (with r = 5, translated to smallest order statistics) to estimate an extreme value distribution, and hence computed a profile likelihood for x ult, the lower endpoint of the distribution, based on data up to 1992 (right plot of previous figure)

Alternative Bayesian calculation: (Smith 1997) Compute the (Bayesian) predictive probability that the 1993 performance is equal or better to Wang s, given the data up to 1992, and conditional on the event that there is a new world record. The answer is approximately 0.0006.

Insurance Extremes Dataset We return to the oil company data set discussed earlier. Prior to any of the analysis, some examination was made of clustering phenomena, but this only reduced the original 425 claims to 393 independent claims (Smith & Goodman 2000) GPD fits to various thresholds: u N u Mean σ ξ Excess 0.5 393 7.11 1.02 1.01 2.5 132 17.89 3.47 0.91 5 73 28.9 6.26 0.89 10 42 44.05 10.51 0.84 15 31 53.60 5.68 1.44 20 17 91.21 19.92 1.10 25 13 113.7 74.46 0.93 50 6 37.97 150.8 0.29

Point process approach: u N u µ log ψ ξ 0.5 393 26.5 3.30 1.00 (4.4) (0.24) (0.09) 2.5 132 26.3 3.22 0.91 (5.2) (0.31) (0.16) 5 73 26.8 3.25 0.89 (5.5) (0.31) (0.21) 10 42 27.2 3.22 0.84 (5.7) (0.32) (0.25) 15 31 22.3 2.79 1.44 (3.9) (0.46) (0.45) 20 17 22.7 3.13 1.10 (5.7) (0.56) (0.53) 25 13 20.5 3.39 0.93 (8.6) (0.66) (0.56) Standard errors are in parentheses

Predictive Distributions of Future Losses What is the probability distribution of future losses over a specific time period, say 1 year? Let Y be future total loss. Distribution function G(y; µ, ψ, ξ) in practice this must itself be simulated.

Traditional frequentist approach: where ˆµ, ˆψ, ˆξ are MLEs. Ĝ(y) = G(y; ˆµ, ˆψ, ˆξ) Bayesian: G(y) = G(y; µ, ψ, ξ)dπ(µ, ψ, ξ X) where π( X) denotes posterior density given data X.

(a) (b) 1.2 0.06 1.0 post.density 0.04 0.02 post.density 0.8 0.6 0.4 0.2 0.0 0.0 10 20 30 40 50 60 mu 2.5 3.5 4.5 Log psi (c) (d) 1.5 50000 post.density 1.0 0.5 0.0 Amount of Loss 10000 5000 1000 500 100 50 0.5 1.0 1.5 2.0 xi 1 5 50 500 N=1/(probability of loss) Estimated posterior densities for the three parameters, and for the predictive distribution function. Four independent Monte Carlo runs are shown for each plot.

Hierarchical models for claim type and year effects Further features of the data: 1. When separate GPDs are fitted to each of the 6 main types, there are clear differences among the parameters. 2. The rate of high-threshold crossings does not appear uniform, but peaks around years 10 12.

A Hierarchical Model: Level I. Parameters m µ, m ψ, m ξ, s 2 µ, s 2 ψ, s2 ξ a prior distribution. are generated from Level II. Conditional on the parameters in Level I, parameters µ 1,..., µ J (where J is the number of types) are independently drawn from N(m µ, s 2 µ), the normal distribution with mean m µ, variance s 2 µ. Similarly, log ψ 1,..., log ψ J are drawn independently from N(m ψ, s 2 ψ ), ξ 1,..., ξ J are drawn independently from N(m ξ, s 2 ξ ). Level III. Conditional on Level II, for each j {1,..., J}, the point process of exceedances of type j is generated from the Poisson process with parameters µ j, ψ j, ξ j.

This model may be further extended to include a year effect, as follows. Suppose the extreme value parameters for type j in year k are not µ j, ψ j, ξ j but µ j + δ k, ψ j, ξ j. We fix δ 1 = 0 to ensure identifiability, and let {δ k, k > 1} follow an AR(1) process: with a vague prior on (ρ, s 2 η). δ k = ρδ k 1 + η k, η k N(0, s 2 η) We show boxplots for each of µ j, log ψ j, ξ j, j = 1,..., 6 and for δ k, k = 2, 15.

(a) (b) 12 10 2.4 2.2 8 2.0 mu 6 4 2 0 log psi 1.8 1.6 1.4 1.2 1.0 1 2 3 4 5 6 type 1 2 3 4 5 6 type (c) (d) 0.85 0.3 0.80 0.2 xi 0.75 0.70 0.65 delta 0.1 0.0 0.60-0.1 1 2 3 4 5 6 2 4 6 8 10 14 type Posterior means, quartiles for µ j, log ψ j, ξ j (j = 1,..., 6) and δ k (k = 2,..., 15). year

50000 A: All data combined B: Separate types A 10000 C: Separate types and years D: As C, outliers omitted B C Amount of loss 5000 D 1000 500 5 10 50 100 500 1000 N=1/(probability of loss) Posterior predictive distribution functions (log-log scale) for homogenous model (curve A) and three versions of hierarchical model

II. ATTRIBUTION OF CLIMATE EXTREMES (Joint work with Michael Wehner, Lawrence Berkeley National Laboratory)

Superstorm Sandy on October 27 2012 (Scott Sistek)

Superstorm Sandy (www.guardian.co.uk; October 30, 2012)

Superstorm Sandy (www.cnn.com; October 31, 2012)

Motivating Question: Concern over increasing frequency of extreme meteorological events Is the increasing frequency a result of anthropogenic influence? How much more rapidly with they increase in the future? Focus on three specific events: heatwaves in Europe 2003, Russia 2010 and Central USA 2011 Identify meteorological variables of interest JJA temperature averages over a region Europe 10 o W to 40 o E, 30 o to 50 o N Russia 30 o to 60 o E, 45 o to 65 o N Central USA 90 o to 105 o W, 25 o to 45 o N Probabilities of crossing thresholds respectively 1.92K, 3.65K, 2.01K in any year from 1990 to 2040.

Background Stott, Stone and Allen (2004) defined fraction of attributable risk (FAR) Observe some extreme event Let P 1 be probability of this event estimated from models including anthropogenic forcings, P 0 corresponding probability under natural forcings F AR = 1 P 0 P (also consider RR = P 1 1 P ) 0 For the Europe 2003 event they claimed P 1 250 1, P 0 1 so RR = 4 and F AR = 1 4 1 = 0.75. 1000 Very likely (confidence level at least 90%) that the F AR was at least 0.5 (RR 2), Method used a combination of extreme value theory, and detection and attribution methodology from the climate literature