Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions

Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions Anna Kiriliouk 1 Holger Rootzén 2 Johan Segers 1 Jennifer L. Wadsworth 3 1 Université catholique de Louvain (BE) 2 Chalmers University of Technology (S) 3 Lancaster University (UK) Risk, Extremes and Contagion Université de Nanterre, May 25 26, 2016 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 1 / 37

Multivariate Peaks-over-Thresholds ξ exceeds a multivariate threshold u: ξ u j = 1,..., d : ξ j > u j ξ 2 u 2 Distribution of excess ξ u given ξ u? ξ 1 u 1 Other exceedances (Thibaud and Opitz, 2013; Dombry and Ribatet, 2015): j = 1,..., d : ξ j > u j ξ 1 + + ξ d > u 1 + + u d... Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 2 / 37

Multivariate Generalized Pareto Distributions Maxima and exceedances Parametrization Representations and densities Stability Inference Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 3 / 37

Generalized Extreme-Value distributions Random vector ξ in R d, cdf F ξ (x) = P(ξ x). Assumption: Max-domain of attraction There exist scaling a n > 0 and location b n R d sequences such that F n ξ (a nx + b n ) G(x), n F n is cdf of vector of componentwise maxima of iid sample from F. Limit G is necessarily Generalized Extreme-Value = Max-Stable Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 4 / 37

Multivariate Generalized Pareto distribution If F n ξ (a nx + b n ) G(x) as n, then, provided 0 < G(0) < 1 (w.l.o.g.), L ( an 1 ) d (ξ b n ) η ξ b n Multivariate Generalized Pareto H where η j [, ) is the lower endpoint of G j. Cumulative distribution function H = GP(G): for x such that G(x) > 0, H(x) = log G(x 0) ( log G(x)) log G(0) Beirlant et al. (2004) Rootzén and Tajvidi (2006) Falk et al. (2010) Ferreira and de Haan (2014) Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 5 / 37

Independent maxima: single-component exceedances Generalized Extreme-Value G: ( G(x 1, x 2 ) = exp 1 x 1 + 1 1 ) x 2 + 1 x 1 > 1, x 2 > 1 Generalized Pareto H: law of { ( 1, Y) wp 1/2, (X 1, X 2 ) = (Y, 1) wp 1/2 X 2 0 1 1 with P(Y t) = 1 1 1+y for y 0 (univariate Generalized Pareto). X 1 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 6 / 37

Completely dependent maxima and exceedances Generalized Extreme-Value G: ( ) 1 G(x 1, x 2 ) = exp min(x 1, x 2 ) + 1 x 1 > 1, x 2 > 1 Generalized Pareto H: law of (Y, Y) X 2 0 1 1 with P(Y y) = 1 1 1+y for y 0 (univariate Generalized Pareto). X 1 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 7 / 37

1 Example: logistic Generalized Extreme-Value G(x 1, x 2 ): exp[ { 2 j=1 (1 + x j) 1/θ } θ ] x 1 > 1, x 2 > 1 Generalized Pareto H: pdf h(x 1, x 2 ): 2 θ 2 x 1 x 2 { 2 j=1 (1+x j) 1/θ } θ Support: x 1 > 1 x 2 > 1 x 1 x 2 > 0 1.0 0.5 0.0 0.5 1.0 0.3 0.6 0.8 0.5 0.7 0.1 0.2 0.4 θ = 0.4 0.7 0.9 1.0 0.5 0.0 0.5 1.0 0.8 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 8 / 37

Generalized Extreme-Value distributions Margins: shape γ j R, location µ j R, scale α j (0, ) exp[ {1 + γ j (x j µ j )/α j } 1/γ j ] if γ j 0, G j (x j ) = exp[ exp{ (x j µ j )/α j }] if γ j = 0 Stable tail dependence function (stdf) l : [0, ) d [0, ) G(x) = exp[ l{ log G 1 (x 1 ),..., log G d (x d )}] Notation: G = GEV(γ, µ, α, l) Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 10 / 37

Identifiability issue Max-stability: G = GEV(γ, µ, α, l) = G t = GEV(γ, µ(t), α(t), l), 0 < t < with µ(t) =... and α(t) =... (exercise). However: GP(G t ) = GP(G). Therefore: not all GEV parameters are identifiable from H = GP(G). Solution? Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 11 / 37

Cumulative distribution function Let G = GEV(γ, µ, α, l). Suppose σ = α γµ (0, ) Then 0 < G(0) < 1. Let H = GP(G). For x R such that σ + γx > 0, ( H(x) = l π (1 + γ(x 0)/σ) 1/γ) l (π (1 + γx/σ) 1/γ) where, in case also x 0, H j (0) = π j H j (x j ) H j (0) = (1 + γ jx j /σ j ) 1/γ j H(x) = l ( H 1 (x 1 ),..., H d (x d ) ) Parametrization: H = GP(γ, σ, π, l). Constraint: l(π) = 1. Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 12 / 37

Standardization We have X GP(γ, σ, π, l) if and only if we have and Z GP(0, 1, π, l). X = σ eγz 1 γ The support of Z is contained in [, ) \ [, 0] and P[Z z] = l ( πe (z 0)) l ( πe z) How to construct Z? Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 14 / 37

Spectral representation A random vector S in [, 0] d is a spectral random vector if: P[max(S 1,..., S d ) = 0] = 1 P[S j > ] > 0, j = 1,..., d Let E be a unit exponential random variable independent of S. Then where Z := S + E GP(0, 1, π, l) π j = E[e S j ] [ l(y) = E max j=1,...,d { e S }] j y j π j H(z) = 1 E[1 e max(s z) ] Conversely: given π and l with l(π) = 1, there exists a unique (in law) spectral random vector S such that the above holds. Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 15 / 37

Constructing spectral random vectors (1) Let T be a random vector in [, ) d such that Then 0 < E[e T j ] <, j = 1,..., d P[max(T 1,..., T d ) > ] = 1 S = T max(t) is a spectral random vector. The associated GP(0, 1, π, l) = GP S (0, 1, L(S)) distribution is determined by π j = E[e Tj max(t) ] [ { l(y) = E max j=1,...,d y j }] e T j max(t) E[e Tj max(t) ] H(z) = 1 E[1 e max(t z) max(t) ] Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 16 / 37

Density (1) Suppose T has support included in R d and is absolutely continuous with Lebesgue density f T. Then the GP T (0, 1, L(T)) distribution has Lebesgue density h given by h(z) = 1(z 0) 1 e max(z) 0 f T (z + log t) t 1 dt. Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 17 / 37

Example: max-normalized log-gaussian generators (1) Pdf of H = GP T (0, 1, N d (0, Σ)): h(z) = (2π)(1 d)/2 Σ 1/2 (1 T Σ 1 1) 1/2 e max(z) exp { 12 (Σ zt 1 Σ 1 11 T Σ 1 ) } 1 T z Σ 1 1 z R d \ (, 0] d Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 18 / 37

Example: max-normalized log-gaussian generators (2) pdf of (Z 1, Z 2 ) GP T with γ 1 = γ 2 = 0 σ 1 = σ 2 = 0 (T 1, T 2 ) N 2 (0, ( )) 1 ρ ρ 1 6 4 2 0 2 4 6 8 6 4 2 0 2 4 6 8 ρ =.5 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 19 / 37

Example: max-normalized log-gaussian generators (2) pdf of (Z 1, Z 2 ) GP T with γ 1 = γ 2 = 0 σ 1 = σ 2 = 0 (T 1, T 2 ) N 2 (0, ( )) 1 ρ ρ 1 6 4 2 0 2 4 6 8 6 4 2 0 2 4 6 8 ρ = 0 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 19 / 37

Constructing spectral random vectors (2) Let U be a random vector in [, ) d such that 0 < E[e U j ] < for all j. Let S be the spectral random vector defined in distribution by P[S ] = E [ 1{U max(u) } e max(u)] E[e max(u) ] The associated GP(0, 1, π, l) = GP S (0, 1, L(S)) distribution is given by π j = E[eU j ] E[e max(u) ] [ l(y) = E max j=1,...,d { y j e U }] j E[e U j] H(z) = 1 E[emax(U) e max(u z) ] E[e max(u) ] Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 20 / 37

Density (2) Suppose U has support included in R d and is absolutely continuous with Lebesgue density f T. Then the GP U (0, 1, L(U)) distribution has Lebesgue density h given by h(z) = 1(z 0) 1 E[e max(u) ] 0 f U (z + log t) dt. Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 21 / 37

Example: independent Fréchet generators e U j independent Fréchet shape α > 0 scales λ j > 0 GP U (0, 1, L(U)) density: h(z) e α d j=1 z j ( d j=1 ( ez j λ j ) α ) d 1/α z R d \ (, 0] d explicit proportionality constant 7 5 3 1 1 3 5 7 7 5 3 1 1 3 5 7 margins dependence γ 1 = γ 2 = 0 α = 2 σ 1 = σ 2 = 1 λ 1 = 2, λ 2 = 1 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 22 / 37

Example: independent Beta generators e U j independent Beta shapes (α j, 1) scales λ j GP U (0, 1, L(U)) density: h(z) z R d \ (, 0] d d j=1 ez j/α j max(λe z ) d j=1 1/α j+1 7 5 3 1 1 3 5 7 9 11 13 15 15 11 7 5 3 1 1 3 5 7 explicit proportionality constant margins dependence γ 1 = γ 2 = 0 α 1 = 2, α 2 = 3 σ 1 = σ 2 = 1 λ 1 = 2, λ 2 = 1 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 23 / 37

Point process representation iid U, U 1, U 2,... in [, ) d such that 0 < E[e U j ] < for all j unit-rate Poisson process 0 < R 1 < R 2 <... (U i ) i and (R i ) i are independent Point process consisting of points ξ i (i = 1, 2,...) where ξ i = U i log R i The law, G, of max i ξ i is GEV (de Haan, 1984): log G(z) = E[e max(u z) ] Associated GP: H = GP(G) = GP U (0, 1, L(U)) Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 24 / 37

Lower-dimensional margins Let X be GP and let J {1,..., d}. The law of X J = (X j ) j J is not GP. The law of X J given that max j J X j > 0 is GP: X GP(γ, σ, π, l) = L(X J X J 0) = GP (γ J, σ J, (P[X j > u j X J 0]) j J, l J ) X GP U (γ, σ, U) = L(X J X J 0) = GP U (γ J, σ J, L(U J )) If J = {j}, we find that L(X j X j > 0) = GP(γ j, σ j ). Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 26 / 37

Threshold stability Univariate GP distributions are threshold stable. What if multivariate? Let X GP(γ, σ, π, l). Let u [0, ) d be such that P(X j > u j ) > 0 for all j. Then L(X u X u) = GP (γ, σ + γu, (P[X j > u j X u]) j, l) Change of marginal parameters as in univariate case Change from π j = P[X j > 0] to P[X j > u j X u] Same stdf l Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 27 / 37

Linear combinations Let X GP S (γ, σ, π, l) be such that γ 1 =... = γ d =: γ. Let a [0, ) d and write a X = d j=1 a jx j. If P[a X > 0] > 0, then L(a X a X > 0) = GP(γ, a σ). Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 28 / 37

Linear transformations Let X GP S (γ, σ, L(S)) be such that γ 1 =... = γ d =: γ. Let A = (a i,j ) i,j [0, ) m d be such that P[A ix > 0] > 0 for all i. For x R m such that A iσ + γx i > 0 for all i = 1,..., m, [ ] P[AX x] = E 1 max {(1 + γx i/a iσ) 1/γ e U i } i=1,...,m where U = (U 1,..., U m ) is given by γ 1 log ( d j=1 U i = p i,j e γs ) j if γ 0, d j=1 p i,j S j if γ = 0, where p i,j = a i,j σ j /A iσ j. As a consequence, L(AX AX 0) = GP U (γ, Aσ, L(U)). Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 29 / 37

Inference problem Data ξ 1,..., ξ n. Threshold u. Threshold exceedances: {ξ i ξ i u}. Model: L(ξ ξ u) = u + σ eγz 1 γ GP(0, 1, π, l) Z GP T (0, 1, L(T)) GP U (0, 1, L(U)) Parametric model for l or L(T) or L(U), parameter α = Parametric model for f Z = Likelihood inference on θ = (γ, σ, α)? Nonparametric inference: empirical stdf ˆl n? Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 31 / 37

Data points with some components far beneath the threshold Exceedance x u (on original scale): excess x u 0. Model by Generalized Pareto? Ill-justified if x j u j for some j. Censor components that are too low. X 1 X 2 u 1 u 2 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 32 / 37

Censored likelihood 1. Choose second, lower threshold v < u. 2. If x j v j, replace x j by v j : censoring from below. 3. Likelihood contribution of a point x u but x v: h(w C u C, x D u D ) dw C (,v C ] C = {j = 1,..., d : x j v j } censored variables; D = {j = 1,..., d : x j > v j } uncensored variables. Comparative study of likelihood-based estimators: Huser et al. (2014). Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 33 / 37

Censored likelihood: Proof of concept X GP U (γ, σ, log Beta) 2.5 2.0 1.5 1.0 0.5 0.0 0.5 α 1 α 2 α 3 λ 1 λ 2 σ 1 σ 2 γ 1 γ 2 Boxplots of normalized parameter estimates, 30 repetitions Censoring at v = u = 0 Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 34 / 37

Overall summary Modelling excess X = ξ u conditionally on ξ u: = Multivariate Generalized Pareto distribution H Q How does H look like? A X = σ eγz 1 γ with Z GP(0, 1, π, l) Q How to construct parametric models for it? A via spectral representations Q How to fit such models? A Censored likelihood Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 35 / 37

Outlook Computational challenges Likelihood: proportionality constant E[e max(u) ]? Simulation: change-of-measure f U (u) e max(u) f U (u)? Likelihood optimisation in case many parameters?... Statistical challenges Graceful blending with models for the bulk of the distribution? Covariates? Threshold choice? Model construction: parsimony vs flexibility?... Thank you! Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 36 / 37

Bibliography Beirlant, J., Y. Goegebeur, J. Segers, and J. Teugels (2004). Statistics of Extremes: Theory and Applications. John Wiley & Sons. de Haan, L. (1984). A spectral representation for max-stable processes. The Annals of Probability 12(4), 1194 1204. Dombry, C. and M. Ribatet (2015). Functional regular variations, pareto processes and peaks over threshold. Statistics and Its Interface 8, 9 17. Falk, M., J. Hüsler, and R.-D. Reiss (2010). Laws of small numbers: extremes and rare events. Springer Science & Business Media. Ferreira, A. and L. de Haan (2014). The generalized Pareto process; with a view towards application and simulation. Bernoulli 20(4), 1717 1737. Huser, R., A. C. Davison, and M. G. Genton (2014). A comparative study of likelihood estimators for multivariate extremes. Extremes (arxiv:1411.3448). Rootzén, H. and N. Tajvidi (2006). Multivariate generalized Pareto distributions. Bernoulli 12(5), 917 930. Thibaud, E. and T. Opitz (2013). Efficient inference and simulation for elliptical Pareto processes. arxiv preprint arxiv:1401.0168. Kiriliouk/Rootzén/Segers/Wadsworth Multivariate Generalized Pareto Distributions Nanterre, 26 May 2016 37 / 37