Bivariate generalized Pareto distribution practice: Models and estimation. Eötvös Loránd University, Budapest, Hungary 7 June 2011, ASMDA Conference, Rome, Italy
Problem How can we properly estimate the tail of a distribution only few observations are known / estimates are needed over the maximum as well etc. Fortunately there is well applicable asymptotic theory Several packages in R are available What is new here? mgpd 1.0 (2.0 is coming soon)
Extreme value theory (EVT) - limit results for extremes Generalized extreme value distribution (EVD or GEV) for block maxima (monthly, yearly,...) { } ( G(x) = exp 1+ξ x µ ) 1 ξ, (1) σ 1+ξ x µ σ > 0, µ R, σ > 0 and ξ R Generalized Pareto distribution (GPD) for threshold exceedances (e.g. over a high quantile of the observations) ( H(x) = 1 1+ξ x ) 1 ξ, (2) σ where 1+ξ x σ > 0 and σ > 0.
Limit for Maximum Theorem (Fisher and Tippet, 1928) If there exist {a n } and {b n } > 0 sequences such that ( Mn a n P b n ) z G(z) as n for some non-degenerate distribution G, then { ( G(x) = exp 1+ξ x µ σ ) 1 ξ where 1+ξ x µ σ > 0 and in such a case X belongs to the max-domain of attraction of a GEV distribution. },
Limit for Threshold Exceedances How to model large values over a given high threshold? Theorem (Balkema and de Haan, 1974) i.i.d. Suppose, that X 1,...,X n F, where F belongs to the max-domain of attraction of GEV for some ξ,µ and σ > 0. Then ( P(X i u z X i > u) H(z) = 1 1+ ξz σ ) 1 ξ where σ = σ +ξ(u µ). NOTE: STRONG LINK BETWEEN GEV AND GPD! as u z +,
Limit for Multivariate Maxima Component-wise maxima of X i = (X i,1,...,x i,d ) is ( n M n = X i,1,..., i=1 n X i,d ). i=1 If X F and there exist a n and b n > 0 sequences of normalizing vectors, such that ( Mn a n P b n ) z = F n (b n z+a n ) G(z), where G i margins are non-degenerate, then G is said to be a multivariate extreme value distribution (MEVD).
Characterization I. - Resnick (1987) MEVD with unit Fréchet margins can be written as ( ) G Fréchet (t 1,...,t d ) = exp V(t 1,...,t d ) V is called exponent measure V(t 1,...,t d ) = ν(([0,t 1 ] [0,t d ]) c ) = d S d i=1 ( wi t i ) S(dw), S spectral measure is a finite measure on the d-dimensional simplex and S d w i S(dw) = 1 for i = 1,...,d.
Some properties of the characterization I. Total mass of S is always S(S d ) = d Density not only on the interior of S d but also on each of the lower-dimensional subspaces of S d For independent margins S({e i }) = 1 for all e i vertices of the simplex For completely dependent margins S((1/d,..., 1/d)) = d
Characterization II. - Pickands (1981) The dependence function A(t) must satisfy the following three properties: (P) 1. A(t) is convex 2. max{(1 t),t} A(t) t 3. A(0) = A(1) = 1 Then the exponent measure can be formulated by A(t) as V(t 1,t 2 ) = ( 1 + 1 ) ( t1 ) A t 1 t 2 t 1 +t 2
An Example: Logistic Dependence Function A very common parametric form to use for A(t) is the so-called logistic dependence function A logistic (t) = ( (1 t) α +t α) 1/α. The strength of dependence is modelled by a single parameter α 1. The independence case corresponds to α = 1. Further details: Coles and Tawn (1991).
More dependence functions by evd Parameteric Families 1. Parameteric Families 2. A(t) 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Logistic Asy. Logistic Coles Tawn A(t) 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Neg. Logistic Asy. Neg. Logistic. Asy. Mixed 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t
Multivariate Threshold Exceedances (of type II.) Taking maxima hides the time structure within the given period Investigate all observations exceeding a given high threshold Theorem (Tajvidi and Rootzén, 2006) Suppose, that G is a d-dimensional MEVD with 0 < G(0) < 1. If F is in the domain of attraction of G then there exist an increasing continuous curve u with F(u(t)) 1 as t, and a function σ(u) > 0 such that as t, for all x. P(X u x X u 0) 1 logg(0) log G(x) G(x 0)
What is type I. then? Logistic bgpd by evd Logistic bgpd by mgpd density curves density curves Hannover m/s 0 5 10 15 20 25 30 No Inference! No Inference! No Inference! Hannover m/s 0 5 10 15 20 25 30 No Inference! 0 5 10 15 20 25 Fehmarn m/s 0 5 10 15 20 25 Fehmarn m/s Figure: BGPD of type I. and II.
Construction of absolutely continuous MGPDs Theorem Let H be a MGPD represented by an absolutely continuous MEVD G with spectral measure S. The MGPD H is absolutely continuous if and only if S(int(S d )) = d holds, i.e all mass is put on the interior of the simplex. Corollary Asymmetric dependence models either do not lead to a.c. MGPDs or complicated to compute.
Parametric dependence models Model Asym. Density Complications Sym. Logistic Asym. Logistic (Tawn, 1988) Ψ-logistic convexity constr. Sym. Neg. Logistic Asym. Neg. Logistic (Joe, 1990) Ψ-negative logistic convexity constr. Bilogistic (Smith, 1990) not explicit Neg. Bilogistic (C. & T., 1994) not explicit Tajvidi s (Tajvidi, 1996) C-T (C. & T., 1991) only s(w) is given Mixed (Tawn, 1988) if ψ = 1 Asym. Mixed
Construction of absolutely continuous BGPDs 1. take an absolutely continuous baseline dependence model; 2. take a strictly monotonic transformation Ψ(x) : [0,1] [0,1], such that Ψ(0) = 0, Ψ(1) = 1; 3. construct a new dependence model from the baseline model A Ψ (t) = A(Ψ(t)); 4. check the constraints: A Ψ (0) = 1, A Ψ (1) = 1 and A Ψ 0 (convexity).
An example for Ψ(x) Feasible transformation if Ψ(t) = t +f(t) (A(Ψ(t))) = A (Ψ(t)) [1+f (t)] so choosing f such that f (0) = f (1) = 0, the A Ψ (0) = 1 and A Ψ (1) = 1 are fulfilled. f ψ1,ψ 2 (t) = ψ 1 (t(1 t)) ψ 2, for t [0,1], where ψ 1 R and ψ 2 1 are asymmetry parameters.
Graphical examples δ 2 ta αlog=2(ψ(t) δ 2 ta αneglog=1.2(ψ(t) 0.0 1.0 2.0 3.0 Psi Logistic ψ 1 = 0.3 ψ 1 = 0.15 ψ 1 = 0 ψ 1 = 0.15 ψ 1 = 0.3 0.0 1.0 2.0 3.0 Psi NegLog ψ 1 = 0.3 ψ 1 = 0.15 ψ 1 = 0 ψ 1 = 0.15 ψ 1 = 0.3 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t Constraints of convexity for psi logistic models Constraints of convexity for psi negative logistic models ψ 1 0.4 0.2 0.0 0.2 0.4 ψ 2 = 1.8 ψ 2 = 2 ψ 2 = 2.2 ψ 1 4 2 0 2 4 ψ 2 = 1.8 ψ 2 = 2 ψ 2 = 2.2 1.4 1.6 1.8 2.0 2.2 0.9 1.0 1.1 1.2 1.3 1.4 1.5 α α
Introduction Extreme value models Applications Wind data Further applications Acknowledgements Tail dependent wind data fbvpot bgpd by evd 5 10 15 20 density curves 20 15 25???? 10 15 20 25 Chi Plot Chi Bar Plot 30 0.4 0.0 0.4 Chi Bar 0.8 Bremerhaven m/s 0.8 Bremerhaven m/s 0.0 Chi 10 Schleswig m/s 15 10 5 Schleswig m/s 20 25 Daily wind speed 0.0 0.2 0.4 0.6 0.8 1.0 Quantile Pa l Rakonczai 0.0 0.2 0.4 0.6 0.8 1.0 Quantile
Wind data Further applications Acknowledgements Prediction regions for the 4 best models (having the smallest χ 2 -statistics) NegLog Psi NegLog Schleswig(m/s) 5 10 15 20 25 Regions γ= 0.99 γ= 0.95 γ= 0.9 Schleswig(m/s) 5 10 15 20 25 Regions γ= 0.99 γ= 0.95 γ= 0.9 5 10 15 20 25 30 5 10 15 20 25 30 Bremerhaven(m/s) Bremerhaven(m/s) Coles Tawn Tajvidi s Schleswig(m/s) 5 10 15 20 25 Regions γ= 0.99 γ= 0.95 γ= 0.9 Schleswig(m/s) 5 10 15 20 25 Regions γ= 0.99 γ= 0.95 γ= 0.9 5 10 15 20 25 30 Bremerhaven(m/s) 5 10 15 20 25 30 Bivariate generalized Bremerhaven(m/s) Pareto distribution
Wind data Further applications Acknowledgements Goodness-of-fit: Bremerhaven és Schleswig γ 0.99 0.95-0.99 0.9-0.95 0.75-0.9 0.5-0.75 0.5 χ 2 Exp. 12.6 50.3 62.8 188.6 314.2 628.5 - Log 7 30 66 216 350 588 21.5 Ψ-L 9 33 60 215 349 591 16.9 NL 11 31 66 215 334 600 14.0 Ψ-NL 11 34 63 210 337 602 10.7 Mix 12 46 71 247 331 550 30.3 C-T 12 34 65 209 330 607 9.1 Tajv. 11 35 62 212 340 597 11.5 BL 6 29 62 220 354 586 25.6 NBL 13 32 61 220 337 594 15.5
Wind data Further applications Acknowledgements Further models - polinomial transformations f a,p (t) = 64a(p 5 p 4 2p 3 2p 2 +5p 2)p 5 (p +1) 2 (p 1) 2 ( 1+2p)(1+2p 2 3p) t6 + 32a(5p6 2p 5 13p 4 12p 3 +15p 2 +6p 6)p 4 (p +1) 2 (p 1) 2 ( 1+2p)(1+2p 2 t 5 3p) + 32a(4p7 +3p 6 14p 5 15p 4 +23p 2 8p 2)p 3 (p +1) 2 (p 1) 2 ( 1+2p)(1+2p 2 t 4 3p) + 32a(p7 +4p 6 5p 5 10p 4 5p 3 +12p 2 +2p 4)p 3 (1+2p 2 3p)(p +1) 2 (4p 1+2p 3 5p 2 t 3 ) 32a(p 6 3p 4 p 2 +4p 2)p 3 + (1+2p 2 3p)(p +1) 2 (4p 1+2p 3 5p 2 ) t2.
Wind data Further applications Acknowledgements Further models - polinomial transformations 4 2 0 2 4 0.30 0.25 0.20 0.15 0.10 0.05 0.00 4 2 0 2 4 0.30 0.25 0.20 0.15 0.10 0.05 0.00 4 2 0 2 4 0.30 0.25 0.20 0.15 0.10 0.05 0.00 4 2 0 2 4 4 2 0 2 4 4 2 0 2 4
Wind data Further applications Acknowledgements BEVD case 0.0 0.2 0.4 0.6 0.8 1.0 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Wind data Further applications Acknowledgements The European Union and the European Social Fund have provided financial support to the project under the grant agreement no. TÁMOP 4.2.1./B-09/KMR-2010-0003. Thank you for your attention!