ESTIMATING BIVARIATE TAIL - PDF Free Download

Elena DI BERNARDINO b joint work with Clémentine PRIEUR a and Véronique MAUME-DESCHAMPS b a LJK, Université Joseph Fourier, Grenoble 1 b Laboratoire SAF, ISFA, Université Lyon 1

Framework Goal: estimating the tail of a bivariate distribution function. Idea: a general extension of the Peaks-Over-Threshold method. Tools: a two-dimensional version of the Pickands-Balkema-de Haan Theorem, Yuri & Wüthrich s approach of the tail dependence. We present real data examples which illustrate our theoretical results. Key words: Extreme Value Theory, Peaks Over Threshold method, Pickands-Balkema-de Haan Theorem, tail dependence.

Contents 1 One-dimensional results The univariate POT method 2 The framework 2D Pickands-Balkema-de Haan Theorem 3 4

Contents One-dimensional results The univariate POT method 1 One-dimensional results The univariate POT method 2 The framework 2D Pickands-Balkema-de Haan Theorem 3 4

Generalized Pareto distribution The univariate POT method Main idea of POT: use of the generalized Pareto distribution (1) to approximate the distribution of excesses over thresholds. { 1 ( ) 1 V k,σ (x) := kx 1 k σ, if k 0, σ > 0, 1 e x σ, if k = 0, σ > 0, (1) and x 0 for k 0 or 0 x < σ k for k > 0. A distribution is in the domain of attraction of an extreme value distribution if and only if the distribution of excesses over high thresholds is asymptotically generalized Pareto (e.g. Balkema and de Haan, 1974, Pickands, 1975).

Generalized Pareto distribution The univariate POT method Let X 1, X 2,... be a sequence of i.i.d random variables with unknown distribution function F. Fix a threshold u. For x > u, decompose F as F (x) = P[X x] = (1 P[X u]) F u (x u) + P[X u], where F u (x) = P[X x + u X > u]. POT estimate for x > u, F (x) = (1 F X (u))v k, σ (x u) + F X (u). This univariate modeling is well understood, and has been discussed by Davison (1984), Davison and Smith (1990), McNeil (1997, 1999) and other papers of these authors.

Contents One-dimensional results The framework 2D Pickands-Balkema-de Haan Theorem 1 One-dimensional results The univariate POT method 2 The framework 2D Pickands-Balkema-de Haan Theorem 3 4

Framework One-dimensional results The framework 2D Pickands-Balkema-de Haan Theorem Setting: X, Y two real valued r.v. with continuous df F X and F Y, the dependence between X and Y is described by a continuous and symmetric copula C. Notation and definitions: Survival Copula (u 1, u 2 ) [0, 1] 2, C (u 1, u 2 ) = u 1 + u 2 1 + C(1 u 1, 1 u 2 ). Upper-tail dependence copula X,Y U[0, 1], with symmetric C, u [0, 1) / C (1 u, 1 u) > 0. Then, (x, y) [0, 1] 2, one defines Cu up (x, y) := P[X 1 1 F u (x), Y F u (y) X > u, Y > u] with F u (x) := P[X x X > u, Y > u] = 1 C (1 x u,1 u) C (1 u,1 u).

The framework 2D Pickands-Balkema-de Haan Theorem Modeling upper tail, Yuri & Wütrich s approach Theorem (Upper-tail Theorem; Juri and Wüthrich 2003) Let C be a symmetric copula such that C (1 u, 1 u) > 0, for all u > 0. Furthermore, assume that there is a strictly increasing continuous function g : [0, ) [0, ) such that C (x(1 u), 1 u) lim u 1 C = g(x), x [0, ). (1 u, 1 u) Then, there exists a θ > 0 such that g(x) = x θ g ( 1 x ) for all x (0, ). Further, for all (x, y) [0, 1] 2 lim C u up (x, y) = x +y 1+G(g 1 (1 x), g 1 (1 y)) := C G (x, y), (2) u 1 ( ) with G(x, y) := y θ g (x, y) (0, 1] 2 and G : 0 on [0, 1] 2 \ (0, 1] 2. x y

Auxiliary result The framework 2D Pickands-Balkema-de Haan Theorem Proposition (de Haan 1970) F X MDA(H k ) is equivalent to the existence of a positive measurable function a( ) such that, for 1 k x > 0 and k R, { 1 1 F X (u + x a(u)) (1 k x) k, if k 0, lim = u x F 1 F X (u) e x, if k = 0. (3) [(2)and(3)] [ a 2D version of the Pickands-Balkema-de Haan Theorem] Juri & Wüthrich (2003) for a symmetric C and if F X = F Y, Di Bernardino, Maume-Deschamps and Prieur (2010) for a symmetric C even if F X F Y.

The framework 2D Pickands-Balkema-de Haan Theorem 2D version Pickands-Balkema-de Haan Theorem Theorem (Di Bernardino, Maume-Deschamps and Prieur 2010) X, Y real valued r.v. with continuous df F X F Y, C a symmetric copula. Assume F X MDA(H k1 ), F Y MDA(H k2 ) and C satisfies the assumptions of the Upper-tail Theorem for some g. Define u Y = F 1 Y (F X (u)), x FX := sup{x R F X (x) < 1}, x FY := sup{y R F Y (y) < 1}, A := {(x, y) : 0 < x x FX u, 0 < y x FY u Y }. Then sup P[ X u x, Y u Y y ] X > u, Y > uy A C G ( 1 g(1 V k1,a 1(u)(x)), 1 g(1 V k2,a 2(u Y )(y)) ) u x FX 0, where V ki,a i are the GPDs and a i ( ) as in (3), i = 1, 2.

Contents One-dimensional results 1 One-dimensional results The univariate POT method 2 The framework 2D Pickands-Balkema-de Haan Theorem 3 4

A new bivariate tail estimator Context: F bivariate df with continuous marginals F X, F Y. F is assumed to have a stable tail dependence function l that is x, y 0, the following limit exists lim t 0 t 1 P (1 F X (X ) tx or 1 F Y (Y ) ty) = l(x, y), see Huang (1992). Then define lim t 0 t 1 P (1 F X (X ) tx, 1 F Y (Y ) ty) = R(x, y). We have x, y 0, R(x, y) = x + y l(x, y). Asymptotic dependence λ = R(1, 1) 0, where λ is the upper tail dependence coefficient. Asymptotic independence x, y 0, l(x, y) = x + y. It is equivalent to λ = R(1, 1) = 0.

Asymptotic dependence: λ > 0 Upper-Tail Theorem of Juri & Wüthrich (2003) holds with g(x) = x + 1 l(x, 1) 2 l(1, 1) = Moreover x > 0, g(x) = x g(1/x) that is θ = 1. R(x, 1) x + y l(x, y) R(x, y), G(x, y) = = R(1, 1) 2 l(1, 1) R(1, 1). We estimate g(x) with the estimator of l in Einmahl, Krajina, Serger (2008): ln (x, y) = 1 k n n 1 {R(Xi )>n k nx+1 or R(Y i )>n k ny+1}, i=1 where R(X i ) is the rank of X i among (X 1,..., X n ), and R(Y i ) is the rank of Y i among (Y 1,..., Y n ), i = 1,..., n.

Estimating g, G and θ We estimate g(x) by ĝ(x) = x+1 ˆl n(x,1). 2 ˆl n(1,1) We estimate G(x, y) by Ĝ(x, y) = x+y ˆl n(x,y). 2 ˆl n(1,1) Finally, we estimate the unknown parameter θ by ˆθ x = log ĝ(x) log ĝ(1/x). log x In practice, k is "optimized" for each value of x.

On simulations 100 samples of size n = 1000; mean curve (full line), empirical standard deviation (dashed lines). Figure: Estimators for θ, (k, θ x) (left) x = 0.07, Survival Clayton copula with parameter 1 (right) x = 5, Logistic copula with parameter 0.5

Asymptotic independence: λ = 0 λ = R(1, 1) = 0 C(u, u) = 1 2(1 u) + o(1 u), for u 1 Proposition If λ = 0 and C is a twice continuously differentiable symmetric copula with the determinant of the Hessian matrix of C at (1, 1) different to zero, then C (x(1 u), 1 u) lim u 1 C = b/2(x 2 + 1) + cx (1 u, 1 u) b + c x [0, ), and θ = 2, with b = 2 C u (1, 1), c = C 2 u v (1, 1). Remark: The assumptions of Proposition are satisfied, for example, in the case of Ali Mikhail-Haq, Frank, Clayton with a 0, Independent and Fairlie-Gumbel-Morgenstern copulas. Remark: one may need going further in the asymptotic development.

On simulations 100 samples of size n = 1000; mean curve (full line), empirical standard deviation (dashed lines). Figure: Estimators for θ, (k, θ x) (left) x = 0.8, Independent copula (right) x = 0.7, Clayton copula with parameter 0.05.

Estimating bivariate tail in literature From Ledford and Tawn (1996): F (y 1, y 2 ) = exp{ l( ln( F Y 1 (y 1 )), ln( F Y 2 (y 2 )))}, (4) for high values of y 1 and y 2, where, for instance, F Y 1 (y 1 ) (resp. F Y 2 (y 2 )) comes from the univariate POT method. Approach based on the univariate dependence function of Pickands, 1981; e.g. see Capéraà and Fougères (2000). In the case λ = 0 these methods produce a significant bias. Basically this happens because a bivariate extreme value (G) type dependence structure is assumed to hold in the joint tail of F above the marginal thresholds. In order to overcome this problem, Ledford and Tawn (1996, 1997, 1998) introduced this model: P[Z 1 > z, Z 2 > z] L(z)P[Z 1 > z] 1 η, where L is a slowly varying function at infinity, η (0, 1], (Z 1, Z 2) with unit Fréchet marginals.

New tail estimator: general idea Contrary to Ledford and Tawn s method, we will propose a model based on regularity conditions of the Copula and on the explicit description of the dependence structure in the joint tail (using g, G and θ). Our estimator covers situations less restrictive than dependence or perfect independence above the thresholds (in a different way from the Ledford and Tawn s method). Our method is free from the pre-treatment of data because we can work directly with the original general samples without the transformation in Fréchet marginal distributions.

New tail estimator: construction 1/2 Let x > u X, y > u Y. For (s, t) in lateral regions ], x] ], u Y ] and ], u x ] ], y] we approximate the tail by F (s, t) = exp { l ( log F X (s), log F Y (t))}. where l is the stable tail dependence function. In the joint tail ]u X, x] ]u Y, y] we use Juri & Wüthrich s non parametric modeling (it also works with asymptotic independence λ = 0).

New tail estimator: construction 2/2 For a threshold u define û Y = 1 F Y ( F X (u)). Then, for k X, σ X (resp. k Y, σ Y ) the MLE based on the excesses of X (resp. Y), we estimate F (x, y) by F (x, y) = A n (B n + C n ) + F 1 (u, y) + F 2 (x, û Y ) 1 n n i=1 1 {X i u, Y i û Y } with A n = 1 n n i=1 1 {X i >u, Y i >û Y }, B n = 1 ĝ n (1 V kx, σ X (x u)) ĝ n (1 V ky, σ Y (y û Y )), C n = Ĝn( 1 V kx, σ X (x u), 1 V ky, σ Y (y û Y ) ), F 1 (u, y) = exp{ l n ( log( F X (u)), log( F Y (y)))}, F 2 (x, û Y ) = exp{ l n ( log( F X (x)), log( FY (û Y )))}.

Assumptions on the marginals The assumptions below are assumed both for F X and F Y. First order assumptions F is in the maximum domain of attraction of Fréchet, that is α > 0 such that F (x) = x α L(x) with L a slowly varying function. Second order assumptions as in Smith (1987), we assume that L satisfies SR2: L(tx) L(x) = 1 + k(t)φ(x) + o(φ(x)), t > 0, as x with φ positive and φ(x) x + 0.

Convergence results In the asymptotic dependence setting: For convenient x n, y n, we prove a Gaussian approximation result for the absolute error (with convergence rate) F (x n, y n ) F (x n, y n ) In the asymptotic independent setting: C (x(1 u), 1 u) lim u 1 C = g(x), x [0, ). (1 u, 1 u) refined by a second order assumption on copula C (Draisma et al. 2004): lim t 0 C (tx,ty) C (t,t) G(x, y) := Q(x, y) q 1 (t) with x, y 0, x + y > 0, where q 1 is some positive function and Q is neither a constant nor a multiple of G.

Contents 1 One-dimensional results The univariate POT method 2 The framework 2D Pickands-Balkema-de Haan Theorem 3 4

Loss ALAE and Storm insurance Real cases to illustrate the finite sample properties of our estimator. In order to study these data we will follow some classical models which make use of a symmetric copula structure. Figure: Logarithmic scale (left) ALAE versus Loss (right) Storm damages.

Loss ALAE We estimate θ and F (2.10 5, 10 5 ). References: Frees and Valdez (1998); Beirlant et al. (2010). Figure: (left) θ 0.04 (right) F (2.10 5, 10 5 ) (full line), F (2.10 5, 10 5 ) (dashed line), with the empirical probability indicated with a horizontal line.

Storm insurance We estimate θ and F (8.10 3, 950). Reference: Lescourret and Robert (2006). Figure: (left) θ 0.05 (right) F (8.10 3, 950) (full line), F (8.10 3, 950) (dashed line), with the empirical probability indicated with a horizontal line.

Wawe surge data Wave surge data comprising 2894 bivariate events (1971-1977) in Cornwall (England). (Symmetric Copula?) Figure: Wave Height (m) versus Surge (m).

Wawe surge data We estimate θ and F (8.32, 0.51). References: Cole & Tawn (1994); Ramos & Ledford (2009). Figure: (left) θ 0.02 (right) F (8.32, 0.51) (full line), F (8.32, 0.51) (dashed line), with the empirical probability indicated with a horizontal line.

Summary a new and different approach for estimating bivariate tails, we need neither Ledford & Tawn assumptions nor unit Fréchet margins, as for L & T estimate, it also works with asymptotic independence.

Ideas for future developments use the bivariate tail estimator F (x, y) to obtain estimation of bivariate upper-quantile curves, for high levels α. application to the estimation of bivariate Value-at-Risk for large α: VaR α ( F ) := {(x, y) (f 1 (n), + ) ( f 2 (n), + ) : F (x, y) = α}.

Thank for your attention.

Balkema A. A. and de Haan L., Residual life time at great age. Annals of Probability 2, 792-804, (1974). Beirlant J., Dierckx G. and Guillou A., Reduced bias estimators in joint tail modelling, submitted, (2010). Capéraà P. and Fougères A.-L., Estimation of a bivariate extreme value distribution. Extremes 3, 311-329, (2000). Coles, S. G. and Tawn, J. A., Statistical methods for multivariate extremes: an application to structural design (with discussion). Appl. Statist., 43, 1-48, (1994). Davison A. C., A statistical model for contamination due to long-range atmospheric transport of radionuclides. PhD Thesis. Department of Mathematics, Imperial College of Science and Technology, London, (1984). Davison A. C. and Smith R. L., Models for exceedances over high thresholds. Journal of the Royal Statistical Society, Series B, 52, 393-442, (1990). Draisma G., Drees H., Ferreira A. and de Haan L., Bivariate tail estimation: dependence in asymptotic independence, Bernoulli, 10, 251-280, (2004).

Einmahl J.H.J., Krajina A. and Segers J., A Method of Moments Estimator of Tail Dependence, Bernoulli, 14, 1003-1026, (2008). Frees E.W. and Valdez E.A. Understanding relationships using copulas. N. Am. Actuar. J. 2, 1-15, (1998). de Haan L., On regular variation and its application to the weak convergence of sample extremes, Mathematical Centre Tract, 32, Amsterdam, (1970). Huang X., Statistics of bivariate extreme values, Thesis, Erasmus University Rotterdam, Tinbergen Institute. Research Series 22, (1992). Juri A. and Wüthrich M.V., Tail Dependence From a Distributional Point of View, Extremes, 3, 213-246, (2003). Ledford A.W. and Tawn J.A., Statistics for near independence in multivariate extreme values, Biometrika, 83, 169-187, (1996). Ledford A.W. and Tawn J.A., Modelling dependence within joint tail regions. Journal of the Royal Statistical Society, Series B, 59, 475-499, (1997).

Ledford A.W. and Tawn J.A., Concomitant Tail Behaviour for Extremes. Advances in Applied Probability 30, 197-215, (1998). Lescourret L. and Robert C.Y., Extreme dependence of multivariate catastrophic losses, Scandinavian Actuarial Journal, 4, 203-225, (2006). McNeil A. J., Estimating the Tails of loss severity distributions using extreme value theory, ASTIN Bulletin, 27, 1117-1137, (1997). McNeil A. J., Extreme value theory for risk managers, preprint, ETH, Zurich, (1999). Pickands J., Multivariate extreme value distributions. Bulletin of International Statistical Institute, Buenos Aires, 859-878, (1981). Pickands J., Statistical inference using extreme order statistics. Annals of Statistics 3, 119-131, (1975). Ramos A. and Ledford A., A new class of models for bivariate joint tails. Journal Of The Royal Statistical Society Series B, 1, 219-241, (2009). Smith R., Estimating tails of probability distributions, Annals of Statistics 15, 1174-1207, (1987).