Australia Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Minimum Variance Unbiased Maximum Lielihood Estimation of the Extreme Value Index Roger Gay April 2005 Woring Paper 8/05
MINIMUM VARIANCE UNBIASED MAXIMUM LIKELIHOOD ESTIMATION OF THE EXTREME VALUE INDEX By ROGER GAY Dept. of Accounting and Finance, Monash University Clayton, Australia, 3800 roger.gay@buseco.monash.edu.au SUMMARY New results for ratios of extremes from distributions with a regularly varying tail at infinity are presented. If the appropriately normalized order statistic X (n-j+) from distribution with tail exponent δ, (δ > 0) converges wealy to X * (j) then: (i) For ρ = /δ, the ratio X * (j)/x * (j+) is distributed lie {U(j,)} -ρ where U(j,) has density B(u:j,) = B - (j,)u j- (-u) -, 0 u where B(a,b) = Γ(a)Γ(b)/Γ(a+b), a > 0, b > 0, Γ( ) denoting the gamma function, (ii) Consecutive ratios of extremes X * ()/X * (2), X * (2)/X * (3) independently distributed. X * (m/x * (m+) are (iii) The maximum lielihood estimator (mle) based on the first ratios is ρˆ = - mlog {X * (m)/x * (m+)} (iv) The mle ρˆ is unbiased, has variance ρ 2 / (the Cramer-Rao minimum variance bound) and is asymptotically normally distributed. (v) It has moments of all orders and moment generating function M ρ (θ) = (-ρθ/) -, θ>0. KEYWORDS: Tail-index, Minimum variance unbiased, Maximum lielihood, Asymptotically normal JEL CLASSIFICATION: C3 2
. INTRODUCTION AND MOTIVATION Considerable research effort has been devoted over the last thirty years to estimation of the exponent of regular variation, alternatively described depending on context, as the tail-index or the extreme value index. Popular estimators based on independent random samples such as those due to Hill (975), Picands (975) and maximum lielihood estimators of Smith (987) and Drees et al (2004) are consistent, asymptotically normal (i.e. biased) estimators. They are beset by an inherent dilemma; large variability when the number of extremes is too low, large bias when it is too high. A survey of bias problems is contained in Beirlant et al. (999). Bias reduction of tail-index estimators, including maximum lielihood estimators has generally been approached by applying second order properties of regularly varying functions to the tail-quantile function of the distribution. Invariably the bias depends on the unnown exponent, and is distribution-specific (e.g. Drees et al. (2004), Teugels and Vanroelen (2004)). Another approach focuses on optimal threshold selection, trading off bias reduction against variability (e.g. Matthys, (200)). Nowadays,... applications of extreme value theory can be seen in a large variety of fields such as hydrology, engineering, economics, astronomy and finance. The estimation of the extreme value index is the first and main statistical challenge (Teugels and Vanroelen, (2004). 2. MAIN RESULTS The class F of distributions F( ) have a regularly varying tail with index δ (equivalently belong to the maximum domain of attraction of the Frechet distribution, (e.g. Embrechts et al. (997), p.3)) if -F(x) = L(x)x -δ, x > 0, δ > 0. The function L(x) is slowly varying at infinity (see for instance Feller, (97), p.276). Theorem (distribution of a ratio of extremes) Denote by X (), X (2), X (n) ascending order statistics of common parent F ε F. The variables X * (), X * (2), are descending Frechet extremes, i.e X * (j) is the wea limit of normalized order statistic X (n-j+) /v n from the parent F ε F, the normalizing sequence {v n }obtained from the parent tail-quantile function, satisfying n[-f(v n )]=, (see for instance, David and Nagaraja, (993), Chapter 0). Then for any F ε F, ( j ) X ( j + ) = {U * (j,)} -ρ where U * (j,) has beta density B(u:j,) = B - (j,)u j- (-u) -, (0 u ) and where B(a,b) = Γ(a)Γ(b)/Γ(a+b), (a >0, b>0), Γ( ) representing the gamma function. 3
Proof (see Appendix, Note ) Theorem 2 (independence of consecutive ratios of extremes) Using the same notation as in Theorem For any F ε F, () = X (+ j) () X (2) X (2) X (3) ( j ) X ( j + ) where the right side is a product of independent {U(m,)} -ρ variables, the U(m,) having beta density B(u:m,) = mu m-,(0 u ). Proof (see Appendix, Note ) Theorem 3 (minimum variance maximum lielihood estimation based on ratios of extremes) (i) The maximum lielihood estimator ρˆ based on the observed ratios Y m = X * (m)/x * (m+) m =,2, is given by ρˆ = - mlog {X * (m)/x * (m+)} (ii) It is unbiased and with variance ρ 2 / and is asymptotically normally distributed. The variance ρ 2 / is the Cramer-Rao minimum variance bound for ρ. (iii) It has moment generating function M ρ (θ) = (-ρθ/) -, θ > 0. Proof (i) Note that L = f m ( ym ) while l = lnl = const. +lnδ ( mδ + ) lnym m since X * (m)/x * (m) has distribution of {U(m,)} -ρ where U(m,) has density B(y m :m,); i.e. f m (y m ) = mδ(y m ) -mδ- Thus 4
l = ρ - δ m m lny m i.e. 2 l Note that E[- 2 ] = ρ 2 δ ρˆ = - mln {X * (m)/x * (m+)} (ii) Since Y m = X * (m)/x * (m+) is distributed lie {U(m,)} -ρ where U(m,) has density B(u:m,) = mu m-, (integer m ), E[{mln(X * (m)/x * (m+))} j ] = Γ(j+)ρ j, (integer j ) Note that I j = E[{mln(X * (m)/x * (m+))} j ] = (-ρ) j E[{mlnU(m,)} j ] since X * (m)/x * (m+) is distributed lie {U(m,)} -ρ = (-ρ) j (-j)e[{mlnu(m,)} j- ] = jρ(-ρ) j- E[{mlnU(m,)} j- ] = jρi j- while I 0 = leading to E[{mln(X * (m)/x * (m+))} j ] = Γ(j+)ρ j, (integer j ) In particular, E[{mln(X * (m)/x * (m+))}] = ρ E[{mln(X * (m)/x * (m+))} 2 ] = 2ρ 2 (so that Var[mln(X * (m)/x * (m+))] = ρ 2 E[{mln(X * (m)/x * (m+))} 3 ] = 6ρ 3 From these results, E[ ρˆ ] = - * E[ mln {X (m) /X * (m+)}] = ρ Var[ ρˆ ] = -2 * Var[ mln {X (m) /X * (m+)}] = ρ 2 / 5
Since E[{mln(X * (m)/x * (m+))} 3 ] = 6ρ 3 <, asymptotic normality follows for example from Liapunov s Central Limit Theorem, (Rao, (973), p.27). 2 l Since E[- 2 ] =ρ 2, the Cramer-Rao minimum variance bound for g(δ) = δ - is given by δ l Var[ ρˆ ] {g (δ)}/ E[- 2 ] = ρ 4 /ρ 2 = ρ 2 /, δ 2 (iii) Recalling the independence of the ratios {X * (m)/x * (m+)}, m =,2, moment generating function of ρˆ is, the * * mϑ / E[exp(θ ρˆ )] = E[ { X ( m ) / X ( m+ ) } ] = (-ρθ/) - Since E[{{X * (m)/x * (m+)} mθ/ ] = E[{U(m,)} -ρmθ/ ] where U(m,) has density B(u:m,) = mu m-, (0 u ), its expectation is (-ρθ/) -. This completes the proof. 3. SUMMARY AND CONCLUSIONS Fundamental new results for heavy-tailed extremes are formulated in terms of ratios of extremes. Such ratios are non-distribution specific, being independent of normalizing constants. Individual ratios in the form {X * (j)/x * (j+)} are distributed lie powers of beta variates. Consecutive ratios X * ()/X * (2), X * (2)/X * (3), are independently distributed. These results affect considerable simplification in dealing with samples of extremes. One consequence is that unbiased estimators of the tail-index are available. The maximum lielihood estimator ρˆ = - mln {X * (m)/x * (m+)}based on observed ratios is itself unbiased. It achieves minimum variance ρ 2 / among unbiased estimators, and is asymptotically normal. Moreover ρˆ has moments of all orders, with moment generating function M ρ (θ) = (-ρθ/) -. APPENDIX Note : Theorem (The distribution of a ratio of extremes) Denote by X (), X (2), X (n) ascending order statistics of common parent F ε F. 6
The variables X * (), X * (2), are descending Frechet extremes, i.e X * (j) is the wea limit of normalized order statistic X (n-j+) /v n from the parent F ε F, the normalizing sequence {v n }obtained from the parent tail-quantile function, satisfying n[-f(v n )]=, (see for instance, David and Nagaraja, (993), Chapter 0). Then for any F ε F, ( j ) X ( j + ) = {U * (j,)} -ρ where U * (j,) has beta density B(u:j,) = B - (j,)u j- (-u) -, (0 < u < ) and where B(a,b) = Γ(a)Γ(b)/Γ(a+b), (a >0, b>0), Γ( ) representing the gamma function. Proof: For the order statistics from common distribution F ε F and corresponding density f( ) the joint density of (X (n-j-+), X (n-j+) ), (j < ) denoted by f # (x (n-j-+),x (n-j+) ) is B(u:j,) B(v:j+,n-j-+) = B - (j, )u j- (-u) - B - (j+,n-j-+) v j+- (-v) n-j- i.e. that of independent random variables U and V deriving from transformations (see for instance, Arnold et al. (993), Chapter 2). (i) u(x (n-j-+), x (n-j+) ) = {-F(x (n-j+) )}/{-F(x (n-j-+) )} u ε [0,] (ii) v(x (n-j-+), x (n-j+) ) = x (n-j-+) Note that ( x ( u, v), x ( n j + ) ( n j + ) = {f(x (n-j+) )/(-F(x (n-j-+) )} f(x (n-j-+) ) ) where represents absolute value. The implication of the transformation: u(x (n-j-+), x (n-j+) ) = {-F(x (n-j+) )}/{-F(x (n-j-+) )} for F ε F, i.e. -F(x) = L(x)x -δ when n is large, can be determined as follows: Put: (i) x (n-j+) = v n x * (j) (ii) x (n-j-+) = v n x * (j+) Note that for -F(x) = L(x)x -δ, v n δ = nl(v n ) and u = {-F(x (n-j+) )}/{-F(x (n-j-+) )} 7
i.e. u = L(v n x * (j))v -δ n (x * (j)) -δ / L(v n x * (j+))v -δ n (x * (j+)) -δ = (x * (j)) -δ / (x * (j+)) -δ using for example L(v n x * (j))/ L(v n ) as n. Thus the u-transformation implies for large n that (A.) {U * (j, )} -ρ = ( j ) X ( j + ) where U * (j,) has density B(u:j, ), independent of X * (j+). Chec: The independence can be checed by showing equivalence of the moments of {X * (j)} φ and those of {X * (j)/x * (j+)} φ {X * (j+)} φ on any dense φ-set interior to (0, j-ρφ), j- ρφ>0. Note that E[{X * (j)} φ ] = Γ(j-ρφ)/Γ(j) This is the same as E{X * (j)/x * (j+)} φ E{X * (j+)} φ = E[{U(j,)} -ρφ ] E{X * (j+)} φ ] = {B(j-ρφ,)/B(j,)} Γ(j+-ρφ)/Γ(j+j) = Γ(j-ρφ)/Γ(j) This completes the proof of Theorem Note 2: Theorem 2 (independence of consecutive ratios of extremes) Using the same notation as in Theorem For any F ε F, (A.2) () = X (+ j) () X (2) X (2) X (3) ( j ) X ( j + ) 8
where the right side is a product of independent {U * (m,)} -ρ variables, the U * (m,) being beta (m,) random variables, for m =,2, j. Proof We re-write product (A.2) as (A.3) δ () = X (+ j) δ * δ X * δ () X (2) X ( j ) X (2) X (3) X (+ j) Each of the ratios on the right side is a B(m,) variable, m =, 2, j, using Theorem. Their independence then follows using Kendall and Stuart (969), Exercise.8, and noting that the left side cannot be beta if the betas on the right side are not independent An alternative proof by equivalence of moments on a dense set (as outlined in the proof of Theorem ) is also available using the uniqueness of moments for distributions (lie beta) confined to closed intervals (see for instance Feller (97) p.227). In fact Theorem 2 also succumbs to a proof based on extending the proof of Theorem as follows: Consider the joint distribution of X (n-i-j-+), X (n-i-j+) and X (n-i+) jointly distributed order statistics with common parent F( ) ε F with density denoted by f #, i.e. f # (x (n-i-j-+), x (n-i-j+), x (n-i+) ) = Γ(n+)/{Γ(n-i-j-+)Γ()Γ(j)Γ(i) F(x (n-i-j-+) ) n-i-j- f(x (n-i-j-+) ) {F(x (n-i-j+) )-F(x (n-i-j-+) )} - f(x (n-i-j+) ) {F(x (n-i+) )-F(x (n-i-j+) )} j- f(x (n-i+) ) {-F(x (n-i+) )} i- This can be re-written as f # (x (n-i-j-+), x (n-i-j+), x (n-i+) ) = Γ(n+)/{Γ(i+j+)Γ(n-i-j-+)} F(x (n-i-j-+) ) n-i-j- {-F(x (n-i-j-+) ) i+j+- f(x (n-i-j-+) ) Γ(i+j+)/{Γ()Γ(i+j)} [-{(-F(x (n-i-j+) )/(-F(x (n-i-j-+) )} - ] {(-F(x (n-i-j+) )/(-F(x (n-i-j-+) )} i+j- f(x (n-i-j+ )/(-F(x (n-i-j-+) )) 9
Γ(i+j)/{Γ(j)Γ(i)}[-{(-F(x (n-i+) )/(-F(x (n-i-j+) ) j- ] {(-F(x (n-i+) )/(-F(x (n-i-j+) )} i- f(x (n-i+) )/(-F(x (n-i-j+) )) Now mae the substitutions: (i) u(x (n-i-j-+), x (n-i-j+), x (n-i+) ) = (-F(x (n-i+) )/(-F(x (n-i-j+) ) (ii) v(x (n-i-j-+), x (n-i-j+), x (n-i+) ) = (-F(x (n-i-j+) )/(-F(x (n-i-j-+) ) (iii) v(x (n-i-j-+), x (n-i-j+), x (n-i+) ) = x (n-i-j-+) Note that ( x ( u, v, w), x, x ( n i j + ) ( n i j + ) ( n i+ ) ) = {f(x (n-i+) /(-F(x (n-i-j+) )} {f(x (n-i-j+) /(-F(x (n-i-j-+) )} So f # (x (n-i-j-+), x (n-i-j+), x (n-i+) ) = B - (n-i-j-,i+j+)w n-i-j- (-w) i+j+- B - (i+j,)v i+j- (-v) - B - (i,j)u i- (-u) j- ( x ( u, v, w), x, x ( n i j + ) ( n i j + ) ( n i+ ) showing U,V, and W to be independently distributed. As in the proof of Theorem, consider the implications as n when -F(x) = L(x)x -δ, and ) (i') x (n-i-j-+) = v n x * (i+j+) (ii') x (n-i-j+) = v n x * (i+j) (iii') x (n-i-j+) = v n x * (i) As outlined in the proof of Theorem, under these changes the transformation 0
u(x (n-i-j-+), x (n-i-j+), x (n-i+) ) = (-F(x (n-i+) )/(-F(x (n-i-j+) ) becomes u(x * (i+j+), x * (i+j), x * (i)) = {x * (i)/x * (i+j)} -δ while the transformation v(x (n-i-j-+), x (n-i-j+), x (n-i+) ) = (-F(x (n-i-j+) )/(-F(x (n-i-j-+) ) becomes u(x * (i+j+), x * (i+j), x * (i)) = {x * (i+j)/x * (i+j+)} -δ These transformations imply that {X * (i)/x * (i+j)} has the distribution of {U(i,j)} -ρ where U(i,j) has beta density B(u:i,j) and is distributed independently of {X * (i+j)/x * (i+j+)} which has the distribution of {V(i+j,)} -ρ where V(i+j,) has beta density B(v:i+j,). Thus the dependent set of extremes {X * (i), X * (i+j), X * (i+j+)} is transformed to the independent random variables{(x * (i)/x * (i+j)) -δ, (X * (i+j)/x * (i+j+)) -δ, X * (i+j+)} the first two random variables having densities B(u:i,j) and B(v:i+j,) respectively. A special case of Theorem 2 then follows on putting i = j = = ; i.e. (X * ()/X * (2)), (X * (2)/X * (3)) are independently distributed with distributions lie {U(,)} -ρ, {U(2,)} -ρ where U(m,) has beta density B(u:m,), m =, 2. Evidently the mechanism extends to ratios and Theorem 2 follows. ARNOLD, B. BALAKRISHNAN, N. AND NAGARAJA H. (993) A First Course in Order Statistics, J.Wiley BIERLANT, J. DIERCKX, G. GOEGEBEUR, Y. AND MATTHYS, G. (999) Tailindex estimation and an exponential regression model, Extremes, 2, 77-200 DAVID, H.A. AND NAGARAJA, H.N. (2003) Order Statistics, 3 rd. Edition, Wiley- Interscience
DREES, H. FERREIRA, A. AND DE HAAN, L. (2004) On maximum lielihood estimation of the extreme value index, Annals of Applied Probability 4, 79-20 EMBRECHTS, P. KLUPPELBERG, C. AND MIKOSCH, T. (997) Modelling Extremal Events for Insurance and Finance Springer-Verlag. FELLER, W. (97) An Introduction to Probability Theory and Its Applications, Volume II, J. Wiley HILL, B.M. A simple general approach to inference about the tail of a distribution, Ann. Statist. 3, 9-3 KENDALL, M.G. AND STUART, A.(969) The Advanced Theory of Statistics Vol.. Griffin. MATTHYS, G. (200) Estimating the extreme-value index and high quantiles: bias reduction and optimal threshold selection. Doctoral Thesis, Katholiee Universiteit Leuven. PICKANDS, J. (975) Statistical inference using extreme order statistics, Ann. Statist, 3, 9-3 SMITH, R.L. (987) Estimating the tails of probability distributions, Ann. Statist. 5, 74-207 RAO, C.R. (973) Linear Statistical Inference and Its Applications, J.Wiley. TEUGELS, J.L. AND VANREOLEN, G. (2004) Box-Cox transformations and heavytailed distributions, in 'Stochastic Methods and Their Applications'. Journal of Applied Probability, Special Volume, 4A, 2-227 2