Multivariate generalized Pareto distributions

Similar documents
PREPRINT 2005:38. Multivariate Generalized Pareto Distributions HOLGER ROOTZÉN NADER TAJVIDI

Multivariate generalized Pareto distributions

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Classical Extreme Value Theory - An Introduction

ESTIMATING BIVARIATE TAIL

Bayesian Inference for Clustered Extremes

A Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization

Extreme Precipitation: An Application Modeling N-Year Return Levels at the Station Level

Multivariate Pareto distributions: properties and examples

Introduction A basic result from classical univariate extreme value theory is expressed by the Fisher-Tippett theorem. It states that the limit distri

Extreme Value Analysis and Spatial Extremes

Models and estimation.

Bivariate generalized Pareto distribution

MULTIVARIATE EXTREMES AND RISK

A Conditional Approach to Modeling Multivariate Extremes

Extreme value statistics: from one dimension to many. Lecture 1: one dimension Lecture 2: many dimensions

Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions

Bayesian Modelling of Extreme Rainfall Data

Multivariate extremes. Anne-Laure Fougeres. Laboratoire de Statistique et Probabilites. INSA de Toulouse - Universite Paul Sabatier 1

The Behavior of Multivariate Maxima of Moving Maxima Processes

Tail dependence in bivariate skew-normal and skew-t distributions

Modeling Extremal Events Is Not Easy: Why the Extreme Value Theorem Cannot Be As General As the Central Limit Theorem

The extremal elliptical model: Theoretical properties and statistical inference

APPLICATION OF EXTREMAL THEORY TO THE PRECIPITATION SERIES IN NORTHERN MORAVIA

Bivariate generalized Pareto distribution in practice: models and estimation

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Extreme Value Theory and Applications

CONTAGION VERSUS FLIGHT TO QUALITY IN FINANCIAL MARKETS

Estimation of spatial max-stable models using threshold exceedances

A MODIFICATION OF HILL S TAIL INDEX ESTIMATOR

New Classes of Multivariate Survival Functions

Modelação de valores extremos e sua importância na

Research Article Strong Convergence Bound of the Pareto Index Estimator under Right Censoring

Models for Dependent Extremes Using Stable Mixtures

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

The Generalized Pareto process; with application

Statistics of Extremes

Nonparametric Estimation of the Dependence Function for a Multivariate Extreme Value Distribution

Estimating Bivariate Tail: a copula based approach

arxiv: v4 [math.pr] 7 Feb 2012

A NOTE ON SECOND ORDER CONDITIONS IN EXTREME VALUE THEORY: LINKING GENERAL AND HEAVY TAIL CONDITIONS

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Heavy Tailed Time Series with Extremal Independence

Explicit Bounds for the Distribution Function of the Sum of Dependent Normally Distributed Random Variables

On Kesten s counterexample to the Cramér-Wold device for regular variation

On the Estimation and Application of Max-Stable Processes

Abstract: In this short note, I comment on the research of Pisarenko et al. (2014) regarding the

Math 576: Quantitative Risk Management

100-year and 10,000-year Extreme Significant Wave Heights How Sure Can We Be of These Figures?

Lecture Quantitative Finance Spring Term 2015

Contributions for the study of high levels that persist over a xed period of time

On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves

Estimation of the extreme value index and high quantiles under random censoring

Multivariate peaks over thresholds models

Chapter 2 Asymptotics

Multivariate Heavy Tails, Asymptotic Independence and Beyond

EXTREMAL QUANTILES OF MAXIMUMS FOR STATIONARY SEQUENCES WITH PSEUDO-STATIONARY TREND WITH APPLICATIONS IN ELECTRICITY CONSUMPTION ALEXANDR V.

Modelling extreme-value dependence in high dimensions using threshold exceedances

Generating uniform random vectors over a simplex with implications to the volume of a certain polytope and to multivariate extremes

RISK ANALYSIS AND EXTREMES

Sparse Representation of Multivariate Extremes with Applications to Anomaly Ranking

Asymptotic Tail Probabilities of Sums of Dependent Subexponential Random Variables

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

Spatial and temporal extremes of wildfire sizes in Portugal ( )

Tail Approximation of Value-at-Risk under Multivariate Regular Variation

A NOTE ON A DISTRIBUTION OF WEIGHTED SUMS OF I.I.D. RAYLEIGH RANDOM VARIABLES

Asymptotics of random sums of heavy-tailed negatively dependent random variables with applications

Multivariate Markov-switching ARMA processes with regularly varying noise

Statistical Methods for Clusters of Extreme Values

Variable inspection plans for continuous populations with unknown short tail distributions

Analysis methods of heavy-tailed data

Tail Dependence of Multivariate Pareto Distributions

Dependence Comparison of Multivariate Extremes via Stochastic Tail Orders

Some conditional extremes of a Markov chain

Regular Variation and Extreme Events for Stochastic Processes

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Multivariate Markov-switching ARMA processes with regularly varying noise

Emma Simpson. 6 September 2013

Precise Large Deviations for Sums of Negatively Dependent Random Variables with Common Long-Tailed Distributions

Characterizations on Heavy-tailed Distributions by Means of Hazard Rate

arxiv: v2 [math.pr] 28 Jul 2011

A class of probability distributions for application to non-negative annual maxima

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

A New Class of Tail-dependent Time Series Models and Its Applications in Financial Time Series

Limit Distributions of Extreme Order Statistics under Power Normalization and Random Index

Bayesian nonparametrics for multivariate extremes including censored data. EVT 2013, Vimeiro. Anne Sabourin. September 10, 2013

Characterization through Hazard Rate of heavy tailed distributions and some Convolution Closure Properties

Asymptotic behavior for sums of non-identically distributed random variables

1 Joint and marginal distributions

Sharp statistical tools Statistics for extremes

arxiv: v1 [math.pr] 24 Aug 2018

Department of Econometrics and Business Statistics

The Spatial Variation of the Maximum Possible Pollutant Concentration from Steady Sources

MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES

On the Estimation and Application of Max-Stable Processes

Models for the extremes of Markov chains

A Class of Weighted Weibull Distributions and Its Properties. Mervat Mahdy Ramadan [a],*

A THRESHOLD APPROACH FOR PEAKS-OVER-THRESHOLD MODELING USING MAXIMUM PRODUCT OF SPACINGS

EXACT SIMULATION OF BROWN-RESNICK RANDOM FIELDS

Certain bivariate distributions and random processes connected with maxima and minima

Transcription:

Multivariate generalized Pareto distributions Holger Rootzén and Nader Tajvidi Abstract Statistical inference for extremes has been a subject of intensive research during the past couple of decades. One approach is based on modeling exceedances of a random variable over a high threshold with the Generalized Pareto GP) distribution. This has shown to be an important way to apply extreme value theory in practice and is widely used. In this paper we introduce a multivariate analogue of the GP distribution and show that it is characterized by each of following two properties: i) exceedances asymptotically have a multivariate GP distribution if and only if maxima asymptotically are Extreme Value EV) distributed, and ii) the multivariate GP distribution is the only one which is preserved under change of exceedance levels. We also give a number of examples and discuss lowerdimensional marginal distributions Keywords: Generalized Pareto distribution, Multivariate extreme value theory, Multivariate Pareto distribution, Non-homogeneous Poisson process, Peaks over threshold method. AMS 99 subject classification: 62E20, 60F7, 62E0, 60G70, 62H2. Introduction Statistical modeling of extreme values has developed extensively during the last decades. This is witnessed by several recent books Coles 200), Embrechts et al. 998), Kotz and Nadarajah 2000), Kowaka 994), Beirlant et al. 2005), Reiss and Thomas 2005)) and a large journal literature. The latter includes both theoretical papers and many articles which apply the methods to a wide range of important problems such as extreme windspeeds, waveheights, floods, insurance claims, price fluctuations,.... For references to some of this literature, see e.g. Kotz and Nadarajah 2000), Beirlant et al. 2005)). The main emphasis has been on univariate extremes, and so far the univariate results are the most complete and directly usable ones. Two main sets of methods, the Block Maxima Chalmers, SE-4296, Göteborg, Sweden, rootzen@math.chalmers.se Lund Institute of Technology, Box 8 SE-2200, Lund, Sweden, tajvidi@maths.lth.se

method and the Peaks over Thresholds method have been developed Coles 200)). We here only consider independent and identically distributed variables. However, the methods are also widely useful for dependent and non-stationary situations. In the Block Maxima method one is supposed to have observed the maximum values of some quantities over a number of blocks, a typical example being that a block is a year and the observed quantities may be some environmental quantity such as the wind speed at a specific location. In this method, the block maxima are modeled by an Extreme Value EV) distribution with distribution function d.f.) Gx) = exp{ γ x µ σ ) /γ }. This choice of distribution is motivated by the two facts that i) the EV distributions are the only ones which can appear as the limit of linearly normalized maxima, and that ii) they are the only ones which are max-stable, i.e. such that a change of block size only leads to a change of location and scale parameter in the distribution. In the Peaks over Thresholds method one is instead supposed to have observed all values which are larger than some suitable threshold, e.g. all windspeeds higher than 20 m/s. These values are then assumed to follow the Generalized Pareto GP) distribution with d.f. Hx) = γ x σ ) /γ. This choice of distributions is motivated by characterizations due to Balkema and de Haan 974) and Pickands 975). One characterization is i) that the distribution of a scale normalized exceedance over a threshold asymptotically as the threshold tends to the righthand endpoint of the distribution) converges to a GP distribution if and only if the distribution of block maxima converges as the block length tends to infinity) to an EV distribution. The other one is ii) that the GP distributions are the only stable ones, i.e. the only ones for which the conditional distribution of an exceedance is a scale transformation of the original distribution. Pickands gives the full statement of this, although we believe there is a small gap in his proof. Balkema and de Haan only consider the infinite endpoint case, but give a complete proof. Some basic papers on the PoT method are Smith 985, 987), Smith et al. 990, 997), and Davidson and Smith 990). Ledford and Tawn 996) develop threshold-based models for joint tail regions of multivariate extreme for asymptotically independent cases. Since the PoT method uses more of the data it can sometimes result in better estimation precision than the block maxima method. As a parenthesis, there are several other variants of the one-dimensional) Pareto distribution, see e.g. Arnold 983). Multivariate extreme value distributions arise in connection with extremes of a random sample from a multivariate distribution. They are extensively discussed in the books by Resnick 987), Kotz and Nadarajah 2000) and Beirlant et al. 2005) and in the review by Fougères in Finkenstadt and Rootzén 2004). Several recent papers e.g. Joe et al. 992), Coles and Tawn 99), Tawn 988, 990), Smith et al. 990)) have explored their statistical application. There are several possibilities for ordering multivariate data, see e.g. the review by Barnett 976). For extreme values the most widely used method is the marginal or M- ordering where the maximum is defined by taking componentwise maxima. Then, for a 2

series of vectors {X i, i } = {X ) i,..., X d) i ), i }, the maximum, M n, is defined by M n = M n ),..., M n d) ) = n i= X) i,..., n i= Xd) i ). Here and hereafter denotes maximum. Under rather general conditions, the distribution of the linearly normalized M n converges to a multivariate extreme value distribution. In applications M n is often the vector of annual maxima, and block maxima methods can be applied similarly as when the observations are onedimensional. However, as in the univariate case it is also of interest to study methods which utilize more of the data and which can contribute to better estimation of parameters. For multivariate observations a further reason to study such methods is that block maxima hide the time structure since they don t show if the component maxima have occurred simultaneously or not. The aim of this paper is to define the multivariate Generalized Pareto distributions and to prove that this definition indeed is the right one. The multivariate Generalized Pareto distribution should a) be the natural distribution for exceedances of high thresholds by multivariate random vectors, and b) it should describe what happens to the other components when one or more of the components exceed their thresholds. In complete analogy with the one-dimensional case we interpret a) to mean that the multivariate GP distribution should be characterized by each of the following two properties: exceedances of suitably coordinated levels) asymptotically have a multivariate GP distribution if and only if componentwise maxima asymptotically are EV distributed, the multivariate GP distribution is the only one which is preserved under a suitably coordinated) change of exceedance levels. In the next section, Section 2, we prove that this indeed is the case for the definition given in this paper. The section also explains the caveat suitably coordinated levels. Further, the requirement b) is taken care of by the choice of support for the GP distribution. There is a close connection between the multivariate GP distribution and the multivariate point process methods used in Coles and Tawn 99) and Joe et al. 992), see Section 2. In Section 3 we show how an explicit formula for the multivariate EV distributions directly leads to a corresponding expression for the multivariate GP distributions and also give a few concrete examples. Lowerdimensional marginals of multivariate GP distributions may be thought of in different ways. This is discussed in Sections 4. Proofs are given in Section 5. This paper is a further development of the results in Tajvidi 996). A set of related work is Falk and Reiss 200, 2002, 2003 and 2005) which introduced a class of distributions also which also were named Bivariate Generalized Pareto Distributions with uniform margins and suggested a canonical parameterization for the distributions. The papers also discussed estimation and asymptotic normality. To the best of our knowledge no multivariate generalization of these disttributions has been discussed by the authors. 3

2 Multivariate Generalized Pareto Distributions In this section we give the formal definition of the multivariate generalized Pareto distribution and reformulate the motivating characterizations into mathematical terms. Proofs of the characterizations given in Section 5. However, first some preliminaries. Suppose {X i, i } = {X ) i,..., X d) i ), i } are independent, identically distributed d-dimensional random vectors with distribution function F. As before, let M n be the vector of componentwise maxima, M n = M n ),..., M n d) ) = n i= X ) i,..., n i= X d) i ). Assume that there exist normalizing constants σ n i) that as n > 0, u i) n R, i d, n such P [M n i) u i) n )/σ n i) x i), i d] = = F n σ n ) x ) u ) n,..., σ n d) x d) u d) n ) Gx) ) with the limit distribution G such that each marginal G i, i =,..., d is non-degenerate. If ) holds, F is said to be in the domain of attraction of G, and we write F DG), and G is said to be a multivariate extreme value distribution. By setting all x-s except x i) to it is seen that each marginal G i of G must be an x µ EV d.f., so that G i x) = exp γ i i σ i ) /γ i ). Here µ i is a location parameter, σ i > 0 is a scale parameter, γ i is a shape parameter, and the signifies that if the expression in parentheses is negative then it should be replaced by 0. For γ i = 0 the expression for the d.f. should be interpreted to mean exp exp x µ i σ i )) As in the univariate case, a multivariate convergence of types argument shows that the class of limit d.f. s for ) is the class of max-stable distributions, where a d.f. G in R d is max-stable if for i =,..., d and every t > 0 there exist functions α i) t) > 0, β i) t) such that G t x) = Gα ) t)x ) β ) t),..., α d) t)x d) β d) t)). It is convenient to have a convention to handle vectors occurring in the same expression but not all of the same length. We use the convention that the value of the expression is a vector with the same length as that of the longest vector occurring in the expression. Shorter vectors are recycled as often as need be, perhaps fractionally, until they match the length of the longest vector. In particular a single number is repeated the appropriate number of times. All operations on vectors are performed element by element. For example if x and y are bivariate vectors and α is an scalar, we have αx = αx, αx 2 ) ; α x = α x, α x 2 ) 4

and xy = x y, x 2 y 2 ) ; x y = x y, x 2 y 2 ). This convention applies also when we take supremum or infimum of a set, so that e.g. a coordinate of the supremum of a set is the supremum of all the values this coordinate takes in the set. Thus, e.g., for u n = u ) n,... u d) n ), σ n = σ n ),... σ n d) ), we can write ) as P M n u n )/σ n x) Gx). For the definition we also use the convention that 0/0 =. The definition has also independently been noticed by Beirlant et al. 2005, Ch 9). Definition 2.. A distribution function H is a multivariate generalized Pareto distribution if Hx) = log G0) log Gx) 2) Gx 0) for some extreme value distribution G with nondegenerate margins and with 0 < G0) <. In particular, Hx) = 0 for x < 0 and Hx) = for x > 0. log Gx) log G0) Perhaps more elegantly, the class of multivariate GP distributions could alternatively be taken to be all distributions of the form Hx) = log Gx) Gx 0), for G an EV distribution with G0) = e. This is not less general than 2) since if we let t = / log G0)) then the H in 2) is of the form log Gx)t, and by max-stability G t Gx 0) t again is an EV distribution, with G0) t log G0) = exp ) = log G0) e. However, for statistical application one would want to parameterize G, and then the form 2) is more convenient. As mentioned in the introduction, there is a strong connection between this multivariate GP distribution and the point process approach of Coles and Tawn 99) and Joe et al. 992). The major difference is that 2) holds for all values x 0 whereas in the point process approach only values x > 0 are modeled parametrically. This might prove to be an improvement for statistical analysis of extremes since then also negative x i s contribute to making inference on the distribution. Our first motivation for this definition is the following theorem. It shows that exceedances of suitably coordinated levels) asymptotically has a multivariate GP distribution if and only if maxima are asymptotically EV distributed. To state the theorem, let X be a d-dimensional random vector with distribution function F and write F = F for the tail function of a distribution F. Further, let {ut) t [, )} be a d-dimensional curve starting at u) = 0, let σu) = σut)) > 0 be a function with values in R d, and let X u = X u σu) 5

be the vector of normalized exceedances of the levels u. In the characterizations we consider exceedances of d levels which tend to infinity interpreted to mean that the levels move further and further out into the tails of F ). However, asymptotic distributions can differ for different relations between the levels. The components of the curve {ut)} give these levels and the curve specifies how the levels increase in a suitably coordinated way. Theorem 2.2. i) Suppose G is a d-dimensional EV distribution with 0 < G0) <. If F DG) then there exists an increasing continuous curve u with F ut)) as t, and a function σu) > 0 such that P X u x X u 0) log G0) log Gx) Gx 0) 3) as t, for all x. ii) Suppose there exists an increasing continuous curve u with F ut)) as t, and a function σu) > 0 such that P X u x X u 0) Hx), 4) for some function H, as t, for x > 0, where the marginals of H on R are nondegenerate. Then the lefthand side of 4) converges to a limit Hx) for all x and there is a unique multivariate extreme value distribution G with G0) = e such that Gx) Hx) = log Gx 0). 5) This G satisfies Gx) = e Hx) for x > 0, and F DG). The next motivation for Definition 2. is that distribution 2) is the only one which is preserved under a suitably coordinated) change of exceedance levels. Theorem 2.3. i) Suppose X has a multivariate generalized Pareto distribution. Then there exists an increasing continuous curve u with P X ut)) as t, and a function σu) > 0 such that for t [, ) and all x. P X u x X u 0) = P X x), 6) ii) If there exists an increasing continuous curve u with P X ut)) as t, and a function σu) > 0 such that 6) holds for x > 0, and X has nondegenerate margins, then X has a multivariate generalized Pareto distribution. 6

An useful tool in extreme value theory is convergence of the point process of large values, see e.g. Resnick 987). The close relation between the previous results and point process convergence is the content of the next result. In it we use the rather standard notation of Resnick 987), and let X, X 2,... be i.i.d. with distribution function F. Theorem 2.4. i) Suppose one of the conditions of Theorem 2.2 holds. Write S for the support of G, so that S = {x : Gx) 0, )} and let µ be the measure on S which is determined by µ, x] c = log Gx). Then there exist d-dimensional vectors of constants u n and σ n > 0 such that n ε = PRMdt dµ) on S. 7) i= n, X i u n ) σn ii) Suppose 7) holds on x 0 for some measure µ with µ, 0] c 0, ) and where the function µ, x] c has non-degenerate marginals. Then the conditions of Theorem 2.2 hold, and hence also i) above is satisfied. 3 Examples In this section we exhibit a general expression for the multivariate generalized Pareto distributions and exhibit one specific bivariate example: the bivariate logistic distribution. Several authors, see e.g. Resnick 987) and Pickands 98), have given equivalent characterizations of multivariate extreme value distributions, assuming different marginal distribution. E.g. according to Proposition 5. in Resnick 987), all max-stable distributions with the unit Fréchet extreme value distribution Φ x) = exp x ), x > 0 as marginal distributions can be written as with G x) = exp{ µ [0, x] c }, x 0 8) µ [0, x] c = d ℵ i= ai) )Sda). 9) x i) Here S is a finite measure on ℵ = {y R d : y = }, which below is assumed to satisfy ℵ ai) Sda) =, i d, where is an arbitrary norm in R d and µ is called the exponent measure. This leads to the descriptions of the multivariate extreme value distribution G with arbitrary marginals as all distributions of the form ) ) γ x µ) γ Gx) = G. σ 7

Here µ, σ and γ are d-dimensional vectors with potentially different entries. By 2. this in turn gives the following expression for the multivariate GP distributions. Proposition 3.. Hx) is a multivariate GP distribution if there exists a finite measure S, normalized as described above, such that for x 0 ℵ d i= Hx) = ) ) a i) γi) x i) 0 µ i) ) γ i) Sda) d σ i) ℵ i= a i) = µ [0, ℵ d i= γ x 0 µ) σ ) γ γi) µ i) σ i) ] c ) [ µ [ µ 0, γ µ σ ) ) a i) γi) x i) µ i) ) γ i) Sda) σ i) ) γ i) ) Sda) ) γ ) ] c ) 0, γ x µ) γ σ ] c ). The parameters µ, γ, and σ > 0 have to satisfy µ i) < σ i) /γ i) if γ i) > 0 and µ i) > σ i) /γ i) for γ i) < 0, for i =,..., n to make 0 < G0) < ). Now, the example: Example The symmetric logistic model has exponent measure µ [0, x, y)] c ) = x r y r ) /r, r. The independent case corresponds to r = and for r = we get the complete dependence which is the only situation without density. By transforming marginals to an arbitrary extreme value distribution we obtain the following bivariate GP distribution Hx, y) = ) r/γx γx x 0 µx) γ y y 0 µ y ) ) r/γy ) /r ) r/γx γx x µx) γ xµ x ) r/γx γ yµ y ) r/γy ) /r As above we assume that the parametrization is such that it ensures 0 < G0, 0) <. For x, y) > 0, 0) this corresponds to γ x x µ x ) Hx, y) = ) r/γx γy y µy) ) ) r/γy /r ) r/γx ) ) r/γy /r, γxµx γyµy 8 γ y y µ y ) ) ) r/γy /r

while for x < 0, y > 0 we have Hx, y) = ) γ r/γx ) ) r/γy /r xx µ x ) γy µy γ x x µ x ) ) r/γx γy y µy) ) ) r/γy /r ) γ r/γx ) ) xµ x γ r/γy /r. yµ y For the independent case, r =, this simplifies to Hx, y) = ) γ /γy ) y µ y γ /γx y y µ y ) ) /γx ) γxµx γ /γy. yµ y Now, still considering the independent case, let X and Y be independent random variables where X has the GP d.f. γ ) /γx x x for x > 0, and Y is defined similarly. Further, let p = ) /γx γx µx γxµx ) /γx γ yµ y ) /γy. Then H is the distribution function of a bivariate random variable which equals X, ) with probability p and, Y ) with probability q = p. Thus, in either case one of the components are degenerate at the lower bound of the distribution while the other one is a GP random variable. This completely agrees with intuition: Suppose the the distribution of the exceedances of a bivariate vector after normalization converges as the levels increase. Then, for independent components the event that one of the component exceeds its level does not influence the value of the other one. Hence asymptotically, as the levels tend to infinity the normalization will force the this component down to its lower bound. The roles of the components can of course be interchanged in this argument. 4 Lowerdimensional marginal distributions Interpreted in the usual way, lowerdimensional marginal distributions of multivariate GP distributions are not GP distributions. E.g., if Hx, y) is the bivariate GP distribution 9

from Example above and H x) is the marginal distribution of the first component, then H x) = Hx, ) = ) γ r/γx ) ) r/γy /r ) /γx x x 0 µ x ) γy µy σ y γx x µ x ) ) γ r/γx ) ) xµ x γ r/γy /r. yµ y This is not a onedimensional generalized Pareto distribution. However, if X has distribution H then the conditional distribution of X X > 0 is generalized Pareto. This property holds for all marginal distributions regardless of the dimension of the original problem. The reason is that H is the asymptotic conditional distribution of the first component of the random vector given that either the first or the second component is large. In contrast, a onedimensional GP distribution is the asymptotic conditional distribution of a random variable, given that it is large. In general standard) lowerdimensional marginal distributions of a multivariate GP distribution is the asymptotic conditional distribution of a subset of random variables given that at least one of a bigger set of variables is large. Sometimes these may be the appropriate lowerdimensional marginals of multivariate GP distributions. However, the following concept may also be useful. Let H be a d-dimensional multivariate GP distribution with representation Hy) = log G0) log Gy) Gy 0) 0) in terms of a multivariate EV distribution G. For x = x,... x d ) a d )-dimensional vector let G i) x) = Gx,... x i,, x i,... x d )) be the d )-dimensional marginal distribution, with the i-th component removed, of G. The d )-dimensional Generalized Pareto marginal distribution Hi) GP of H is defined to be H GP i) x) = log G i) 0) log Since G i) is an EV distribution it follows directly that H GP i) G i) x) G i) x 0). ) is a GP distribution. The interpretation is that if H is the asymptotic conditional distribution of a d- dimensional random vector X,... X d ) given that at least one of its components is large, then Hi) GP is the asymptotic conditional distribution of X,... X i, X i,... X d ) given that at least one of the components of X,... X i, X i,... X d ) is large. The expression ) is implicit since it involves G. However the GP marginal distribution can also be expressed directly in terms of the parent GP distribution H. For x a d )-dimensional vector write x i) for the d-dimensional vector which is obtained from 0

x by inserting an at position i, i.e. x i) = x,... x i,, x i,... x d ). Then, H GP i) x) = log H0 i) ) log Hx i) ) Hx 0) i) ). 2) Above we have formally only discussed d )-dimensional marginal distributions of d-dimensional GP distributions. However, of course, for both definitions of marginal distributions, d k)-dimensional marginal distributions can be obtained by repeating k times, and it is obvious that the resulting d k)-dimensional distributions do not depend on which order one steps down in dimension. Acknowledgement: The authors thank the referee for very constructive comments. 5 Proofs Proof of Theorem 2.2. i) If F DG) then there exists σ n > 0, u n in R d such that F n σ n x u n ) Gx) 3) for all x, since G is continuous. The components of the norming constants σ n > 0, u n may be chosen as in the univariate case, where we may choose each component of u n to be non-decreasing cf Leadbetter et al. 983, p. 8). Further, making a suitably small pertubation of the u n we may, and will in the sequel, assume u n to be strictly increasing. By 3) also F n σ n x u n ) Gx) and by the convergence of types theorem Leadbetter et al. 983, p. 7)) applied to each marginal, it follows that σ n /σ n and u n u n )/σ n 0. 4) Taking logarithms, it is seen that 3) is equivalent to n F σ n x u n ) log Gx). 5) Now, define u by ut) = u n for t = n and by linear interpolation for n < t < n, and set σut)) = σ n for n t < n. It then follows from 4) and 5) that t F σut))x ut)) log Gx). 6) By straightforward argument P X u x X u 0) = P X u x 0) P X u x), 7) P X u 0)

for x 0. Since P X u x) = F σ u x u), 3) now follows from 6). ii) Suppose that 4) holds. Since ut) is strictly increasing we may reparametrise so that t = inf{s : F us)) /t} for large t. Then, t F ut )) t F ut)). Further, for any continuity point ɛ > 0 of H, since σ > 0, lim sup t F ut ) F ut) lim sup t F ut)) F ut) σut))ɛ) = Hɛ). Since Hx) is a limit of distribution functions it is right continuous, and H0) =, so letting ɛ 0 through continuity points of H gives that F ut )/ F ut), and hence It follows from 4) and 7) that, for x > 0, t F ut)). lim ut) σut))x) t = F ut) σut))x) lim t F ut) = P X u x) lim t P X u 0) = Hx). 8) We next show that 8) holds also when x isn t positive. From 8) follows that F ut) σt)x) t e Hx) for x > 0. Further, t F u t) σu t))x) Hx)/ for > 0 and hence F u t) σ t)) t e Hx)/. By the extremal types theorem Leadbetter et al. 983, p.7) there are c > 0, x with σ t) σt) c, u t) ut) σt) x. Thus, for any x x /c, t F u t) σ t)x) = t F ut) σt) σ t) σt) This may be rephrased as Hc x x ). t F ut) σt)x) Hc x x ). u t) ut) x ) σt) Hence the limit does not depend on the choice of, and we may uniquely define log Gx) as Hcx x ) for any x inf x /c to obtain that for such x t F ut) σt)x) log Gx). 9) 2

Suppose that one coordinate in x is less than the corresponding coordinate of inf x /c. Then if we let x be the vector which has all other components set to, and x 0 the vector were all other components are set to and this coordinate is set to 0, we have that lim inf t t F ut) σt)x) lim inf t t F ut) σt)x ) Hx 0 ), for any >. Hence t F ut)σt)x) for such x, and thus if we define log Gx) = for x which are not greater than inf x /c then 9) holds for all x. Thus, F ut) σt)x) t Gx), for all x, and hence Gx) is a multivariate extreme value distribution, and it follows from the first part of the theorem that 5) holds, since G0) = e H0) = e. Proof of Theorem 2.3. i) Let X have distribution H. By definition, H is of the form 2) for some extreme value distribution G, so that P X x) = log G0) log Gx) Gx 0) = Gx)G0) log log G0) Gx 0). Since G is max-stable, there exist continuous curves σt) > 0, ut) with σ) =, u) = 0 and ut) strictly increasing, such that Gut) σt)x) t = Gx). In particular, Gut)) = G0) /t. Further, by 7) and 20), P X u x X u 0) = Gut) σt)x)gut) σt)x 0)) 0) log log Gut)) Gut) σt)x 0)Gut) σt)x) 0). It can be seen that ut) σt)x 0)) 0 = ut) σt)x) 0, and hence P X u x X u 0) = = = t log G0) log Gut) σt)x) Gut) σt)x 0) log G0) log Gut) σt)x) t Gut) σt)x 0) t log G0) log Gx) = P X x). Gx 0) 20) ii) This is an easy consequence of Theorem 2.2 ii). Proof of Theorem 2.4. i) By Theorem 2.2 we have that F DG), and hence there are constants u n, σ n such that np X i u n σ n x) log Gx) on S. The conclusion of i) then follows from Theorem 3.2 of Resnick 987). ii) Again by Theorem 3.2 of Resnick 987), it follows that np X i u n σ n x) µ, x] c ) for x > 0. However, this is just a different way of writing 4) of Theorem 2.2, and hence condition ii) of the theorem is satisfied, and the result follows. 3

References [] Arnold, B.C. 983) Pareto Distributions. International Co-operative Publishing House. [2] Balkema, A. A. and de Haan, L. 974) Residual life time at high age Ann. Probab. 2, 792-804. [3] Barnett, V. 976) The ordering of multivariate data with discussion). J. R. Statist. Soc. A 39, 38-354. [4] Beirlant, J., Segers, J., and Teugels, J. 2005) Statistics of extremes, theory and applications. Wiley, Chichester. [5] Coles, S. G. 200) An Introduction to Statistical Modeling of Extreme Values. Springer, London. [6] Coles, S. G. and Tawn, J.A. 99) Modelling multivariate extreme events. J. R. Statist. Soc. B 53, 377-392. [7] Davison, A.C. and Smith, R.L. 990) Models for exceedances over high thresholds. J. R. Statist. Soc. B 52, 393-442. [8] Embrechts, P., Klüppelberg C., and Mikosch, T. 997) Modelling extremal events. Springer, New York. [9] Falk, M. and Reiss, R. D. 200) Estimation of canonical dependence parameters in a class of bivariate peaks-over-threshold models. Statist. Probab. Lett. 52, 233-242. [0] Falk, M. and Reiss, R. D. 2002) A characterization of the rate of convergence in bivariate extreme value models. Statist. Probab. Lett. 59, 34-35. [] Falk, M. and Reiss, R. D. 2003) Efficient estimation of the canonical dependence function. Extremes 6, 6-82. [2] Falk, M. and Reiss, R. D. 2003) Efficient estimators and LAN in canonical bivariate POT models. Journal of Multivariate Analysis 84, 90-207. [3] Falk, M. and Reiss, R. D. 2005) On the distribution of Pickands coordinates in bivariate EV and GP models. Journal of Multivariate Analysis 93, 267 295. [4] Finkenstädt, B. and Rootzén, H., editors. 2004) Extreme values in finance, telecommunications and the environment. Chapman & Hall/CRC, Boca Raton. [5] Joe, H., Smith, R.L. and Weissman, I.992) Bivariate Threshold Methods for Extremes. J. R. Statist. Soc. B 54, 7-83. 4

[6] Kotz, S. and Nadarajah, S. 2000) Extreme value distributions : theory and applications. Imperial College Press, London. [7] Kowaka, M. 994). An Introduction to Life Prediction of Plant Materials. Application of Extreme Value Statistical Methods for Corrosion Analysis. Allerton Press, New York. [8] Leadbetter, M.R., Lindgren, G. and Rootzén, H. 983) Extremes and Related Properties of Random Sequences and Processes. Berlin: Springer-Verlag. [9] Ledford, A.W. and Tawn, J.A. 996) Statistics for near independence in multivariate extreme values. Biometrika 83, 69-87. [20] Pickands, J. III 975) Statistical inference using extreme order statistics. Ann. Statist. 3, 9-3. [2] Pickands, J. 98) Multivariate extreme value distributions. Proc. 43rd Session I.S.I., 859-878. [22] Reiss, R. and Thomas, M. 2005) Statistical Analysis of Extreme Values for Insurance, Finance, Hydrology and Other Fields). 3rd revised edition, Birkhuser, Basel. [23] Resnick, S.I. 987) Extreme values, Regular Variation and Point Processes. Berlin: Springer-Verlag. [24] Smith, R.L. 985) Statistics of extreme values. Proc. 45th Session I.S.I., Paper 26., Amsterdam. [25] Smith, R.L., 987) Estimating tails of probability distributions. Ann. Statist. 5, 74-207. [26] Smith, R.L., Tawn, J.A. and Yuen, H.K. 990) Statistics of multivariate extremes. Int. Statist. Inst. Rev. 58, 47-58. [27] Smith, R.L., Tawn, J.A. and Coles, S.G. 997) Markov chain models for threshold exceedances. Biometrika 84, 249-268. [28] Tajvidi, N. 996) Multivariate generalized Pareto distributions. Article in PhD thesis, Department of Mathematics, Chalmers, Göteborg. [29] Tawn, J. A.988) Bivariate extreme value theory: Models and estimation. Biometrika 75, 397-45. [30] Tawn, J.A. 990) Modelling multivariate extreme value distributions. Biometrika 77, 245-253. 5