Inference for clusters of extreme values

Size: px

Start display at page:

Download "Inference for clusters of extreme values"

Stephany Bailey
5 years ago
Views:

1 J. R. Statist. Soc. B (2003) 65, Part 2, pp Inference for clusters of extreme values Christopher A. T. Ferro University of Reading, UK and Johan Segers EURANDOM, Eindhoven, the Netherlands [Received May Final revision October 2002] Summary. Inference for clusters of extreme values of a time series typically requires the identification of independent clusters of exceedances over a high threshold. The choice of declustering scheme often has a significant effect on estimates of cluster characteristics. We propose an automatic declustering scheme that is justified by an asymptotic result for the times between threshold exceedances. The scheme relies on the extremal index, which we show may be estimated before declustering, and supports a bootstrap procedure for assessing the variability of estimates. Keywords: Bootstrap; Declustering; Extremal index; Extreme values; Interexceedance times 1. Introduction Let {ξ n } n 1 be a strictly stationary sequence of random variables with marginal distribution function F, finite or infinite right end point ω = sup{x : F.x/ < 1} and tail function F = 1 F. For integers 0 k<land n 1, put M k,l = max{ξ i : i = k + 1,...,l} and M n = M 0,n.The process {ξ n } n 1 is said to have extremal index θ [0, 1] if for each τ > 0 there is a sequence {u n } n 1 such that, as n, (a) nf.u n / τ and (b) P.M n u n / exp. θτ/; see Leadbetter et al. (1983). If θ = 1 then exceedances of an increasing threshold occur singly in the limit; if θ < 1 then exceedances tend to cluster in the limit. We consider the problem of making inferences about characteristics of such clusters. The extremal index is one such characteristic, which has an interpretation due to Leadbetter (1983) as the reciprocal of the mean cluster size. Theorem 4.5 of Hsing (1987) shows that clusters of exceedances may be considered independent in the limit. Consequently, a common approach to inference is to identify independent clusters of exceedances above a high threshold, to evaluate for each cluster the characteristic of interest and to form estimates from these values. The methods that are used to identify clusters define different estimators, the two most common being blocks and runs declustering (Leadbetter et al., 1989). Runs declustering, for example, assumes that exceedances belong to the same cluster if they are separated by fewer than a certain number, the run length, of values Address for correspondence: Christopher A. T. Ferro, Department of Meteorology, University of Reading, Earley Gate, PO Box 243, Reading, RG6 6BB, UK. c.a.t.ferro@reading.ac.uk 2003 Royal Statistical Society /03/65545

2 546 C. A. T. Ferro and J. Segers below the threshold. Hsing (1991) pointed out that a problem with these estimators is the selection of the declustering parameters, which is largely arbitrary: the choice of run length usually has a significant influence on the estimate of the cluster characteristic. Estimation of the extremal index has been developed by Smith and Weissman (1994) and Weissman and Novak (1998) among others; for an application in finance see Longin (2000). Examples of other cluster characteristics are the cluster maximum, which is the focus of peaks-over-threshold modelling (Davison and Smith, 1990), and the excess height statistic used by Leadbetter (1995) to monitor ozone levels. We investigate the point process of exceedance times (Hsing et al., 1988) and find that the asymptotic distribution of the interexceedance times belongs to a one-dimensional parametric family of distributions indexed by the extremal index. This result provides a limiting argument for one particular declustering scheme, characterized by the extremal index. We can estimate the extremal index without declustering by equating theoretical moments of the limiting distribution to their empirical counterparts. In this way we define an automatic declustering scheme that does not require a subjective choice of auxiliary parameter. Furthermore, the declustering scheme supports a bootstrap procedure for obtaining confidence intervals on estimates of cluster characteristics that accounts for the uncertainty in the scheme s estimation. We derive the asymptotic distribution of times between threshold exceedances in Section 2 and present our estimator for the extremal index in Section 3. In Section 4 we define the declustering scheme and bootstrap procedure. We investigate the performance of our extremal index estimator with a simulation study in Section 5, and we conclude in Section 6 with an application to a series of daily minimum temperatures recorded at Wooster, Ohio. 2. Interexceedance times The statistical development in this paper rests on the limiting distribution of the times between exceedances of a threshold u by the process {ξ n } n 1. Let T.u/ be a random variable equal in distribution to i.e. or, alternatively, min{n 1:ξ n+1 >u} given ξ 1 >u, P{T.u/ = n} = P.M 1,n u, ξ n+1 >u ξ 1 >u/ for n 1, P{T.u/ > n} = P.M 1,n+1 u ξ 1 >u/ for n 1: We compute the asymptotic distribution of T.u/. The case of independent random variables {ξ n } n 1 is straightforward. Clearly so that, for x>0, P{T.u/ > n} = F.u/ n for n 1, P{ F.u/ T.u/ > x} = P{T.u/ > x= F.u/ } = exp[ x= F.u/ log{f.u/}], where x denotes the integer part of x.iff has no atom at its end point ω then, since log.1 + "/ " as " 0, we have lim [P{ F.u/ T.u/ > x}] = exp. x/ for x>0: u ω

3 Clusters of Extreme Values 547 So F.u/ T.u/ is asymptotically standard exponentially distributed. This agrees with the result (Hsing et al., 1988) that the point process of exceedance times has a Poisson process limit. Now consider the general case with extremal index θ [0, 1]. The corresponding point process limit for the exceedance times is compound Poisson (Hsing et al., 1988). This leads us to expect that the limit distribution of the interexceedance times will be a mixture of an exponential distribution and a point mass on zero. This is indeed the case, as described in theorem 1 below. For real u and integers 1 k l, let F k,l.u/ be the σ-field generated by the events {ξ i >u}, k i l. Define the mixing coefficients α n,q.u/ = max sup P.B A/ P.B/, 1 k n q where the supremum is over all A F 1,k.u/ with P.A/ > 0 and all B F k+q,n.u/. Theorem 1. Let the positive integers r n and the thresholds u n,forn 1, be such that and r n, r n F.u n / τ P.M rn u n / exp. θτ/, for some τ.0, / and θ [0, 1]. If there are positive integers q n = o.r n / such that α crn,q n.u n / = o.1/ for all c>0, then P{ F.u n /T.u n />t} θ exp. θt/ for t>0: The proof is contained in Appendix A. Note that the mixing condition that is used here is similar to that of Weissman and Novak (1998). If we denote convergence in distribution by d then theorem 1 says that F.u/ T.u/ d T θ as u ω,.1/ where T θ is a random variable distributed according to the mixture distribution.1 θ/" 0 + θµ θ,.2/ " 0 is the degenerate probability distribution at 0 and µ θ is the exponential distribution with mean θ 1. Note the double role of the extremal index: θ is both the proportion of non-zero interexceedance times and the reciprocal of the mean of the non-zero interexceedance times. These features are evident in Fig. 1, in which normalized interexceedance times are plotted against standard exponential quantiles for two sets of data: one simulated and one observed. The simulated data are the largest 1000 values in a sequence of length generated from a max-autoregressive process with extremal index θ = 0:5 (see expression (6) in Section 5); the observed data are the largest 85 values in the temperature series that is analysed in Section Estimation of the extremal index In this section we describe an estimator for the extremal index that is based on the limit result (1). Suppose that we have a sample ξ 1,...,ξ n and a high threshold u. Let N = N n.u/ = n I.ξ i >u/

4 548 C. A. T. Ferro and J. Segers Interexceedance Times Standard Exponential Quantiles (a) Interexceedance Times Standard Exponential Quantiles Fig. 1. Quantile quantile plots of normalized interexceedance times against standard exponential quantiles for (a) simulated data and (b) observed data (j,.1 θ/-quantile; the sloping line has gradient θ 1 ): the true value, θ D 0:5, is used for the simulated data; an estimate, θ D 0:68, from the estimator developed in Section 3.1 is used for the observed data (b) be the number of observations exceeding u, and let 1 S 1 <... <S N n be the exceedance times. The observed interexceedance times are T i = S i+1 S i for i = 1,...,N Intervals estimator The second moment of T θ is E.Tθ 2 / = 2=θ, which we estimate to obtain a first estimator for θ. Let F n.u/ be an estimator for F.u/. Then For example, if F n.u/ = N=n, then we have 2.N 1/ θ n.u/ = F n.u/ 2 N 1 T 2 i θ n.u/ = 2n2.N 1/ : N 2 N 1 We can improve on this, however. The first moment of T θ is1soθ is related to the coefficient of variation, ν, of the interexceedance times by T 2 i 1 + ν 2 = E.T 2 θ /=E.T θ/ 2 = 2θ 1 : In particular, the interexceedance times are overdispersed, with clustering in the limit, if and only if θ < 1. This relationship motivates another estimator for θ, ˆθ n.u/ = 2 ( N 1 T i ) 2.N 1/ N 1 where we do not need to estimate F.u/. A further improvement is possible if we consider a penultimate approximation to the limit- :, Ti 2

5 Clusters of Extreme Values 549 ing mixture distribution (2). If we set r n = n in the proof of theorem 1, then we see that the distribution of the interexceedance times satisfies P{T.u n />n} = θ F.u n / nθ + o.1/: We can derive an estimator for θ based on this relationship. Let T denote a random variable on the positive integers whose distribution is given by P.T>n/= θp nθ for n 1,.3/ where θ.0, 1] and p.0, 1/ may be thought of as F.u n /. Note that 2 E.T/ 2 E.T 2 / = 2{1.1 θ/p θ } 2 2θp θ + θp θ.1 p θ / +.1 p θ / 2 ( = θ + θ 2 3θ ).1 p/ + O{.1 p/ 2 } as p 1, 2 which follows from E.T 1/ = P.T>n/= θp θ.1 p θ / 1, n=1 { } T.T 1/ E = np.t>n/= θp θ.1 p θ / 2 : 2 n=1 Therefore, the first-order bias of ˆθ n.u/ at a threshold u is θ.2 3θ=2/ F.u/. In contrast, the relationship 2{E.T 1/} 2 E{.T 1/.T 2/} = θ.4/ motivates the estimator { } 2 N 1 2.T i 1/ ˆθ Å n.u/ =,.N 1/ N 1.T i 1/.T i 2/ whose first-order bias is 0. Compared with ˆθ n.u/, the estimators for the first and second moments from which ˆθ Å n.u/ is constructed have been shrunk towards 0. This is intuitively appealing. The smallest observed interexceedance times are positive, whereas the limiting distribution (2) models them as 0. Estimator ˆθ Å n.u/ ensures that the contributions from the smallest interexceedance times are indeed 0; the larger interexceedance times are relatively unaffected. Finally, we note that ˆθ Å n.u/ can take values that are greater than 1 and is undefined if the largest interexceedance time is no greater than 2. To avoid these cases, we define { 1 ˆθ θ n.u/ = n.u/ if max{t i : 1 i N 1} 2, 1 ˆθ Å n.u/ if max{t.5/ i : 1 i N 1} > 2, which we call the intervals estimator for the extremal index. Consistency of the intervals estimator for m-dependent processes is stated in the following theorem, the proof of which may be found in Ferro and Segers (2002). Recall that, for a positive integer m, the sequence {ξ n } n 1 is m dependent if, for all positive integers k, the σ-fields σ.ξ i :1 i k/ and σ.ξ i : i k + m/ are independent.

6 550 C. A. T. Ferro and J. Segers Theorem 2. Let the positive integers r n and the thresholds u n,forn 1, be such that and r n = o.n/, r n, r n F.u n / τ P.M rn u n / exp. θτ/, for some τ.0, / and θ.0, 1]. If {ξ n } n 1 is m dependent, then θ n.u n / p θ: 3.2. Maximum likelihood estimators We have described two models for the interexceedance times: the limiting form (2) and the penultimate approximation (3). In this section we examine the possibility of using these models to construct maximum likelihood estimators for the extremal index. This contrasts with the moment-based intervals estimator (5). We shall construct the likelihoods under the assumption that the interexceedance times are independent since we have models for their marginal distribution only. If we write t i = NT i =n then the log-likelihood from the limiting model (2) is N 1 log[.1 θ/ I.ti=0/ {θ 2 exp. θt i /} I.ti>0/ ] = 2.N 1/ log.θ/ θ N 1 t i, 0 θ 1, since the observed interexceedance times are always strictly positive. The maximum likelihood estimator for θ is min.1, 2= t/, where t is the mean of the t i. This estimator tends in distribution to1asn increases: a failure arising from the model assigning all the interexceedance times to the exponential component of the mixture distribution. It is possible to circumvent this problem by grouping a certain number of the smallest interexceedance times and treating them as though they were equal to 0. This solution requires the choice of an auxiliary parameter, however. The choice is arbitrary and can have a significant effect on estimates so we are no better off than if we had used runs or blocks declustering. The log-likelihood from the penultimate model (3) is m 1 log.1 θp θ / + {log.θ/ + log.1 p θ /}.N 1 m 1 / + θ log.p/ N 1.T i 1/, where m 1 is the number of interexceedance times that are equal to 1. Maximum likelihood estimates may be found by numerical optimization; unfortunately, a combination of two features can cause them to perform poorly. First, the distribution (3), from which the likelihood is constructed, is in general not a good model for the smallest interexceedance times. In particular, P.T > 1/ = θp θ converges to θ as p 1, where θ arises as the limit of P.M 1,m u ξ 1 >u/as u ω and m at an appropriate rate (O Brien, 1987). In contrast, P{T.u/ > 1} = P.ξ 2 u ξ 1 >u/is larger than P.M 1,m u ξ 1 >u/and need not converge to θ. For example, the doubly stochastic process of Smith and Weissman (1994) with parameters 0 < ψ < 1 and 0 < η < 1 has lim u ω [P{T.u/ > 1}] = 1 ηψ.1 ψ/=.1 ψ + ψη/ = θ. The second, troublesome, feature is the sensitivity of the log-likelihood to m 1, which is clearer after not-

7 Clusters of Extreme Values 551 ing that Σ N 1.T i 1/ = S N S 1.N 1/ is asymptotically equivalent to n as N and N=n 0. Our previous comments warn that the observed value of m 1 may be far from the value that is expected under the model, in which case poor estimates result. Grouping small interexceedance times and modelling only the larger times is again a possible but unattractive solution. In contrast with the problems that are encountered with likelihood estimation, the intervals estimator (5) is quite robust to deviations of the smallest interexceedance times from the model postulated. This is due to relationship (4) in which the numerator and denominator both tend to as p 1, so the smallest interexceedance times play no role asymptotically and the estimator depends most heavily on the well-modelled large interexceedance times. 4. Declustering and bootstrapping Although we have shown how to estimate the extremal index without recourse to declustering, estimating other cluster characteristics still may require clusters to be identified. All declustering schemes proposed in the literature require an auxiliary parameter, the choice of which is largely arbitrary. In this section we explain how the limiting distribution (2) may be used to identify clusters without making an arbitrary choice. As we shall see, the limit theory also supports a bootstrap procedure for assessing estimation uncertainty; we describe a procedure to compute confidence limits for estimates of general cluster characteristics. Importantly, this procedure accounts for the uncertainty in the choice of declustering scheme. As the limiting process of exceedance times is a compound Poisson process, we may categorize interexceedance times into two types: independent intercluster times (between clusters) and independent sets of intracluster times (within clusters). As mentioned at the end of Section 2, the extremal index is the proportion of interexceedance times that may be regarded as intercluster times. Suppose that we observe N exceedance times, S 1 <... <S N, with interexceedance times T i = S i+1 S i for i = 1,...,N 1. Then we can assume that the largest C 1 = θn inter- exceedance times are approximately independent intercluster times that divide the remainder into approximately independent sets of intracluster times. To be precise, if T.C/ is the Cth largest interexceedance time and T ij is the jth interexceedance time to exceed T.C/, then {T ij } C 1 j=1 is a set of approximately independent intercluster times. In the case of ties, decrease C until T.C 1/ is strictly greater than T.C/. Let T j = {T ij 1 +1,...,T ij 1}, where i 0 = 0, i C = N and T j = if i j = i j Then {T j } C j=1 is a collection of approximately independent sets of intracluster times. Furthermore, each set T j has associated with it a set of threshold exceedances C j = {ξ k : k S j }, where S j = {S ij 1 +1,...,S ij }. This interpretation justifies a decomposition of the observed process into C clusters, where the jth cluster comprises the exceedances C j. This is equivalent to runs declustering with run length T.C/. In practice, C is defined by replacing θ with an estimate ˆθ. Any estimator for θ may be used, but if we employ the intervals estimator (5) then we have an entirely automatic declustering procedure, justified by asymptotic theory. Suppose now that we are interested in making inferences about a cluster functional H. This could be the excess height statistic of Leadbetter (1995) for example. We can evaluate the functional for each cluster to obtain values {H j } C j=1 that may be used to find estimates H of properties of H. For example, we might estimate the expectation of H by H = C 1.H H C /. We can use the bootstrap to obtain confidence limits on H and ˆθ. Given the decomposition of the process of exceedances described above, we recommend the following procedure. (a) Resample with replacement C 1 intercluster times from {T ij } C 1 j=1.

8 552 C. A. T. Ferro and J. Segers (b) Resample with replacement C sets of intracluster times, some of which may be empty, and associated exceedances from {.T j, C j /} C j=1. (c) Intercalate these interexceedance times and clusters to form a bootstrap replication of the process. (d) Compute N for the bootstrap replication, estimate θ and decluster accordingly. (e) Compute H from the declustered bootstrap replication. Forming B such bootstrap processes yields collections of estimates, { ˆθ.1/,..., ˆθ.B/ } and { H.1/,..., H.B/ }, that may be used to approximate the distributions of the original point estimates. In particular, the empirical α- and.1 α/-quantiles of each sample define.1 2α/- confidence intervals. We demonstrate this procedure for the data set in Section 6 and make use of it in the simulation study of the following section. 5. Simulation study In this section we investigate the performance of the intervals estimator (5) for the extremal index and compare it with the performance of the runs estimator, which may be written in terms of the interexceedance times as { } ˆθ n.u; r/ = N N 1 I.T i >r/ when the run length is r. We simulate data from two stationary processes: a max-autoregressive process and a first-order Markov chain with extreme value transition distribution. Choose θ.0, 1]. Let W n,forn 1, be independent unit Fréchet random variables and put ξ 1 = W 1 =θ,.6/ ξ n = max{.1 θ/ξ n 1, W n } for n 2: Then {ξ n } n 1 is a max-autoregressive process with extremal index θ. For the Markov chain, let β.0, 1] and P.ξ 1 x 1, ξ 2 x 2 / = exp{.x 1=β 1 + x 1=β 2 / β }: This is a bivariate extreme value distribution with symmetric logistic dependence structure; the extremal index of {ξ n } n 1 for specific values of β may be found by simulation (Smith et al., 1997). We simulate 1000 sequences of length 5000 from both of these processes for each of three extremal indices, 0.25, 0.5 and 0.75, corresponding to β = 0:43, 0:64, 0.82 for the Markov chain. For each sequence, we compute the intervals estimator and three runs estimators, with run lengths 1, 5 and 9, at a range of thresholds chosen so that there are N exceedances, with N ranging from 10 to We report the root-mean-square error (RMSE) of the point estimates and the coverage probability of bootstrapped confidence intervals, computed with B = 1000, in each case. Fig. 2 shows the empirical RMSEs and coverage probabilities for the four estimators applied to the three max-autoregressive processes. The sensitivity of the runs estimator to run length, threshold and the true extremal index is clear; the intervals estimator is robust to these. This robustness is particularly evident at lower thresholds, where the coverage probability for the intervals estimator remains close to the nominal value. The cause of the poor coverage for the runs estimator may be traced to step (d) of the bootstrap algorithm: the run length is fixed so that any uncertainty in the declustering scheme is unaccounted for, resulting in confidence intervals

9 Clusters of Extreme Values 553 that are too narrow. The apparently uniform decrease of RMSE with threshold for the intervals estimator is only indicative of the fact that the optimal threshold for the intervals estimator is typically lower than that for the runs estimator. The reason for this is the greater sensitivity of the runs estimator to disproportionately large numbers of small interexceedance times, which enhances the bias at low thresholds. As we commented in Section 3.2, the intervals estimator is robust to such discrepancies. However, the runs estimators appear to outperform the intervals estimator for high thresholds. Similar results, not shown, were obtained for the Markov chains, except that not every choice of run length for the runs estimator led to a performance that was superior at high thresholds to that of the intervals estimator. 6. Data example We conclude with an application of the intervals estimator and intervals declustering to a time series of negated daily minimum temperatures, recorded to the nearest degree Fahrenheit, at Wooster, Ohio. See Smith et al. (1997) for a description of the data and additional analy- Fig. 2. (a) RMSEs and (b) coverage probabilities for the intervals estimator ( ) and runs estimators with run lengths 1 ( ), 5 (... )and9(- - - ) applied to max-autoregressive sequences of length 5000 with extremal indices 0.25 (top), 0.5 (middle) and 0.75 (bottom) (the nominal coverage probability (0.95) is indicated by a horizontal line)

10 554 C. A. T. Ferro and J. Segers sis. We shall estimate the extremal index and mean cluster excess of the series at thresholds u = 0:5, 1:5,...,14:5, where the excess of a cluster {ξ k : k S} of exceedances over a threshold u is defined by Σ k S.ξ k u/ and is an indicator for the severity of a cold period. The intervals estimates of the extremal index are shown in Fig. 3(a) with bootstrap 95% confidence limits. For comparison, runs estimates with run length 5 days are shown in Fig. 3(b). As expected, the variance of the intervals estimator is greater than that of the runs estimator, which does not account for uncertainty in the run length. Both plots support an extremal index of about 0.6, which is similar to the values found by Smith et al. (1997). The estimates of the mean cluster excess are presented in Fig. 3(c). The results are not clear cut although the graph appears to level out at a threshold of 10 F and a mean cluster excess of about 6 F. Finally, returning to the bootstrap procedure of Section 4, notice that each bootstrapped process has an associated run length r.b/ = T.C.b/ /, where C.b/ = ˆθ.b/ N.b/ and N.b/ is the number of exceedances in the process. At each threshold, the collection {r.b/ } B b=1 describes the uncertainty in the declustering scheme and indicates the extent of the uncertainty ignored by using runs declustering with an arbitrary run length. Fig. 3(d) summarizes the run length distribution. A run length of about 1 week is suggested with the interpretation that cold periods separated by more than 1 week can be considered independent. Acknowledgements We thank Jonathan Tawn for helpful comments and acknowledge the financial support of the Fig. 3. Estimates (ı) of cluster characteristics for the Wooster temperature series: (a) intervals estimator and (b) runs estimator with run length 5 for the extremal index and (c) mean cluster excess and (d) run length ( , bootstrapped 95% confidence limits; the number of exceedances at each threshold is indicated on the upper axes)

11 Clusters of Extreme Values 555 Engineering and Physical Sciences Research Council (CATF), the Fund for Scientific Research, Flanders (JS), the Department of Mathematics and Statistics at Lancaster University and the Bernoulli Society for subsidizing our attendance at the Gothenburg meeting (December 10th 16th, 2001) of the Séminaires Européens de Statistiques. Appendix A: Proof of theorem 1 Since both q n = o.r n / and α n = α rn,qn.u n / = o.1/, we can find positive integers p n such that p n = o.r n /, r n α n = o.p n / and q n = o.p n /: (Take for instance p n = {r n max.q n, r n α n /} 1=2.) Since F.u n / rn exp. τ/, all conditions of corollary 2.3 of O Brien (1987) are fulfilled. We derive P.M rn u n / F.u n / rnp.m 1, pn un ξ 1>u n/ 0: But P.M rn u n / exp. θτ/ by assumption, so necessarily P.M 1,pn u n ξ 1 >u n / θ: Now we can proceed with the proof of the stated limit relationship itself. Put k n = t= F.u n /. Wehave P{ F.u n /T.u n />t} = P{T.u n />k n } = P.M 1,1+kn u n ξ 1 >u n /: For n sufficiently large that q n <p n,wehave so P.M pn,pn+q n >u n ξ 1 >u n / P.M pn,pn+q n >u n / + α n q n F.u n / + α n 0, P.M 1,1+kn u n ξ 1 >u n / = P.M 1,pn u n, M pn+q n,k n u n ξ 1 >u n / + o.1/: But k n tτ 1 r n = O.r n /, so by assumption P.M 1,pn u n, M pn+q n,k n u n ξ 1 >u n / = P.M pn+q n,k n u n M 1,pn u n, ξ 1 >u n /P.M 1,pn u n ξ 1 >u n / = {P.M pn+q n,k n u n / + o.1/}{θ + o.1/}: Now also s n = k n.p n + q n / tτ 1 r n = O.r n /. Apply corollary 2.3 of O Brien (1987) again to deduce that P.M pn+q n,k n u n / = P.M sn u n / = F.u n / snp.m 1, pn un ξ 1>u n/ + o.1/ exp. θt/: References Davison, A. C. and Smith, R. L. (1990) Models for exceedances over high thresholds (with discussion). J. R. Statist. Soc. B, 52, Ferro, C. A. T. and Segers, J. (2002) Automatic declustering of extreme values via an estimator for the extremal index. Technical Report EURANDOM, Eindhoven. (Available from Hsing, T. (1987) On the characterization of certain point processes. Stochast. Process. Applic., 26, Hsing, T. (1991) Estimating the parameters of rare events. Stochast. Process. Applic., 37, Hsing, T., Hüsler, J. and Leadbetter, M. R. (1988) On the exceedance point process for a stationary sequence. Probab. Theory Reltd Flds, 78, Leadbetter, M. R. (1983) Extremes and local dependence in stationary sequences. Z. Wahrsch. Ver. Geb., 65, Leadbetter, M. R. (1995) On high-level exceedance modeling and tail-inference. J. Statist. Planng Inf., 45, Leadbetter, M. R., Lindgren, G. and Rootzén, H. (1983) Extremes and Related Properties of Random Sequences and Processes, ch. 3. New York: Springer. Leadbetter, M. R., Weissman, I., de Haan, L. and Rootzén, H. (1989) On clustering of high values in statistically stationary series. Technical Report 253. Center for Stochastic Processes, University of North Carolina, Chapel Hill.

12 556 C. A. T. Ferro and J. Segers Longin, F. M. (2000) From value at risk to stress testing: the extreme value approach. J. Bank. Finan., 24, O Brien, G. L. (1987) Extreme values for stationary and Markov sequences. Ann. Probab., 15, Smith, R. L., Tawn, J. A. and Coles, S. G. (1997) Markov chain models for threshold exceedances. Biometrika, 84, Smith, R. L. and Weissman, I. (1994) Estimating the extremal index. J. R. Statist. Soc. B, 56, Weissman, I. and Novak, S. Yu. (1998) On blocks and runs estimators of the extremal index. J. Statist. Planng Inf., 66,

Bayesian Inference for Clustered Extremes

Bayesian Inference for Clustered Extremes Newcastle University, Newcastle-upon-Tyne, U.K. lee.fawcett@ncl.ac.uk 20th TIES Conference: Bologna, Italy, July 2009 Structure of this talk 1. Motivation and background 2. Review of existing methods Limitations/difficulties