On the Estimation and Application of Max-Stable Processes

Size: px

Start display at page:

Download "On the Estimation and Application of Max-Stable Processes"

Ralph Lynch
6 years ago
Views:

1 On the Estimation and Application of Ma-Stable Processes Zhengjun Zhang Department of Mathematics Washington University Saint Louis, MO USA Richard L Smith Department of Statistics University of North Carolina Chapel Hill, NC USA January 11, 2004 Abstract Modeling etreme observations in multivariate time series is a difficult eercise Classical treatments of multivariate etremes use certain multivariate etreme value distributions to model the dependencies between components An alternative approach based on multivariate ma-stable processes enables the simultaneous modeling of dependence within and between time series We propose a specific class of ma-stable processes, known as multivariate maima of moving maima (M4 processes for short), and present procedures to estimate their coefficients To illustrate the methods, some eamples are given for modeling jumps in returns in multivariate financial time series We introduce a new measure to quantify and predict the etreme co-movements in price returns Keywords: multivariate etremes, multivariate maima of moving maima, etreme value distribution, empirical distribution, estimation, etreme dependence, etreme co-movement 0

2 1 Introduction Why do we study ma-stable process modeling? This question can be answered in several main aspects regarding theory and modeling of etreme events In real data analysis, studies have shown that time series data from finance, insurance and environment etc are fat-tailed and clustered when etremal events occur It has been believed among financial statisticians that multivariate normal models are not sufficient in modeling large financial returns and computing value at risk (VaR) Some of the main purposes of studying etreme events are to understand etremal financial risk and etremal co-movements of financial assets It is obvious that a single distribution fitting is not enough to characterize those etremal co-movements and that stochastic processes which characterize the large jumps and etremal co-movements are needed In modeling multivariate etremes, especially etremal observations in multivariate time series, there are still several important issues to be solved These issues include the model applicability, tractability of model parameters, etc Besides these issues, there are needs for improving theories and methodologies in etreme value theory itself In the aspects of traditional approaches studying etremes, univariate etreme value theory studies the limiting distribution of the maima or the minima of a sequence of random variables There are well-developed approaches to model univariate etremal processes Galambos (1987), Leadbetter, Lindgren and Rootzén (1983), and Resnick (1987) are ecellent reference books on the probabilistic side Coles (2001) and Smith (2003) have reviewed statistical methodology for etremes Embrechts, Klüppelberg, and Mikosch (1997) give an ecellent viewpoint of modeling etremal events These references and others are important sources for applying etreme value theory, but problems concerning the environment, finance and insurance etc are multivariate in character: for eample, floods may occur at different sites along a coastline; the failure of a portfolio management may be caused by a single etremal price movement or multiple movements Here multivariate etreme modeling is essential for risk management and precision of modeling In the multivariate contet, the maimum is taken componentwise and there is no unified parametric type of limiting distribution However, there have been many characterizations of the possible limits, such as de Haan and Resnick (1977), de Haan (1985), and Resnick s (1987) point process approach, and Pickands s (1981) representation theorem for multivariate etreme value distribution with unit eponential margins If a limiting multivariate etreme value distribution eists, its marginal distributions must be one of the three types of univariate etreme value distributions if they are non-degenerate Many different models are reviewed by Coles and Tawn (1991, 1994) In many circumstances, etremal observations appear to be clustered in time For eample, large price movements in the stock market, large insurance claims after a disaster event, heavy rainfalls etc, may persist over several observations Neither univariate nor multivariate etreme value theory is adequate to describe this kind of clustering of etreme events in a time series Ma-stable processes, introduced by de Haan (1984), are an infinite-dimensional generalization of etreme value theory which does have the potential to describe clustering behavior The limiting distributions of univariate and multivariate etreme value theory are ma-stable, as shown by Leadbetter et al (1983) in the univariate case and Resnick (1987) in the multivariate case One of the most important features of ma-stable processes is that they not only model the cross-sectional dependence, but also model the dependence across time Parametric models for ma-stable processes have been considered since the 1980s Deheuvels 1

3 (1983) defines the moving minimum (MM) process Davis and Resnick (1989) study what they call the ma-autoregressive moving average (MARMA) process of a stationary process For prediction, see also Davis and Resnick (1993) Recently, Hall, Peng, and Yao (2002) discuss moving maimum models For a finite number of parameters, they propose parameter estimators based on empirical distribution functions In the study of characterization and estimation of the multivariate etremal inde introduced by Nandagopalan (1991, 1994), Smith and Weissman (1996) etend Deheuvels definition to the so called multivariate maima of moving maima (henceforth M4) process Smith and Weissman (1996) argue that under quite general conditions, the etreme values of multivariate stationary time series may be characterized in terms of a limiting ma-stable process They also show that a very large class of ma-stable processes may be approimated by M4 processes mainly because those processes have the same multivariate etremal indees as the M4 processes have (Theorem 23 in Smith and Weissman 1996) The use of M4 process is a new development in modeling etremal observations of multivariate time series Due to the lack of statistical estimation methods, applications of ma-stable processes are hardly found in real data analysis For a class of M4 processes which contain a finite number of parameters, Zhang and Smith (2003) study the behavior of the specific M4 processes They also provide guidance of applying M4 process But, a practically usable approach on estimating all parameters has yet developed The purpose of this paper is to fill the gaps between the theoretical probabilistic results and the real data modeling This paper is organized as follows In Section 2, we introduce the model which is used to model financial data in Section 5 The distributional properties of the model which lead to the construction of estimating procedures of all parameters are studied Model identifiability is addressed The estimators and their asymptotic properties are studied in Section 3 In contrast to the bootstrapped processes which Hall et al (2002) use to construct confidence intervals and prediction intervals for the moving maima models, we directly construct parameter estimators and prove their asymptotic properties for the M4 processes In Section 4 we provide simulation eamples to show the efficiency of proposed estimating procedures In Section 5 we eplore modeling financial time series data as M4 processes Stock price returns of General Electric Co (GE), CITI Group (C) and Pfizer (PFE) are studied Parameter estimates of M4 models are based on multivariate time series of approimately 5000 days data In Section 6, some discussions are addressed Technical arguments are shown in Section 7 2 The model and its identifiability Suppose we have multivariate time series {X id, i = 0, ±1, ±2,, d = 1,, D}, where i is time and d indees a component of the process Initially, we assume that the D-dimensional process is stationary in time but make no other assumptions Well-established methods of univariate etreme value theory (see Coles (2001) or Smith (2003) for reviews) show that under very mild assumptions, it is possible to model the behavior of a random variable above a high threshold by the generalized Pareto distribution Assuming this and applying a probability integral transformation, it is possible to transform each marginal distribution of the process, above a high threshold, so that the marginal distribution is unit Fréchet For the moment we ignore the high threshold part of the modeling, and assume that the univariate Fréchet assumption 2

4 applies to the whole distribution Thus, we transform each X id into a random variable Y id for which Pr{Y id y} = ep( 1/y), 0 < y < The process {Y id } is said to be ma-stable if for any finite collection of time points i = i 1, i 1 + 1,, i 2 and any positive set of values {y id, i = i 1,, i 2, d = 1,, D}, we have Pr{Y id y id, i = i 1,, i 2, d = 1,, D} = [Pr{Y id ny id, i = i 1,, i 2, d = 1,, D}] n This property directly generalizes the ma-stability property of univariate and multivariate etreme value distributions (Leadbetter et al (1983), Resnick (1987)) and provides a convenient mathematical framework to talk about etremes in infinite-dimensional processes From now on, we shall assume that our multivariate time series, after marginal transformation to unit Fréchet, is ma-stable The net task is to characterize ma-stable processes For univariate processes, such a characterization was provided by Deheuvels (1983) This was generalized by Smith and Weissman (1996) to the following: under some miing assumptions that we shall not detail here, any ma-stable process with unit Fréchet margins may be approimated by a multivariate maima of moving maima process, or M4 for short, with the representation Y id = ma l=1,2, ma a l,k,dz l,i k, < i <, d = 1,, D, <k< where {Z l,i, l = 1, 2,, < i < } are independent unit Fréchet random variables and a l,k,d are non-negative coefficients satisfying l=1 k= a l,k,d = 1 for each d In practice, even this representation is too cumbersome for practical application, involving infinitely many parameters a l,k,d, so we simplify it by assuming that only a finite number of these coefficients are non-zero Thus we have the representation Y id = ma 1 l L d ma a l,k,d Z l,i k, < i <, d = 1,, D, (21) K 1ld k K 2ld where L d, K 1ld, K 2ld are finite and the coefficients satisfy L d K1ld l=1 k= K 1ld a l,k,d = 1 for each d Under model (21), when an etremal event occurs or when a large Z li occurs, Y id a l,i k,d for i k, ie if some Z lk is much larger than all neighboring Z values, we will have Y id = a l,i k,d Z lk for i near k This indicates a moving pattern of the time series, known as a signature pattern Hence L d corresponds to the maimum number of signature patterns The constants K 1ld and K 2ld characterize the range of dependence in each sequence and (ma l K 1ld +ma l K 2ld +1) is the order of the moving maima processes We illustrate these phenomena in Figure 1 Plots (b) and (c) involve the same values of (a l, K1ld,d, a l, K1ld +1,d,, a l,k2ld,d), ie the same single value l = l 1 Similarly, Plots (d) and (e) involve the same values of (a l, K1ld,d, a l, K1ld +1,d,, a l,k2ld,d), ie the same single value l = l 2 One can see immediately that Plot (b) is a blowup of a few observations of the process in (a) and Plot (c) is a similar blowup of a few other observations of the process in (a) The vertical coordinate scales of Y in Plot (b) are from 0 to 20, while the vertical coordinate scales of Y in Plot (c) are from 0 to 100 Similar interpretations can be applied to Plots (d)-(e) These plots show that there are characteristic shapes around the local maima that replicate themselves Those blowups, or replicates, are known as signature patterns Under model (21), it is easy to obtain the finite distribution of {Y id, 1 i r, 1 d D} The goal is to estimate all parameters {a l,k,d } under the constraints that the parameters are positive 3

5 100 (a) (b) 100 (c) (d) (e) Figure 1: A demonstration of a M4 process (a) is a simulated 365 days data for a component process (b) - (e) are partial pictures drawn from the whole simulated data showing two different moving patterns, called signature patterns, in certain time periods when etremal events occur 4

6 and the summation is equal to 1 for each d = 1,, D Due to the degeneracy of the multivariate joint distribution function of the M4 processes, the method of maimum likelihood is in general not applicable in this instance In some simple cases, for eample L d = 2, K 1ld = K 2ld = 0, the method of maimum likelihood can be applied see Zhang (2003) The estimators developed in this paper are based on the joint empirical distribution functions It follows immediately from that Pr(Y id y) = e 1/y, (22) Pr(Y id y id, Y i+1,d y i+1,d ) = [ L d 2+K 1ld {a l,1 m,d ep, a l,2 m,d } ], y id y i+1,d (23) ma l=1 m=1 K 2ld [ ma(l d,l d ) Pr(Y 1d y 1d, Y 1d y 1d ) = ep and a general joint probability formula: l=1 [ ma d L d Pr{Y id y id, 1 i r, 1 d D} = ep l=1 1+ma(K 1ld,K 1ld ) m=1 ma(k 2ld,K 2ld ) ma { a l,1 m,d, a l,1 m,d } ], y 1d y 1d r+ma d K 1ld ma ma a ] l,k,d 1 m k r m 1 d D y m+k,d m=1 ma d K 2ld (24), (25) where a l,k,d = 0 when the triple subinde is outside the range defined in Model (21) This assumption is held in the rest of the paper These formulas are used to compute various probabilities and to construct asymptotic covariance matrices in the following sections Remark 1 For each l, the value of K 2ld k= K 1ld a l,k,d tells that what proportions of the total observations are approimately drawn from the lth signature process in the dth observed process It is clear from equations (22)-(24) that we cannot hope to estimate the M4 parameters based on the marginal distributions alone However, there is some hope to estimate those parameters based on bivariate distributions This naturally suggests the question of whether bivariate distributions identify all the parameters of the process In the following discussion we propose sufficient, though not necessary, conditions for that The probability evaluated at the points (y id, y i+1,d ) in (23) depends on the comparison of a l,1 m,d /y id and a l,2 m,d /y i+1,d, and similarly in (24) By fiing one of y id and y i+1,d say y id, then a l,1 m,d /(a l,2 m,d y id ) is the change point of ma(a l,1 m,d /y id, a l,2 m,d /y i+1,d ) when y i+1,d varies It is easy to see the following identity, Pr{Y 1d u, Y 2d u + } = Pr{Y 1d 1, Y 2d (u + )/u} (1/u) So without loss of generality, we can fi y id = 1 for simplicity of our calculation application we may choose a threshold value u and fi y id = u or y i+1,d = u From (23) and (24), we have In real data [ Pr{Y 1d 1, Y 2d } = ep L d l=1 5 2+K 1ld ma(a l,1 m,d, a l,2 m,d m=1 K 2ld ] ),

7 and [ ma(l d,l d ) Pr{Y 1d 1, Y 1d } = ep l=1 1+ma(K 1ld,K 1ld ) m=1 ma(k 2ld,K 2ld ) ma ( a l,1 m,d, a l,1 m,d ) ] Let b d () = log [ Pr(Y 1d 1, Y 2d ) ], b dd () = log [ Pr(Y 1d 1, Y 1d ) ], we have b d () = L d [ l=1 a l,k2ld,d + ma(a l,k2ld 1,d, a l,k 2ld,d + ma(a l,k2ld 3,d, a l,k 2ld 2,d d = 1,, D, ) + ma(a l,k2ld 2,d, a l,k 2ld 1,d ) (26) ) + + ma(a l, K1ld,d, a l, K 1ld +1,d ) + a l, K 1ld,d ], and b dd () = ma(l d,l d ) l=1 1+ma(K 1ld,K 1ld ) m=1 ma(k 2ld,K 2ld ) ma(a l,1 m,d, a l,1 m,d ) (27) It is clear that for each d, q d () b d () and q dd () b dd () are piecewise linear functions and have slope jump points at a l,j,d /a l,j,d or a l,k,d /a l,k,d, where the notation A B means that A is denoted as B This suggests that if we can identify the functions q d () or q dd (), we may be able to identify all the parameters a l,k,d A typical q d () picture is shown in Figure 2 35 A typical q() picture q() Figure 2: A demonstration of q d () and its slope q d () and slope change points The following propositions will show the identifiability The proofs of propositions 22, 23, 24 are deferred to Section 7 Proposition 21 For a fied d and L d = 1, if all ( K 1 +K 2 ) +1 a 2 ratios jd a j d uniquely identified by q d (), or b d () are distinct, the model is The reason why Proposition 21 is true is that in this case, any permutation of the a jd s must create a new set of values of ratios or jump points which will result in a different function of q d () 6

8 Remark 2 This justifies statements like for almost all (wrt Lebesgue measure) choices of coefficients a K1ld,d,, a K2ld,d, the model is identifiable from q d () Remark 3 The uniqueness means the values of the vector (a K1ld,d, a K1ld +1,d,, a K2ld,d) are uniquely determined, but their locations (determined by K 1ld and K 2ld ) are not as shown in the following two processes Y i = ma(2z i 1, 3Z i, 5Z i+1 ), Y i = ma(2z i, 3Z i+1, 5Z i+2 ) But we can shift the whole vector of the second process to get the first process There should be no ambiguity that we treat them as one model since they have the same joint distribution functions within each sequence, while they generate different sample paths Proposition 22 For a fied d, if all ( K 1ld +K 2ld ) +1 a 2 ratios l,j,d a, l = 1,, L l,j d, j j, are distinct,,d the model is uniquely identified by q d (), or b d () Remark 4 For a fied d, by uniqueness we mean that if {a l,k,d, l = 1,, L d, K 1ld k K 2ld } and {a l,k,d, l = 1,, L d, K 1l d k K 2l d} are two sets of solutions, they must satisfy the condition: a m(l),k,d = a l,k,d, l = 1,, L d, where {m(l), l = 1,, L d } is a permutation of {1,, L d } For eample, we consider the following two structures of parameter values are equivalent: [ ] [ ] 1/4 1/6 (a lkd ) =, (a lkd 1/4 1/3 ) = 1/4 1/3 1/4 1/6 for a fied d (in the above two matrices, we use l as row indees and k as column indees) since they result in the same joint distributions, but different sample paths It is easy to see that the ratios are 3/2, 2/3, 4/3, 3/4 in this eample All the ratios are distinct For higher dimensional structures, we can have similar equivalences as long as they have the same joint distributions Remark 5 By Remark 4, the solutions from b d () can be something like either (a lkd ) or (a l kd ), and the solutions from b d () can be something like either (a lkd ) or (a l kd ) While parameter values can be determined by b d () for each d, functions b d () are not enough to determine the parameter structures in M4 processes For eample, one may obtain the following coefficients (a lkd ) = [ 1/4 1/6 1/4 1/3 ], (a lkd ) = [ 1/8 1/4 1/4 3/8 based on observations from a bivariate process and functions b d (), b d (), but one may also obtain the following coefficients [ ] [ ] (a lkd ) = 1/4 1/3, (a lkd 1/4 1/6 ) = 1/8 1/4, 1/4 3/8 based on the same observations and functions b d (), b d () Obviously, M4 models formed using these two sets of coefficients are two different models since the joint distributions are different and the bivariate sample paths are different The purpose of functions b dd () is to find which structure is the true structure 7 ],

9 Proposition 23 Suppose all ratios a l,j,d a for all l and j j are distinct, all ratios a l,j,d l,j,d a for all l l,j,d and j j are distinct, and all nonzero and eisting ratios a l,k,d a for all l and k are distinct when l,k,d d d, then b d () = L d l=1 b d () = L d l=1 b dd () = [ a l,k2ld,d + ma(a l,k2ld 1,d, a l,k 2ld,d + ma(a l,k2ld 3,d, a l,k 2ld 2,d [ a l,k2ld,d + ma(a l,k 2ld 1,d, a l,k 2ld,d + ma(a l,k2ld 3,d, a l,k 2ld 2,d ma(l d,l d ) l=1 1+ma(K 1ld,K 1ld ) m=1 ma(k 2ld,K 2ld ) ) + ma(a l,k2ld 2,d, a l,k 2ld 1,d ) ) + + ma(a l, K1ld,d, a l, K 1ld +1,d ) + a l, K 1ld,d ], ) + ma(a l,k2ld 2,d, a l,k 2ld 1,d ) ) + + ma(a l, K1ld,d, a l, K 1ld +1,d ma(a l,1 m,d, a l,1 m,d ), uniquely determine all values of a l,k,d and a l,k,d for any two fied d and d ) + a l, K 1ld,d Furthermore, there eist points 1, 2,, m, m Ld d l (K 1ld + K 2ld + 2), such that b d ( i ) and b dd ( i ), i = 1,, m, uniquely determine all values of a l,k,d and a l,k,d for any two fied d and d Proposition 24 Suppose all ratios a l,j,d a for all l and j j are distinct for each d = 1,, D, and l,j,d nonzero and eisting ratios a l,k,1 a for all l, l and k are distinct for each d = 2,, D, then l,k,d b d () = [ L d l=1 b 1d () = a l,k2ld,d + ma(a l,k2ld 1,d, a l,k 2ld,d ) + ma(a l,k2ld 2,d, a l,k 2ld 1,d ) + ma(a l,k2ld 3,d, a l,k 2ld 2,d ) + + ma(a l, K1ld,d, a l, K 1ld +1,d ) + a l, K 1ld,d d = 1,, D, ma(l 1,L d ) l=1 1+ma(K l11,k 1ld ) m=1 ma(k l21,k 2ld ) ma(a l,1 m,1, a l,1 m,d ), d = 2,, D, uniquely determine all values of a l,k,d, d = 1,, D, l = 1,, L d, K 1ld k K 2ld Furthermore, there eist points 1, 2,, m, m (2D 1) Ld d l (K 1ld +K 2ld +2)+2D, such that b d ( i ) and b 1d ( i ), i = 1,, m, d = 1,, D, d = 2,, D uniquely determine all values of a l,k,d Remark 6 Only b 1d () is used in the proof of model identifiability In some situations, other b dd () functions may also be needed in order to prove identifiability or to get estimates of all parameters Since we want to construct the estimators of parameters in model (21) based on the bivariate distribution, we know that from the previous arguments, if the conditions are false, we may not be able to identify the model But we may be able to identify the model via some higher-order joint distribution We now construct an artificial eample to demonstrate this idea Eample 21 This is a countereample to show a process that is not identifiable via the bivariate joint distribution, but can be identifiable from the trivariate joint distribution ], ], (28) 8

10 Let (a 0,, a 4 ) = 1 6 (1, 1, 2, 1, 1) and (b 0,, b 4 ) = 1 6 (1, 2, 1, 1, 1) generated by the sequences a 0,, a 4 and b 0,, b 4, ie We consider the two processes Y i = ma a kz i k, < i <, k=0,1,2,3,4 Y i = ma b kz i k, < i < k=0,1,2,3,4 Then the number of slope change points is p = 3,and the slope change points are r 1 = 1 2, r 2 = 1, r 3 = 2 for both configurations, so q() is the same and displayed in Figure 2 with < < 1 2, q () = < < 1, < < 2, 1 2 < ie we can t distinguish the a i s from the b i s on the basis of q() However, consider the formula log(pr(y 1 y 1, Y 2 y 2, Y 3 y 3 )) = a 4 y 1 + ma( a 3 y 1, a 4 y 2 ) + ma( a 2 y 1, a 3 y 2, a 4 y 3 ) + ma( a 1 y 1, a 2 y 2, a 3 y 3 ) + ma( a 0 y 1, a 1 y 2, a 2 y 3 ) + ma( a 0 y 2, a 1 y 3 ) + a 0 y 3 and let y 1 = 1, y 2 = y 3 = c where c > 2, then and similarly log(pr(y 1 y 1, Y 2 y 2, Y 3 y 3 )) = 1 6 [ c + 1 c ] = c, log(pr(y 1 y 1, Y 2 y 2, Y 3 y 3 )) = 1 6 [ c + 1 c ] = c, So the two values of Pr(Y 1 y 1, Y 2 y 2, Y 3 y 3 ) and Pr(Y 1 y 1, Y 2 y 2, Y 3 y 3) are distinct in this case In other words, the two possible models are distinguishable from their trivariate distributions, but not bivariate However this is a specific eample where we need trivariate distribution function, in most cases bivariate distributions are enough 3 The estimators and asymptotics We now propose estimators for parameters in general M4 processes It is known that an empirical process approimates a random process from which observations are obtained Our estimators are motivated from such approimations of one process to the other one Let 1d,, Ld, 1d,, Ld be positive constants which, for the moment, are arbitrary Let A 1d = (0, 1d ) (0, 1d ),, A L 1,d = (0, L 1,d ) (0, L 1,d ) be different sets Define Ῡ Ajd = 1 n n I Ajd (Y id, Y i+1,d ), (31) i=1 9

11 where I A () is an indicative function Then by the strong law of large numbers (SLLN), we have Ῡ Ajd as Pr{A jd } = Pr{Y 1d jd, Y 2d jd } µ jd (32) Formulas (31) and (32) are used to construct the estimators for parameters and to show the central limit theorem (CLT) of the estimators In order to study asymptotic normality, we apply the following proposition which is Theorem 274 in Billingsley (1995) First we introduce the so-called α-miing condition For a sequence ζ 1, ζ 2, of random variables, let α n be a number such that Pr(A B) Pr(A) Pr(B) α n for A σ(ζ 1,, ζ k ), B σ(ζ k+n, ζ k+n+1, ), and k 1, n 1 When α n 0, the sequence {ζ n } is said to be α-miing This means that ζ k and ζ k+n are approimately independent for large n Proposition 31 Suppose that Υ 1, Υ 2,, is stationary and α-miing with α n = O(n 5 ) and that E[Υ n ] = 0 and E[Υ 12 n ] < If S n = Υ Υ n, then n 1 Var[S n ] σ 2 = E[Υ 2 1] + 2 E[Υ 1 Υ 1+k ], where the series converges absolutely If σ > 0, then S n /σ n L N(0, 1) Remark 7 The conditions α n = O(n 5 ) and E[Υ 12 n ] < are stronger than necessary as stated in the remark following Theorem 274 in Billingsley (1995) to avoid technical complication in the proof Let Υ nd = I Ajd {Y nd, Y n+1,d } µ jd, then E[Υ nd ] = 0 and E[Υ 12 nd ] < because Υ nd is bounded The α-miing condition is satisfied since Y nd s are M-dependent So the conditions of Proposition 31 are satisfied As one can see that even though the conditions of Proposition 31 are stronger than needed for the CLT result, they are plenty strong enough for the application in this paper By using Billingsley s arguments, we have the following two lemmas whose proofs are deferred to Section 7 k=1 Lemma 32 Suppose that ῩA jd and µ jd are defined in (31) and (32) respectively, if σ jd > 0, then n( Ῡ Ajd µ jd ) L N(0, σ 2 jd ), where µ jd is the mean of random variable Υ Ajd Its value is defined in (32) The value of σ 2 jd is defined as: ma l K 1ld+ma l K 2ld+1 σjd 2 = µ jd µ 2 jd + 2 k=1 [ Pr { Y 1d jd, Y 2d jd, Y 1+k,d jd, Y 2+k,d } ] jd µ 2 jd A generalization of Lemma 32 to multivariate asymptotics is shown in the following lemma Lemma 33 Suppose that ῩA jd and µ jd are defined in (31) and (32) respectively, then Ῡ A1d µ 1d ( n L ma l K 1ld +ma l K 2ld +1 ) N 0, Σ d + {W kd + Wkd T } Ῡ AL 1,d µ L 1,d 10 k=1

12 where the entries σ i,j,d of matri Σ d are defined by: µ i,j,d = Pr{Y 1d min( id, jd ), Y 2d min( id, jd )}, σ i,j,d = µ i,j,d µ id µ jd, the matri W kd has entries w ij kd = Pr(Y 1d id, Y 2d id, Y 1+k,d jd, Y 2+k,d jd ) µ idµ jd, µ i,i,d = µ id The above constructions are for a component process We now turn to constructions for multivariate (vector) processes A generalization of Lemma 33 to the empirical counterparts of b d () and b 1d () can be realized The empirical counterparts of b d () and b 1d () are defined as: U d () = 1 n U 1d () = 1 n n I {Yid 1, Y i+1,d }, b d () = log [ U d () ], d = 1,, D, (33) i=1 n I {Yi1 1, Y id }, b 1d () = log [ U 1d () ], d = 2,, D (34) i=1 Let 1d, 2d,, md, d = 1,, D, where m (2D 1)(ma l K 1ld + ma l K 2ld + 1) + 2D as described in Proposition 24, and 1d, 2d,, m d, d = 2,, D, where m (2D 1) Ld d l (K 1ld +K 2ld +2)+2D, be suitable choice of the points used to evaluate the values of all functions defined Then (33) and (34) can be written as the following vector forms: = ( 11, 21,, m1, 12,, md, 12, 22,, m 2, 13,, m D) T, U = U 1( 11),, U 1( m1), U 2( 12),, U D( md), U 12( 12),, U 12( m 2),, U 1D( m D) T, b = b 1 ( 11 ),, b 1 ( m1 ), b 2 ( 12 ),, b D ( md ), b 12 ( 12),, b 12 ( m 2),, b 1D ( m D) T In order to obtain the joint asymptotics of U and the joint asymptotics of b, the following notations and results will be used They play similar roles as those of µ i,j,d and σ i,j,d in Lemma 33 These notations and their epressions are: µ djd = E [ U d ( jd ) ] = Pr(Y 1d 1, Y 2d jd ), d = 1,, D, j = 1,, m, µ 1d j d = E[ U 1d ( j d ) ] = Pr(Y 11 1, Y 1d j d ), d = 2,, D, j = 1,, m, µ djd, d j d = [ E )( ) ] (I{Y1d 1,Y 2d jd } µ djd I{Y1d 1,Y 2d j d } µ d j d = Pr(Y 1d 1, Y 2d jd, Y 1d 1, Y 2d j d ) µ djdµ d j d, d, d = 1,, D, j, j = 1,, m, µ djd, 1d j d = [ E )( ) ] (I{Y1d 1,Y 2d jd } µ djd I{Y11 1,Y 1d j d } µ 1d j d = Pr(Y 1d 1, Y 2d jd, Y 11 1, Y 1d j d ) µ djd µ 1d j d, d = 1,, D, j = 1,, m, d = 2,, D, j = 1,, m, 11

13 µ 1d j d, djd = )( ) E[ (I{Y11 1,Y 1d j d } µ 1d j d I{Y1d 1,Y 2d jd } µ djd = Pr(Y 1d 1, Y 2d jd, Y 11 1, Y 1d j d ) µ djd µ 1d j d = µ djd, 1d j d, d = 1,, D, j = 1,, m, d = 2,, D, j = 1,, m, µ 1djd, 1d j d = [ E )( ) ] (I{Y11 1,Y 1d jd } µ 1djd I{Y11 1,Y 1d j d } µ 1d j d w (k) djd, d j d = E = Pr(Y 11 1, Y 1d jd, Y 1d j d ) µ 1djd µ 1d j d, d = 2,, D, j = 1,, m, d = 2,, D, j = 1,, m, [ (I{Y1d 1,Y 2d jd} µ djd )( I{Y1+k,d 1,Y2+k,d j d } µ d j d ) ] = Pr(Y 1d 1, Y 2d jd, Y 1+k,d 1, Y 2+k,d j d ) µ djdµ d j d, d, d = 1,, D, j, j = 1,, m, w (k) djd, 1d j d = )( ) E[ (I{Y1d 1,Y 2d jd } µ djd I{Y1+k,1 1,Y 1+k,d j d } µ 1d j d = Pr(Y 1d 1, Y 2d jd, Y 1+k,1 1, Y 1+k,d j d ) µ djd µ 1d j d, d = 1,, D, j = 1,, m, d = 2,, D, j = 1,, m, w (k) 1d j d, djd = )( ) E[ (I{Y1,1 1,Y 1,d j d } µ 1d j d I{Y1+k,d 1,Y 2+k,d jd } µ djd = Pr(Y 1,1 1, Y 1,d j d, Y 1+k,d 1, Y 2+k,d jd ) µ djd µ 1d j d, d = 1,, D, j = 1,, m, d = 2,, D, j = 1,, m, w (k) 1djd, 1d j d = )( ) E[ (I{Y11 1,Y 1d jd } µ 1djd I{Y1+k,1 1,Y 1+k,d j d } µ 1d j d = Pr(Y 11 1, Y 1d jd, Y 1+k,1 1, Y 1+k,d j d ) µ 1djd µ 1d j d, d = 2,, D, j = 1,, m, d = 2,, D, j = 1,, m The above quantities are not convenient to epress a multivariate central limit theorem We define 12

14 the following vectors using the notations defined µ = µ 111 µ 121 µ 1m1 µ 212 µ DmD µ 1212 µ 1222 µ 12m 2 µ 1313 µ 1Dm D = µ 1 µ 2 µ m µ m+1 µ D m µ D m+1 µ D m+2 µ D m+m µ D m+m +1 µ D m+(d 1)m, b = b 1 ( 11 ) b 1 ( 21 ) b 1 ( m1 ) b 2 ( 12 ) b D ( md ) b 12 ( 12 ) b 12 ( 22 ) b 12 ( m 2 ) b 13 ( 13 ) b 1D ( m D ) Notice that the subscripts of the elements of vector µ are different in its two vector forms though the same notation µ is used To form the above vectors, for eample, to get the rth value in the vector µ, we have used the following inde transformation: { µdjd µ r, where r = (d 1) m + j, µ 1d jd µ r, where r = D m + (d 2) m + j, where [] takes integer values We now use the similar relations between the indees of µ djd and the indees of µ r to define the following variables: µ djd,d j d, if r D m, s D m, µ djd,1d σ rs = j d, if r D m, s > D m, µ 1djd,d j d, if r > D m, s D m,, if r > D m, s > D m, and the matrices: w rs k = µ 1djd,1d j d w (k) djd,d j d, if r D m, s D m, w (k) djd,1d j d, if r D m, s > D m, w (k) 1djd,d j d, if r > D m, s D m, w (k) 1djd,1d j d, if r > D m, s > D m, Σ = (σ rs ), W k = (w rs k ), Θ = (diag{µ}) 1 (diag{}) After putting everything above together, the following lemma is obtained Its proof simply follows arguments used in Lemma 33 and the mean value theorem 13

15 Lemma 34 For the choices of jd, j d and n(u µ) n( b b) and with the notations established, we have ( ma l K 1ld +ma l K 2ld +1 L N 0, Σ + {W k + Wk ), T } k=1 ( L N 0, Θ [ ma l K 1ld +ma l K 2ld +1 ) Σ + {W k + Wk T }] Θ T, which establish the asymptotics for the empirical functions b d () The results in Lemma 34 are for any arbitrary choices of jd, j d In order to construct estimators for parameters in M4 models, choices of jd, j d must satisfy certain conditions The eistence of such choices was shown in Proposition 24 Now suppose points jd, j d have been chosen for the moment (the determination of points jd, j d is addressed in Section 4) such that values of a l,k,d are uniquely determined Let us consider the system of non-linear equations b d ( jd ) = [ L d l=1 a l,k2ld,d + ma(a l,k2ld 1,d, a l,k 2ld,d jd ) + ma(a l,k2ld 2,d, a l,k 2ld 1,d jd ) b 1d ( ma(l j d ) = 1,L d ) k=1 + ma(a l,k2ld 3,d, a l,k 2ld 2,d jd ) + + ma(a l, K1ld,d, a l, K 1ld +1,d jd ) + a l, K 1ld,d jd ], j = 1,, m, d = 1,, D, l=1 1+ma(K l11,k 1ld ) m=1 ma(k l21,k 2ld ) j = 1,, m, d = 2,, D, ma(a l,1 m,1, a l,1 m,d j d ), and denote the left hand side of (35) as b Let a be a vector whose elements are all parameters a l,k,d Since (35) uniquely determine the values of all parameters a l,k,d, each of the maima in (35) is determined uniquely (no ties) Therefore, the relation between b and a in equation (35) has the matri representation b = Ca, (36) where each element in matri C belongs to {1, 1/ jd, 1 + 1/ jd, 1/ j d, j = 1,, m, d = 1,, D, j = 1,, m, d = 2,, D} Moreover, the C matri does not change on minor perturbations of the a vector Equation (36) is equivalent to j = 1,, m, d = 1,, D, b1d ( ma(l 1,L j d ) = d ) 1+ma(K l11,k 1ld ) l=1 m=1 ma(k l21,k 2ld ) j = 1,, m, d = 2,, D (35) (C T C) 1 C T b = a (37) Since the estimators obey the SLLN, we can assume that for large sample, C really is uniquely determined, and therefore, a can be solved as in (37) Our estimators are the solutions of the system of non-linear equations: bd ( jd ) = [ L d l=1 â l,k2ld,d + ma(â l,k2ld 1,d, âl,k 2ld,d jd ) + ma(â l,k2ld 2,d, âl,k 2ld 1,d jd ) + ma(â l,k2ld 3,d, âl,k 2ld 2,d jd ) + + ma(â l, K1ld,d, âl, K 1ld +1,d jd ) + âl, K 1ld,d jd ], ma(â l,1 m,1, âl,1 m,d j d ), By SLLN, the left hand side of (38) converges to b as n So when n is sufficiently large, (38) can be written as the following matri representation: (38) (C T C) 1 C T b = â (39) 14

16 Summarizing all arguments above we have obtained the following theorem which is the asymptotic results of the estimators Theorem 35 If all ratios a l,j,d a for all l and j j are distinct for each d = 1,, D, and nonzero l,j,d eisting ratios a l,k,1 a for all l, l and k are distinct for each d = 2,, D, of the multivariate processes l,k,d {Y id }, then there eist { 1d, 2d,, md, d = 1,, D}, and { 1d, 2d,, m d, d = 2,, D}, such that the estimator â, which is the solution of (38), of a satisfies n(â a) where B = (C T C) 1 C T ( L N 0, BΘ [ ma l,d K 1ld +ma l,d K 2ld +1 ) Σ + {W k + Wk T }] Θ T B T We have established the asymptotic results for parameter estimators Net section we propose a procedure to determine the values of all knots jd used in the estimating equations k=1 4 Determining the jd values and simulation eamples In this section, we address how to determine the values of the points { 1d, 2d,, md, d = 1,, D}, and { 1d, 2d,, m d, d = 2,, D}, which are used in estimating parameters in an M4 process of which the parameters L d, K 1ld and K 2ld are assumed known The case of unknown parameter values of L d, K 1ld and K 2ld is addressed in Section 5 with financial applications In Section 2, we introduced functions b d (), or q d (), and showed the slope jumping points of q d () are the ratios of coefficients, ie a l,j+1,d /a l,j,d In real applications, we would not epect the data to follow eactly the model (21), because the degenerate features of repeated signature patterns are not likely to be real phenomena Nevertheless, our hope is that with large enough L d, K ild and K 2kd, the model will provide a good approimation to the finite-dimensional etreme value distributions As an eample, Figure 3 shows simulated realizations of an M4 process and a miture of an M4 process and a white noise process Plot (a) is based on and Plot (b) is based on Y i = ma(1z 1,i 1, 4Z 1,i, 35Z 2,i 1, 15Z 2,i ), (41) Y i = ma(1z 1,i 1, 4Z 1,i, 35Z 2,i 1, 15Z 2,i ) + N i, (42) where N i N(0, 1) are iid Visually, Plots (a) and (b) are very close, but plots of local regions with magnified scale will show the differences This phenomenon suggests (41) is a good approimation of (42) In Section 2, we illustrated that the blowups were caused by the very large Z li values Consider now the following model: Y id = ma 1 l L d ma a l,k,d Z l,i k + N id, < i <, d = 1,, D, (43) K 1ld k K 2ld 15

17 350 (a) A realization of a M4 process (b) A realization of mied M4 process and noise Figure 3: A demo of realizations of an M4 process and a miture of M4 and noise process where L d K2ld l=1 k= K 1ld a l,k,d = 1 for d = 1,, D N id are independent white noises with mean 0 and a common variance For a fied d, and very large clustered observations Y i,d, Y i+1,d,, Y i+k,d, we have Y i,d Y i+1,d = which is approimately equal to ζ id =( a l,k 2ld,d + a l,k2ld 1,d = a l,k 2ld,d + a l,k2ld 1,d for some l, since Z l,i K2ld a l,k 2ld,dZ l,i K2ld + N id a l,k2ld 1,dZ l,i K2ld + N i+1,d = N id a l,k2ld 1,dZ l,i K2ld ) (1 N id a l,k2ld 1,dZ l,i K2ld a l,k2ld,d a l,k2ld 1,d N id a l,k2ld 1,dZ l,i K2ld, N i+1,d a l,k2ld 1,dZ l,i K2ld N i+1,d a l,k2ld 1,dZ l,i K2ld ) N i+1,d a 2 l,k 2ld 1,d Z l,i K 2ld + is very large It is easy to see that E(ζ id ) = a l,k 2ld,d 1, ζ = a l,k2ld 1,d n Similarly, the average of the ratios Y i+j 1,d Y i+j,d 2,, K n i=1 N id N i+1,d a 2 l,k 2ld 1,d Z2 l,i K 2ld as ζ id a l,k 2ld,d, as n a l,k2ld 1,d = a l,k 2ld j+1,dz l,i K2ld +N id a l,k2ld j,dz l,i K2ld +N i+1,d tend to a l,k 2ld j+1,d a l,k2ld j,d, j = These properties suggest that the slope jumping points of q d () are those means of observed ratios of very large observations In practice, clustering analysis method can be used to cluster those very large clustered observations (above certain thresholds) into L d groups based on the consecutive ratios of Y i+j 1,d /Y i+j,d We propose the following procedure to determine jd and j d values 1 For each d, use clustering analysis method to cluster the consecutive ratios of (Y i+j 1,d /Y i+j,d, j = 1,, K) into L d groups for all very large clustered observations indeed on i and appeared in K consecutive days Tuning parameters L d are assumed known in this section We deal with how to determine L d net section 2 For all clustered groups, assign the same group number to the cases where the consecutive ratios of (Y i+j 1,d /Y i+j,d, j = 1,, K) for all very large clustered observations indeed on i when both processes ehibit very large clustered observations simultaneously 16

18 3 Within each group, take the averages of the ratios as slope jumping points Between any two adjacent jumping points, arbitrarily choose two points as jd values For eample, suppose r 1, r 2 are two adjacent ratios, then a natural choice would be jd = r (r 2 r 1 ), j+1,d = r (r 2 r 1 ) 4 The choices of j d can be done from averaging the ratios of Y i1/y id within the same group numbers obtained in Step 2 between two processes Then j d can take the middle values of two adjacent ratios or take two values between two adjacent ratios like the previous step 5 After choosing jd and j d values, use them to estimate a l,k,d based on b d () and b 1d () functions Remark 8 In Step 1, we may need to cluster those very large clustered observations into more than L d groups since outliers may eist and cause the clustering method fail to recognize the true patterns In our eample, we cluster those observations into L d + 3 groups We discard the 3 groups which are very small proportions among all those very large clustered observations Remark 9 In Step 3 and 4, theoretically, we can choose as many points of jd and j d as possible, but it is not realistic due to the intensive computation and the compleity of inferences The goal is to choose moderate number of points such that the estimated values of parameters are close to true parameter values Let us consider a bivariate process and implement the procedures discussed so far Suppose Y id = ma ma a l,k,dz l,i k + N id, < i <, d = 1, 2, (44) 1 l 3 1 k 1 where each M4 process has three signature patterns and moving range order of 3, the noises N id N(0, 1) The coefficients are listed in Table 1 The total number of parameters in the M4 process in (44) are 18 There is a nuisance parameter that is the variance of N id We do not need to estimate the nuisance parameter in order to estimate the values of M4 model parameters We first generate data by simulating these bivariate processes, then based on the simulated data we re-estimate all coefficients simultaneously and compute their asymptotic covariance matri Table 1 is obtained using simulated data with a sample size of 10,000 The jd and j d values are determined using the procedure described earlier We use Monte Carlo method to find best estimations of a l,k,d s For all l, k, d, we simulate 5000 vector values of a l,k,d s from which the ratios a l,k+1,d /a l,k,d and a l,k,1 /a l,k,d (d > 1), are falling in the regions determined by jd, j d computed in Steps 3 and 4 We keep the vector whose ratios a l,k+1,d/a l,k,d, a l,k,1 /a l,k,d have the minimal distance to the estimated ratios also obtained in Steps 3 and 4 Theoretical values of b d () and b 1d () are computed using the kept a l,k,d values We repeat this process 100 times and keep the vector which gives the minimal distance from the simulated ratios a l,k+1,d /a l,k,d, a l,k,1 /a l,k,d, and theoretical values of b d () and b 1d () computed using the kept a l,k,d values to the estimated ratios obtained in Steps 3 and 4 and the estimated functions b d () and b 1d () In Table 1, the estimated values of all cases are very close to the true parameter values The estimated standard deviations are very small These results have shown the efficiency of the proposed estimating procedures 17

19 Parameter True Estimated Standard Parameter True Estimated Standard value value deviation value value deviation a 1, 1, a 1, 1, a 1,0, a 1,0, a 1,1, a 1,1, a 2, 1, a 2, 1, a 2,0, a 2,0, a 2,1, a 2,1, a 3, 1, a 3, 1, a 3,0, a 3,0, a 3,1, a 3,1, Table 1: Simulation results for model (44) j1 =(00811, 01561, 02522, 04023, 06065, 07970, 09737, 11299, 12656, 14998, 18323, 21984) j2 =(04276, 06243, 07325, 08630, 10157, 11438, 12474, 13013, 13057, 14595, 17630, 21062) j =(01323, 02571, 04187, 05051, 05163, 05627, 06442, 07058, 07475, 09301, 12537, 14254, 14452, 16269, 19706, 28595, 42938, 55120) Standard deviations are computed by applying Theorem 35 5 Modeling jumps in returns of financial assets as M4 processes As mentioned earlier, our goal is to model multivariate financial time series, particularly jumps in returns, as M4 processes Figure 4 are time series plots of three stock returns, ie returns from General Electric Co, CITI Group, and Pfizer Inc They are selected from top 10 NYSE volume leaders on day March 12, 2001 The starting date is January 4, 1982 One can see there are etremal observed values in each sequence and there are jumps in volatilities and returns Since the marginal distribution of the original data are not unit Fréchet, data transformations are needed in order to apply M4 process modeling Since local standardization of data may remove the jumps in volatilities from the data and distributional transformation of data can give unit Fréchet margins, they are applied in this data analysis 51 Data transformation Research has shown that the GARCH (generalized autoregressive conditional heteroscedasticity) model, proposed by Bollerslev (1986), has been quite successful to the use of fitting financial time series data Mikosch (2003) gives a very thorough study of GARCH modeling of dependence and tails of financial time series Here we do not study GARCH modeling, but use it as a tool to model volatilities GARCH(1,1) and GARCH(2,2) models have been used to model the volatility in this analysis The estimated standard deviations are drawn in Figure 5 The original data sets are then divided by the estimated conditional standard deviation and three new time series, standardized time series GARCH residuals are obtained Study of GARCH residual has drawn a major attention recently, for eample McNeil and Frey (2000), Engle (2002), among others In this paper, we further study the tail dependencies between and within multivariate financial time series, especially, the GARCH residuals Figure 6 shows standardized time series Visually they look stationary 18

20 01/04/82 09/30/84 06/27/87 03/23/90 12/17/92 09/13/95 06/09/98 03/05/01 02 Negative Daily Return Neg Log Return GE Neg Log Return /04/82 09/30/84 06/27/87 03/23/90 12/17/92 09/13/95 06/09/98 03/05/01 CITI Neg Log Return /04/82 09/30/84 06/27/87 03/23/90 12/17/92 09/13/95 06/09/98 03/05/01 PFIZER Figure 4: Negative daily returns The top plot is for General Electric Co, the middle plot is for CITI Group, and the bottom plot is for Pfizer These figures show that there are etremal observations and the highest drops happened in the same day in all three time series, ie October 19, 1987, the date of the Wall Street crash 006 Estimated Conditional Standard Deviation Using GARCH(1,1) GE data CITI Bank data Pfizer data Figure 5: Estimated volatility The top plot is the estimated volatility plot for GE data; the middle plot is the estimated volatility plot for CITI Group data; the bottom plot is the estimated volatility plot for Pfizer data 19

21 01/04/82 09/30/84 06/27/87 03/23/90 12/17/92 09/13/95 06/09/98 03/05/01 10 Negative Daily Return Divided by Estimated Standard Deviation, Neg Log Return GE 10 5 Neg Log Return /04/82 09/30/84 06/27/87 03/23/90 12/17/92 09/13/95 06/09/98 03/05/01 CITI 10 5 Neg Log Return /04/82 09/30/84 06/27/87 03/23/90 12/17/92 09/13/95 06/09/98 03/05/01 PFIZER Figure 6: Standardized series of negative returns drawn in Figure 4 The top plot is for General Electric Co, the middle plot is for CITI Group, and the bottom plot is for Pfizer Methods for eceedances over thresholds are widely used in the applications of etreme value theory Consider the distribution of X conditionally on eceeding some high threshold u, we have F u (y) = F (u + y) F (u) 1 F (u) As u F = sup{ : F () < 1}, Pickands (1975) shows that the generalized Pareto distribution (GPD) is the only non-degenerate distribution that approimates the distribution of eceedances over thresholds The limit distribution of F u (y) is given by G(y; σ, ξ) = 1 (1 + ξ y σ ) 1/ξ + (51) In the GPD, y + = ma(y, 0), ξ is the tail inde, also called shape parameter, which gives a precise characterization of the shape of the tail of the GPD For the case of ξ > 0, it is long-tailed, for which 1 G(y) decays at the same rate as y 1/ξ for large y The case of ξ < 0 corresponds to the distribution which has a finite upper end point at σ/ξ The case of ξ = 0 corresponds to the following eponential distribution with mean σ after taking the limit as ξ 0, G(y; σ, 0) = 1 ep( y σ ) The Pareto, or GPD, and other similar distributions have long been used as a model for long-tailed processes In order to apply M4 process modeling, we need to obtain transformed data whose fitted distributions have unit Fréchet margins The logical way is to fit a generalized etreme value distribution which may be a combined form of the three types of etreme value distributions The generalized etreme value (GEV) form can be written as: H() = ep [ (1 + ξ µ ψ 20 ] ) 1/ξ +, (52)

22 where µ is a location parameter, ψ > 0 is a scale parameter, and ξ is a shape parameter Pickands (1975) establishes the rigorous connection of the GPD with the GEV He shows that for any given F, a GPD approimation arises from (51), if and only if the classical etreme value limits (52) holds with the same shape parameter ξ in (51) Thus there is a close parallel between limit results for sample maima and limit results for eceedances over threshold, which is quite etensively eploited in modern statistical methods for etremes Using the connection between the GPD and the GEV, we can fit the GPD to the eceedances over thresholds and then get the GEV for the sample maima, but there are often many situations in which a GPD fitting may not be appropriate In practice, the mean ecess plot, Z-plot and W - plot can tell whether the etreme value distribution fitting to those etreme values of the returns is appropriate or not The mean ecess plot is a plot of the mean of all ecess values over a threshold u against u itself It usually suggests whether a etreme value distribution fitting is appropriate or not It is very useful for initial diagnostics and selecting the threshold The underlying idea behind the analysis of Z-statistics and W -statistics is the point-process approach to univariate etreme value modeling due to Smith (1989) According to this viewpoint, the eceedance times and ecess values of a high threshold are viewed as a two-dimensional point process If the process is stationary and satisfies a condition that there are asymptotically no clusters among the high-level eceedances, then its limiting form is non-homogeneous Poisson Smith and Shively (1995) introduce a number of diagnostic devices to eamine the fit of the generalized etreme value distributions These techniques have been broadly used in model diagnostics, for eample, Tsay (1999), Smith and Goodman (2000) Smith (2003) is an etensive source of model diagnostics We have applied these graphical diagnostic tools to both the original data and standardized data and concluded that etreme value distribution fittings are not appropriate to the original data The diagnostic results have suggested a good fitting of standardized data We skip the intermediate steps used to diagnose the data and only report the final transformed data since the main purpose of the paper is to illustrate the estimating procedures of the M4 process For those steps, we refer to Zhang and Smith (2001), Zhang (2002), and Smith (2003) Since the GEV is easier to handle for transforming to and from the Fréchet distribution, it is used to fit the data above a certain threshold (02 is used for original data, and 12 for standardized data in this study) for each sequence Considering there are some asymmetric behavior between positive returns and negative returns, we fit the standardized positive returns and negative returns to GEV separately The estimated parameter values of the GEV distributions are summarized in Table 2 (for standardized negative returns), and Table 3 (standardized positive returns) From these two tables, we see that all negative returns are long-tailed, but not all positive returns are long-tailed since not all estimated shape parameter values are positive The data are then transformed into the Fréchet scale from fitted GEV function We combine the transformed data into three new time series which are standardized and transformed absolute returns The transformed data are plotted in Figure 7 The final transformed data are the base of the M4 process modeling 21

On the Estimation and Application of Max-Stable Processes

On the Estimation and Application of Max-Stable Processes Zhengjun Zhang Department of Statistics University of Wisconsin Madison, WI 53706 USA Richard L Smith Department of Statistics University of North