Practical conditions on Markov chains for weak convergence of tail empirical processes Olivier Wintenberger University of Copenhagen and Paris VI Joint work with Rafa l Kulik and Philippe Soulier Toronto, May 4 2016 1/31
I. Regularly varying time series II. The tail empirical process III. Statistical applications 2/31
I. Regularly varying time series Let {X t, t Z} be a real valued stationary time series. It is regularly varying iff for any k 0 ( X0 L x, X 1 x..., X ) k x X 0 > x x (Y 0,..., Y k ). This defines a process {Y k, k 0} called the tail process, Basrak and Segers (2009). Y 0 has a two-sided Pareto distribution on (, 1] [1, ). If for some j {1,..., d}, X j is extremally independent of X 0, then Y j 0. Y 0 is independent of Y j / Y 0 = Θ 0 called the spectral tail process. 3/31
Harris recurrent Markov chains Assume X t = g(φ t ) with {Φ t } an aperiodic, recurrent and irreducible Markov chain with kernel P and which admits an invariant distribution π such that X 0 = g(φ 0 ) is regularly varying. It constitutes a stationary regularly varying time series, Resnick and Zeber (2013), Janssen and Segers (2014). Y t is a multiplicative random walk Y t = A t A 1 Y 0, t 0 for some (A t ) iid. 4/31
Drift condition Assume the following drift condition holds: PV (x) λv (x) + b1 C (x), where λ (0, 1), V (x) : E [1, ) and C is a small set. Under these conditions, the chain (Φ t ) is β-mixing with geometric rate and so is (X t ), Meyn and Tweedie (1997). 5/31
The DC p, p > 0 condition, Mikosch and W. (2014) Let {u n } be an increasing sequence such that 1 lim F (u n ) = lim n n n F (u n ) = 0. There exist p (0, α/2) and a constant c such that For every compact set [a, b] (0, ), lim sup n sup a s b g p cv. (1) 1 unp(x p 0 > u n ) E [ ] V (Φ 0 )1 {sun<g(φ0 )} <. Roughly (1) and (2) implies the existence of c 1, c 2 and n 0 c 1 P(V (Φ 0 ) > u n ) P( X p > u n ) c 2 P(V (Φ 0 ) > u n ), n n 0. (2) 6/31
Example 1: the GARCH(1,1) process We consider a GARCH(1,1) process X t = σ t Z t, where (Z t ) is iid N (0, 1) and σ 2 t = α 0 + σ 2 t 1(α 1 Z 2 t 1 + β 1 ) = α 0 + σ 2 t 1A t. Special SRE then DC p holds under Kesten s condition E[A α/2 0 ] = 1 for p < α with V (x) = x p and g = x, Mikosch and W. (2014). 7/31
Example 2: the AR(p) process Assume that {X j, j Z} is an AR(p) model X j = φ 1 X j 1 + + φ p X j p + ε j, j 1, that satisfies the following conditions: {ε j, j Z} are iid, regularly varying with index α, the polynomial 1 φ 1 z φ p z p does not have unit root inside the unit cercle. As for Random Coefficients AR, DC p for p < α, holds for some V and g(x 1,..., x p ) = x 1, Feigin and Tweedie (1985). 8/31
Example 3: the Threshold ARCH(1) process Let ξ R. Assume that {X j } is T-ARCH model X j =(b 10 + b 11 X 2 j 1) 1/2 Z j 1 {Xj 1 <ξ} + (b 20 + b 21 X 2 j 1) 1/2 Z j 1 {Xj 1 ξ}, that satisfies the following conditions: b ij > 0; {Z j, j Z} are iid gaussian r.v., the Lyapunov exponent γ = p log b 1/2 11 + (1 p) log b1/2 21 + E[log( Z 1 )], where p = P(Z 1 < 0), is strictly negative; (b 11 b 21 ) p/2 E[ Z 0 p ] < 1. Then DC p holds. 9/31
A counterexample Let {Z j, j Z} be iid positive integer valued r.v. regularly varying with index β > 1. Define the Markov chain {X j, j 0} by the following recursion: { X j 1 1 if X j 1 > 1, X j = Z j if X j 1 = 1. Since β > 1, the chain admits a stationary distribution π on N given by π(n) = P(Z 0 n) E[Z 0 ], n 1. Karamata s Theorem implies that π is regularly varying with index α = β 1. 10/31
The state {1} is an atom and the return time to the atom is distributed as Z 0, The chain is not geometrically ergodic and DC p cannot hold, The time series is regularly varying with Y j = 1 for all j 0, It admits a phantom distribution, the distribution of Z 0, O Brien (1987), Doukhan et al. (2015) ( ) P max X j u n P(Z j u n ) n/e[z] 0 1 j n GARCH 0 1 2 3 4 5 QUEUE 5 10 15 20 0 2000 4000 6000 8000 10000 Time 11/31
II. The tail empirical process. II.1. Univariate TEP The univariate tail empirical process is defined by e n (t) = 1 n F (u n ) n {1 {Xi >u nt} F (u n t)} i=1 relevant to infer the marginal tail. In the iid case then n F (u n ) e n (t) W (t α ). where W is the standard Brownian motion and means weak convergence in the space D(0, ] endowed with the J 1 topology, Starica and Resnick (1997). 12/31
II.1. Multivariate TEP For each quantity of interest, an appropriate type of TEP is used. Joint exceedences, Kulik and Soulier (2015), extremograms, Davis and Mikosch (2009). e n (t 1,..., t h ) = 1 n F (u n ) n {1 {Xi+1 >u nt 1,...,X i+h >u nt h } i=1 Conditional limiting distributions, e n (t 1,..., t h ) = 1 n F (u n ) P(X 1 > u n t 1,..., X h > u n t h )}. n 1 {Xi+1 u nt 1,...,X i+h u nt h }1 {Xi >u n} i=1 P(X 1 u n t 1,..., X h u n t h X 0 > u n ) 13/31
Spectral tail process, Drees et al. (2015). The relevant TEP is e n (t) = 1 n F (u n ) n 1 {Xi+1 / X i t,x i >u n} P(X 1 /X 0 u n t X 0 > u n ). i=1 Many variations and more generally, cluster functionals, Drees and Rootzen (2010). Weighted versions e n (t 1,..., t h ) = 1 n F (u n ) n Ψ(X i+1,..., X i+h )1 {Xi+1 >u nt 1,...,X i+h >u nt h } i=1 E[Ψ(X i+1,..., X i+h ) X 1 > u n t 1,..., X h > u n t h ]. 14/31
Extremal quantities of interest Univariate: The tail index. Extreme quantiles. Multivariate; extremal dependence: various coefficients of extremal dependence; extremograms; distribution of the tail process. Multivariate; extremal independence: conditional limiting distributions; conditional scaling exponents. Infinite dimensional: extremal index; cluster functionals. 15/31
Conditions for time series To prove the weak convergence of the tail empirical process for time series, three types of conditions are needed. Temporal dependence condition; Tightness criterion; Finite clustering condition; Bias conditions. We will not discuss the bias issues which are important but of a completely different nature (second order conditions). 16/31
Temporal dependence β-mixing allowing blocks method, Rootzen (2009) For a suitable choice of sequences r n and u n, the limiting distribution of e n is the same as ẽ n (t) = m n i=1 1 n F (u n ) r n j=1 {1 { X(i 1)rn+j } F (u n t)} where m n = [n/r n ] and the coupling blocks, Yu (1994), { X j, j = (i 1)r n + 1,..., ir n }, i = 1,..., m n are iid as one original block of X i. Given this approximation, we need two more conditions: existence of the limiting variance of each block suitably normalized, tightness of the TEP. 17/31
The variance of one block By standard computations, we obtain r 1 n F (u n ) var 1 {Xj >u ns} j=1 F (u n s) F (u n ) + rn s α + j=1 r n j=1 P(X 0 > u n s, X j > u n s) F (u n ) P(X 0 > u n s, X j > u n s) F (u n ) + O(r n F (u n )) + O(r n F (u n )). If we assume that r n F (un ) 0, then the sum right above must a limit. 18/31
Note that joint regular variation of (X 0, X j ) implies that for fixed L > 0, by definition of the tail process {Y j, j 1}, lim n L j=1 P(X 0 > u n s, X j > u n s) F (u n ) L = s α P(Y j > 1). j=1 This implies that for each L > 1, 1 lim inf n F (u n ) var r n j=1 1 {Xj >u ns} s α L P(Y j > 1). Thus a necessary condition for the quantity on the left hand side to have a finite limit is P(Y j > 1) <. j=1 j=0 19/31
Finite Clustering condition A sufficient condition is lim lim sup L n 1 F (u n ) L< j r n P(X 0 > u n s, X j > u n s) = 0. (known under the misleading name Anti-Clustering condition). If Condition FC holds, then for s t, lim n n F (u n ) cov(e n (t), e n (s)) = s α j Z P(Y j > t/s). where {Y j, j Z} is the tail process (extended to negative integers thanks to the forward-backward formula of Basrak and Segers, 2009). 20/31
If the time series is extremally independent, i.e. Y j = 0 for j 0, then the limit is the same as in the iid case. If Condition FC holds, then the extremal index θ of the chain {X t } is positive and is given by, Basrak and Segers (2009), ( ) θ = P max Y k 1 > 0. k 1 In the counterexample θ = 0, Roberts et al. (2006). 21/31
Checking Condition FC for Markov chains Lemma Under the DC p and minorization conditions, Condition FC holds. Proof: The set C is a regenerative set. Let τ C be the first return to C. Fix an integer L > 0 and split the sum at τ C : 1 [ rn ] F (u n ) E 1 {sun<x0 }1 {sun<xj } j=l 1 F (u n ) E + 1 F (u n ) E τ C j=l r n j=τ C +1 1 {sun<x0 }1 {sun<x j } 1 {sun<x0 }1 {sun<xj }1 {τc r n} The last term is dealt with by standard regenerative arguments. 22/31
For r > 1, the sum up to the first visit is bounded by F 1 τc ] (u n )E[ 1 {su j=l n<x 0 }1 {sun<xj } [ τc ] E 1 {su su j=l n<x j } X 0 = x ν n (dx) n [ Cun p r L τc ] E r j V (X j ) X 0 = x ν n (dx) j=l su n The geometric drift condition states precisely that there exists r > 1 such that [ τa ] E r k V (Φ j ) X 0 = x C x p. j=l Plugging this bound into the integral yields F 1 τa ] (u n )E[ 1 {su j=l n<x 0 }1 {sun<xj } Cr L. 23/31
Tightness Tightness of the process e n under the β-mixing condition holds under the following sufficient condition: for each a > 0, s, t a, lim sup n 1 r n F (u n ) E r n 1 {uns<xj r nt} j=1 2 s α t α. This condition can be proved for Markov chains which satisfy the DC p and minorization conditions as above. Moreover, the following restriction on the block size is needed lim n r n n F (u n ) = 0. 24/31
Theorem Let DC p and minorization conditions hold and assume moreover that there exists η > 0 such that { } 1 lim n log1+η (n) F (u n ) + = 0. n F (u n ) Let s 0 > 0 be fixed. The TEP converges weakly in l ([s 0 1, )) to a centered Gaussian process at the rate nf (u n ). If ψ : R h+1 R is such that ψ(x 0,..., x h ) c(( x 0 1) q 0 + + ( x h 1) q h ), with q i + q i p α/2 for all i, i = 0,..., h, then converges weakly to a centered Gaussian process at the rate nf (u n ). 25/31
Examples and a counterexample Most usual Markovian time series satisfy the DC p condition GARCH(1,1), SRE, AR(p), Functional autoregressive process: X t+1 = g(x t ) + Z t with lim t g(x) λ < 1, Threshold models, AR-ARCH models, etc. The counterexample does not satisfy the DC p condition and the TEP converges at another rate than the iid case. The limit is not gaussian if var(z) =. 26/31
III. Statistical applications In the case of extremal independence, Y j = 0 for all j 0, under the DC p and minorization conditions, the univariate TEP has the same limit as in the iid case. Thus most of the statistical inference works similarly as in the iid case. 27/31
The Hill estimator The classical Hill estimator of γ = 1/α is defined as Corollary ˆγ = 1 k n j=1 ( ) Xn:n j+1 log + X n:n k Under the DC p and minorization conditions and if one can neglect the bias, then k {ˆγ γ} N 0, α 2 1 + 2 P(Y j > 1 Y 0 > 1). j=1. Notice the limit distribution is the one from the iid case multiplied by the sum of the extremograms. 28/31
Estimation of the cluster index Under Condition DC p, the cluster index exists, Mikosch and Wintenberger (2014), lim b +(h) = b +, h b + (h) = 1 h lim n P(X 0 + + X h > u n ). F (u n ) Define Corollary ˆb + (h) = 1 n h 1 kh {Xj + +X j+h >X n:n k }. j=1 Under the DC p and minorization conditions and if one can neglect the bias, k(ˆb + (h) b + (h)) converges weakly to a gaussian r.v.. 29/31
What about the counterexample? In most statistical applications, we are interested in the behavior of the TEP e n at X n:n k /u n. Both quantities behave in a very non usual way but the ultimate estimation seems to work normally. 10 9 8 7 6 5 4 3 7 8 9 10 30/31
Thanks! 31/31