Thomas Mikosch and Daniel Straumann: Stable Limits of Martingale Transforms with Application to the Estimation of Garch Parameters

Similar documents
The sample autocorrelations of financial time series models

Nonlinear Time Series Modeling

GARCH processes probabilistic properties (Part 1)

Practical conditions on Markov chains for weak convergence of tail empirical processes

Heavy Tailed Time Series with Extremal Independence

QUASI-MAXIMUM-LIKELIHOOD ESTIMATION IN CONDITIONALLY HETEROSCEDASTIC TIME SERIES: A STOCHASTIC RECURRENCE EQUATIONS APPROACH 1

Stochastic volatility models: tails and memory

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Stochastic volatility models with possible extremal clustering

ECON 616: Lecture 1: Time Series Basics

Consistency of Quasi-Maximum Likelihood Estimators for the Regime-Switching GARCH Models

Continuous invertibility and stable QML estimation of the EGARCH(1,1) model

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

Probability and Measure

Extremogram and Ex-Periodogram for heavy-tailed time series

Introduction to Real Analysis Alternative Chapter 1

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Some functional (Hölderian) limit theorems and their applications (II)

Extremogram and ex-periodogram for heavy-tailed time series

M-estimators for augmented GARCH(1,1) processes

Multivariate Markov-switching ARMA processes with regularly varying noise

Metric Spaces and Topology

Regular Variation and Extreme Events for Stochastic Processes

On the estimation of the heavy tail exponent in time series using the max spectrum. Stilian A. Stoev

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Gaussian Processes. 1. Basic Notions

Location Multiplicative Error Model. Asymptotic Inference and Empirical Analysis

On convergent power series

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*

Continued fractions for complex numbers and values of binary quadratic forms

Multivariate Markov-switching ARMA processes with regularly varying noise

Asymptotic inference for a nonstationary double ar(1) model

STAT 200C: High-dimensional Statistics

Nonparametric regression with martingale increment errors

Strictly Stationary Solutions of Autoregressive Moving Average Equations

4 Sums of Independent Random Variables

Chapter 4: Asymptotic Properties of the MLE

Math 209B Homework 2

ASYMPTOTIC NORMALITY OF THE QMLE ESTIMATOR OF ARCH IN THE NONSTATIONARY CASE

Limit theorems for dependent regularly varying functions of Markov chains

Consistency of the maximum likelihood estimator for general hidden Markov models

3 Integration and Expectation

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Analysis-3 lecture schemes

Notes on uniform convergence

The Canonical Gaussian Measure on R

A Stochastic Recurrence Equation Approach to Stationarity and phi-mixing of a Class of Nonlinear ARCH Models

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

Spring 2012 Math 541B Exam 1

ON KRONECKER PRODUCTS OF CHARACTERS OF THE SYMMETRIC GROUPS WITH FEW COMPONENTS

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

µ X (A) = P ( X 1 (A) )

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Extreme Value Analysis and Spatial Extremes

Observer design for a general class of triangular systems

Conditional independence, conditional mixing and conditional association

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

On Kesten s counterexample to the Cramér-Wold device for regular variation

Stable Process. 2. Multivariate Stable Distributions. July, 2006

4 Expectation & the Lebesgue Theorems

Introduction and Preliminaries

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals

5: MULTIVARATE STATIONARY PROCESSES

Submitted to the Brazilian Journal of Probability and Statistics

If we want to analyze experimental or simulated data we might encounter the following tasks:

A NICE PROOF OF FARKAS LEMMA

Estimation of Dynamic Regression Models

Chapter 6. Convergence. Probability Theory. Four different convergence concepts. Four different convergence concepts. Convergence in probability

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

Problem set 1, Real Analysis I, Spring, 2015.

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Statistical inference on Lévy processes

ELEMENTS OF PROBABILITY THEORY

Chapter 2 Metric Spaces

July 31, 2009 / Ben Kedem Symposium

L p Spaces and Convexity

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Module 3. Function of a Random Variable and its distribution

STA205 Probability: Week 8 R. Wolpert

A CLT FOR MULTI-DIMENSIONAL MARTINGALE DIFFERENCES IN A LEXICOGRAPHIC ORDER GUY COHEN. Dedicated to the memory of Mikhail Gordin

X n = c n + c n,k Y k, (4.2)

On rational approximation of algebraic functions. Julius Borcea. Rikard Bøgvad & Boris Shapiro

Lecture 17 Brownian motion as a Markov process

Random Bernstein-Markov factors

University of Pavia. M Estimators. Eduardo Rossi

Topological vectorspaces

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Isomorphisms between pattern classes

Process-Based Risk Measures for Observable and Partially Observable Discrete-Time Controlled Systems

S chauder Theory. x 2. = log( x 1 + x 2 ) + 1 ( x 1 + x 2 ) 2. ( 5) x 1 + x 2 x 1 + x 2. 2 = 2 x 1. x 1 x 2. 1 x 1.

STAT 7032 Probability Spring Wlodek Bryc

Mi-Hwa Ko. t=1 Z t is true. j=0

A Functional Central Limit Theorem for an ARMA(p, q) Process with Markov Switching

CONSTRAINED PERCOLATION ON Z 2

1 The Observability Canonical Form

Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes with Killing

Elements of Probability Theory

Transcription:

MaPhySto The Danish National Research Foundation: Network in Mathematical Physics and Stochastics Research Report no. 11 March 2004 Thomas Mikosch and Daniel Straumann: Stable Limits of Martingale Transforms with Application to the Estimation of Garch Parameters ISSN 1398-2699

STABLE LIMITS OF MARTINGALE TRANSFORMS WITH APPLICATION TO THE ESTIMATION OF GARCH PARAMETERS THOMAS MIKOSCH AND DANIEL STRAUMANN Abstract. In this paper we study the asymptotic behavior of the Gaussian quasi maximum likelihood estimator of a stationary GARCH process with heavy-tailed innovations. This means that the innovations are regularly varying with index α 2, 4. Then, in particular, the marginal distribution of the GARCH process has infinite fourth moment and standard asymptotic theory with normal limits and n-rates breaks down. This was recently observed by Hall and Yao 2003. It is the aim of this paper to indicate that the limit theory for the parameter estimators in the heavytailed case nevertheless very much parallels the normal asymptotic theory. In the light-tailed case, the limit theory is based on the CLT for stationary ergodic finite variance martingale difference sequences. In the heavy-tailed case such a general result does not exist, but an analogous result with infinite variance stable limits can be shown to hold under certain mixing conditions which are satisfied for GARCH processes. It is the aim of the paper to give a general structural result for infinite variance limits which can also be applied in situations more general than GARCH. 1. Introduction The motivation for writing this paper comes from Gaussian quasi maximum likelihood estimation QMLE for GARCH generalized autoregressive conditionally heteroscedastic processes with regularly varying noise; we refer to Section 4 for a detailed description of the problem. Recall that the process p q 1.1 X t = σ t Z t, with σt 2 = α 0 + α i Xt i 2 + β j σt j 2, t Z, i=1 is said to be a GARCHp,q process GARCH process of order p,q. Here Z t is an iid sequence with EZ1 2 = 1 and EZ 1 = 0, and α i,β j are non-negative constants. GARCH processes and their parameter estimation have been intensively investigated over the last few years; see Mikosch [19] for a general overview and Straumann and Mikosch [28] and the references therein for parameter estimation in GARCH and related models. In the context of QMLE the asymptotic behavior of the parameter estimator is essentially determined by the limiting behavior of the following quantity, see 4.21, L nθ 0 = 1 2 h t θ 0 σ 2 t Z 2 t 1, where L n is the derivative of the underlying log-likelihood, h t is the derivative of σ2 t when considered as a function of the parameter θ, and θ 0 is the true parameter consisting of the α i and β j values Date: November 21, 2003. Key words and phrases. GARCH process, Gaussian quasi-maximum likelihood, regular variation, infinite variance, stable distribution, stochastic recurrence equation, mixing AMS 2000 Subject Classification: Primary: 62F12 Secondary: 62G32 60E07 60F05 60G42 60G70. Thomas Mikosch s research was supported by MaPhySto, the Danish Research Network for Mathematical Physics and Stochastics, DYNSTOCH, a research training network under the Improving Human Potential Programme financed by the Fifth Framework Programme of the European Commission, and by the Danish Natural Science Research Council SNF grant no. 21-01-0546. 1 j=1

2 T. MIKOSCH AND D. STRAUMANN in a certain parameter space. In this context, G t = h tθ 0 σt 2, t Z, is a stationary ergodic sequence of vector-valued random variables which is adapted to the filtration F t = σy t 1,Y t 2,..., t Z, where Y t = Zt 2 1 constitutes an iid sequence. If G t has a finite first moment the sequence G t Y t is a transform of the martingale difference sequence Y t, hence a stationary ergodic martingale difference sequence with respect to F t. If E G 1 2 < and EY1 2 <, an application of the central limit theorem CLT for finite variance stationary ergodic martingale differences see Billingsley [4], Theorem 23.1 yields n 1/2 d G t Y t N0,Σ, where Σ is the covariance matrix of G 1 Y 1. This result does not require any additional information about the dependence structure of G t Y t. It implies the asymptotic normality of the parameter estimator based on QMLE. If EY1 2 = a result as general as the CLT for stationary ergodic martingale differences is not known. However, some limit results for stationary sequences with marginal distribution in the domain of attraction of an infinite variance stable distribution exist. We recall two of them in Section 2. Our interest in infinite variance stable limit distributions for n G t Y t is again closely related to parameter estimation for GARCH processes. Recently, Hall and Yao [15] gave the asymptotic theory for QMLE in GARCH models when EZ1 4 =. To be more specific, they assume regular variation with index α 1,2 for the distribution of Z1 2. It is our aim to show that their results can be obtained by a general limit result for the martingale transforms n G t Y t when the iid noise Y t is regularly varying with index α 1,2. The key notions in this context are regular variation of the finite-dimensional distributions of G t Y t and strong mixing of this sequence, see Section 2 for these notions. Our objective is twofold. First, we want to show that the theories on parameter estimation for GARCH processes with heavy- or light-tailed innovations Z t parallel each other. We use the recent structural approach to GARCH estimation by Berkes et al. [3] in order to show that such a unified approach is possible. Second, our approach to the asymptotic theory for parameter estimators is not restricted to GARCH processes. In the light-tailed case, Straumann and Mikosch [28] extended the approach by Berkes et al. [3], including among others AGARCH and EGARCH processes. The main difficulty of our approach when infinite variance limits occur is the verification of certain mixing conditions. In contrast to the case of asymptotic normality, such conditions cannot be avoided. However, it is difficult to check for a given model that these conditions hold; see Section 4.4 in order to get a flavor of the task to be solved. GARCH processes and their parameter estimation give the motivation for this paper. The corresponding limit theory for the QMLE with heavy-tailed innovations can be found in Section 4. Our main tool for achieving these limit results is based on asymptotic theory for martingale transforms with infinite variance stable limits. This theory is formulated and proved in Section 3. It is based on more general results for sums of stationary mixing vector sequences with regularly varying finite-dimensional distributions. This theory is outlined in Section 2. 2. Preliminaries In this section we collect some basic tools and notions to be used throughout this paper. First we want to formulate a classical result on infinite variance stable limits for iid vector-valued summands due to Rvačeva [26]. Before we formulate this result we recall the notions of stable random vector and multivariate regular variation. The class of stable random vectors coincides with the class of possible limit distributions for sums of iid random vectors, and multivariate regular variation is the

STABLE LIMITS AND GARCH 3 domain of attraction condition for sums of iid random vectors. Then we continue with an analog of Rvačeva s result for stationary ergodic vector sequences. In this context, we also need to recall some mixing conditions. Stable random vectors. Recall that a vector X with values in R d is said to be α-stable for some α 0,2 if its characteristic function is given by { { exp Ee ix,x S x,y α 1 isignx,y tanπα/2 Γdy + ix,µ } α 1, = d 1 exp { S x,y 1 + i 2 d 1 π signx,y log x,y Γdy + ix,µ } α = 1, see Samorodnitsky and Taqqu [27], Theorem 2.3.1. The index of stability α 0,2, the spectral measure Γ on the unit sphere S d 1 and the location parameter µ uniquely determine the distribution of an infinite variance α-stable random vector X. Multivariate regular variation. If X is α-stable for some α 0,2, it is regularly varying with index α. This means the following. The random vector X with values in R d is regularly varying with index α 0 if there exists a random vector Θ with values in the unit sphere S d 1 of R d such that for any t > 0, as x, 2.1 where for any vector x 0, P X > tx, X P X > x x = x/ x, v t α PΘ, and v denotes vague convergence in the Borel σ-field of S d 1 ; see Resnick [23, 24] for its definition and details. The distribution of Θ is called the spectral measure of X. Alternatively, 2.1 is equivalent to PX x P X > x v µ, where v denotes vague convergence in the Borel σ-field of R d \{0} and µ is a measure on the same σ-field satisfying the homogeneity assumption µta = t α µa for t > 0. Stable limits for sums of iid random vectors. Now let Y t be an iid sequence of random vectors with values in R d. According to Rvačeva [26], there exist sequences of constants a n > 0 and b n R d such that n d Y t b n Xα for some α-stable random variable X α with α 0,2 if and only if Y 1 is regularly varying with index α, and the normalizing constants a n can be chosen as 2.2 P Y 1 > a n n 1. For a stationary sequence Y t a similar result can be found in Davis and Mikosch [12] as a multivariate extension of one-dimensional results in Davis and Hsing [11]. For its formulation one needs regular variation of the summands and a particular mixing condition, called Aa n which was introduced in Davis and Hsing [11].

4 T. MIKOSCH AND D. STRAUMANN Mixing conditions. We say that the condition Aa n holds for the stationary sequence Y t of random vectors with values in R d if there exists a sequence of positive integers r n such that r n, k n = [n/r n ] as n and { } { r kn n E exp fy t /a n E exp fy t /a n } 0, 2.3 n, f G s, where G s is the collection of bounded non-negative step functions on R d \{0}. The convergence in 2.3 is not required to be uniform in f. This is indeed a very weak condition and is implied by many known mixing conditions, in particular the strong mixing condition which is relevant in the context of GARCH processes; see Section 4. We refer to Davis and Mikosch [12] for a comparison of Aa n with other mixing conditions. For later use we also recall the definition of a strongly mixing stationary sequence Y t of random vectors with rate function φ k, see Rosenblatt [25], cf. Doukhan [13] or Ibragimov and Linnik [16]: sup PA B PAPB =: φ k 0 as k. A σy s, s 0, B σy s, s>k If φ k decays to zero at an exponential rate then Y t is said to be strongly mixing with geometric rate. Recall that absolute regularity or β mixing is a mixing notion which is slightly more restrictive than strong mixing: 2.4 E sup PB σy s, s 0 PB B σy t, t>k Indeed, β-mixing implies strong mixing with the same rate function. =: b k 0, k. Stable limits for sums of stationary random variables. The following result is a combination of Theorem 2.8 and Proposition 3.3 in [12]. It gives conditions under which an α-stable weak limit occurs for the sum process of a stationary sequence. In what follows, we write and for any Borel set B R, S 0 = 0 and S n = Y 1 + + Y n, n 1, where S h n B = S n B = S h n B h=1,...,d, Y h t I B Y h t /a n, n 1. Theorem 2.1. Let Y t be a strictly stationary sequence of random vectors with values in R d and the real sequence a n be defined by 2.2. Assume that the following conditions are satisfied: a The finite-dimensional distributions of Y k are regularly varying with index α > 0. To be specific, let vecθ k k,...,θk k be the 2k + 1d-dimensional random row vector with values in the unit sphere S 2k+1d 1 that appears in the definition 2.1 of regular variation of vecy k,...,y k, k 0, with respect to the max-norm in R 2k+1d. b The mixing condition Aa n holds for Y t.

2.5 c Then the limit 2.6 lim k lim sup P n STABLE LIMITS AND GARCH 5 Y t > a n y k t r n Y 0 > a n y = 0, y > 0, where r n appears in the formulation of Aa n. γ = lim E θ k 0 k α k j=1 θk j + α exists. If γ > 0, then the following results hold. 2.7 i If α 0,1, then for some α-stable random vector X α. ii If α [1,2 and for all δ > 0, then lim y 0 for some α-stable random vector X α. n S n d Xα, / E θ k 0 α lim sup P S n 0,y] ES n 0,y] > δ a n = 0, n n S n ES n 0,1] d X α, The structure of the limiting vectors X α is given by some functional of the points of a limiting point process. The proof of this result makes heavily use of point process convergence results which are appropriate tools in the context of regularly varying distributions when extremely large values may occur in the sequence Y t ; see Davis and Mikosch [12] for details. 3. Stable limits for martingale transform In this section we want to derive infinite variance stable limits for sums of strictly stationary random vectors which have the particular form Y t = G t Y t, where Y t is an iid sequence and G t is a strictly stationary sequence of random vectors with values in R d such that G t is adapted to the filtration given by the σ-fields F t = σy t 1,Y t 2,..., t Z. If EY 1 = 0 and E G 1 <, EG t Y t F t = 0 a.s., and therefore G t Y t is a martingale difference sequence and S 0 = 0, S n = Y 1 + + Y n, n 1, is the martingale transform of the martingale n Y t n 0 by the sequence G t. We keep this name even if E Y 1 =. 3.1. Basic assumptions. We impose the following assumptions on the sequences Y t and G t. A.1 Y 1 is regularly varying with index α 0,2. A.2 E G 1 α+ɛ < for some ɛ > 0. A.3 G t Y t satisfies condition Aa n, see 2.3, where P Y 1 > a n n 1 and r n, defined in 2.3, is such that 3.1 where ɛ is the same as in A.2. α+ɛ arn n r n 0, a n

6 T. MIKOSCH AND D. STRAUMANN Remark 3.1. Regular variation of Y 1 with index α and the iid property of Y t imply that P n max Y t x Φ α x = e x α, x > 0, 1 t n for the Fréchet distribution Φ α ; see Embrechts et al. [14], Chapter 3. In this setting, the heaviness of the tails of the distribution of G 1 Y 1 is essentially determined by the distribution of Y 1 ; see Remark 3.3 below. 3.2. Main result. We are now ready to formulate our main result on the asymptotic behavior of the sum process S n. Theorem 3.2. Consider the martingale transform n Y t n 0 = n G ty t n 0 defined above. Assume that the conditions A.1-A.3 are satisfied. Moreover, if α 1,2 assume that EY 1 = 0 and, if α = 1, that Y 1 is symmetric. Then the finite-dimensional distributions of Y t are regularly varying with index α and the limit γ in 2.6 exists. If γ > 0, then 3.2 where the sequence a n is given by and X α is an α-stable random vector. n S n d X α, P Y 1 > a n n 1. Remark 3.3. It is not difficult to see that Y t is regularly varying with index α. For the proof we need a result of Breiman [10]. It says that if one has two independent random variables ξ,η > 0 a.s., ξ is regularly varying with index α > 0 and Eη ν < for some ν > α, then Pξ η > x Eη α Pξ > x, i.e., ξη is regularly varying with the same index α. Now observe that for t,x > 0 and a Borel set S S d 1, by multiple application of Breiman s result, G 1 Y 1 P G 1 Y 1 > tx, G 1 Y 1 S = = P G 1 Y 1 > x P G 1 Y 1 > tx,signy 1 G 1 S P G 1 Y 1 > x P G 1 Y 1 > tx, G 1 S P G 1 Y 1 < tx, G 1 S P G 1 Y 1 > x + P G 1 Y 1 > x E G 1 α I S G1 PY 1 > tx E G 1 α I S G 1 PY 1 tx E G 1 α P Y 1 > x + E G 1 α P Y 1 > x. Writing for some p,q 0 with p + q = 1 and a slowly varying function Lx, PY 1 > x = p Lxx α and PY 1 x = q Lx x α, x > 0, we can read off the spectral measure of the vector Y 1 : E G 1 α I S G1 E G 1 α I S G 1 3.3 PΘ S = p E G 1 α + q E G 1 α.

STABLE LIMITS AND GARCH 7 By regular variation, a n = n 1/α ln for some slowly varying function l. By Breiman s result and since E G 1 α+ɛ < for some ɛ > 0, it also follows that P G 1 Y 1 > x E G 1 α P Y 1 > x, and therefore P Y 1 > ca n n 1 for some constant c > 0. Moreover, we have 3.4 n P n Y 1 v µ 1, for some measure µ 1 on R d \{0} which is determined by α and the spectral measure. Remark 3.4. It follows from the proof below that 3.5 n P n Y 1,...,Y h dx 1,...,x h v µ 1 dx 1 ε 0 dx 2,...,x h + + µ 1 dx h ε 0 dx 1,...,x h 1 =: µ h dx 1,...,x h. where µ 1 is defined by 3.4, ε 0 is Dirac measure at 0 and 3.6 Y 1,...,Y h := vecy 1,...,Y h and x 1,...,x h := vecx 1,...,x h. This means in particular that the limiting measure in the definition of regular variation for Y 1,...,Y h is the same as in the definition of regular variation for vecy 1,...,Y h, where Y i are iid copies of Y 1. This part of the theorem is valid for any α > 0. Proof. We verify the conditions of Theorem 2.1. Since A.3 implies Aa n and since we require γ > 0, it remains to check a and c in Theorem 2.1. a Regular variation of the finite-dimensional distributions. We show regular variation of the vector Y 1,...,Y h defined in 3.6, i.e., we show that 3.5 holds. We restrict ourselves to prove regular variation of the pairs Y 1,Y 2 := vecy 1,Y 2 ; the case of general finite-dimensional distributions is completely analogous. The regular variation of Y 1 was explained in Remark 3.3. Let now B 1 and B 2 be two Borel sets in [0, ] d \{0}, bounded away from zero. In particular, there exists M > 0 such that x > M for all x B 1 and x B 2. Then for any ɛ > 0, { a 1 n Y 1 B 1, n Y } 2 B 2 { G 1 Y 1 > M a n, G 2 Y 2 > M a n } {ɛ Y 1 > M a n,ɛ Y 2 > M a n } { G 1 I ɛ, G 1 Y 1 > M a n,ɛ Y 2 > M a n } { G 2 I ɛ, G 2 Y 2 > M a n,ɛ Y 1 > M a n } { } 4 G 1 I ɛ, G 1 Y 1 > M a n, G 2 I ɛ, G 2 Y 2 > M a n =: D i. By independence and an application of Breiman s result, npd 1 0 and npd 2 0. Similarly, n PD 3 n P G 2 I ɛ, G 2 Y 2 > Ma n thus, by Lebesgue s dominated convergence theorem, n P Y 2 > M a n E G 2 α I ɛ, G 2, lim ɛ lim sup n PD 3 = 0, n i=1

8 T. MIKOSCH AND D. STRAUMANN and npd 4 0 can be proved in the same way. We conclude that n P n Y 1,Y 2 dx 1,x 2 v µ1 dx 1 ε 0 dx 1 + µ 1 dx 2 ε 0 dx 2 = µ 2 dx 1,x 2, see Resnick [24]. This proves the regular variation of the 2-dimensional finite-dimensional distributions. The higher-dimensional case is completely analogous. c The condition 2.5. We have for any y > 0, P max G t Y t > ya n k t r n G 0 Y 0 > y a n P max G t > y a n /s k a rn k t r n G 0 Y 0 > y a n + P max Y t > s k a rn k t r n =: I 1 + I 2, where s k is any sequence such that s k. In what follows, all calculations go through for any y > 0; for ease of notation we set y = 1. Then, by Remark 3.1, lim lim I 2 = lim 1 Φ αs k = 0. k n k An application of Markov s inequality yields for some constant c > 0 and ɛ > 0 as in A.2 here and in what follows, c denotes any positive constant whose value is not of interest, I 1 r n t=k P G t > a n /s k a rn G 0 Y 0 > a n sk a α+ɛ rn rn E[ G t α+ɛ I { G0 Y 0 >a n/s k a rn }] P G 0 Y 0 > a n a n t=k sk a α+ɛ rn cnr n E G 0 α+ɛ a n 0 as n. Here we used Breiman s result [10] to show that P G 0 Y 0 > a n E G 0 α P Y 0 > a n, condition 3.1 and the fact that E G 1 α+ɛ < ; see A.2. Now we turn to P max G t Y t > a n r n t k G 0 Y 0 > a n P max G t > a n /s k a rn r n t k G 0 Y 0 > a n +P max Y t > s k a rn r n t k G 0 Y 0 > a n =: I 3 + I 4.

STABLE LIMITS AND GARCH 9 The quantity I 3 can be treated in the same way as I 1 to show that I 3 0 a.s. as n. We turn to I 4. Fix 0 < M <. Then I 4 P max r n t k Y t > s k a rn,m Y 0 > a n P G 0 Y 0 > a n + P G 0 I M, G 0 Y 0 > a n P G 0 Y 0 > a n =: I 41 + I 42. By independence of the Y i s, Breiman s [10] result and since r n, By virtue of Breiman s [10] result, I 41 P max r n t k Y t > s k a rn M α P Y 0 > a n E G 0 α P Y 0 > a n c1 Φ α s k as n 0 as k. I 42 E G 0 α I M, G 0 P Y 0 > a n E G 0 α P Y 0 > a n Since G 0 has finite moments of order greater than α, an application of the Lebesgue dominated convergence theorem yields lim lim I 42 = 0. M n This proves 2.5. Thus the conditions a-c and γ > 0 of Theorem 2.1 are satisfied. In the case α < 1, Theorem 2.1 immediately yields 3.2. In the case α [1,2 we have to check condition 2.7. It suffices to show it for the components S i n 0,y], i = 1,...,d, of S n 0,y]. Since the components can be handled in the same way, we suppress the dependence on i and, for the ease of notation, write G t Y t for the summands of the ith component. We start with the case α 1,2. As before, write F t = σy t 1,Y t 2,... Then, for z > 0, since EY 1 = 0, E[G t Y t I 0,z] G t Y t /a n F t ] = G t E[Y t I 0,z] G t Y t /a n G t ] = G t E[Y t I z, G t Y t /a n G t ]. Consider the decomposition [ Gt Y t I 0,z] G t Y t /a n E[G 1 Y 1 I 0,z] G 1 Y 1 /a n ] ] n = n n [ Gt Y t I 0,z] G t Y t /a n G t E[Y t I 0,z] G t Y t /a n G t ] ] [ Gt E[Y t I z, G t Y t /a n G t ] E[G 1 Y 1 I z, G 1 Y 1 /a n ] ] =: T 1 + T 2. For fixed n, T 1 is a sum of stationary mean zero martingale differences. An application of Karamata s theorem to the regularly varying random variable G 1 Y 1 with index α yields for some constant.

10 T. MIKOSCH AND D. STRAUMANN c > 0, vart 1 = n a 2 n E [ G 1 Y 1 I 0,z] G 1 Y 1 /a n G 1 E[Y 1 I 0,z] G 1 Y 1 /a n G 1 ] ] 2 cna 2 n E [ G 1 Y 1 I 0,z] G 1 Y 1 /a n ] 2 3.7 cz 2 α as n 0 as z 0. Next we treat T 2. Fix 0 < δ < M < to be chosen later. Notice that by Karamata s theorem and the uniform convergence theorem for regularly varying functions uniformly for c [δ,m], E[Y 1 I cx, Y 1 ] C cxp Y 1 > cx for some constant C. Taking this into account, the strong law of large numbers yields, with probability 1, 3.8 G t I [δ,m] G t E[Y t I z, G t Y t /a n G t ] n = n G t I [δ,m] G t [za n /G t P Y t > za n / G t G t C + o1] = C + o1z 1 α n 1 G t α I [δ,m] G t C z 1 α E[ G 1 α I [δ,m] G 1 ]. On the other hand, since G 1 I [δ,m] G 1 Y 1 is regularly varying with index α 1,2, by the same argument and Breiman s result, 3.9 n n E[G 1I [δ,m] G 1 Y 1 I z, G 1 Y 1 /a n ] = n [ n C + o1z an PG 1 I [δ,m] G 1 Y 1 > za n ] = C + o1z 1 α E[ G 1 α I [δ,m] G 1 ]. This shows that 3.8 and 3.9 cancel asymptotically as n for every fixed z. A similar argument shows that, with probability 1, 3.10 n G t I [0,δ] G t E[Y t I z, G t Y t /a n G t ] n G t I [0,δ] G t E[ Y 1 I z, δ Y 1 /a n ] Moreover, 3.11 cz/δ 1 α E[ G 1 I [0,δ] G 1 ]. n n E[G 1 I [0,δ] G 1 Y 1 I z, G 1 Y 1 /a n ] n n E[ G 1 I [0,δ] G 1 Y 1 I z, δ Y 1 /a n ] cz/δ 1 α E[ G 1 I [0,δ] G 1 ].

STABLE LIMITS AND GARCH 11 Now choose δ = z 2. Then, first letting n and then z 0, both 3.10 and 3.11 vanish asymptotically. Finally, we consider n E G t I M, G t E[Y t I z, G t Y t /a n G t ] n n E[ G 1 I M, G 1 Y 1 I z, G 1 Y 1 /a n ]. An application of Breiman s result to the regularly varying random variable G 1 I [M, G 1 Y 1 gives that the right-hand side is asymptotically equivalent as n to cz 1 α E[ G 1 α I [M, G 1 ]. Choosing M large enough, the right-hand side is smaller than z, say. The same argument can be applied to Collecting the bounds above, we see that n n E[G 1I [M, G 1 Y 1 I z, G 1 Y 1 /a n ]. lim lim sup P T 2 > r = 0, r > 0. z 0 n This together with 3.7 concludes the proof of 2.7 for α 1,2. For α = 1 we use the additional condition of symmetry of Y t. Then ES n 0,y] = 0 and the same argument as for vart 1 above shows that 2.7 holds in this case as well. This concludes the proof of 2.7. Since the conditions of Theorem 2.1 are satisfied for α [1,2 we conclude that n S n ES n 0,1] d X α for some α-stable random vector in R d. For α = 1 we can drop ES n 0,y] because of the symmetry of G t Y t. For α 1,2, G t Y t is regularly varying with index α. Since EG t Y t = 0, Karamata s theorem yields n ES n0,1] b for some constant b which can be incorporated in the stable limit, and therefore centering in 3.2 can be avoided. This concludes the proof of Theorem 3.2. Remark 3.5. If the roles of G 1 and Y 1 are interchanged in the sense that G 1 is regularly varying with index α and E Y 1 α+ɛ < for some ɛ > 0, Y 1 is regularly varying with index α, as the following calculations show. Observe that the random variables G 1 I S G 1 and G 1 I S G 1 are regularly varying for any Borel set S S d 1 for which S, S are continuity sets with respect to the spectral measure PΘ G of G 1 and which satisfy PΘ G ±S > 0. For such a set S, any

12 T. MIKOSCH AND D. STRAUMANN t,x > 0, by multiple application of Breiman s result, G 1 Y 1 P G 1 Y 1 > tx, G 1 Y 1 S = = P G 1 Y 1 > x P G 1 Y 1 > tx,signy 1 G 1 S P G 1 Y 1 > x P G 1 Y 1 > tx, G 1 S P + P G 1 Y 1 > x E[Y α 1 I 0, Y 1 ]P x. G 1 > tx, G 1 S E Y 1 α P G 1 > x G 1 Y 1 < tx, G 1 S Letting x, one can read off the spectral measure of Y 1 : 3.12 P G 1 Y 1 > x E[ Y 1 α I,0 Y 1 ]P G 1 > tx, G 1 S + E Y 1 α P G 1 > x PΘ S = E[Y 1 αi 0, Y 1 ] E Y 1 α PΘ G S + E[ Y 1 α I,0 Y 1 ] E Y 1 α PΘ G S. The spectral measure of Y 1 under the A-conditions, see 3.3, is completely different from 3.12. One might hope that a result similar to Theorem 3.2 can be derived simply under the conditions that G 1 is regularly varying with index α < 2 and E Y 1 α+ɛ <. However, it is not clear whether Y t has regularly varying finite-dimensional distributions, and it is not clear how to verify condition 2.5. Therefore one cannot expect to derive a result as general as Theorem 3.2 without additonal conditions on the sequences G t and Y t. Examples of limit results when Y t is light tailed and G t has regularly varying tails are given by the sample autocovariances of GARCH processes; see Basrak et al. [2]. 4. Gaussian quasi maximum likelihood estimation for GARCH processes with heavy tailed innovations In this section we apply Theorem 3.2 to Gaussian quasi maximum likelihood estimation QMLE in GARCH processes. The limit properties of the QMLE were studied by Berkes et al. [3]. They proved strong consistency of the QMLE under the moment condition E Z 1 2+δ < for some δ > 0 and established asymptotic normality under EZ1 4 <. Here Z t is an iid innovation sequence; see Section 4.1 below for the definition of the GARCH model and the QMLE. Hall and Yao [15] refined these results and also allowed for innovations sequences, where Z1 2 is regularly varying with index α 1,2. Then the speed of convergence is slower than the usual n rate and the limiting distribution of the QMLE is multivariate α stable. It is our objective to show that the asymptotic theories for the QMLE under light- and heavytailed innovations parallel each other and that very similar techniques can be applied in both cases. However, in the light-tailed case see [3] an application of the CLT for stationary ergodic martingale differences is the basic tool which establishes the asymptotic normality of the QMLE. In the heavy-tailed situation one depends on an analog of the CLT which is provided by Theorem 3.2. As a matter of fact, the structure of the proofs shows that the asymptotic properties of the QMLE are not dependent on the particular structure of the GARCH process if one can establish the regular variation of the finite-dimensional distributions of the underlying process X t and the mixing condition Aa n. Therefore the results of this section have the potential to be extended to more general models, including, for example, the AGARCH or EGARCH models whose QMLE

STABLE LIMITS AND GARCH 13 properties in the light-tailed case were treated in Straumann and Mikosch [28]. The most intricate step in the proof is, however, the verification of this mixing condition for a given time series model. We establish this condition for a GARCH process by an adaptation of Theorem 4.3 in Mokkadem [22]; this yields strong mixing with geometric rate of the relevant sequence. We devote Section 4.4 to the solution of this problem. Before we start, we introduce some notation. If K R d is a compact set, we write CK, R d for the space of continuous R d valued functions equipped with the sup norm v K = sup s K vs. The space CK, R d 1 d 2 consists of the continuous d 1 d 2 matrix valued functions on K; in R d 1 d 2 we work with the operator norm induced by the Euclidean norm, i.e., A = sup Ax, A R d 1 d 2. x =1 4.1. Definition of the QMLE. Recall the definition of a GARCHp,q process X t from 1.1. As before, Z t is an iid innovation sequence with EZ 2 1 = 1 and EZ 1 = 0, and α i,β j are nonnegative constants. GARCH processes have been intensively investigated over the last few years. Assumptions for strict stationarity are complicated: they are expressed in terms of Lyapunov exponents of certain random matrices; see Bougerol and Picard [5] for details. A necessary condition for stationarity is 4.1 β 1 + + β q < 1. Corollary 2.3. in [5]. We will make use of this condition later. In what follows, we always assume strict stationarity of the GARCH processes. As a matter of fact, the observation X t is always a measurable function of the past and present innovations Z t,z t 1,Z t 2,...; hence X t is automatically ergodic. In what follows, we review how an approximation to the conditional Gaussian likelihood of a stationary GARCHp, q process is constructed, i.e., a conditional likelihood under the synthetic assumption Z t iid N0,1. Given X 0,...,X p+1 and σ0 2,...,σ2 q+1, the random variables X 1,...,X n are conditionally Gaussian with mean zero and variances h t θ, t = 1,...,n, where θ = α 0,α 1,...,α p,β 1,...,β q T denotes the presumed parameter and ȟ t θ = { σ 2 t t 0, α 0 + α 1 X 2 t 1 + + α px 2 t p + β 1 ȟ t 1 θ + + β q ȟ t q θ t > 0. The conditional Gaussian log likelihood has the form 4.2 log f θ X 1,...,X n X 0,...,X p+1,σ 2 0,...,σ 2 q+1 = n 2 log2π 1 2 X 2 t + log ȟtθ. ȟ t θ Since X 0,...,X p+1 are not available and the squared volatilities σ0 2,...,σ2 q+1 unobservable, the conditional Gaussian log likelihood 4.2 cannot be numerically evaluated without a certain initialization for σ0 2,...,σ2 p+1 and X 0,...,X q+1. The initial values being asymptotically irrelevant, we set the X t s equal to zero and ĥtθ = α 0 /1 β 1 β q for t 0. We arrive at α 0 /1 β 1 β q t 0, 4.3 ĥ t θ = α 0 + α 1 Xt 1 2 + + α minp,t 1Xmaxt p,1 2 +β 1 ĥ t 1 θ + + β q ĥ t q θ t > 0. The function ĥtθ 1/2 can be understood as an estimate of the volatility at time t and under parameter hypothesis θ. It can be established that ĥt ȟt a.s. 0 with a geometric rate of convergence

14 T. MIKOSCH AND D. STRAUMANN and uniformly on the compact set K defined in 4.4 below. This suggests that by replacing ȟtθ by ĥtθ in 4.2 we obtain a good approximation to the conditional Gaussian log likelihood. Since the constant n log2π/2 does not matter for the optimization, we define the QMLE ˆθ n as a maximizer of the function ˆL n θ = ˆl t θ = 1 Xt 2 + log ĥtθ 2 ĥ t θ with respect to θ K, and K being the compact set 4.4 K = { θ R p+q+1 m αi,β j M, β 1 + + β q β }, where 0 < m < M < and 0 < β < 1 are such that qm < β. Remark 4.1. From a comparison with [3], one might think at first sight that our definition of the QMLE is different from theirs. To see that ĥt coincides with w t in [3], introduce the polynomials αz = α 1 z + + α p z p and βz = 1 β 1 z β q z q for every θ = α 0,α 1,...,α p,β 1,...,β q T K. Then one can show by induction on t that 4.5 ĥ t θ = α t 1 0 β1 + ψ j θxt j, 2 where the coefficients ψ j θ are defined through 4.6 j=1 αz βz = ψ j θz j, z 1. j=1 Note that the latter Taylor series representation is valid because β i 0 and β 1 + + β q β < 1 imply βz 0 on K for z 1 + ɛ and ɛ > 0 sufficiently small. We choose 4.3 rather than 4.5 as a first definition for the squared volatility estimate under parameter hypothesis θ, because the recursion 4.3 is natural and computationally attractive. In [3], starting point for the definition of the QMLE is Theorem 2.2, which says that for all t Z one has h t θ 0 = σ 2 t, where θ 0 is the true parameter and 4.7 h t θ = α 0 β1 + ψ j θxt j 2. In [3] this leads to the definition of a squared volatiliy estimate at time t under parameter θ based on X 1...,X n, which is given by 4.5. Note also that h t θ obeys 4.8 h t+1 θ = α 0 + α 1 X 2 t + α px 2 t+1 p + β 1h t θ + + β q h t+1 q θ, θ K. 4.2. Limit distribution in the case EZ1 4 <. First we list the conditions employed by [3] for establishing consistency and asymptotic normality of ˆθ n. Write θ 0 = α 0,α 1,...,α p,β1,...,β q T for the true parameter. C.1 There is δ > 0 such that E Z 1 2+δ <. j=1 C.2 The distribution of Z 1 is not concentrated in one point. C.3 There is µ > 0 such that P Z 1 t = ot µ as t 0. C.4 The true parameter θ 0 lies in the interior of K.

STABLE LIMITS AND GARCH 15 C.5 The polynomials α z = α 1 z + + α pz p and β z = 1 β 1 z β qz q do not have any common roots. Now we are ready to quote the main result of [3]. We cite it in order to be able to compare the assumptions and assertions both in the light- and heavy-tailed cases; cf. Theorem 4.4 below. Theorem 4.2 Theorem 4.1 of Berkes et al. [3]. Let X t be a stationary GARCHp,q process with true parameter vector θ 0. Suppose the conditions C.1 C.5 hold. Then the QMLE ˆθ n is strongly consistent, i.e., a.s. ˆθ n θ 0, n. If in addition EZ0 4 <, then ˆθ n is also asymptotically normal, i.e., nˆθn θ 0 d N0,B 1 0 A 0B 1 0, where the p + q + 1 p + q + 1 matrices A 0 and B 0 are given by A 0 = EZ4 0 1 1 E 4 σ1 4 h 1 θ 0 T h 1 θ 0, B 0 = 1 1 4.9 2 E σ1 4 h 1 θ 0 T h 1 θ 0. 4.3. Limit distribution in the case EZ1 4 =. First we identify the limit determining term for the QMLE. To this end, we set analogously to [3], L n θ = l t θ = 1 X 2 t 2 h t θ + log h tθ and define θ n as a maximizer of L n with respect to θ K. It is a slightly simpler problem to analyze θ n because l t is stationary ergodic, in contrast to ˆl t t N. As is shown in Proposition 4.3 below, ˆθ n and θ n are asymptotically equivalent. It turns out that the asymptotic distribution of the QMLE is essentially determined by the limit behavior of L n θ 0/n, up to multiplication with the matrix B0 1. Actually, these results follow by a careful analysis of the proofs in Berkes et al. [3]. To give some guidance to the reader who wants to verify all details, we briefly repeat the necessary steps and arguments. Compare also with the similar reference Straumann and Mikosch [28], where the case of processes with a more general volatility structure than GARCH is treated. Proposition 4.3. Let X t be a stationary GARCHp,q process with true parameter vector θ 0. Suppose the conditions C.1 C.5 apply. If there is a positive sequence x n n 1 with x n = on as n and 4.10 x n L n θ 0 n d D, n, for an R p+q+1 valued random variable D, then the QMLE ˆθ n satisfies the limit relation 4.11 x n ˆθ n θ 0 d B 1 0 D, where B 0 is given by 4.9. Proof. We first demonstrate x n θ n θ 0 d B 1 0 D. In a second step we establish x n θ ˆθ n a.s. 0. Finally, the assertion 4.11 is an immediate consequence of Slutsky s lemma. Step 1. Since for all θ K the polynomials βz have no root in the disc { z 1 + ɛ }, ɛ > 0 sufficiently small, the coefficients ψ j θ, see 4.6, and their first and second derivatives decay exponentially fast, uniformly on K Lemmas 3.2 and 3.3 in [3]. From this together with the fact that E log + X 0 < Lemma 2.3 in [3], one shows by means of Lemma 2.2 in [3] that

16 T. MIKOSCH AND D. STRAUMANN the functions h t are twice continuously differentiable on K a.s. and that differentiation and infinite sum in 4.6 may be interchanged. As a consequence, h t and h t are measurable functions of X t 1,X t 2,... Since X t is stationary ergodic, so are h t and h t see e.g. Proposition 2.3 in [28]. This means that L n is twice continuously differentiable, i.e., L n = n l t, where 4.12 l t θ = 1 1 h t 2 h t θ 2 θt h t θ 2 X2 t h t θ 1 + h t θh tθ Xt 2, θ K. An inspection of the proof of Theorem 4.1 in [3] reveals that θ n a.s. θ 0. Consequently, for large enough n the following Taylor expansion is valid: 4.13 L n θ n = L n θ 0 + L n ζ n θ n θ 0, where ζ n θ 0 < θ n θ 0. Since θ n is the maximizer of L n and θ 0 lies in the interior of K, one has L n θ n = 0. Therefore 4.13 is equivalent to 4.14 n 1 L n ζ n θ n θ 0 = n 1 L n θ 0. Our aim is to apply a uniform strong law of large numbers for proving the uniform convergence of n 1 L n = n 1 n l t see e.g. Theorem 2.5 in [28]. Since l t is a measurable function of X t,x t 1,..., it is stationary ergodic. If we can verify E l 1 K <, then in CK, R d d, 4.15 L n/n a.s. L, n, where L θ = E[l 1 θ], θ K. For showing E l 1 K <, first note that E X1 2/h 1 1+s K < if s < δ/2 Lemma 5.1 of [3] and that h 1 /h 1 K and h 1 /h 1 K have finite moments of any order Lemma 5.2 of [3]. Then the desired relation is obtained from an application of the triangle inequality to the norm of 4.12, followed by the use of Hölder s inequality. Relation 4.15 together a.s. with ζ n θ 0 implies L n ζ n/n a.s. E[l 1 θ 0], n. Take into account that h 1, h 1 and h 1 are independent of Z 1, h 1 θ 0 = σ1 2 a.s. and X 1 = σ 1 Z 1, in order to conclude E[l 1 θ 0] = 2 1 E h 1 θ 0 T h 1 θ 0/σ1 4 = B0. It is shown in Lemma 5.7 of [3] that B 0 is negative definite and hence invertible. Thus the matrix L nζ n /n has inverse B 1 0 1 + o P1, n. Therefore equation 4.14 is equivalent to 4.16 θ n θ 0 = B 1 0 1 + o P1L n θ 0/n, which shows that x n θ n θ 0 d B 1 0 D. Step 2. The relation x n θ n ˆθ 0 a.s. 0 follows along the lines of proof of Theorem 4.4 in [3] or Lemma 7.5 in [28]. With the help of the mean value theorem together with the facts that a.s. ĥt h t K 0 and ĥ t h t K a.s. 0 with exponential rate Lemmas 5.8 and 5.9 in [3], one can bound ˆL n L n K ˆl t l t K to show sup n N ˆL n L n K < a.s. Consequently, since x n = on, 4.17 x n n ˆL n L a.s. n K 0, n. From Taylor s theorem, 4.18 L n θ n L n ˆθ n = L n ξ n θ n ˆθ n, where ξ n lies on the line segment connecting ˆθ n and θ n. This line segment is completely contained in the interior of K provided n is large enough. Since L n θ n = ˆL nˆθ n = 0, equation 4.18 is equivalent to 4.19 x n n ˆL n ˆθ n L n ˆθ n = L n ξ n x n θ n n ˆθ n.

STABLE LIMITS AND GARCH 17 By virtue of relation 4.17, both sides of 4.19 tend to 0 a.s. when n. From 4.15 and a.s. ξ n θ 0, we conclude L nξ n /n a.s. E[l 1 θ 0] = B 0. Since B 0 is invertible, we can deduce x n θ n ˆθ n a.s. 0. This completes the proof of the proposition. Now we can state the main theorem of this section. We remind once again that Hall and Yao [15] derived the identical result by means of different techniques. Theorem 4.4. Let X t be a stationary GARCHp,q process with true parameter vector θ 0. Suppose that Z1 2 is regularly varying with index α 1,2 and that C.3 C.5 hold true. Moreover, assume that Z 1 has a Lebesgue density f, where the closure of the interior of the support {f > 0} contains the origin. Define x n = n n, where Then the QMLE ˆθ n is consistent and PZ 2 1 > a n n 1, n. 4.20 x n ˆθ n θ 0 d D α, n, for some non degenerate α stable vector D α. Before proving the theorem, we discuss its practical consequences for parameter inference: The rate of convergence x n has roughly speaking magnitude n 1 1/α, which is less than n. The heavier the tails of the innovations, i.e., the smaller α, the slower is the convergence of ˆθ n towards the true parameter θ 0. The limit distribution of the standardized differences ˆθ n θ 0 is α stable and hence non Gaussian. The exact parameters of this α stable limit are not explicitly known. Confidence bands based on the normal approximation of Theorem 4.2 are false if EZ 4 1 =. By the definition of a GARCH process, the distribution of the innovations Z t is unknown. Therefore assumptions about the heaviness of the tails of its distribution are purely hypothetical. As a matter of fact, the tails of the distribution of X t can be regularly varying even if Z t has light tails, such as for the normal distribution; see Basrak et al. [2]. Depending on the assumptions on the distribution of Z 1, one can develop different asymptotic theories for QMLE of GARCH processes: asymptotic normality as provided by Theorem 4.2 or infinite variance stable distributions as provided by Theorem 4.4. Proof. The proof follows by combining Theorem 3.2 and Proposition 4.3. Indeed, setting one recognizes that G t = h tθ 0 /σ 2 t, Y t = Z 2 t 1/2 and Y t = G t Y t, 4.21 L nθ 0 = 1 2 h t θ 0 σ 2 t Z 2 t 1 = G t Y t is a martingale transform. Regular variation of Z1 2 with index α 1,2 implies A.1, but also C.1 and C.2. Condition A.2 is fulfilled because h 1 /h 1 K has finite moments of any order Lemma 5.2 of [3], and so has G 1. The condition A.3 holds true if we can show that Y t is strongly mixing with geometric rate, in which case we choose r n = n δ in Aa n for any small δ > 0 so that 3.1 immediately follows. This choice of r n is justified by the arguments given in Basrak et al. [2]. The strong mixing condition with geometric rate of Y t will be verified in Section 4.4. Finally, we have to give an argument for γ > 0. The latter quantity has interpretation as the extremal index of the sequence Y t see Remark 2.3 in [12]; cf. Leadbetter et al. [18] for the definition and properties of the extremal index. According to Theorem 3.7.2 in [18], if γ = 0, then if for some sequence u n the relation lim inf n P M n u n > 0 holds, one neccessarily has lim n PM n u n = 1. Here M n = max Y 1,..., Y n and M n is the corresponding sequence of partial maxima for an iid sequence R i where R 1 has the same distribution as Y 1. Assume γ = 0. The random variable

18 T. MIKOSCH AND D. STRAUMANN Y 1 is regularly varying with index α since Y 1 is regularly varying with index α. Hence M n n has a Fréchet limit distribution Φ α, but PM n xa n 1 does not hold for all positive x. Indeed, straightforward arguments exploiting for all i = 1,...,p, show that 4.22 and 4.23 j=1 ψ j θ α i z j = zi, z 1, βz h t θ α i 0 for all i = 0,... p, p i=0 α i h t θ α i = h t θ. Since the Euclidean norm is equivalent to the 1 norm x = p+q+1 i=1 x i and α i M on K, there is c > 0 such that h tθ h t θ c h t θ p i=0 α i h t θ α i = c h t θ p i=0 α i h t θ α i = c. Note that the last two equalities in the latter display are a consequence of 4.22 and 4.23. Hence we have PM n xa n Pmax t n Y t c 1 xa n, and the right hand side converges to a Fréchet limit and is never equal to 1 for all positive x. From this contradiction we may conclude that γ > 0. All conditions of Theorem 3.2 have been verified so that 2 n L n θ 0 = 2x n L nθ 0 n d D α, where D α is α stable notice that PZ0 2 1/2 > a n/2 PZ0 2 > a n n 1. Since x n /n = 0, Proposition 4.3 implies n x n ˆθ n θ 0 d 2 1 B 1 0 D α = D α. Recalling that a linear transform of an α stable random vector is again α stable concludes the proof of the theorem. 4.4. Verification of strong mixing with geometric rate of Y t. To begin with, we quote a powerful result due to Mokkadem [22], which allows one to establish strong mixing in stationary solutions of so called polynomial linear stochastic recurrence equations SRE s. A sequence Y t of random vectors in R d obeys a linear SRE if 4.24 Y t = P t Y t 1 + Q t where P t,q t constitutes an iid sequence with values in R d d R d. A linear SRE is called polynomial if there exists an iid sequence e t in R d such that P t = Pe t and Q t = Qe t, where Px and Qx have entries and coordinates, respectively, which are polynomial functions of the coordinates of x. The existence and uniqueness of a stationarity solution to 4.24 has been studied by Brandt [9], Bougerol and Picard [6] and Babillot et al. [1] and others. The following set of conditions is sufficient: E log + P 1 <, E log + Q 1 <, and the top Lyapunov coefficient associated with the operator sequence P t is strictly negative, i.e., 4.25 ρ = inf{t 1 E log P t P 1 t 1} < 0. Here is the operator norm corresponding to an arbitrary fixed norm in R d, e.g. the Euclidean norm. The following result is a slight generalization of Theorem 4.3 in [22]; see the beginning of the proof below for a comparison.

STABLE LIMITS AND GARCH 19 Theorem 4.5. Let e t be an iid sequence of random vectors in R d. Then consider the polynomial linear SRE 4.26 Y t = Pe t Y t 1 + Qe t, where Pe t is a random d d matrix and Qe t a random R d valued vector. Suppose: 1. P0 has spectral radius strictly smaller than 1 and the top Lyapunov coefficient ρ corresponding to Pe t is strictly negative. 2. There is s > 0 such that E Pe 1 s < and E Qe 1 s <. 3. There is a smooth algebraic variety V R d such that e 1 has a density f with respect to Lebesgue measure on V. Assume that 0 is contained in the closure of the interior of the support {f > 0}. Then the polynomial linear SRE 4.26 has a unique stationary ergodic solution Y t, which is absolutely regular with geometric rate and consequently strongly mixing with geometric rate. Remark 4.6. As regards the definition of a smooth algebraic variety, we first introduce the notion of an algebraic subset. An algebraic subset of R d is a set of form V = {x R d F 1 x = = F r x = 0}, where F 1,...,F r are real multivariate polynomials. An algebraic variety is an algebraic subset which is not the union of two proper algebraic subsets. An algebraic variety is smooth if the Jacobian of F = F 1,...,F r T has identical rank everywhere on V. Examples of smooth algebraic varieties in R d are the hyperplanes of R d or V = R d. Recall also the definition of absolute regularity 2.4. Proof. There is nothing to prove if E Pe s 1 < 1 for some s > 0 as this special case is the content of Theorem 4.3 in [22]. For the general case it suffices to prove the absolute regularity with geometric rate for some subsequence Y tm t Z, where m 1 is fixed. Indeed, the mixing coefficient b k is nonincreasing and since Y t is a Markov process, the simpler representation b k = E sup PB σy 0 PB B σy k+1 is also valid, see e.g. Bradley [8]. Since ρ < 0, there is m 1 with E log Pe m Pe 1 < 0. From the fact that the map u E Pe m Pe 1 u has first derivative equal to E log Pe m Pe 1 at u = 0, we deduce that there is 0 < s s with E Pe m Pe 1 s < 1. Then note that Ỹt = Y tm obeys a linear SRE: where ẽ t = and e tm. e t 1m+1 Ỹ t = Pẽ t Ỹt 1 + Qẽ t, Pẽ t = Pe tm Pe t 1m+1, Qẽ t = Qe tm + m 1 j=1 j Pe tm+1 i Qe tm j. Since both the matrix Pẽ t and the vector Qẽ t are polynomial functions of the coordinates of ẽ t and the sequence ẽ t is iid, Ỹt obeys a polynomial linear SRE. Observe that P0 = P0 m has spectral radius strictly smaller than 1, that E Pẽ 1 s < 1 and E Qẽ 1 s < and that ẽ 1 i=1

20 T. MIKOSCH AND D. STRAUMANN has a density with respect to Lebesgue measure on V m, where V m is a smooth algebraic variety see A.14 in [22]. Thus an application of Theorem 4.3 in [22] yields that Ỹt is absolutely regular with geometric rate. This proves the assertion. The following two facts will also be needed. Lemma 4.7. Let P t be an iid sequence of k k matrices with E P 1 s < for some s > 0. Then the associated top Lyapunov coefficient ρ < 0 if and only if there exist c > 0, s > 0 and λ < 1 so that 4.27 E P t P 1 s cλ t, t 1. Proof. For the proof of necessity, observe that there exists n 1 such that E log P n P 1 < 0. From the fact that the map u E P n P 1 u has first derivative equal to E log P n P 1 at u = 0, we deduce that there is s > 0 with E P n P s 1 = λ < 1. Since the operator norm is submultiplicative and the factors in P t P 1 are iid, E P t P s 1 λ t/n 1 max E P l P s 1 cλ t, t 1, l=1,...,n 1 for c = λ 1 max l=1,...,n 1 E P l P 1 s and λ = λ 1/n. Regarding the proof of sufficiency, use Jensen s inequality and lim t t 1 E log P t P 1 = ρ to conclude 1 ρ = lim t t s E log P t P s 1 lim sup t This completes the proof of the lemma. B t lim sup t Lemma 4.8. Suppose that At 0 4.28 P t = r k r, t Z, C t 1 t s log E P t P s 1 1 log λ log c + t log λ = < 0. t s s forms an iid sequence of k k matrices with E P 1 s <, s > 0, where A t R r r, B t R k r r and C t R k r k r. Then its associated top Lyapunov coefficient ρ P < 0 if and only if the sequences A t and C t have top Lyapunov coefficients ρ A < 0 and ρ C < 0. Proof. For the proof of sufficiency of ρ A < 0 and ρ C < 0 for ρ P < 0, it is by Lemma 4.7 enough to derive a moment inequality of form 4.27 for P t. By induction we obtain At A P t P 1 = 1 0 r k r, Q t C t C 1 where Observe that Q t = B t A t 1 A 1 + C t B t 1 A t 2 A 1 + C t C t 1 B t 2 A t 3 A 1 + + C t C 3 B 2 A 1 + C t C 2 B 1. 4.29 max A t A 1, C t C 1 P t P 1 A t A 1 + C t C 1 + Q t. It is sufficient to show 4.27 for each block in the matrix P t P 1. Because of ρ A < 0, ρ C < 0 and E A 1 s,e C 1 s E P 1 s <, Lemma 4.7 already implies moment bounds of form 4.27 for A t and C t. Thus we are left to bound Q t. Without loss of generality we may assume that the constants λ < 1 and s,c > 0 in the inequality 4.27 are equal for A t and C t and that

STABLE LIMITS AND GARCH 21 s s 1. From an application of the Minkowski inequality and exploiting the independence of the factors in each summand of Q t, we receive the desired relation E Q t s c 2 t E B 1 s λ t 1 c λ t, some λ λ,1, c > 0. For the proof of necessity, assume ρ P < 0. Then the left-hand estimates in 4.29 and Lemma 4.7 imply that ρ A < 0 and ρ C < 0. We now exploit Theorem 4.5 in order to establish strong mixing with geometric rate of the sequence Y t = G t Y t, where G t = h t θ 0/σ 2 t and Y t = Z 2 t 1/2. Proposition 4.9. Let X t be a stationary GARCHp,q process with true parameter vector θ 0. Moreover, assume that Z 1 has a Lebesgue density f, where the closure of the interior of the support {f > 0} contains the origin. Then Y t is absolutely regular with geometric rate. Proof. For the proof of this result we first embed Y t in a polynomial linear SRE. Without loss of generality assume p, q 3. Write Ỹ t = σt+1 2,...,σ2 t q+2,x2 t,...,x2 t p+2, Since Zt 2 = X2 t /σ2 t, we have h t+1 θ 0 α 0,..., h t q+2θ 0 α 0,..., h t+1θ 0 α p,..., h t q+2θ 0 α p, h t+1 θ 0 β 1,..., h t q+2θ 0 β 1,..., h t+1θ 0 β q,..., h t q+2θ 0 β q T. σy t, t > k σỹt,t > k and σy t, t 0 σỹt, t 0. Consequently, it is enough to demonstrate abolute regularity with geometric rate of the sequence Ỹt. We introduce various matrices. Write 0 d1 d 2 for the d 1 d 2 matrix with all entries equal to zero and let I d denote the identity matrix of dimension d. Then set τ t βq α α p I M 1 Z t = q 1 0 q 1 1 0 q 1 p 2 0 q 1 1 ξ t 0 1 1 0 1 p 2 0 1 1, where 0 p 2 q 1 0 p 2 1 0 p 2 p 2 0 p 2 1 τ t = β 1 + α 1Z 2 t,β 2,...,β q 1 R q 1, ξ t = Z 2 t,0,...,0 R q 1, α = α 2,...,α p 1 Rp 2. Moreover, define M 2 Z t = 0 q p+q 1 U 1. U p V 1 and M 4 =. V q,