Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.09 Kolmogorov-Smirov type Tests for Local Gaussiaity i High-Frequecy Data George Tauche, Duke Uiversity Viktor Todorov, Northwester Uiversity July 013 Abstract We derive a oparametric test for the class of Itô semimartigales with o-vaishig diffusio compoet usig high-frequecy record of the process o a iterval with fixed spa. The test is based o the fact that the leadig compoet of the high-frequecy icremets of Itô semimartigales with o-vaishig diffusio compoet is a ormally distributed radom variable with ukow stochastic variace that is proportioal to the legth of the high-frequecy iterval. We form a oparametric estimate of the local variace ad scale the high-frequecy icremets by it. To remove the effect of big jumps, we further discard the high-frequecy icremets exceedig a time-varyig threshold determied by our estimate of the local variace. Our test is the based o comparig the distace betwee the empirical cdf of the rescaled high-frequecy icremets ot exceedig the threshold ad the cdf of a stadard ormal radom variable. We show that the test has a good power agaist Itô semimartigales with o diffusio compoet as well as Itô semimartigales cotamiated with oise. Keywords: High-frequecy data, Itô semimartigale, jumps, Kolmogorov-Smirov test, stable process, stochastic volatility 1 Itroductio The stadard jump-diffusio model used for modelig may stochastic processes is a Itô semimartigale give by the followig differetial equatio dx t = α t dt + σ t dw t + dy t, (1) where α t ad σ t are processes with càdlàg paths, W t is a Browia motio ad Y t is a Itô semimartigale process of pure-jump type (i.e., semimartigale with zero secod characteristic, Defiitio II..6 i Jacod ad Shiryaev (003)). At high-frequecies, provided σ t does ot vaish, the domiat compoet of X t is its cotiuous martigale compoet ad at these frequecies the icremets of X t i (1) behave like scaled ad

Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.030 idepedet Gaussia radom variables. That is, for each fixed t, we have the followig covergece 1 h (X t+sh X t ) L σ t (B t+s B s ), as h 0 ad s [0, 1], () where B t is a Browia motio ad the above covergece is for the Skorokhod topology. There are two distictive features of the covergece i (). The first is the scalig factor of the icremets o the left side of () is the square-root of the legth of the high-frequecy iterval, a feature that has bee used i developig tests for presece of diffusio. The secod distictive feature is that the limitig distributio of the (scaled) icremets o the right side of () is mixed Gaussia (the mixig give by σ t ). The result i () implies that the high-frequecy icremets are approximately Gaussia but the key obstacle of testig directly () is that the variace of the icremets, σ t, is ukow ad further is approximately costat oly over a short iterval of time. Therefore, o a first step we split the high-frequecy icremets ito blocks (with legth that shriks asymptotically to zero as we sample more frequetly) ad form local estimators of volatility over the blocks. We the scale the high-frequecy icremets withi each of the blocks by our local estimates of the volatility. This makes the scaled high-frequecy icremets approximately i.i.d. variables with uit variace. cetered ormal radom To purge further the effect of big jumps, we the discard the icremets that exceed a time-varyig threshold (that shriks to zero asymptotically) with timevariatio determied by our estimator of the local volatility. We derive a (fuctioal) Cetral Limit Theorem (CLT) for the covergece of the empirical cdf of the scaled high-frequecy icremets, ot exceedig the threshold, to the cdf of a stadard ormal radom variable. The rate of covergece ca be made arbitrary close to, by appropriately choosig the rate of icrease of the block size, where is the umber of high-frequecy observatios withi the time iterval. This is achieved despite the use of the block estimators of volatility, each of which ca estimate the spot volatility σ t at a rate o faster tha 1/4. The developed limit theory the allows us to perform a test of the Kolmogorov-Smirov type for the local Gaussiaity of the high-frequecy icremets by evaluatig maximal differece betwee two cdf-s. We further derive the behavior of our statistic i two possible alteratives. Setup We start with the formal setup ad assumptios. We will geeralize the setup i (1) to accommodate also the alterative hypothesis i which X ca be of pure-jump type. Thus, the geeralized setup we cosider is the followig. The process X is defied o a filtered space (Ω, F, (F t ) t 0, P) ad has

Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.031 the followig dyamics dx t = α t dt + σ t ds t + dy t, (3) where α t, σ t ad Y t are processes with càdlàg paths adapted to the filtratio ad Y t is of pure-jump type. S t is a stable process with a characteristic fuctio, see e.g., Sato (1999), give by log [ E(e ius ] t = t cu β ta(πβ/) if β 1, (1 iγsig(u)φ), Φ = (4) π log u, if β = 1, where β (0, ] ad γ [ 1, 1]. Whe β = ad c = 1/ i (4), we recover our origial jumpdiffusio specificatio i (1) i the itroductio. Whe β <, X is of pure-jump type. Y t i (3) will play the role of a residual jump compoet at high frequecies (see assumptio A below). We ote that Y t ca have depedece with S t (ad α t ad σ t ), ad thus X t does ot iherit the tail properties of the stable process S t, e.g., X t ca be drive by a tempered stable process whose tail behavior is very differet from that of the stable process. 3 Test for Local Guassiaity of High-Frequecy Data We ow derive a test for the local Gaussiaity at high-frequecies, i.e., the result i (), usig high-frequecy record of X. More specifically, we assume X is observed o the equidistat grid 0, 1,..., 1 with. We start with the costructio of the test statistic followed by limit theory for its behavior uder the ull ad a set of alteratives. 3.1 Empirical CDF of the Devolatilized High Frequecy Icremets I the derivatio of the test statistic we will suppose that S t is a Browia motio ad the i the ext subsectio we will derive the behavior of the statistic uder the more geeral case whe S t is stable. The result i () suggests that the high-frequecy icremets i X = X i X i 1 are approximately Gaussia with variace give by the value of the process σt at the begiig of the icremet. Of course, the stochastic volatility σ t is ot kow ad varies over time. Hece to test for the local Gaussiaity of the high-frequecy icremets we first eed to estimate locally σ t ad the divide the high-frequecy icremets by this estimate. To this ed, we divide the iterval [0, 1] ito blocks each of which cotais k icremets, for some determiistic sequece k with k / 0. O each of the blocks our local estimator of σ t is give by V j = π jk k i=(j 1)k + i 1X i X, j = 1,..., /k. (5) 3

Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.03 V j is the Bipower Variatio proposed by Bardorff-Nielse ad Shephard (004, 006) for measurig the quadratic variatio of the diffusio compoet of X. We ote that a alterative measure of σ t ca be costructed usig the so-called trucated variatio. I turs out, however, that while the behavior of the two volatility measures uder the ull is the same, it differs i the case whe S t is stable with β <. Usig a trucated variatio type estimator of σ t will lead to degeerate limit of our statistic, ulike the case of usig the Bipower Variatio estimator i (5). For this reaso we prefer the latter i our aalysis. We use the first m icremets o each block, with m k, to test for local Gaussiaity. The case m = k amouts to usig all icremets i the block ad we will eed m < k for derivig a feasible CLT later o. Fially, we eed to remove the high-frequecy icremets that cotai big jumps. The total umber of icremets used i our statistic is thus give by N (α, ϖ) = /k j=1 (j 1)k +m i=(j 1)k +1 ( ) 1 i X α V j, (6) where α > 0 a ϖ (0, 1/). We ote that we use a time-varyig threshold i our trucatio to accout for the time-varyig σ t. With this, we defie /k (j 1)k F 1 +m (τ) = N 1 i X 1 (α, ϖ) j=1 i=(j 1)k +1 V { } τ i X α V j ϖ, (7) j which is simply the empirical cdf of the devolatilized icremets that do ot cotai big jumps. Uder the ull of model (1), F (τ) should be approximately the cdf of a stadard ormal radom variable. 3. Covergece i probability of F (τ) We ext derive the limitig behavior of F (τ) both uder the ull of model (1) as well as uder a set of alteratives. We start with the case whe X t is give by (3). Theorem 1 Uder regularity coditios, if the block size grows at the rate ad m as. The if β (1, ], we have k q, for some q (0, 1), (8) F (τ) P F β (τ), as, (9) where the above covergece is uiform i τ over compact subsets of (, 0) (0, + ); F β (τ) is the cdf of π S 1 E S 1 (S 1 is the value of the β-stable process S t at time 1) ad F (τ) equals the cdf of a stadard ormal variable Φ(τ). 4

Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.033 We ext derive the limitig behavior of F (τ) uder a alterative of the local Gaussia settig i (1) which is of particular relevace i fiacial applicatios, i.e., the situatio whe the Itô semimartigale X is cotamiated by oise. 3.3 CLT for F (τ) uder Local Gaussiaity To derive a test based o F (τ) we eed a feasible CLT for its behavior uder the ull of S t beig Browia motio. This is doe i the followig theorem. Theorem Let X t satisfy (3) with S t beig a Browia motio ad assume that assumptio B holds. Further, let the block size grow at the rate m k 0, k q, for some q (0, 1/), whe. (10) We the have locally uiformly i subsets of (, 0) (0, + ) F (τ) Φ(τ) = Ẑ 1 (τ) + Ẑ (τ) + 1 τ Φ (τ) τφ ( (τ) (π ) ) + π 3 k 8 ( ) (11) 1 + o p, k ( /k m Ẑ1 (τ) /k k Ẑ (τ)) L (Z 1 (τ) Z (τ)), (1) where Φ(τ) is the cdf of a stadard ormal variable ad Z 1 (τ) ad Z (τ) are two idepedet Gaussia processes with covariace fuctios Cov (Z 1 (τ 1 ), Z 1 (τ )) = Φ(τ 1 τ ) Φ(τ 1 )Φ(τ ), [ τ1 Φ (τ 1 ) τ Φ ] ( (τ ) (π ) ) Cov (Z (τ 1 ), Z (τ )) = + π 3, τ 1, τ R \ 0. (13) 3.4 Test for Local Gaussiaity We proceed with a feasible test for a jump-diffusio model of the type give i (1) usig Theorem. { } C = N (α, ϖ) F (τ) Φ(τ) > q (α, A) (14) sup τ A where recall Φ(τ) deotes the cdf of a stadard ormal radom variable, α (0, 1), A R \ 0 is a fiite uio of compact sets with positive Lebesgue measure, ad q (α, A) is the (1 α)-quatile of sup Z 1(τ) + τ A m m τ Φ (τ) τφ (τ) Z (τ) + k k k 8 ( (π ) ) + π 3, (15) with Z 1 (τ) ad Z (τ) beig the Gaussia processes defied i Theorem. We ca easily evaluate q (α, A) via simulatio. 5

Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.034 Now, i terms of the size ad power of the test, uder assumptios A ad B, usig Theorem 1 ad Theorem, we have lim P (C ) = α, if β = ad lim if 4 Proofs available olie Refereces P (C ) = 1, if β (1, ), (16) Bardorff-Nielse, O. ad N. Shephard (004). Power ad Bipower Variatio with Stochastic Volatility ad Jumps. Joural of Fiacial Ecoometrics, 1 37. Bardorff-Nielse, O. ad N. Shephard (006). Ecoometrics of Testig for Jumps i Fiacial Ecoomics usig Bipower Variatio. Joural of Fiacial Ecoometrics 4, 1 30. Jacod, J. ad A. Shiryaev (003). Limit Theorems For Stochastic Processes (d ed.). Berli: Spriger- Verlag. Sato, K. (1999). Lévy Processes ad Ifiitely Divisible Distributios. Cambridge, UK: Cambridge Uiversity Press. 6