: Asymptotic Inference and Empirical Analysis Qian Li Department of Mathematics and Statistics University of Missouri-Kansas City ql35d@mail.umkc.edu October 29, 2015
Outline of Topics Introduction GARCH Multiplicative Error Model (MEM) Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Simulation Study Empirical Analysis on IBM Volume per Trade
GARCH Multiplicative Error Model (MEM) GARCH and ACD The ARCH model, proposed by Engle (1982) and later generalized as GARCH by Bollerslev (1986), describes a process (such as the price volatility) {y t} n t=1 as y t = σ tɛ t σ 2 t = ω 0 + p i=1 α i y 2 t i + q β j σt j 2 i=j (1) where ɛ t s are i.i.d., Eɛ t = 0 and Eɛ 2 t = 1. The Autoregressive Conditional Duration (ACD) model proposed in Engle and Russell (1998) is one of the extensions of GARCH, specified as d t = h tz t p h t = ω 0 + α i d t i + i=1 where z t s are i.i.d. and z t > 0, Ez t = 1. q (2) β j h t j i=j
GARCH Multiplicative Error Model (MEM) Multiplicative Error Model (MEM) Engle (2002) generalizes ACD as Multiplicative Error Model (MEM). Model Specification r t = h tz t p h t = ω + α i r t i + i=1 q β j η t j + γ (3) w t i=j where z t s are i.i.d., z t 0, Ez t = 1 and w t is a vector of weakly exogenous variables. Remark:It applies to a wide range of non-negative financial variables: Durations (for example, price duration and volume duration) Volume of shares per trade Daily high-low range of transaction price Ask-bid spread
GARCH Multiplicative Error Model (MEM) Motivation For the Extended Model I IBM stock transaction data on April 8th-12th, 2014 Trade Duration 0 50 100 150 Trade Volume 0 1000 2000 3000 4000 5000 6000 0 5000 10000 15000 Trade 0 5000 10000 15000 Trade Intraday trend must be removed before modeling by a smoothing technique. There is no typical intrady trend observed in volume per trade. The volume per trade data has a lower bound determined by the minimum order size allowed by the exchange.
GARCH Multiplicative Error Model (MEM) Motivation For the Extended Model II Table : Adjusted Duration and Raw Volume 1st Quartile/Min Median/Min 3rd Quartile/Min Max/Min A. Duration 8.410256 64.61538 49.20379 1015.385 R. Volume 1 2.867405 3 81.83 Note: The minimum of volume is not trivial, compared to the scale of the whole sample. Questions: Is there a constant lower bound for volume? How to incorporate this lower bound in MEM? Can we use the minimum statistic to estimate the lower bound? How to estimate the autoregressive equation in MEM with the lower bound?
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Location MEM(p,q) and the Matrix Expression Location Linear MEM at order (p,q): r t = µ 0 + h tz t (4) p q h t = ω + α i (r t i µ 0 ) + β j h t j (5) i=1 j=1 By Berkes et al. (2003), Location MEM(p,q) can be rewritten as X t+1 = C tx t + D (6) where X t = (h t,..., h t q 1, x t 1,..., z t p 1 ) R p+q 1 and x t = r t µ 0 τ t β q α α p C t = I q 1 0 0 0 ξ t 0 0 0, 0 0 I p 2 0 α = (α 2,..., α p 1 ) R p 2, τ t = (β 1 + α 1 z t, β 2,..., β q 1 ) R q 1, ξ t = (z t, 0,..., 0) R q 1,D = (ω, 0,..., 0) R p+q 1,
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Assumption Based on The Matrix Expression Assumption 1: (1) z t s are i.i.d. and nondegenerate. (2) E(ln C 0 ) <. (3) γ L < 0. (4) E z t s < for some s 1. where. d is the Euclidean norm in R d M = sup{ Mx d / x d : x R d, x 0}, and γ L is defined as γ L = inf 0 n Proposition 1: Under Assumption 1, (1) X t+1 = D + C t... C t k D 0 k< 1 n + 1 E ln C 0C 1 C n, (2) r t and h t are strictly stationary, and E h t s <, E x t s <.
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Alternative Representation of h t h t can be expressed by unique c j s as h t = c 0 + c i (r t i µ 0 ) (7) 1 i where α 1 = c 1, α 2 = c 2 β 1 c 1, α 3 = c 3 β 1 c 2 β 2 c 1,. Remark: c 0, c 1, c 2, are functions of θ 0 = (ω, α 1,, α p, β 1,, β q). Definition 5: u = (x, s 1,..., s p, t 1,..., t q) is the vector of unknown parameters for θ 0. Definition 6: c i (u) s are the unknown parameters for c 0, c 1, c 2,.
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Estimator for Location MEM(p,q) The quasi likelihood function for Location MEM(p,q) takes the form L n(µ, u) = 1 n ( lt (µ, u) lt (µ, u) = ln ht + rt µ ) n ht t=1 p q ht = ht (µ, u) = x + s i (r t i µ) + t j ht j i=1 j=1 t 1 = c 0 (u) + c i (u)(r t i µ) i=1 (8) Definition 1: Let r n(1) be the minimum of {r t} n t=1. Definition 2: Modified QMLE for θ 0 : ˆθ = arg max u Θ L n(r n(1), u).
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Consistency of The Minimum Statistic I Lemma: Under Assumption 1, r n(1) p µ 0. Proof: For any x > 0, P(r n(1) µ 0 > x) = P(h tz t > x for 1 t n) = P(z t > x h t for 1 t n) (9) For a fixed integer N, if z t C for N < t n, (10) h n+1 ω 0 (1 β 0 ) + (α 0 C + β 0 )h n (α 0 C + β 0 ) n+1 N h 0N ω 0 (α 0 C + β 0 ) n+1 N Let A n be the event: {there exist N s.t. z t C for N < t n} and B = α 0 C + β 0. Then P(A n) P ( h 0n+1 ω 0 B n+1 N) ) P (h n+1 δ ω0b δ (n+1 N)δ Eh δ 0n+1 ω δ 0 B(n+1 N)δ (11)
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Consistency of The Minimum Statistic II For some δ and C, we have lim n P(A n) = 0. Since A n A, where A={there exist N s.t. z t C for all t > N}, P(A c ) = 1 lim n P(A n) = 1. That is, for any integer N, there exist t N > N s.t. z tn < C. Select a monotone increasing sequence {t N } N=1 such that z tn < C, obviously t N as N. Let n = t N, then by the assumption that z t s are independent ( P z t > x ) ( for 1 t n P z t1 > x,..., z tn > x ) h t h t1 h tn (12) P (z t1 < C,..., z tn < C) P(z t < C) N The right hand side of the third inequality in equation (12) approaches to 0 as N. That is P(r n(1) µ 0 > x) 0 as n.
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Asymptotic Results for The Estimators Assumption 2: E r n(1) µ 0 s = o( 1 n ) for some s < 1. Theorem: Under Assumption 1 and 2 and if E ln h t < holds, (1) ˆθ p θ 0. (2) n(ˆθ θ 0 ) D N(0, V 0 ) where V 0 = G 1 0 C 0G 1 0, C 0 = E( ul t(θ 0 ) ul t(θ 0 ) ) and G 0 = E 2 ul t(θ 0 ). (3) ˆV n = Ĝ 1 n (ˆθ)Ĉn(ˆθ n)ĝ 1 n (ˆθ) p V 0 = G 1 0 C 0G 1 0 n, where Ĝ n(θ) = 1 2 n θlt (r n(1), θ), Ĉn(θ) = 1 n n t=1 θlt (r n(1), θ) θ lt (r n(1), θ). t=1
Model Specification and Assumption An Alternative Representation Estimator and Asymptotic Results Connection Between MEM and Location MEM Connection Between MEM and Location MEM Suppose r t is a Location MEM(p,q) process. Let H t = µ 0 + h t and x t = µ 0 + h tz t µ 0 + h t, then r t = H tx t H t = ω + p α i r t i + i=1 q (13) β j H t j j=1 where ω = ω + (1 p α i i=1 q β j )µ 0 and E(x t) = 1. i=1 Remark: x t s are not i.i.d.. When µ 0 is trivial compared to h t, x t can be approximated by z t. When µ 0 is significant, the estimation for process (13) is not valid.
Simulation Study Empirical Analysis on IBM Volume per Trade Consistency and Asymptotic Normality for Location MEM(1,1) Table : Modified QMLE of Location MEM(1,1) for Different Distributions. ω α β True Value 0.03 0.1 0.85 Exponential Mean 0.030890 0.100333 0.84802 S.D. 0.005573 0.009727 0.016241 Weibull Mean 0.030661 0.101692 0.846497 S.D. 0.005223 0.008585 0.014517 Gamma Mean 0.029305 0.102111 0.845881 S.D. 0.0052 0.008652 0.015456
Simulation Study Empirical Analysis on IBM Volume per Trade QQ Plot of standardized estimates for different distributions Exponential Weibull Gamma ω -3-1 1 3-4 -2 0 2-3 -1 1-3 -2-1 0 1 2 3-3 -2-1 0 1 2 3-3 -2-1 0 1 2 3 α -2 0 2 4-3 -1 1 3-3 -1 1 3 β -3-2 -1 0 1 2 3-3 -2-1 0 1 2 3-3 -2-1 0 1 2 3-3 -1 1 3-3 -1 1 3-3 -1 1 3-3 -2-1 0 1 2 3 Standard Normal -3-2 -1 0 1 2 3 Standard Normal -3-2 -1 0 1 2 3 Standard Normal
Simulation Study Empirical Analysis on IBM Volume per Trade Estimates for DGP of Location MEM(2,2) with µ 0 =8 Table : Estimates by Two Models ω α 1 α 2 β 1 β 2 True Value in L-MEM 0.03 0.05 0.1 0.5 0.3 True Value in MEM 0.08 0.05 0.1 0.5 0.3 Est. of L-MEM 0.03 0.05 0.1 0.503 0.296 (S.D) (0.003) (0.006) (0.009) (0.084) (0.074) Est. of MEM 0.0055 0.06 0.1 0.408 0.433 (S.D.) (0.0001) (0.006) (0.008) (0.072) (0.066)
Simulation Study Empirical Analysis on IBM Volume per Trade QQ Plot of standardized estimates for µ 0 = 8 at order (2,2) ω α 1 α 2 β 1 β 2 Location MEM -3-1 1 2 3-2 -1 0 1 2-3 -1 1 2 3 4-4 -2 0 2-2 0 2 4-3 0 2-3 0 2-3 0 2-3 0 2-3 0 2 MEM -1500-500 0-2 0 2 4-4 -2 0 2 4-6 -4-2 0 2 0 2 4 6-3 0 2-3 0 2-3 0 2-3 0 2-3 0 2 Standard Normal Qian Li - University Standard of Missouri-Kansas Normal City Standard Normal Location Multiplicative Standard Error Normal Model Standard Normal
Simulation Study Empirical Analysis on IBM Volume per Trade Empirical Improvement at Lag of (1,2): IBM Trade Volume at NYSE April 8th-12th, 2013. Table : Improvement on Raw Data Location MEM Est. of θ 0 2.055 0.029 0.44 0.51 S.D. 0.68 0.005 0.125 0.125 LLH -56772.35 Ljung-Box Q(15) 23.6975 p Value: Q(15) 0.07041 MEM Est. of θ 0 4.818 0.033 0.414 0.529 S.D. 1.675 0.005 0.151 0.153 LLH -62598.45 Ljung-Box Q(15) 32.2416 p Value: Q(15) 0.005968
s Main Results: Asymptotic properties of the modified QMLE for Location MEM. Simulation study and real data analysis illustrate the asymptotic results and improvement of goodness-of-fit. There is no need to remove intraday trend while modeling volume per trade. Future Work: Zero-Augmented MEM as proposed by Hautsch et al. (2014) and asymptotic analysis on a mixture density QMLE. Asymptotic Inference on Location MEM allowing for weakly-exogenous variables as GARCH-X model in Han (2013). Nonlinear structures in the sense of regime-swiching approach TACD in Zhang et al. (2001) and asymmetric news impact curve AACD in Fernandes and Grammig (2006).
I Amemiya, T. (1985). Advanced Econometrics. Cambridge: Harvard University Press. Berkes, Istvan and Horvath, Lajos (2004). The Efficiency of the Estimators of the Parameters in GARCH Processes. The Annals of Statistics (32), pp. 633 655. Berkes, Istvan, Horvath, Lajos, and Kokoszka, Piotr (2003). GARCH Processes: Structure and Estimation. Bernoulli (9), pp. 201 227. Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics 31, pp. 307 327. Engle, Robert (1982). Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom inflation. Econometrica 50, pp. 987 1007. (2002). New Frontiers For ARCH Models. Journal of Applied Econometrics 17.
II Engle, Robert and Russell, Jefferey (1998). Autoregressive Conditional Duration: A New Model for Irregularly Spaced Transaction Data. Econometrica 66, pp. 1127 1162. Fan, Jianqing, Qi, Lei, and Xiu, Dacheng (2014). Quasi-maximum likelihood estimation of GARCH models with heavy-tailed likelihoods. Journal of Business & Economic Statistics 32.2, pp. 178 191. Fernandes, M and Grammig, J (2006). A Family of Autoregressive Conditional Duration Models. Journal of Econometrics 130, pp. 1 23. Han, Heejoon (2013). Asymptotic Properties of GARCH-X Processes. Journal of Financial Econometrics, nbt023. Hautsch, Nikolaus, Malec, Peter, and Schienle, Melanie (2014). Capturing the Zero: A New Class of Zero-Augmented Distributions and Multiplicative Error Processes. Journal of Financial Econometrics 12, pp. 89 121.
III Lee, S.W. and Hansen, B.E. (1994). Asymptotic Theory for the GARCH(1,1) Quasi-Maximum Likelihood Estimator. Econometric Theory 10, pp. 29 52. Zhang, Michael Yuanjie, Russell, Jeffrey R, and Tsay, Ruey S (2001). A Nonlinear Autoregressive Conditional Duration Model with Applications to Financial Transaction Data. Journal of Econometrics 104.1, pp. 179 207.
Thank you!