Understanding Regressions with Observations Collected at High Frequency over Long Span

Similar documents
Mean Reversion and Stationarity in Financial Time Series Generated from Diffusion Models

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

Taking a New Contour: A Novel View on Unit Root Test 1

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

A New Approach to Robust Inference in Cointegration

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

This chapter reviews properties of regression estimators and test statistics based on

Nonstationary Panels

Quick Review on Linear Multiple Regression

Testing for Stationarity at High Frequency

Econometrics of Panel Data

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis

Ch.10 Autocorrelated Disturbances (June 15, 2016)

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econometrics of Panel Data

Lectures on Structural Change

Asymptotic Least Squares Theory

Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi)

The outline for Unit 3

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Title. Description. Quick start. Menu. stata.com. xtcointtest Panel-data cointegration tests

Block Bootstrap HAC Robust Tests: The Sophistication of the Naive Bootstrap

On Perron s Unit Root Tests in the Presence. of an Innovation Variance Break

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Rank tests for short memory stationarity

University of Pavia. M Estimators. Eduardo Rossi

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

MA Advanced Econometrics: Applying Least Squares to Time Series

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

The Spurious Regression of Fractionally Integrated Processes

CHANGE DETECTION IN TIME SERIES

Stochastic Processes

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS

A New Test in Parametric Linear Models with Nonparametric Autoregressive Errors

Cointegrating Regressions with Messy Regressors: J. Isaac Miller

On detection of unit roots generalizing the classic Dickey-Fuller approach

BCT Lecture 3. Lukas Vacha.

Stochastic Volatility and Correction to the Heat Equation

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests

Moreover, the second term is derived from: 1 T ) 2 1

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Robust methods for detecting multiple level breaks in autocorrelated time series

On the Long-Run Variance Ratio Test for a Unit Root

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

A Bootstrap View on Dickey-Fuller Control Charts for AR(1) Series

Economic modelling and forecasting

Stationarity Revisited, With a Twist. David G. Tucek Value Economics, LLC

6. The econometrics of Financial Markets: Empirical Analysis of Financial Time Series. MA6622, Ernesto Mordecki, CityU, HK, 2006.

1. Stochastic Processes and Stationarity

F9 F10: Autocorrelation

A BOOTSTRAP VIEW ON DICKEY-FULLER CONTROL CHARTS FOR AR(1) SERIES

Tests for a Broken Trend with Stationary or Integrated Shocks

Testing in GMM Models Without Truncation

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

by Ye Cai and Mototsugu Shintani

ECON3327: Financial Econometrics, Spring 2016

Weak Identification in Maximum Likelihood: A Question of Information

Regression with time series

Unit Root and Cointegration

Trend and initial condition in stationarity tests: the asymptotic analysis

On the Error Correction Model for Functional Time Series with Unit Roots

If we want to analyze experimental or simulated data we might encounter the following tasks:

Econometrics II - EXAM Answer each question in separate sheets in three hours

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Darmstadt Discussion Papers in Economics

Duration-Based Volatility Estimation

Robust Nonnested Testing and the Demand for Money

Time Series Methods. Sanjaya Desilva

Lecture 19 - Decomposing a Time Series into its Trend and Cyclical Components

Volatility. Gerald P. Dwyer. February Clemson University

The Generalized Cochrane-Orcutt Transformation Estimation For Spurious and Fractional Spurious Regressions

Non-Stationary Time Series, Cointegration, and Spurious Regression

Specification Test for Instrumental Variables Regression with Many Instruments

Single Equation Linear GMM with Serially Correlated Moment Conditions

10. Time series regression and forecasting

Modelling of Economic Time Series and the Method of Cointegration

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Testing for Unit Roots in Small Panels with Short-run and Long-run Cross-sectional Dependencies 1

The Functional Central Limit Theorem and Testing for Time Varying Parameters

Econometrics of Panel Data

Nonstationary Time Series:

EC821: Time Series Econometrics, Spring 2003 Notes Section 9 Panel Unit Root Tests Avariety of procedures for the analysis of unit roots in a panel

Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments

Department of Econometrics and Business Statistics

ONLINE SUPPLEMENT FOR EVALUATING FACTOR PRICING MODELS USING HIGH FREQUENCY PANELS

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

ECON 4160, Lecture 11 and 12

Nonstationary time series models

E 4101/5101 Lecture 9: Non-stationarity

Advanced Econometrics

Single Equation Linear GMM with Serially Correlated Moment Conditions

Transcription:

Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University & Sungkyunkwan University Abstract In this paper, we consider regressions with observations collected at small time interval over long period of time. For the formal asymptotic analysis, we assume that samples are obtained from continuous time stochastic processes, and let the sampling interval δ shrink down to zero and the sample span T diverge up to infinity. In this set-up, we show that the standard Wald statistic always diverges to infinity as long as δ 0 sufficiently fast relative to T, and regressions become spurious. This is indeed well expected from our asymptotics which shows that, in such a set-up, samples from any continuous time process become strongly dependent with their serial correlation approaching to unity, and regressions become spurious exactly as in the conventional spurious regression. However, as we show in the paper, the spuriousness of Wald test disappears if we account for strong persistency adequately using an appropriate longrun variance estimate. The empirical illustrations in the paper provide a strong and unambiguous support for the practical relevancy of our asymptotic theory. Keywords: high frequency regression, spurious regression, continuous time model, asymptotics 1

Understanding Regressions with Observations Collected at High Frequency over Long Span Joon Y. Park Department of Economics Indiana University Conference on Econometrics for Macroeconomics and Finance Hitotsubashi University 15-16 March 2014 1 / 80 Joon Y. Park Regressions at High Frequency

Objective To provide a formal analysis of the effect of sampling frequency on statistical inference to help with the choice of sampling frequency when observations are available at many different frequencies. It is widely, but rather vaguely, believed that there are both positive and negative aspects of using high frquency observations. In particular, high frequency observations are thought to give us more information on the one hand, but we have to deal with excessive noise and erratic volatility in high frequency observations. 2 / 80 Joon Y. Park Regressions at High Frequency

References Main Contents Chang and Park (2014), Understanding Regressions with Observations Collected at High Frequency over Long Span. Background Material Jeong and Park (2011), Asymptotic Theory of Maximum Likelihood Estimation for Diffusion Model. Jeong and Park (2014), Asymptotic Theory of Maximum Likelihood Estimation for Jump Diffusion Model. Kim and Park (2014), Asymptotics for Recurrent Diffusions with Application to High Frequency Regression. Lu and Park (2014), Estimation of Longrun Variance of Continuous Time Stochastic Process Using Discrete Sample. 3 / 80 Joon Y. Park Regressions at High Frequency

Contents Background and Preliminaries Empirical Illustrations Limit Theories in Continuous Time Continuous Time Regression Technical Assumptions Asymptotics of High Frequency Regression Further Analysis of Spuriousness Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 4 / 80 Joon Y. Park Regressions at High Frequency

Empirical Illustrations Limit Theories in Continuous Time Background and Preliminaries 5 / 80 Joon Y. Park Regressions at High Frequency

Illustrative Regression and Test Empirical Illustrations Limit Theories in Continuous Time We consider the simple regression model y i = α + βx i + u i for i = 1,..., n, and use samples at varying frequencies, collected at sampling interval δ, to test the null hypothesis using the standard t-statistic. α = 0 or β = 1 6 / 80 Joon Y. Park Regressions at High Frequency

Models Background and Preliminaries Empirical Illustrations Limit Theories in Continuous Time We consider the following four models Model (y i ) and (x i ) I 20-year T-bond rates and 3-month T-bill rates II 3-month eurodollar rates and 3-month T-bill rates III log US/UK exchange rates forward and spot IV log SP500 index futures and index as illustrative examples. 7 / 80 Joon Y. Park Regressions at High Frequency

t-test for α in Model I Empirical Illustrations Limit Theories in Continuous Time 8 / 80 Joon Y. Park Regressions at High Frequency

t-test for α in Model II Empirical Illustrations Limit Theories in Continuous Time 9 / 80 Joon Y. Park Regressions at High Frequency

t-test for α in Model III Empirical Illustrations Limit Theories in Continuous Time 10 / 80 Joon Y. Park Regressions at High Frequency

t-test for α in Model IV Empirical Illustrations Limit Theories in Continuous Time 11 / 80 Joon Y. Park Regressions at High Frequency

t-test for β in Model I Empirical Illustrations Limit Theories in Continuous Time 12 / 80 Joon Y. Park Regressions at High Frequency

t-test for β in Model II Empirical Illustrations Limit Theories in Continuous Time 13 / 80 Joon Y. Park Regressions at High Frequency

t-test for β in Model III Empirical Illustrations Limit Theories in Continuous Time 14 / 80 Joon Y. Park Regressions at High Frequency

t-test for β in Model IV Empirical Illustrations Limit Theories in Continuous Time 15 / 80 Joon Y. Park Regressions at High Frequency

Empirical Illustrations Limit Theories in Continuous Time Summary Our test results have some common features, which we may summarize as Test results are heavily dependent upon sampling intervals. Test values fluctuate significantly but without any trend until the sampling interval reaches one month, and start to increase rather sharply as the sampling interval further decreases. At the daily frequency, our tests strongly and unambiguously reject both α = 0 and β = 1 in all models. Clearly, it is not sensible to use daily observations for the t-test in these models. 16 / 80 Joon Y. Park Regressions at High Frequency

Data Plot for Model I Empirical Illustrations Limit Theories in Continuous Time 17 / 80 Joon Y. Park Regressions at High Frequency

Data Plot for Model II Empirical Illustrations Limit Theories in Continuous Time 18 / 80 Joon Y. Park Regressions at High Frequency

Data Plot for Model III Empirical Illustrations Limit Theories in Continuous Time 19 / 80 Joon Y. Park Regressions at High Frequency

Data Plot for Model IV Empirical Illustrations Limit Theories in Continuous Time 20 / 80 Joon Y. Park Regressions at High Frequency

Empirical Illustrations Limit Theories in Continuous Time Summary We may summarize the sample characteristics of our observations as All time series in our models are either nonstationary or near-nonstationary. Stock prices and exchange rates are clearly nonstationary. Interest rates are more stationary, but show some evidence of nonstationarity. Existing nonstationarity in stock prices and exchange rates cannot be fully removed by simple detrending or transformations. However, we may detrend or transform them at least to make them recurrent. In sum, it seems that recurrent nonstationary or near-nonstationary processes are of most relevant characteristics of our samples. 21 / 80 Joon Y. Park Regressions at High Frequency

Stationary Limit Theory Empirical Illustrations Limit Theories in Continuous Time For a wide class of mean zero stationary processes in continuous time, the continuous time version of invariance principle holds. That is, if we define, for a mean zero stationary process V = (V t ), the normalized process V T = (V T t ) on the unit interval [0, 1] by V T t = 1 T tt 0 V s ds, we have V T d V as T, where V is Brownian motion with variance ( 1 T ) ( T Π = lim T T E V t dt V t dt) > 0, 0 0 which is assumed to exist. 22 / 80 Joon Y. Park Regressions at High Frequency

Nonstationary Limit Theory Empirical Illustrations Limit Theories in Continuous Time For a general nonstationary process X = (X t ) in continuous time, the normalized process X T = (Xt T ) defined on the unit interval [0, 1] by Xt T = Λ 1 T X T t with some normalizing sequence (Λ T ) of matrices such that Λ T as T is expected to yield functional limit theory X T d X as T for a well defined limit process X. 23 / 80 Joon Y. Park Regressions at High Frequency

Empirical Illustrations Limit Theories in Continuous Time Limit Theory for Nonstationary Diffusion Let X be a diffusion in natural scale with speed density m, which is regularly varying with index r > 1, and define X T on [0, 1] by Then we have X T t = T 1/(r+2) X T t. X T d X for a generalized diffusion X as T in the space C[0, 1] of continuous functions defined on [0, 1]. 24 / 80 Joon Y. Park Regressions at High Frequency

Limit Process Background and Preliminaries Empirical Illustrations Limit Theories in Continuous Time The limit process X is defined by with { Ā t = inf s X = B Ā D } m(x)l(s, x)dx > t for 0 t 1, where B is Brownian motion with local time L, and m is the( limit homogeneous function ) of m given by m(x) = a1{x 0} + b1{x < 0} x r with a and b are nonnegative constants such that a + b > 0. 25 / 80 Joon Y. Park Regressions at High Frequency

Empirical Illustrations Limit Theories in Continuous Time Special Cases The scaled limit process [ ( ) 1 ( ) 1 ] a r+2 b r+2 + X r + 1 r + 1 is a skew Bessel process in natural scale of dimension 2(r + 1)/(r + 2), which reduces to Bessel process if b = 0 skew Brownian motion if r = 0, and Brownian motion if r = 0 and a = b 26 / 80 Joon Y. Park Regressions at High Frequency

Empirical Illustrations Limit Theories in Continuous Time Summary Asymptotics for diffusions and jump diffusions are now fully available in Jeong and Park (2011): diffusion asymptotics with proper moment conditions. Kim and Park (2014): more general diffusion asymptotics allowing for nonexisting moments. Jeong and Park (2014): standard asymptotics for jump diffusion models. The existing asymptotics for diffusions may well be extended to more general class of models. 27 / 80 Joon Y. Park Regressions at High Frequency

Continuous Time Regression Technical Assumptions 28 / 80 Joon Y. Park Regressions at High Frequency

Model Background and Preliminaries Continuous Time Regression Technical Assumptions We assume that the regression y i = x iβ + u i for i = 1,..., n is generated from the underlying continuous time regression Y t = X tβ + U t over 0 t T, where Y, X and U are continuous time stochastic processes, and y i = Y iδ, x i = X iδ, u i = U iδ, where δ is the sampling interval and T = nδ is the sample span. 29 / 80 Joon Y. Park Regressions at High Frequency

Continuous Time Regression Technical Assumptions Wald Statistic For the general linear hypothesis Rβ = r, we consider the standard Wald statistic F ( ˆβ) that is given by ( n ) 1 F ( ˆβ) = (R ˆβ r) R x i x i R i=1 1 (R ˆβ r)/ˆσ 2, where ˆβ is the OLS estimator for β and ˆσ 2 is the usual estimator for the error variance obtained from the OLS residuals (û i ). 30 / 80 Joon Y. Park Regressions at High Frequency

Modified Statistics Background and Preliminaries Continuous Time Regression Technical Assumptions In the presence of serial correlation in (u i ), the Wald test is generally not applicable. In this case, modified versions of the Wald statistic such as ( n ) 1 1 G( ˆβ) = (R ˆβ r) R x i x i R (R ˆβ r)/ˆω 2, i=1 where ˆω 2 is a consistent estimator for the longrun variance of (u i ) based on (û i ), or ( n 1 ( H( ˆβ) = (R ˆβ n ) 1 1 r) R x i x i) ˆΩ x i x i R (R ˆβ r), i=1 where ˆΩ is a consistent estimator for the longrun variance of (x i u i ) based on (x i û i ). 31 / 80 Joon Y. Park Regressions at High Frequency i=1

Asymptotic Framework Continuous Time Regression Technical Assumptions Our asymptotics are developed as δ 0 and T jointly. In particular, we assume that δ 0 sufficiently fast relative to T, and therefore, our asymptotics yield the same results as the sequential asymptotics relying on δ followed by T. However, our asymptotics are fundamentally different from the sequential asymptotics in that our asymptotics approximate the finite sample distributions for all combinations of δ and T, as long as T is large and δ is small relative to T. 32 / 80 Joon Y. Park Regressions at High Frequency

Assumption A Background and Preliminaries Continuous Time Regression Technical Assumptions Let Z be any element in U, XX or XU. We have E Z t = O(T ). 0 t T Moreover, if we define δ,t (Z) = sup 0 s,t T sup t s δ Zt c Zs, c then ( ) max δ, δ ( ) T sup Z t = O δ,t (Z) 0 t T as δ 0 and T. 33 / 80 Joon Y. Park Regressions at High Frequency

Continuous Time Regression Technical Assumptions Assumption B We assume that 1 T T as T for some σ 2 > 0. 0 U 2 t dt p σ 2 > 0 34 / 80 Joon Y. Park Regressions at High Frequency

Assumption C1 Background and Preliminaries Continuous Time Regression Technical Assumptions We assume that 1 T T 0 X t X tdt p M as T for some nonrandom matrix M > 0, and we have as T, where 1 T T 0 X t U t dt d N(0, Π) ( 1 T ) ( T Π = lim T T E X t U t dt X t U t dt) > 0, 0 0 which is assumed to exist. 35 / 80 Joon Y. Park Regressions at High Frequency

Assumption C2 Background and Preliminaries Continuous Time Regression Technical Assumptions For X T defined on [0, 1] with an appropriately defined nonsingular normalizing sequence (Λ T ) of matrices as X T t = Λ 1 T X T t, we assume that X T d X in the product space of D[0, 1] as T with linearly independent limit process X, and if we define U T on [0, 1] as U T t T t = T 1/2 U s ds, 0 then U T d U in D[0, 1] as T, where U is Brownian ( motion with variance π 2 = lim T T 1 T 2 E 0 tdt) U > 0, which is assumed to exist. 36 / 80 Joon Y. Park Regressions at High Frequency

Assumptions D1 and D2 Continuous Time Regression Technical Assumptions Assumption D1: δ,t (U), δ,t ( XX ) 0 and T δ,t ( XU ) 0 as δ 0 and T. Assumption D2: δ,t (U), Λ T 2 δ,t ( XX ) 0 and T ΛT δ,t ( XU ) 0 as δ 0 and T. 37 / 80 Joon Y. Park Regressions at High Frequency

Assumption E Background and Preliminaries Continuous Time Regression Technical Assumptions We let U c, the continuous part of U, be a semimartingale given by U c = A + B, where A and B are respectively the bounded variation and martingale components of U c satisfying A t A s sup = O p (a T ) and sup 0 s,t T t s 0 s,t T [B] t [B] s t s = O p (b T ), and a T δ,t (U) 0 and (b T / T ) δ,t (U) 0 with δ = 2 δ,t (U) as δ 0 and T. Moreover, we assume that 0 t T E( U t) 4 = O(T ) and T 1 [U] T p τ 2 for some τ 2 > 0 as T. 38 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness 39 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness Lemma Let Assumption A hold. If we define Z = U, XX or XU and z i = Z iδ for i = 1,..., n, we have 1 n n z i = 1 T i=1 T 0 ) Z t dt + O p ( δ,t ( Z ) for all small δ and large T. 40 / 80 Joon Y. Park Regressions at High Frequency

Theorem 1 Background and Preliminaries Asymptotics of High Frequency Regression Further Analysis of Spuriousness Assume Rβ = r and let Assumptions A and B hold. Under Assumption C1, we have T ( ˆβ β) d N δf ( ˆβ) d N R ( RM 1 R ) 1 RN/σ 2, where N = d N ( 0, M 1 ΠM 1), as δ 0 and T jointly satisfying Assumption D1. 41 / 80 Joon Y. Park Regressions at High Frequency

Theorem 2 Background and Preliminaries Asymptotics of High Frequency Regression Further Analysis of Spuriousness Assume Rβ = r and let Assumptions A and B hold. Under Assumptions C2, we have T Λ T ( ˆβ β) d P δf ( ˆβ) d P R ( RQ 1 R ) 1 RP/σ 2 ( ) 1 1 where P = 0 X t Xt 1 dt 0 X t dut and Q = 1 0 X t Xt dt, as δ 0 and T jointly satisfying Assumption D2. 42 / 80 Joon Y. Park Regressions at High Frequency

Pitfall Background and Preliminaries Asymptotics of High Frequency Regression Further Analysis of Spuriousness Note that ˆβ p β as δ 0 and T. However, we have F ( ˆβ) p as δ 0 and T. Therefore, we are led to falsely reject the null hypothesis when it is true, if δ is small relative to T. 43 / 80 Joon Y. Park Regressions at High Frequency

LLN at High Frequency Asymptotics of High Frequency Regression Further Analysis of Spuriousness The LLN holds for discrete samples, since 1 n as δ 0 and T. n u i = 1 T i=1 1 T N i=1 T 0 δu iδ U t dt p 0 44 / 80 Joon Y. Park Regressions at High Frequency

CLT at High Frequency Asymptotics of High Frequency Regression Further Analysis of Spuriousness However, the CLT for discrete samples fails to hold, since 1 n n i=1 u i = 1 δ 1 T n i=1 δu iδ 1 δ ( 1 T T as δ 0 and T. This is because 0 ) U t dt p cor (u i, u i 1 ) = cor (U iδ, U (i 1)δ ) 1 as δ 0 and the dependency becomes too strong for the CLT to hold. 45 / 80 Joon Y. Park Regressions at High Frequency

Residual Autoregression Asymptotics of High Frequency Regression Further Analysis of Spuriousness To further analyze the serial dependency in (u i ), we consider AR(1) regression u i = ρu i 1 + ε i, which is called the residual autoregression, and obtain the asymptotics of the OLS estimator ρ of the AR coefficient ρ. Given the consistency of the OLS estimator ˆβ of β, it is straightforward to show that the estimated residual AR coefficient ρ based on (u i ) is asymptotically equivalent to the estimated residual AR coefficient ˆρ, say, based on the OLS residual (û i ). 46 / 80 Joon Y. Park Regressions at High Frequency

Lemma Background and Preliminaries Asymptotics of High Frequency Regression Further Analysis of Spuriousness Under Assumption E, we have as δ 0 and T. ρ = 1 τ 2 δ σ 2 + o p(δ) In particular, we have ρ p 1 as δ 0 and T. Therefore, (u i ) becomes strongly dependent. 47 / 80 Joon Y. Park Regressions at High Frequency

Remarks Background and Preliminaries Asymptotics of High Frequency Regression Further Analysis of Spuriousness The speed at which ρ 1 depends upon the ratio τ 2 /σ 2. Note that 1 T [U] T p τ 2 1 T and Ut 2 dt p σ 2, T which may be interpreted respectively as the mean local variation and the mean global variation. If the local variation is too large relative to the global variation, ρ approaches to unity very slowly, or it even will not converge to unity at all if the local variation is excessively large and τ 2 = while σ 2 <. 0 48 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness Estimated Residual AR Coefficient for Model I 49 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness Estimated Residual AR Coefficient for Model II 50 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness Estimated Residual AR Coefficient for Model III 51 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness Estimated Residual AR Coefficient for Model IV 52 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics of High Frequency Regression Further Analysis of Spuriousness Summary We may summarize our empirical results as follows. In Models I-III, estimated residual AR(1) coefficient ρ converges to unity, evidently and rapidly, as δ 0. In Model IV, estimated residual AR(1) coefficient ρ converges very slowly to unity, or may even not converge to unity, as δ 0. For this model, it seems that the local variation is relatively too big compared with the magnitude of global variation. Our asymptotic theory very well explains what we observe in our empirical illustrations. 53 / 80 Joon Y. Park Regressions at High Frequency

Conventional Spurious Regression Asymptotics of High Frequency Regression Further Analysis of Spuriousness The conventional spurious regression refers to the regression between two independent random walks, or more generally, any regression with the regression error that has a unit root and is strongly dependent. It is well known that, in the conventional spurious regression, the standard Wald statistic diverges as the sample size increases and yields spurious test results. The high frequency regressions we consider yield spurious results for the same reason as the conventional spurious regression. However, our high frequency regressions are authentic and represent truthful relationships, whereas the conventional spurious regression has no meaningful interpretation. Therefore, the spuriousness in our regressions is rectifiable. 54 / 80 Joon Y. Park Regressions at High Frequency

HAC Estimation of Longrun Variance Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation The commonly used longrun variance estimators ω 2 and Ω of (u i ) and (x i u i ) may be written as ω 2 = ( ) j ( ) j K γ(j), Ω = K Γ(j) m m j m j m where K is the kernel function, γ(j) = T 1 i u iu i j and Γ(j) = T 1 i x iu i x i j u i j are the sample autocovariances, and m is the bandwidth parameter that determines the number of sample autocovariances included in the estimator. In practice, we use ˆω 2 and ˆΩ defined with (û i ) and (x i û i ) from the OLS residual (û i ), which are asymptotically equivalent to ω 2 and Ω. 55 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation HAC Estimators in Discrete and Continuous Times If we let V = U or XU, and v i = V iδ, i = 1,..., n, then δ n n i=1 v i 1 T T 0 V t dt as shown earlier, and therefore, we may expect that δ ω 2 π 2 and δ Ω Π, where π 2 and Π are the longrun variances of U and XU introduced in Assumptions C1 and C2. 56 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Estimation of Continuous Time Longrun Variance We may estimate the continuous time longrun variances π 2 and Π of U and XU, respectively, using the longrun variance estimators of their discrete samples (u i ) and (x i u i ). In fact, Lu and Park (2014) show that Assumption F: If mδ and m/n 0 as δ 0 and T, then we have δ ω 2 p π 2 and δ Ω p Π as δ and T. under very mild conditions. Here we just use their result as an assumption. Note in particular that ω 2 and Ω do not converge. 57 / 80 Joon Y. Park Regressions at High Frequency

Theorem Background and Preliminaries Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Assume Rβ = r and let Assumptions A, B and F hold. (a) Under Assumption C1, we have H( ˆβ) d χ 2 q as δ 0 and T satisfying Assumption D1. (b) Under Assumption C2, we have G( ˆβ) d P R ( RQ 1 R ) 1 RP ( ) 1 1 where P = 0 X t Xt 1 dt 0 X t du t with U = πu and Q = 1 0 X t Xt dt, as δ 0 and T satisfying Assumption D2. 58 / 80 Joon Y. Park Regressions at High Frequency

Possible Failure in Bandwidth Choice Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Assumption F, which is crucial to have our asymptotics for G( ˆβ) and H( ˆβ) tests, may not hold if the bandwidth parameter m is careless chosen in the usual discrete time set-up. If, for instance, we let m = n κ for some 0 < κ < 1, Assumption F does not hold. This is because we have mδ = n κ δ = T κ δ 1 κ 0, if δ 0 fast enough so that δ = o(t κ/(1 κ) ). In this case, we have δˆω 2 p 0 and δω p 0, and consequently, G( ˆβ) p and H( ˆβ) p, exactly as for the standard Wald statistic F ( ˆβ). 59 / 80 Joon Y. Park Regressions at High Frequency

Optimal Bandwidth Background and Preliminaries Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation The optimal bandwidth parameter m, which minimizes the asymptotic mean squared error variance, is given by ( νk m 2 ν = ν C ν 2 ) 1/(2ν+1) K(x) 2 dx n, where ν is the so-called characteristic exponent, K ν = lim x 0 (1 K(x))/ x ν, C ν = j ν γ(j)/ γ(j). We have ν = 1 for the Bartlett kernel, and ν = 2 for all other commonly used kernels such as Parzen, Tukey-Hanning and Quadratic Spectral kernels. Note that ν, K ν, as well as K(x) 2 dx, are determined entirely by the choice of kernel function K, whereas C ν is given by the covariance structure of the underlying random sequence (u i ). 60 / 80 Joon Y. Park Regressions at High Frequency

Andrews Automatic Bandwidth Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation If we assume (u i ) is AR(1) with the autoregressive coefficient ρ following the suggestion of Andrews (1991), then we have C 2 1 = 4ρ 2 (1 ρ) 2 (1 + ρ) 2, C2 2 = 4ρ2 (1 ρ) 4 for ν = 1, 2. Therefore, once we fit AR(1) model to (û i ) and obtain the OLS estimate ˆρ of the residual AR coefficient ρ, we may use it to estimate the constant C ν, and get an estimate for the optimal bandwidth parameter m ν from this estimate of C ν, ν = 1, 2. 61 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Asymptotics of Andrews Automatic Bandwidth From our previous lemma on the residual AR coefficient, we may readily derive that and C 2 1 = τ 4 σ 4 δ 2 + o p(δ 2 ), C2 2 = 4τ 8 σ 8 δ 4 + o p(δ 4 ), ( m τ 4 K 2 ) 1/3 1 1 = σ 4 K(x) 2 dx n δ 2/3( 1 + o p (1)) ( m 8τ 8 K 2 ) 1/5 2 2 = σ 8 K(x) 2 dx n δ 4/5( 1 + o p (1)), as δ and T. 62 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Asymptotics of with Automatic Bandwidth For the optimal bandwidth with the Andrews data dependent procedure, we have ( m τ 4 K 2 ) 1/3 1 1δ = σ 4 K(x) 2 dx T (1 + o p (1)) ( m 8τ 8 K 2 ) 1/5 2 2δ = σ 8 K(x) 2 dx T (1 + o p (1)) as δ 0 and T. Consequently, Assumption F holds and we have proper limit distributions for G( ˆβ) and H( ˆβ) tests. Note that the limit distribution of G( ˆβ) test is nonnormal unless the regressor and regression error are asymptotically independent. 63 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Estimated Bandwidth Parameter for Model I 64 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Estimated Bandwidth Parameter for Model II 65 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Estimated Bandwidth Parameter for Model III 66 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Estimated Bandwidth Parameter for Model IV 67 / 80 Joon Y. Park Regressions at High Frequency

H-Test for α in Model I Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 68 / 80 Joon Y. Park Regressions at High Frequency

G-Test for α in Model I Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 69 / 80 Joon Y. Park Regressions at High Frequency

H-Test for α in Model II Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 70 / 80 Joon Y. Park Regressions at High Frequency

G-Test for α in Model II Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 71 / 80 Joon Y. Park Regressions at High Frequency

G-Test for α in Model III Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 72 / 80 Joon Y. Park Regressions at High Frequency

G-Test for α in Model IV Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 73 / 80 Joon Y. Park Regressions at High Frequency

H-Test for β in Model I Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 74 / 80 Joon Y. Park Regressions at High Frequency

G-Test for β in Model I Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 75 / 80 Joon Y. Park Regressions at High Frequency

H-Test for β in Model II Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 76 / 80 Joon Y. Park Regressions at High Frequency

G-Test for β in Model II Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 77 / 80 Joon Y. Park Regressions at High Frequency

G-Test for β in Model III Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 78 / 80 Joon Y. Park Regressions at High Frequency

G-Test for β in Model IV Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation 79 / 80 Joon Y. Park Regressions at High Frequency

Asymptotics for Tests with HAC Estimators Bandwidth Selection in HAC Estimation Summary The test results from the modified tests can be summarized as The tests do not reject both hypotheses α = 0 and β = 1 in most cases. The H-test reject the hypothesis for α in Model I and for β in Model II. However, it may be due to the nonstationarity of interest rates. If the G-test is used instead, the hypotheses are not rejected. We have a good incentive to use high frequency observations and use modified tests with appropriate choices of bandwidth parameter. Test results become very stable as the sampling interval decreases and become robust once the sampling frequency crosses the threshold level. For fully efficient tests, we should use the optimal bandwidth derived specifically for the underlying process in continuous time. 80 / 80 Joon Y. Park Regressions at High Frequency