Econometrics of financial markets, -solutions to seminar 1. Problem 1 a) Estimate with OLS. For any regression y i α + βx i + u i for OLS to be unbiased we need cov (u i,x j )0 i, j. For the autoregressive model y t µ + αy t 1 + ε t we have cov(y t 1,ε t+k )0for all k > 0 and the fact that cov(y t 1,ε t+k )0for k 0ensures that OLS gives consistent results. However for k 1, 2, 3,... cov(y t 1,ε t+k ) 6 0which results in a bias towards zero in small samples. Note that you do not need an "infinitely large sample" for the asymptotic properties to be good approximations. In practical applications the bias is most often neglected without any discussion unless the sample is very small. b) Use the lag operator: y t 1 Ly t. Note that the lag operator can be subject to simple algebraic manipulations like L(ax t + by t )alx t + bly t and (Ly t ) 2 L 2 yt 2 yt 2 2 Using the lag operator: y t µ + αy t 1 + ε t, α < 1 y t (1 αl) µ + ε t y t µ 1 α + 1 1 αl ε t µ 1 α +(1+αL + α2 L 2 +...)ε t µ 1 α + ε t + αε t 1 + α 2 ε t 2 +... µ 1 α + X α i ε t i i0 c) 1
Assume k > 0 cov(y t,y t k ) cov α i ε t i, α j ε t k j i0 i0 j0 j0 α i+j cov (ε t i,ε t k j ) i>0,i6k+j j0 + α i+j cov (ε t i,ε t k j ) α k+j+j cov (ε t k j,ε t k j ) j0 X σ 2 α k+2j αk 1 α 2 σ2 j0 ρ k k>0 cov (y t,y t k ) p var(yt ) p var(y t k ) α k 1 α σ 2 2 1 1 α σ 2 αk 2 Must have ρ k ρ k,soρ k α k d) Assume k > 0. For the MA(1) model cov(y t,y t k ) cov (ε t + θε t 1,ε t k + θε t k 1 ) 0 if k>1 { θσ 2 if k 1 1+θ 2 σ 2 if k 0 ρ 1 θ 1+θ 2 ρ 2 ρ 3... 0 2
e) To solve, assume stationarity; unconditional expectation and variance are then independent of the time subscript and autocovariances only dependent on time difference.. var(y) α 2 1var (y)+α 2 1var (y)+σ 2 +2α 1 α 2 cov (y t 1,y t 2 ) cov(y t,y t 1 ) α 1 var(y)+α 2 cov (y t 1,y t 2 ) Must have cov(y t,y t 1 )cov (y t 1,y t 2 ) c 1, so c 1 α 1 var(y) 1 α 2 var(y) 1 α 2 1 α 2 1 σ 2 +2α 1 α 2 c 1 var(y) (1 α 2 ) σ 2 (1 + α 2 )(1 α 1 α 2 )(1+α 1 α 2 ) If any of the conditions listed under exercise e) in the problem text are not satisfied the unconditional variance is not defined. A necessary condition for a process to be stationary is that the unconditional variance is a (constant) finite number, and so the conditions under e) are needed for the AR(2) process to be stationary. f) The autocorrelations; ρ 1 c 1 var(y) α 1 1 α 2 g) cov(y t,y t 2 ) α 1 cov(y t 1,y t 2 )+α 2 var(y t 2 ) ρ 2 α 1 ρ 1 + α 2 α2 1 1 α 2 + α 2 3
For the AR(1) model there is no correlation between y t and y t 2 once the correlation between y t and y t 1 is controlled for: For the AR(2) model; ρ 2 1 ρ 2 (ρ 1 ) 2 1 (ρ 1 ) 2 α2 α 2 1 α 2 0 ρ 2 1 ρ 2 (ρ 1 ) 2 α 2 1 (ρ 1 ) 2 α 2 1 1 α 2 + α 2 ³ α1 ³ 2 1 α1 1 α 2 ³ α α 2 µ1 1 ³ 1 1 α 2 2 1 α 2 2 2 α 1 1 α 2 In general, for the AR(p) model, the partial correlation ρ k 1,2,3,...,k 1 will be nonzero for k 6 p and zero for k>p. process acf pacf AR(p) infinite, damps out finite, cuts off after lag p MA(q) finite, cuts off after lag q infinite, damps out ARMA infinite, damps out infinite, damps out h) Forecasting (abstracting from model uncertainty): Definition: Mean squared error (MSE) ; MSE E t (y t+k dy t+k ) 2 It can be shown that choosing the estimator dy t+k to be the conditional expec- 4
tation E (y t+k y t ) will minimize MSE. AR(1) process: y t µ + αy t 1 + ε t One period ahead forecasting E (y t+1 y t ) E (µ + αy t + ε t+1 y t ) µ + αy t Two periods ahead E (y t+2 y t ) E µ + αµ + α 2 y t + αε t+1 + ε t+2 y t µ + αµ + α 2 y t k periods ahead E (y t+k y t ) 1 α k 1 α µ + αk y t long run forecast equals unconditional mean: The MA(1): One period ahead: lim E (y t+k y t ) µ k 1 α y t µ + ε t + θε t 1 E (y t+1 y t ) E (µ + ε t+1 + θε t y t ) µ + θe (ε t y t ) Note that E (ε t y t ) is not necessarily zero. Suppose we knew the start value ε 0 along with the parameters µ and θ. Then y 1 µ + ε 1 + θε 0 will reveal the exact value of ε 1, which can be used along with the observations of y up to 5
time t, to nest up all the exact values ε 2,ε 3,..., ε t. In this situation the optimal forecast should use this information such that E (y t+1 y t )µ + θε t. More periods ahead: E (y t+k y t ) E (y t+2 y t ) E (µ + ε t+2 + θε t 1 y t ) µ thus the k>1 period forecast of the MA(1) is simply the unconditional mean. i) Consider the special case α 1. If we fix the starting point of the process to some date t 0,wecansolvethedifference equation to obtain and y t y 0 + µt + tx τ1 ε t E (y t ) y 0 + µt var(y t ) σ 2 t + var (y 0 ) s ρ k (t) σ 2 (t k)+var (y 0 ) σ 2 t + var (y 0 ) When µ 6 0the mean evolves over time. As long as there are innovations to the process σ 2 60, the variance and covariances also depend on t. We say the process is non-stationary. To test the null hypothesis α 1, the model can be estimated as usual applying OLS. However, under the null hypothesis the test observator is nonstandard, and a tabulated distribution has to be applied (Dickey and Fuller 1979). If the innovations are serially correlated, an augmented test can be applied. The test has very low power for alpha close to one. There are also alternative tests. 6
Efficient market hypothesis. " The information in past prices/returns are not useful in achieving excess returns" Testing price processes for α 1is not equivalent with testing for weak market efficiency. Indeed, if the random walk is the true process then EMH must apply, but the opposite is not true. For an elaborate discussion of EMH, the fair game and the random walk hypothesis, check out e.g. Copeland and Weston, "Financial Theory and Corporate Policy",1983, Addison Wesley. Problem 2 Proposition 2.4 Let y k (δ),k 0, 1, 2,..., T/δ for an integer δ represent a sampled series of y t such that only observations that has a time index that is a multiple of δ is selected. If y t is an autoregressive process of order 1 with an autoregressive coefficient ρ and an innovation variance σ 2, then y k (δ) is an autoregressive process of order 1, with coefficient ρ δ and variance σ 2 1 ρ2 δ 1 ρ. 2 Empirical assessment of the AR(1) on different frequencies using interest rate data, To sample data at different frequencies is not a straight forward task using GiveWin. The following procedure can however be used: 1) Load the data file HF-3MR.xls 2) Open the calculator and generate a new variable, which you can denote "t", using the year() function. The new variable will read 1,2,3,...,P where P is the total number of periods. 7
3) Open "algebra editor" from the "tools" menu and type in the following code in the algebra code field: DUMMY (fmod(t, 6) 0)? 1 : 0 ; Press the "run" button. This will generate a new variable called "DUMMY" which will equal one if time subscript is a multiple of 6 and zero otherwise. The syntax of the code should be understood as; "newname, logical condition, value if true, value if false". We specify the number "6" because there is (except for "missing" entries) mostly 6 observations per hour in the data. 8
4) Use the calculator to generate a new variable, say "h", which equals the product of DUMMY and t. 5) Use the calculator to generate a variable, of any name, using the function _sortallby(h). This will sort your data according to ascending values of the variable h. In the data editor you will now see all the unselected entries first. The selected entries will start at some period, τ, and will all be in the correct time-order (you can double-check this using the variables minutes, hour, day, month, year). 6) Chose "save as" from the "file" menu and specify that you only want to save the subsample starting at period τ. 9
The saved subsample will contain observations at the hourly frequency. If the data is in fact generated by a AR(1) process, fitting this model using the high frequency and the hourly data respectively, then according to proposition 2.4, if the autoregressive parameter from the former regression is ρ, thenthe the autoregressive parameter from the latter regression should equal ρ 6.Ifthis is not true, data suggests that the AR(1) specification is not the appropriate model. You should check proposition 2.4 at several different frequencies using the same procedure to sample the data. You should also include a constant term in the estimation of the models unless you want to evaluate the joint hypothesis of "no constant term" and "process is AR(1)". On the interest rate data you will probably find that proposition 2.4 underestimates the ρ parameter as you turn to less frequent data. As an example, using the high frequency data you can find bρ 0.952733, while estimating on the hourly data gives bρ 0.942891.Now 0.952733 6 0.74787196 < 0.942891 which is not in line with prop. 2.4. 10