Lent Term 2001 Richard Weber Time Series Examples Sheet This is the examples sheet for the M. Phil. course in Time Series. A copy can be found at: http://www.statslab.cam.ac.uk/~rrw1/timeseries/ Throughout, unless otherwise stated, the sequence {ǫ t } is white noise, variance σ 2. 1
1. Find the Yule-Walker equations for the AR(2) process X t = 1 3 X t 1 + 2 9 X t 2 + ǫ t. Hence show that it has autocorrelation function ( ρ k = 16 2 ) k ( 21 3 + 5 21 1 k 3), k Z. [ The Yule-Walker equations are ρ k 1 3 ρ k 1 2 9 ρ k 2 = 0, k 2. On trying ρ k = Aλ k, we require λ 2 1 3 λ 2 9 = 0. This has roots 2 3 and 1 3, so ρ k = A ( ) 2 k ( 3 + B 1 k 3), where ρ 0 = A + B = 1. We also require ρ 1 = 1 3 + 2 9 ρ 1. Hence ρ 1 = 3 7, and thus we require 2 3 A 1 3 B = 3 16 7. These give A = 21, B = 5 21. ] 2
2. Let X t = A cos(ωt+u), where A is an arbitrary constant, Ω and U are independent random variables, Ω has distribution function F over [0, π], and U is uniform over [0, 2π]. Find the autocorrelation function and spectral density function of {X t }. Hence show that, for any positive definite set of covariances {γ k }, there exists a process with autocovariances {γ k } such that every realization is a sine wave. [Use the following definition: {γ k } are positive definite if there exists a nondecreasing function F such that γ k = π π eikω df(ω).] [ 2π E[X t Ω] = 1 A cos(ωt + u) du = 1 2π 2π 0 A sin(ωt + u) 2π 0 = 0 π E[X t+s X t ] = 1 2π 2π 0 0 π 2π = 1 4π 0 0 π = 1 2 0 π = 1 4 0 = 1 4 A2 π A cos(ω(t + s) + u)a cos(ωt + u) du df(ω) A 2 [cos(ω(2t + s) + 2u) + cos(ωs)] du df(ω) A 2 cos(ωs)df(ω) A 2 [ e iωs + e iωs] df(ω) π e iωs d F(Ω) where we define over the range [ π, π] the nondecreasing function F, by F( Ω) = F(π) F(Ω) and F(Ω) = F(Ω) + F(π) 2F(0), Ω [0, π]. ] 3
3. Find the spectral density function of the AR(2) process X t = φ 1 X t 1 + φ 2 X t 2 + ǫ t. What conditions on (φ 1, φ 2 ) are required for this process to be an indeterministic second order stationary? Sketch in the (φ 1, φ 2 ) plane the stationary region. [ We have Hence f X (ω) = f X (ω) 1 φ1 e iω φ 2 e 2iω 2 = σ 2 /π σ 2 π [1 + φ 2 1 + φ2 2 + 2( φ 1 + φ 1 φ 2 ) cos(ω) 2φ 2 cos(2ω)] The Yule-Walker equations have solution of the form ρ k = Aλ k 1 + Bλk 2 where λ 1, λ 2 are roots of g(λ) = λ 2 φ 1 λ φ 2 = 0. The roots are λ = [ φ 1 ± φ 2 1 + 4φ 2 ] /2. To be indeterministic second order stationary these roots must have modulus less that 1. If φ 2 1 +4φ 2 > 0 then the roots are real and lie in the range [ 1, 1] if and only if g( 1) > 0 and g(1) > 0, i.e., φ 1 + φ 2 < 1, φ 1 φ 2 > 1. If φ 2 1 + 4φ 2 < 0 then the roots are complex and their product must be less than 1, i.e., φ 2 > 1. The union of these two regions, corresponding to possible (φ 1, φ 2 ) for real and imaginary roots, is simply the triangular region ] φ 1 + φ 2 < 1, φ 1 φ 2 > 1, φ 2 1. 4
4. For a stationary process define the covariance generating function g(z) = γ k z k, z < 1. k= Suppose {X t } satisfies X = C(B)ǫ, that is, it has the Wold representation X t = c r ǫ t r, r=0 where {c r } are constants satisfying 0 c2 r < and C(z) = r=0 c rz r. Show that g(z) = C(z)C(z 1 )σ 2. Explain how this can be used to derive autocovariances for the ARMA(p, q) model. Hence show that for ARMA(1, 1), ρ 2 2 = ρ 1 ρ 3. How might this fact be useful? [ We have [ ] γ k = EX t X t+k = E c r ǫ t r c s ǫ t+k s = σ 2 c r c k+r r=0 r=0 s=0 Now C(z)C(z 1 ) = The coefficients of z k and z k are clearly from which the result follows. c r z r c s z s r=0 s=0 c k c 0 + c k+1 c 1 + c k+2 c 3 + For the ARMA(p, q) model φ(b)x = θ(b)ǫ or X = θ(b) φ(b) ǫ where φ and θ are polynomials of degrees p and q in z. Hence C(z) = θ(z) φ(z) 5
and γ k can be found as the coefficient of z k in the power series expansion of σ 2 θ(z)θ(1/z)/φ(z)φ(1/z). For ARMA(1, 1) this is σ 2 (1 + θz)(1 + θz 1 )(1 + φz + φ 2 z 2 + )(1 + φz 1 + φ 2 z 2 + ) from which we have ( γ 1 = θ(1 + φ 2 + φ 4 + ) ) + (φ + φ 3 + φ 5 + )(1 + θ 2 ) + θ(φ 2 + φ 4 + φ 6 + ) and similarly = θ + φ(1 + θ2 ) + φ 2 θ 1 φ 2 σ 2 γ 2 = φ θ + φ(1 + θ2 ) + φ 2 θ 1 φ 2 σ 2 γ 3 = φ 2θ + φ(1 + θ2 ) + φ 2 θ 1 φ 2 σ 2 Hence ρ 2 2 = ρ 1 ρ 3. This might be used as a diagnostic to test the appropriateness of an ARMA(1, 1) model, by reference to the correlogram, where we would expect to see r 2 2 = r 1 r 3. ] σ 2 6
5. Consider the ARMA(2, 1) process defined as X t = φ 1 X t 1 + φ 2 X t 2 + ǫ t + θ 1 ǫ t 1. Show that the coefficients of the Wold representation satisfy the difference equation and hence that c k = φ 1 c k 1 + φ 2 c k 2, k 2, c k = Az k 1 + Bz k 2, where z 1 and z 2 are zeros of φ(z) = 1 φ 1 z φ 2 z 2, and A and B are constants. Explain how in principle one could find A and B. [ The recurrence is produced by substituting X t = r=0 c rǫ t r into the defining equation, and similarly for X t 1 and X t 2, multiplying by ǫ t k, k 2, and taking expected value. The general solution to such a second order linear recurrence relation is of the form given and we find A and B by noting that X t = φ 1 (φ 1 X t 2 + φ 2 X t 3 + ǫ t 1 + θ 1 ǫ t 2 ) + φ 2 X t 2 + ǫ t + θ 1 ǫ t 1 so that c 0 = 1 and c 1 = θ 1 +φ 1. Hence A+B = 1 and Az 1 1 +Bz 1 2 = θ 1 +φ 1. These can be solved for A and B. ] 7
6. Suppose Y t = X t + ǫ t, X t = αx t 1 + η t, where {ǫ t } and {η t } are independent white noise sequences with common variance σ 2. Show that the spectral density function of {Y t } is { } f Y (ω) = σ2 2 2α cosω + α 2. π 1 2α cosω + α 2 For what values of p, d, q is the autocovariance function of {Y t } identical to that of an ARIMA(p, d, q) process? [ f Y (ω) = f X (ω) + f ǫ (ω) = { = σ2 π 1 1 2α cosω + α 2 + 1 1 1 αe iω 2f η(ω) + f ǫ (ω) } { } = σ2 2 2α cosω + α 2 π 1 2α cosω + α 2. We recognise this as the spectral density of an ARMA(1, 1) model. E.g., Z t αz t 1 = ξ t θξ t 1, choosing θ and σξ 2 such that (1 2θ cosω + θ 2 )σ 2 ξ = (σ2 /π)(2 2α cosω + α 2 ) I.e., choosing θ such that (1 + θ 2 )/θ = (2 + α 2 )/α. ] 8
7. Suppose X 1,..., X T are values of a time series. Prove that { } T 1 ˆγ 0 + 2 ˆγ k = 0, where ˆγ k is the usual estimator of the kth order autocovariance, k=1 ˆγ k = 1 T T (X t X)(X t k X). t=k+1 Hint: Consider 0 = T t=1 (X t X). Hence deduce that not all ordinates of the correlogram can have the same sign. Suppose f( ) is the spectral density and I( ) the periodogram. Suppose f is continuous and f(0) 0. Does EI(2π/T) f(0) as T? [ The results follow directly from Note that formally, 1 T [ T X)] 2 (X t = 0. t=1 T 1 I(0) = ˆγ 0 + 2 ˆγ k = 0. so it might appear that EI(2π/T) I(0) f(0) as T. However, this would be mistaken. It is a theorem that as T, I(ω j ) f(ω j )χ 2 2 /2. So for large T, EI(2π/T) f(0). ] k=1 9
8. Suppose I( ) is the periodogram of ǫ 1,..., ǫ T, where these are i.i.d. N(0, 1) and T = 2m + 1. Let ω j, ω k be two distinct Fourier frequencies, Show that I(ω j ) and I(ω k ) are independent random variables. What are their distributions? If it is suspected that {ǫ t } departs from white noise because of the presence of a single harmonic component at some unknown frequency ω a natural test statistic is the maximum periodogram ordinate T = max j=1,...,m I(ω j). Show that under the hypothesis that {ǫ t } is white noise P(T > t) = 1 { 1 exp ( πt/σ 2)} m. [ The independence of I(ω j ) and I(ω k ) was proved in lectures. Their distributions are (σ 2 /2π)χ 2 2, which is equivalent to the exponential distribution with mean σ2 /π. Hence the probability that the maximum is less than t is the probability that all are, i.e., P(T < t) = { 1 exp ( πt/σ 2)} m. ] 10
9. Complete this sketch of the fast Fourier transform. From data X 0,..., X T, with T = 2 M 1, we want to compute the 2 M 1 ordinates of the periodogram I(ω j ) = 1 T 2 X t e it2πj/2m, j = 1,..., 2 M 1. πt t=0 This requires order T multiplications for each j and so order T 2 multiplications in all. However, X t e it2πj/2m = X t e it2πj/2m + X t e it2πj/2m t=0,1,...,2 M 1 = = t=0,1,...,2 M 1 1 t=0,1,...,2 M 1 1 X 2t e i2t2πj/2m + t=0,2,...,2 M 2 t=0,1,...,2 M 1 1 X 2t e it2πj/2m 1 + e i2πj/2m t=0,1,...,2 M 1 1 t=1,3,...,2 M 1 X 2t+1 e i(2t+1)2πj/2m X 2t+1 e it2πj/2m 1. Note that the value of either sum on the right hand side at j = k is the complex conjugate of its value at j = (2 M 1 k); so these sums need only be computed for j = 1,..., 2 M 2. Thus we have two sums, each of which is similar to the sum on the left hand side, but for a problem half as large. Suppose the computational effort required to work out each right hand side sum (for all 2 M 2 values of j) is Θ(M 1). The sum on the left hand side is obtained (for all 2 M 1 values of j) by combining the right hand sums, with further computational effort of order 2 M 1. Explain Θ(M) = a2 M 1 + 2Θ(M 1). Hence deduce that I( ) can be computed (by the FFT) in time T log 2 T. [ The derivation of the recurrence for Θ(M) should be obvious. We have Θ(1) = 1, and hence Θ(M) = am2 M = O(T log 2 T). ] 11
10. Suppose we have the ARMA(1, 1) process X t = φx t 1 + ǫ t + θǫ t 1, with φ < 1, θ < 1, φ + θ 0, observed up to time T, and we want to calculate k-step ahead forecasts ˆX T,k, k 1. Derive a recursive formula to calculate ˆX T,k for k = 1 and k = 2. [ ] ˆX T,1 = φx T + ˆǫ T+1 + θˆǫ T = φx T + θ(x T ˆX T 1,1 ) ˆX T,2 = φ ˆX T,1 + ˆǫ T+2 + θˆǫ T+1 = φ ˆX T,1 12
11. Consider the stationary scalar-valued process {X t } generated by the moving average, X t = ǫ t θǫ t 1. Determine the linear least-square predictor of X t, in terms of X t 1, X t 2,.... [ We can directly apply our results to give ˆX t 1,1 = θˆǫ t 1 = θ[x t 1 ˆX t 2,1 ] = θx t 1 + θ ˆX t 2,1 = θx t 1 + θ[ θx t 2 + θ ˆX t 3,1 ] = θx t 1 θ 2 X t 2 θ 3 X t 3 Alternatively, take the linear predictor as ˆX t 1,1 = r=1 a rx t r and seek to minimize E[X t ˆX t 1,1 ] 2. We have [ ] 2 E[X t ˆX t 1,1 ] 2 = E ǫ t θǫ t 1 a r (ǫ t r θǫ t r 1 ) r=1 = σ 2 [ 1 + (θ + a 1 ) 2 + (θa 1 a 2 ) 2 + (θa 2 a 3 ) 2 + ] Note that all terms, but the first, are minimized to 0 by taking a r = θ r. ] 13
12. Consider the ARIMA(0, 2, 2) model where {ǫ t } is white noise with variance 1. (I B) 2 X = (I 0.81B + 0.38B 2 )ǫ (a) With data up to time T, calculate the k-step ahead optimal forecast of ˆXT,k for all k 1. By giving a general formula relating ˆX T,k, k 3, to ˆX T,1 and ˆX T,2, determine the curve on which all these forecasts lie. [ The model is Hence and similarly X t = 2X t 1 X t 2 + ǫ t 0.81ǫ t 1 + 0.38ǫ t 2. ˆX T,1 = 2X T X T 1 + ˆǫ T+1 0.81ˆǫ T + 0.38ˆǫ T 1 = 2X T X T 1 0.81[X T ˆX T 1,1 ] + 0.38[X T 1 ˆX T 2,1 ] ˆX T,2 = 2 ˆX T,1 X T + 0.38[X T ˆX T 1,1 ] ˆX T,k = 2 ˆX T,k 1 ˆX T,k 2, k 3. This implies that the forecasts lie on a straight line. ] (b) Suppose now that T = 95. Calculate numerically the forecasts ˆX 95,k, k = 1, 2, 3 and their mean squared prediction errors when the last five observations are X 91 = 15.1, X 92 = 15.8, X 93 = 15.9, X 94 = 15.2, X 95 = 15.9. [You will need estimates for ǫ 94 and ǫ 95. Start by assuming ǫ 91 = ǫ 92 = 0, then calculate ˆǫ 93 = ǫ 93 = X 93 ˆX 92,1, and so on, until ǫ 94 and ǫ 95 are obtained.] [ Using the above formulae we obtain t X t ˆXt,1 ˆXt,2 ˆXt,3 ǫ t 91 15.1 0.000 0.000 92 15.8 16.500 17.200 17.900 0.000 93 15.9 16.486 16.844 17.202 0.600 94 15.2 15.314 14.939 14.564 1.286 95 15.9 15.636 15.596 15.555 0.586 Now X T+k = c r ǫ T+k r and ˆXT,k = r=0 c r ǫ T+k r. r=k 14
Thus where σ 2 ǫ = 1. Now E [X T+k ˆX ] 2 k 1 T,k = σ 2 ǫ c 2 r. r=0 X T = ǫ T + (2 0.81)ǫ T 1 + ( 2(0.81) 1 + 0.38)ǫ T 2 + Hence the mean square errors of ˆX 95,1, ˆX95,2, ˆX95,3 are respectively 1, 1.416, 5.018. ] 15
13. Consider the state space model, X t = S t + v t, S t = S t 1 + w t, where X t and S t are both scalars, X t is observed, S t is unobserved, and {v t }, {w t } are Gaussian white noise sequences with variances V and W respectively. Write down the Kalman filtering equations for Ŝt and P t. Show that P t P (independently of t) if and only if P 2 + PW = WV, and show that in this case the Kalman filter for Ŝt is equivalent to exponential smoothing. [ This is the same as Section 8.4 of the notes. F t = 1, G) t = 1, V t = V, W t = W. R t = P t 1 +W. So if (S t 1 X 1,..., X t 1 ) N (Ŝt 1, P t 1 then (S t X 1,..., X t ) ) N (Ŝt, P t, where P t = R t Ŝ t = Ŝt 1 + R t (V + R t ) 1 (X t Ŝt 1) R2 t V + R t = V R t V + R t = V (P t 1 + W) V + P t 1 + W. P t is constant if P t = P, where P is the positive root of P 2 + WP WV = 0. In this case Ŝt behaves like Ŝt = (1 α) r=0 αr X t r, where α = V/(V + W + P). This is simple exponential smoothing. ] 16