Parameter estimation: ACVF of AR processes Yule-Walker s for AR processes: a method of moments, i.e. µ = x and choose parameters so that γ(h) = ˆγ(h) (for h small ). 12 novembre 2013 1 / 8
Parameter estimation: ACVF of AR processes Yule-Walker s for AR processes: a method of moments, i.e. µ = x and choose parameters so that γ(h) = ˆγ(h) (for h small ). Multiplying X t = ϕ 1 X t 1 + + ϕ p X t p + Z t by X t h (h = 1... p) and taking expectations, one has Γ p φ = γ p with (Γ p ) i,j = γ(i j), i.e. γ(0) γ(1) γ(2) γ(p 1) γ(1) γ(0) γ(1). γ(1) Γ p =. γ(2) γ(1) γ(0).. γ(2), γ p =........ γ(p) γ(p 1) γ(1) γ(0) and multiplying by X t one obtains σ 2 = γ(0) φ t γ p. 12 novembre 2013 1 / 8
Parameter estimation: Yule-Walker s method ˆϕ 1 ˆϕ 2 Hence estimate ˆφ =. by solving ˆΓ p ˆφ = ˆγ p. (1) ˆϕ p Moreover ˆσ 2 = ˆγ(0) ˆφ t ˆγ p. (2) 12 novembre 2013 2 / 8
Parameter estimation: Yule-Walker s method Remarks ˆϕ 1 ˆϕ 2 Hence estimate ˆφ =. by solving ˆΓ p ˆφ = ˆγ p. (1) ˆϕ p Moreover ˆσ 2 = ˆγ(0) ˆφ t ˆγ p. (2) 12 novembre 2013 2 / 8
Parameter estimation: Yule-Walker s method Remarks ˆϕ 1 ˆϕ 2 Hence estimate ˆφ =. by solving ˆΓ p ˆφ = ˆγ p. (1) ˆϕ p 1 We proved ˆΓ p non-negative definite. Moreover ˆσ 2 = ˆγ(0) ˆφ t ˆγ p. (2) 12 novembre 2013 2 / 8
Parameter estimation: Yule-Walker s method Remarks ˆϕ 1 ˆϕ 2 Hence estimate ˆφ =. by solving ˆΓ p ˆφ = ˆγ p. (1) ˆϕ p Moreover ˆσ 2 = ˆγ(0) ˆφ t ˆγ p. (2) 1 We proved ˆΓ p non-negative definite. If ˆγ(0) > 0 (i.e. data are not identical), is positive definite, hence invertible = ˆφ well-defined. 12 novembre 2013 2 / 8
Parameter estimation: Yule-Walker s method Remarks ˆϕ 1 ˆϕ 2 Hence estimate ˆφ =. by solving ˆΓ p ˆφ = ˆγ p. (1) ˆϕ p Moreover ˆσ 2 = ˆγ(0) ˆφ t ˆγ p. (2) 1 We proved ˆΓ p non-negative definite. If ˆγ(0) > 0 (i.e. data are not identical), is positive definite, hence invertible = ˆφ well-defined. 2 One can divide (1) by γ(0) to obtain ˆR p ˆφ = ˆρ p with (R p ) i,j = ρ(i j), ρ i = ρ(i). 12 novembre 2013 2 / 8
Parameter estimation: Yule-Walker s method Remarks ˆϕ 1 ˆϕ 2 Hence estimate ˆφ =. by solving ˆΓ p ˆφ = ˆγ p. (1) ˆϕ p Moreover ˆσ 2 = ˆγ(0) ˆφ t ˆγ p. (2) 1 We proved ˆΓ p non-negative definite. If ˆγ(0) > 0 (i.e. data are not identical), is positive definite, hence invertible = ˆφ well-defined. 2 One can divide (1) by γ(0) to obtain ˆR p ˆφ = ˆρ p with (R p ) i,j = ρ(i j), ρ i = ρ(i). 3 One can compute φ iteratively, applying Durbin-Levinson to ˆγ. 12 novembre 2013 2 / 8
Asymptotic properties of Yule-Walker Theorem If {X t } is the causal AR process satisfying X t = ϕ 1 X t 1 + + ϕ p X t p + Z t {Z t } IID(0, σ 2 ) then n 1/2 ( ˆφ φ) = N(0, σ 2 Γ 1 p ) and ˆσ2 P σ 2. 12 novembre 2013 3 / 8
Asymptotic properties of Yule-Walker Theorem If {X t } is the causal AR process satisfying X t = ϕ 1 X t 1 + + ϕ p X t p + Z t {Z t } IID(0, σ 2 ) then n 1/2 ( ˆφ φ) = N(0, σ 2 Γ 1 p ) and ˆσ2 P σ 2. An approximate γ-confidence interval for ϕ i, i = 1... p is then ( ˆϕ i ˆσ [(ˆΓ p ) 1] 1/2 ii where P( N(0, 1) z γ ) = γ. z γ, ˆϕ i + ˆσ [(ˆΓ p ) 1] 1/2 n ii z γ n ) 12 novembre 2013 3 / 8
Comparison with linear regression One could write the problem as Y = X φ + Z with X n X n 1 X n p X n 1 Y =. X = X n 2 X n p 1.. and Z IID(0, σ2 ). X p+1 X p X 1 Least square estimation asks for minimizing Y X φ 2 and results into X t Y = n t=p+1 n t=p+1 X t X t 1. X t X t p ˆφ = (X t X ) 1 X t Y. p i nˆγ p (X t X ) ij = t=p i+1 X t X t+i j n(ˆγ) ij. 12 novembre 2013 4 / 8
Variations on Yule-Walker: Burg s algorithm Yule-Walker s method finds ˆφ as the coefficients of the best linear predictor of X p+1 using X 1,..., X p, assuming that the ACF coincides with the sample ACF for i = 1,... p. 12 novembre 2013 5 / 8
Variations on Yule-Walker: Burg s algorithm Yule-Walker s method finds ˆφ as the coefficients of the best linear predictor of X p+1 using X 1,..., X p, assuming that the ACF coincides with the sample ACF for i = 1,... p. Burg s algorithm considers actual forward and backward linear predictors. Define u p (t) (p t < n) as the difference between X n t+p and the best linear prediction using the previous p values, i.e. u p (t) = X n t+p P L(Xn t,...,x n t+p 1 )X n t+p and v p (t) (1 t < n) be the difference between the value of X n t and the best linear prediction using the following p values, i.e. v p (t) = X n t P L(Xn t+1,...,x n t+p )X n t. 12 novembre 2013 5 / 8
Variations on Yule-Walker: Burg s algorithm Yule-Walker s method finds ˆφ as the coefficients of the best linear predictor of X p+1 using X 1,..., X p, assuming that the ACF coincides with the sample ACF for i = 1,... p. Burg s algorithm considers actual forward and backward linear predictors. Define u p (t) (p t < n) as the difference between X n t+p and the best linear prediction using the previous p values, i.e. u p (t) = X n t+p P L(Xn t,...,x n t+p 1 )X n t+p and v p (t) (1 t < n) be the difference between the value of X n t and the best linear prediction using the following p values, i.e. v p (t) = X n t P L(Xn t+1,...,x n t+p )X n t. The coefficients are found by minimizing n 1 t=p (u2 p(t) + v 2 p (t)). Details on the algorithm in an exercise. 12 novembre 2013 5 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. 12 novembre 2013 6 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. If m < p (i.e. the model is wrong), we should see this from the residuals. p X t ˆX t = X t ϕ j X t j = Z t WN(0, σ 2 ) for t > p. 12 novembre 2013 6 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. If m < p (i.e. the model is wrong), we should see this from the residuals. p X t ˆX t = X t ϕ j X t j = Z t WN(0, σ 2 ) for t > p. Ŵ t = X t p ˆϕ j X t j are not exactly WN, but close for n large. 12 novembre 2013 6 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. If m < p (i.e. the model is wrong), we should see this from the residuals. p X t ˆX t = X t ϕ j X t j = Z t WN(0, σ 2 ) for t > p. Ŵ t = X t p ˆϕ j X t j are not exactly WN, but close for n large. If m < p and we compute (Ŵ m ) t = X t m ( ˆϕ m ) j X t j, we expect to find them correlated. 12 novembre 2013 6 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. If m < p (i.e. the model is wrong), we should see this from the residuals. p X t ˆX t = X t ϕ j X t j = Z t WN(0, σ 2 ) for t > p. Ŵ t = X t p ˆϕ j X t j are not exactly WN, but close for n large. If m < p and we compute (Ŵ m ) t = X t m ( ˆϕ m ) j X t j, we expect to find them correlated. Significant ACF of Ŵ m suggests m < p. 12 novembre 2013 6 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. If m < p (i.e. the model is wrong), we should see this from the residuals. p X t ˆX t = X t ϕ j X t j = Z t WN(0, σ 2 ) for t > p. Ŵ t = X t p ˆϕ j X t j are not exactly WN, but close for n large. If m < p and we compute (Ŵ m ) t = X t m ( ˆϕ m ) j X t j, we expect to find them correlated. Significant ACF of Ŵ m suggests m < p. If m p, n 1/2 ( ˆφ m φ) = N(0, σ 2 Γ 1 m ) (Theorem). In particular for m > p, ˆϕ m N(0, 1/n). 12 novembre 2013 6 / 8
Order of the auto-regressive process Assume the true model is AR(p) and we fit a model of order m. If m < p (i.e. the model is wrong), we should see this from the residuals. p X t ˆX t = X t ϕ j X t j = Z t WN(0, σ 2 ) for t > p. Ŵ t = X t p ˆϕ j X t j are not exactly WN, but close for n large. If m < p and we compute (Ŵ m ) t = X t m ( ˆϕ m ) j X t j, we expect to find them correlated. Significant ACF of Ŵ m suggests m < p. If m p, n 1/2 ( ˆφ m φ) = N(0, σ 2 Γ 1 m ) (Theorem). In particular for m > p, ˆϕ m N(0, 1/n). Finding ˆϕ m within ±1.96n 1/2 suggests that the true value of ϕ m may be 0, and to fit a model of lower order. 12 novembre 2013 6 / 8
Maximum likelihood estimation A standard method of parameter estimate is maximum likelihood estimation (MLE). 12 novembre 2013 7 / 8
Maximum likelihood estimation A standard method of parameter estimate is maximum likelihood estimation (MLE). Likelihood of ARMA processes is computed assuming {Z t } IIDN(0, σ 2 ). Then {X t } is also normally distributed. 12 novembre 2013 7 / 8
Maximum likelihood estimation A standard method of parameter estimate is maximum likelihood estimation (MLE). Likelihood of ARMA processes is computed assuming {Z t } IIDN(0, σ 2 ). Then {X t } is also normally distributed. Example of AR(1): X t = µ + ϕ(x t 1 µ) + Z t, ϕ < 1. Then X t N(µ, σ 2 1 ϕ 2 ). 12 novembre 2013 7 / 8
Maximum likelihood estimation A standard method of parameter estimate is maximum likelihood estimation (MLE). Likelihood of ARMA processes is computed assuming {Z t } IIDN(0, σ 2 ). Then {X t } is also normally distributed. Example of AR(1): X t = µ + ϕ(x t 1 µ) + Z t, ϕ < 1. σ Then X t N(µ, 2 ). 1 ϕ 2 Joint distributions can be obtained by conditional distributions: conditional on the value of X t 1, X t N(µ + ϕ(x t 1 µ), σ 2 ). 12 novembre 2013 7 / 8
Maximum likelihood estimation A standard method of parameter estimate is maximum likelihood estimation (MLE). Likelihood of ARMA processes is computed assuming {Z t } IIDN(0, σ 2 ). Then {X t } is also normally distributed. Example of AR(1): X t = µ + ϕ(x t 1 µ) + Z t, ϕ < 1. σ Then X t N(µ, 2 ). 1 ϕ 2 Joint distributions can be obtained by conditional distributions: conditional on the value of X t 1, X t N(µ + ϕ(x t 1 µ), σ 2 ). Hence likelihood of data (x 1, x 2,..., x n ) can be obtained as f (x 1 )f (x 2 x 1 ) f (x n x n 1 ) where f ( ) is the (unconditional) density of X t, while f ( x) is the density of X t conditional on X t 1 = x. 12 novembre 2013 7 / 8
Computing likelihood for AR(1) f (x) = (2π σ2 (x 1 ϕ 2 ) 1/2 µ)2 exp{ 2σ 2 /(1 ϕ 2 ) } ( ) 1 ϕ 2 1/2 = 2πσ 2 exp{ (1 ϕ2 )(x µ) 2 2σ 2 } f (y, x) = (2πσ 2 ) 1/2 (y µ ϕ(x µ))2 exp{ 2σ 2 }. 12 novembre 2013 8 / 8
Computing likelihood for AR(1) f (x) = (2π σ2 (x 1 ϕ 2 ) 1/2 µ)2 exp{ 2σ 2 /(1 ϕ 2 ) } ( ) 1 ϕ 2 1/2 = 2πσ 2 exp{ (1 ϕ2 )(x µ) 2 2σ 2 } f (y, x) = (2πσ 2 ) 1/2 (y µ ϕ(x µ))2 exp{ 2σ 2 }. Parameters of the model are µ, ϕ, σ 2 : L(µ, ϕ, σ 2 ) = (1 ϕ 2 ) 1/2 (2πσ 2 ) n/2 exp{ (1 ϕ2 )(x 1 µ) 2 2σ 2 } n exp{ (x j µ ϕ(x j 1 µ)) 2 2σ 2 } j=2 12 novembre 2013 8 / 8
Computing likelihood for AR(1) f (x) = (2π σ2 (x 1 ϕ 2 ) 1/2 µ)2 exp{ 2σ 2 /(1 ϕ 2 ) } ( ) 1 ϕ 2 1/2 = 2πσ 2 exp{ (1 ϕ2 )(x µ) 2 2σ 2 } f (y, x) = (2πσ 2 ) 1/2 (y µ ϕ(x µ))2 exp{ 2σ 2 }. Parameters of the model are µ, ϕ, σ 2 : L(µ, ϕ, σ 2 ) = (1 ϕ 2 ) 1/2 (2πσ 2 ) n/2 exp{ (1 ϕ2 )(x 1 µ) 2 2σ 2 } n exp{ (x j µ ϕ(x j 1 µ)) 2 2σ 2 } j=2 [ = (1 ϕ2 ) 1/2 1 n (2πσ 2 exp{ ) n/2 2σ 2 (1 ϕ 2 )(x 1 µ) 2 + (x t µ ϕ(x t 1 µ)) ]} 2 t=2 12 novembre 2013 8 / 8