Introduction to Time Series Analysis. Lecture 12.

Similar documents
Introduction to Time Series Analysis. Lecture 11.

Parameter estimation: ACVF of AR processes

STAT 720 sp 2019 Lec 06 Karl Gregory 2/15/2019

6.3 Forecasting ARMA processes

Contents. 1 Time Series Analysis Introduction Stationary Processes State Space Modesl Stationary Processes 8

STAT Financial Time Series

Midterm Suggested Solutions

MAT3379 (Winter 2016)

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Autoregressive Moving Average (ARMA) Models and their Practical Applications

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Akaike criterion: Kullback-Leibler discrepancy

ITSM-R Reference Manual

Levinson Durbin Recursions: I

Time Series I Time Domain Methods

Levinson Durbin Recursions: I

Calculation of ACVF for ARMA Process: I consider causal ARMA(p, q) defined by

Lecture 7: Model Building Bus 41910, Time Series Analysis, Mr. R. Tsay

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS)

Introduction to Estimation Methods for Time Series models Lecture 2

Lecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay

MAT 3379 (Winter 2016) FINAL EXAM (PRACTICE)

STAT 443 (Winter ) Forecasting

Multivariate Time Series

STOR 356: Summary Course Notes

Introduction to Time Series Analysis. Lecture 7.

Chapter 4: Models for Stationary Time Series

Statistics 3858 : Maximum Likelihood Estimators

STA 6857 Estimation ( 3.6)

2. An Introduction to Moving Average Models and ARMA Models

Ch 4. Models For Stationary Time Series. Time Series Analysis

Univariate Time Series Analysis; ARIMA Models

Gaussian processes. Basic Properties VAG002-

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Model selection using penalty function criteria

A Very Brief Summary of Statistical Inference, and Examples

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Covariances of ARMA Processes

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Association studies and regression

FE570 Financial Markets and Trading. Stevens Institute of Technology

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Econometrics II Heij et al. Chapter 7.1

Ch 6. Model Specification. Time Series Analysis

COS513 LECTURE 8 STATISTICAL CONCEPTS

1 One-way analysis of variance

System Identification

Introduction to ARMA and GARCH processes

2014/2015 Smester II ST5224 Final Exam Solution

We will only present the general ideas on how to obtain. follow closely the AR(1) and AR(2) cases presented before.

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

ARMA Estimation Recipes

Lecture 2: Univariate Time Series

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

STAD57 Time Series Analysis. Lecture 8

CHANGE DETECTION IN TIME SERIES

Central Limit Theorem ( 5.3)

AR, MA and ARMA models

ARMA MODELS Herman J. Bierens Pennsylvania State University February 23, 2009

Figure 29: AR model fit into speech sample ah (top), the residual, and the random sample of the model (bottom).

Lecture 32: Asymptotic confidence sets and likelihoods

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

Problem Selected Scores

STOR 356: Summary Course Notes Part III

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

1 Class Organization. 2 Introduction

Econometría 2: Análisis de series de Tiempo

Lecture on ARMA model

Chapter 6: Model Specification for Time Series

Discrete time processes

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

STAD57 Time Series Analysis. Lecture 14

ECON/FIN 250: Forecasting in Finance and Economics: Section 8: Forecast Examples: Part 1

Statistical Inference

Time Series: Theory and Methods

Parametric Techniques Lecture 3

Module 4. Stationary Time Series Models Part 1 MA Models and Their Properties

Computing the MLE and the EM Algorithm

Review Session: Econometrics - CLEFIN (20192)

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Lecture 1: Stationary Time Series Analysis

Model Identification for Time Series with Dependent Innovations

Lecture 3. Inference about multivariate normal distribution

Lecture 2: ARMA(p,q) models (part 2)

BTRY 4090: Spring 2009 Theory of Statistics

Parametric Techniques

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Econ 623 Econometrics II Topic 2: Stationary Time Series

Lecture 15. Hypothesis testing in the linear model

Multivariate Time Series: VAR(p) Processes and Models

Residual Bootstrap for estimation in autoregressive processes

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Forecasting with ARMA

Applied time-series analysis

Estimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1

Transcription:

Last lecture: Introduction to Time Series Analysis. Lecture 12. Peter Bartlett 1. Parameter estimation 2. Maximum likelihood estimator 3. Yule-Walker estimation 1

Introduction to Time Series Analysis. Lecture 12. 1. Review: Yule-Walker estimators 2. Yule-Walker example 3. Efficiency 4. Maximum likelihood estimation 5. Large-sample distribution of MLE 2

Review: Yule-Walker estimation Method of moments: We choose parameters for which the moments are equal to the empirical moments. In this case, we choose φ so that γ(h) = ˆγ(h) for h = 0,...,p. Yule-Walker equations for ˆφ: ˆΓ pˆφ = ˆγp, ˆσ 2 = ˆγ(0) ˆφ ˆγ p. These are the forecasting equations. We can use the Durbin-Levinson algorithm. 3

Review: Confidence intervals for Yule-Walker estimation If {X t } is an AR(p) process, ˆφ AN (φ, σ2 ˆφ hh AN ( 0, 1 n n Γ 1 p ) ), ˆσ 2 P σ 2. for h > p. Thus, we can use the sample PACF to test for AR order, and we can calculate approximate confidence intervals for the parameters φ. 4

Review: Confidence intervals for Yule-Walker estimation If {X t } is an AR(p) process, and n is large, n(ˆφ p φ p ) is approximately N(0,ˆσ 2ˆΓ 1 p ), with probability 1 α,φ pj is in the interval ˆσ ) 1/2 ˆφ pj ±Φ 1 α/2 n (ˆΓ 1 p, jj whereφ 1 α/2 is the 1 α/2 quantile of the standard normal. 5

Review: Confidence intervals for Yule-Walker estimation with probability 1 α,φ p is in the ellipsoid { ) } φ R p : (ˆφp φ) ˆΓp (ˆφp φ ˆσ2 n χ2 1 α(p), whereχ 2 1 α(p) is the(1 α) quantile of the chi-squared withpdegrees of freedom. 6

Yule Walker estimation: Example 1.2 Sample ACF: Depth of Lake Huron, 1875 1972 1 0.8 0.6 0.4 0.2 0 0.2 40 30 20 10 0 10 20 30 40 7

Yule Walker estimation: Example 1 Sample PACF: Depth of Lake Huron, 1875 1972 0.8 0.6 0.4 0.2 0 0.2 0.4 0 5 10 15 20 25 30 35 40 8

Yule Walker estimation: Example ˆΓ 2 = ˆφ 2 = 1.7379 1.4458 ˆγ 2 = 1.4458 1.7379 1 ˆΓ 2 ˆγ 2 = 1.0538 0.2668 1.4458 1.0600 ˆσ 2 w = ˆγ(0) ˆφ 2ˆγ 2 = 0.4971 9

Yule Walker estimation: Example Confidence intervals: ˆφ 1 ±Φ 1 α/2 (ˆσ 2 wˆγ 1 2 /n ) 1/2 11 = 1.0538±0.1908 ˆφ 2 ±Φ 1 α/2 (ˆσ 2 wˆγ 1 2 /n ) 1/2 22 = 0.2668±0.1908 10

Yule-Walker estimation It is also possible to define analogous estimators for ARMA(p,q) models withq > 0: q ˆγ(j) φ 1ˆγ(j 1) φ pˆγ(j p) = σ 2 θ i ψ i j, where ψ(b) = θ(b)/φ(b). Because of the dependence on theψ i, these equations are nonlinear inφ i,θ i. There might be no solution, or nonunique solutions. Also, the asymptotic efficiency of this estimator is poor: it has unnecessarily high variance. i=j 11

Efficiency of estimators Let ˆφ (1) and ˆφ (2) be two estimators. Suppose that ˆφ (1) AN(φ,σ 2 1), ˆφ(2) AN(φ,σ 2 2). The asymptotic efficiency of ˆφ (1) relative to ˆφ (2) is ( e φ, ˆφ (1), ˆφ (2)) = σ2 2 σ1 2. If e (φ, ˆφ (1), ˆφ ) (2) 1 for all φ, we say that ˆφ (2) is a more efficient estimator ofφthan ˆφ (1). For example, for an AR(p) process, the moment estimator and the maximum likelihood estimator are as efficient as each other. For an MA(q) process, the moment estimator is less efficient than the innovations estimator, which is less efficient than the MLE. 12

Yule Walker estimation: Example AR(1): AR(2): and ˆφ 1 ˆφ 2 γ(0) = σ2 1 φ 2 1 ˆφ 1 AN (φ 1, σ2 AN σ 2 n Γ 1 2 = 1 n φ 1 n Γ 1 1 φ 2 ) ( ) = AN φ 1, 1 φ2 1. n, σ2 n Γ 1 2 1 φ 2 2 φ 1 (1+φ 2 ) φ 1 (1+φ 2 ) 1 φ 2 2. 13

Yule Walker estimation: Example Suppose{X t } is an AR(1) process and the sample size n is large. If we estimate φ, we have Var(ˆφ 1 ) 1 φ2 1 n If we fit a larger model, say an AR(2), to this AR(1) process,. We have lost efficiency. Var(ˆφ 1 ) 1 φ2 2 n = 1 n > 1 φ2 1 n. 14

Introduction to Time Series Analysis. Lecture 12. 1. Review: Yule-Walker estimators 2. Yule-Walker example 3. Efficiency 4. Maximum likelihood estimation 5. Large-sample distribution of MLE 15

Parameter estimation: Maximum likelihood estimator One approach: Assume that {X t } is Gaussian, that is,φ(b)x t = θ(b)w t, where W t is i.i.d. Gaussian. Chooseφ i,θ j to maximize the likelihood: L(φ,θ,σ 2 ) = f(x 1,...,X n ), where f is the joint (Gaussian) density for the given ARMA model. (c.f. choosing the parameters that maximize the probability of the data.) 16

Maximum likelihood estimation Suppose that X 1,X 2,...,X n is drawn from a zero mean Gaussian ARMA(p,q) process. The likelihood of parameters φ R p,θ R q, σ 2 w R + is defined as the density of X = (X 1,X 2,...,X n ) under the Gaussian model with those parameters: L(φ,θ,σ 2 w) = 1 (2π) n/2 Γ n 1/2 exp ( 1 ) 2 X Γ 1 n X, where A denotes the determinant of a matrix A, and Γ n is the variance/covariance matrix of X with the given parameter values. The maximum likelihood estimator (MLE) ofφ,θ,σ 2 w maximizes this quantity. 17

Parameter estimation: Maximum likelihood estimator Advantages of MLE: Efficient (low variance estimates). Often the Gaussian assumption is reasonable. Even if{x t } is not Gaussian, the asymptotic distribution of the estimates (ˆφ, ˆθ,ˆσ 2 ) is the same as the Gaussian case. Disadvantages of MLE: Difficult optimization problem. Need to choose a good starting point (often use other estimators for this). 18

Preliminary parameter estimates Yule-Walker for AR(p): Regress X t ontox t 1,...,X t p. Durbin-Levinson algorithm withγ replaced by ˆγ. Yule-Walker for ARMA(p,q): Method of moments. Not efficient. Innovations algorithm for MA(q): withγ replaced by ˆγ. Hannan-Rissanen algorithm for ARMA(p,q): 1. Estimate high-order AR. 2. Use to estimate (unobserved) noisew t. 3. Regress X t ontox t 1,...,X t p,ŵt 1,...,Ŵt q. 4. Regress again with improved estimates ofw t. 19

Recall: Maximum likelihood estimation Suppose that X 1,X 2,...,X n is drawn from a zero mean Gaussian ARMA(p,q) process. The likelihood of parameters φ R p,θ R q, σ 2 w R + is defined as the density of X = (X 1,X 2,...,X n ) under the Gaussian model with those parameters: L(φ,θ,σ 2 w) = 1 (2π) n/2 Γ n 1/2 exp ( 1 ) 2 X Γ 1 n X, where A denotes the determinant of a matrix A, and Γ n is the variance/covariance matrix of X with the given parameter values. The maximum likelihood estimator (MLE) ofφ,θ,σ 2 w maximizes this quantity. 20

Maximum likelihood estimation We can simplify the likelihood by expressing it in terms of the innovations. Since the innovations are linear in previous and current values, we can write X 1 X 1 X1 0. = C. X n }{{} X X n Xn n 1 }{{} U where C is a lower triangular matrix with ones on the diagonal. Take the variance/covariance of both sides to see that Γ n = CDC where D = diag(p 0 1,...,P n 1 n ). 21

Thus, Γ n = C 2 P 0 1 P n 1 n Maximum likelihood estimation = P 0 1 P n 1 n and X Γ 1 n X = U C Γ 1 n CU = U C C T D 1 C 1 CU = U D 1 U. So we can rewrite the likelihood as L(φ,θ,σ 2 w) = = 1 ( (2π)n P 0 1 Pn 1 n 1 ( (2πσ 2 w ) n r 0 1 rn 1 n ) 1/2 exp ( ) 1/2 exp n 1 2 i=1 ( S(φ,θ) 2σw 2 (X i X i 1 i ), ) 2 /P i 1 i ) where r i 1 i = P i 1 i /σw 2 and S(φ,θ) = n i=1 ( Xi X i 1 ) 2 i r i 1. i 22

Maximum likelihood estimation The log likelihood of φ,θ,σ 2 w is l(φ,θ,σ 2 w) = log(l(φ,θ,σ 2 w)) = n 2 log(2πσ2 w) 1 2 n i=1 logr i 1 i S(φ,θ). 2σw 2 Differentiating with respect toσ 2 w shows that the MLE(ˆφ,ˆθ,ˆσ 2 w) satisfies n 2ˆσ 2 w = and ˆφ,ˆθ minimize S(ˆφ, ˆθ) 2ˆσ 4 w log ( ) S(ˆφ,ˆθ) n ˆσ w 2 = S(ˆφ,ˆθ), n + 1 n n i=1 logr i 1 i. 23

Summary: Maximum likelihood estimation The MLE(ˆφ,ˆθ,ˆσ 2 w) satisfies and ˆφ,ˆθ minimize ˆσ w 2 S(ˆφ, ˆθ) =, n ( ) S(ˆφ,ˆθ) log n + 1 n n i=1 logr i 1 i, where r i 1 i = P i 1 i /σw 2 and S(φ,θ) = n i=1 ( Xi X i 1 ) 2 i r i 1. i 24

Maximum likelihood estimation Minimization is done numerically (e.g., Newton-Raphson). Computational simplifications: Unconditional least squares. Drop thelogr i 1 i terms. Conditional least squares. Also approximate the computation of x i 1 i dropping initial terms ins. e.g., for AR(2), all but the first two terms ins depend linearly onφ 1,φ 2, so we have a least squares problem. The differences diminish as sample size increases. For example, Pt t 1 σw 2 sort t 1 1, and thusn 1 i logri 1 i 0. by 25

Maximum likelihood estimation: Confidence intervals For an ARMA(p,q) process, the MLE and un/conditional least squares estimators satisfy ˆφ 1 φ AN 0, σ2 w Γ φφ Γ φθ, ˆθ θ n Γ θφ Γ θθ, where Γ φθ Γ φφ Γ θφ Γ θθ, = Cov((X,Y),(X,Y)), X = (X 1,...,X p ) φ(b)x t = W t, Y = (Y 1,...,Y p ) θ(b)y t = W t. 26

Introduction to Time Series Analysis. Lecture 12. 1. Review: Yule-Walker estimators 2. Yule-Walker example 3. Efficiency 4. Maximum likelihood estimation 5. Large-sample distribution of MLE 27