Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Similar documents
Module 4. Stationary Time Series Models Part 1 MA Models and Their Properties

We use the centered realization z t z in the computation. Also used in computing sample autocovariances and autocorrelations.

Chapter 6: Model Specification for Time Series

2. An Introduction to Moving Average Models and ARMA Models

Econometría 2: Análisis de series de Tiempo

at least 50 and preferably 100 observations should be available to build a proper model

Empirical Market Microstructure Analysis (EMMA)

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Univariate Time Series Analysis; ARIMA Models

Some Time-Series Models

Time Series I Time Domain Methods

Econometrics II Heij et al. Chapter 7.1

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

3 Theory of stationary random processes

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Forecasting. Simon Shaw 2005/06 Semester II

STAT Financial Time Series

Time Series Analysis -- An Introduction -- AMS 586

CHAPTER 8 FORECASTING PRACTICE I

MCMC analysis of classical time series algorithms.

A Data-Driven Model for Software Reliability Prediction

Basics: Definitions and Notation. Stationarity. A More Formal Definition

FE570 Financial Markets and Trading. Stevens Institute of Technology

Lesson 2: Analysis of time series

Ch 6. Model Specification. Time Series Analysis

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS)

Chapter 4: Models for Stationary Time Series

Review Session: Econometrics - CLEFIN (20192)

Problem Set 2: Box-Jenkins methodology

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Ross Bettinger, Analytical Consultant, Seattle, WA

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends

Time Series Analysis

Introduction to ARMA and GARCH processes

Discrete time processes

We will only present the general ideas on how to obtain. follow closely the AR(1) and AR(2) cases presented before.

A time series is called strictly stationary if the joint distribution of every collection (Y t

3. ARMA Modeling. Now: Important class of stationary processes

Applied time-series analysis

Time Series 2. Robert Almgren. Sept. 21, 2009

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Lecture 2: Univariate Time Series

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Classic Time Series Analysis

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Univariate ARIMA Models

Modelling using ARMA processes

1 Linear Difference Equations

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

Econ 623 Econometrics II Topic 2: Stationary Time Series

Midterm Suggested Solutions

IDENTIFICATION OF ARMA MODELS

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Statistics of stochastic processes

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Econ 424 Time Series Concepts

Advanced Econometrics

Introduction to Time Series Analysis. Lecture 11.

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

TMA4285 December 2015 Time series models, solution.

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Exercises - Time series analysis

Ch 4. Models For Stationary Time Series. Time Series Analysis

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Lesson 9: Autoregressive-Moving Average (ARMA) models

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Introduction to Stochastic processes

Chapter 8: Model Diagnostics

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Gaussian processes. Basic Properties VAG002-

6.3 Forecasting ARMA processes

Time Series: Theory and Methods

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Covariances of ARMA Processes

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Final Examination 7/6/2011

MAT3379 (Winter 2016)

Dynamic Time Series Regression: A Panacea for Spurious Correlations

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

The Identification of ARIMA Models

Box-Jenkins ARIMA Advanced Time Series

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

AR, MA and ARMA models

9. Using Excel matrices functions to calculate partial autocorrelations

1. Fundamental concepts

Nonlinear time series

Stationary Stochastic Time Series Models

Econometrics I: Univariate Time Series Econometrics (1)

Multivariate Time Series

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Econometría 2: Análisis de series de Tiempo

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

Lecture 4a: ARMA Model

Transcription:

Module 3 Descriptive Time Series Statistics and Introduction to Time Series Models Class notes for Statistics 451: Applied Time Series Iowa State University Copyright 2015 W Q Meeker November 11, 2015 17h 20min 3-1

Module 3 Segment 1 Populations, Processes, and Descriptive Time Series Statistics 3-2

Populations A population is a collection of identifiable units (or a specified characteristic of these units) A frame is a listing of the units in the population We take a sample from the frame and use the resulting data to make inferences about the population Simple random sampling with replacement from a population implies independent and identically distributed (iid) observations Standard statistical methods use the iid assumption for observations or residuals (in regression) 3-3

Processes A process is a system that transforms inputs into outputs, operating over time inputs outputs A process generates a sequence of observations (data) over time We can use a realization from the process to make inferences about the process The iid model is rarely appropriate for processes (observations close together in time are typically correlated and the process often changes with time) 3-4

Process, Realization, Model, Inference, and Applications Description Process Generates Data Realization Z, Z,, Z 1 2 n Statistical Model for the Process (Stochastic Process Model) Estimates for Model Parameters Inferences Forecasts Control 3-5

Stationary Stochastic Processes Z t is a stochastic process Some properties of Z t include mean µ t, variance σ 2 t, and autocorrelation ρ Z t,z t+k In general, these can change over time Strongly stationary (also strictly or completely stationary): Difficult to check F(z t1,,z tn ) = F(z t1 +k,,z tn +k) Weakly stationary (or 2nd order or covariance stationary) requires only that µ = µ t and σ 2 = σ 2 t be constant and that ρ k = ρ Zt,Z t+k depend only on k Easy to check with sample statistics Generally, stationary is understood to be weakly stationary 3-6

Change in Business Inventories 1955-1969 Stationary? Billions of Dollars -5 0 5 10 15 1955 1957 1959 1961 1963 1965 1967 1969 Year 3-7

Estimation of Stationary Process Parameters Some Notation Process Parameter Notation Estimate Mean of Z µ Z = E(Z) µ Z = z = nt=1 z t n Variance of Z γ 0 = σ 2 Z = Var(Z) γ 0 = nt=1 (z t z) 2 n Standard σ Z σ Z = γ 0 Deviation Autocovariance γ k γ k = n k t=1 (z t z)(z t+k z) n Autocorrelation ρ k = γ k γ 0 ρ k = γ k γ 0 Variance of Z σ 2 Z = Var( Z) S 2 Z = γ 0 n [ ] 3-8

Change in Business Inventories 1955-1969 with z and z ±3 σ Z Billions of Dollars -5 0 5 10 15 1955 1957 1959 1961 1963 1965 1967 1969 Year 3-9

Variance of Z with Correlated Data The Variance of Z σ 2 Z = Var( Z) = γ 0 n n 1 k= (n 1) ( 1 k n with correlated data is more complicated than with iid data where σ 2 Z = Var( Z) = γ 0 n = σ2 n Thus if one incorrectly uses the iid formula, the variance is under estimated, potentially leading for false conclusions ) ρ k 3-10

Module 3 Segment 2 Correlation and Autocorrelation 3-11

Correlation [From Statistics 101 ] Consider random data y and x (eg, sales and advertising): x y x 1 y 1 x 2 y 2 x n y n ρ x,y denotes the population correlation between all values of x and y in the population To estimate ρ x,y, we use the sample correlation ρ x,y = ni=1 (x i x)(y i ȳ) ni=1 (x i x) 2 (Stat 101) n i=1 (y i ȳ) 2, 1 ρ x,y 1 3-12

Sample Autocorrelation Consider the random time series realization z 1,z 2,,z n Lagged Variables t z t z t+1 z t+2 z t+3 z t+4 1 z 1 z 2 z 3 z 4 z 5 2 z 2 z 3 z 4 z 5 z 6 3 z 3 z 4 z 5 z 6 z 7 n 2 z n 2 z n 1 z n n 1 z n 1 z n n z n Assuming weak (covariance) stationarity, let ρ k denote the process correlation between observations separated by k time periods To compute the order k sample autocorrelation (ie, correlation between z t and z t+k ) ρ k = γ k γ 0 = Note: 1 ρ k 1 n k t=1 (z t z)(z t+k z) nt=1 (z t z) 2, k = 0,1,2, 3-13

Sample Autocorrelation (alternative formula) Consider the random time series realization z 1,z 2,,z n Lagged Variables t z t z t 1 z t 2 z t 3 z t 4 1 z 1 2 z 2 z 1 3 z 3 z 2 z 1 4 z 4 z 3 z 2 z 1 n z n z n 1 z n 2 z n 3 z n 4 Assuming weak (covariance) stationarity, let ρ k denote the process correlation between observations separated by k time periods To compute the order k sample autocorrelation (ie, correlation between z t and z t k ) ρ k = γ nt=k+1 k (z t z)(z t k z) = γ nt=1 0 (z t z) 2, k = 0,1,2, Note: 1 ρ k 1 3-14

Wolfer Sunspot Numbers 1770-1869 0 20 40 60 80 100 3-15

Lagged Sunspot Data Lagged Variables t Spot Spot1 Spot2 Spot3 Spot4 1 101 2 82 101 3 66 82 101 4 35 66 82 101 5 31 35 66 82 101 99 37 7 16 30 47 100 74 37 7 16 30 101 74 37 7 16 102 74 37 7 103 74 37 104 74 3-16

Wolfer Sunspot Numbers Correlation Between Observations Separated by One Time Period spotts[-1] spotts[-100] Correlation = 081708 Autocorrelation = 0806244 3-17

Wolfer Sunspot Numbers Correlation Between Observations Separated by k Time Periods [showacf(spotts)] timeseries timeseries Lag 1 series Series lag 1 Lag 2 series Series lag 2 Lag 3 series Series lag 3 Lag 4 series Series lag 4 Lag 5 series Series lag 5 Lag 6 series Series lag 6 Lag 7 series Series lag 7 Lag 8 series Series lag 8 Lag 9 series Series lag 9 0 20 60 100 140 Lag 10 series Series lag 10 0 20 60 100 140 Lag 11 series Series lag 11 0 20 60 100 140 Lag 12 series Series lag 12 0 20 60 100 140 Lag 13 series Series lag 13 0 20 60 100 140 Lag ACF 0 5 10 15 20-02 02 06 10 Series : timeseries 3-18

Wolfer Sunspot Numbers Sample ACF Function Series : spotts ACF -02 00 02 04 06 08 10 0 5 10 15 20 Lag 3-19

Module 3 Segment 3 Autoregression, Autoregressive Modeling, and and Introduction to Partial Autocorrelation 3-20

Wolfer Sunspot Numbers 1770-1869 0 20 40 60 80 100 3-21

Autoregressive (AR) Models AR(0): z t = µ+a t (White noise or Trivial model) AR(1): z t = θ 0 +φ 1 z t 1 +a t AR(2): z t = θ 0 +φ 1 z t 1 +φ 2 z t 2 +a t AR(3): z t = θ 0 +φ 1 z t 1 +φ 2 z t 2 +φ 3 z t 3 +a t AR(p): z t = θ 0 +φ 1 z t 1 +φ 2 z t 2 + +φ p z t p +a t a t nid(0,σa 2) 3-22

Data Analysis Strategy Tentative Identification Time Series Plot Range-Mean Plot ACF and PACF Estimation Least Squares or Maximum Likelihood Diagnostic Checking Residual Analysis and Forecasts No Model ok? Yes Use the Model Forecasting Explanation Control 3-23

AR(1) Model for the Wolfer Sunspot Data Data and 1-Step Ahead Predictions Sunspot Residuals, AR(1) Model -20 20 60 0 20 40 60 80 100 Year Number 0 20 40 60 80 100 Year Number Series : residuals(spotar1fit) ACF -04 02 08 residuals(spotar1fit) -20 20 60 0 5 10 15 Lag -2-1 0 1 2 Quantiles of Standard Normal 3-24

AR(2) Model for the Wolfer Sunspot Data Data and 1-Step Ahead Predictions Sunspot Residuals, AR(2) Model -40 0 20 0 20 40 60 80 100 Year Number 0 20 40 60 80 100 Year Number Series : residuals(spotar2fit) ACF -02 04 08 residuals(spotar2fit) -40 0 20 0 5 10 15 Lag -2-1 0 1 2 Quantiles of Standard Normal 3-25

AR(3) Model for the Wolfer Sunspot Data Data and 1-Step Ahead Predictions Sunspot Residuals, AR(3) Model -40 0 20 40 0 20 40 60 80 100 Year Number 0 20 40 60 80 100 Year Number Series : residuals(spotar3fit) ACF -02 04 08 residuals(spotar2fit) -40 0 20 0 5 10 15 Lag -2-1 0 1 2 Quantiles of Standard Normal 3-26

AR(4) Model for the Wolfer Sunspot Data Data and 1-Step Ahead Predictions Sunspot Residuals, AR(4) Model -20 0 20 40 0 20 40 60 80 100 Year Number 0 20 40 60 80 Year Number Series : residuals(spotar4fit) ACF -02 04 08 residuals(spotar2fit) -40 0 20 0 5 10 15 Lag -2-1 0 1 2 Quantiles of Standard Normal 3-27

Sample Partial Autocorrelation Function The sample PACF φ kk can be computed from the sample ACF ρ 1, ρ 2, using the recursive formula from Durbin (1960): φ 1,1 = ρ 1 where φ kk = ρ k 1 k 1 j=1 k 1 j=1 φ k 1, j ρ k j φ k 1, j ρ j, k = 2,3, φ kj = φ k 1,j φ kk φ k 1,k j (k = 3,4,; j = 1,2,,k 1) Thus the PACF contains no new information just a different way of looking at the information in the ACF This sample PACF formula is based on the solution of the Yule-Walker equations (covered module 5) 3-28

Summary of Sunspot Autoregressions Order Regression Output PACF Model p R 2 S φ p t φp φ pp AR(1) 1 6676 2153 810 1396 8062 AR(2) 2 8336 1532 711 981 6341 AR(3) 3 8407 1512 208 204 0805 AR(4) 4 8463 1501 147 141 0611 φ p is from ordinary least squares (OLS) φ pp is from the Durbin formula 3-29

Sample Partial Autocorrelation The true partial autocorrelation function, denoted by φ kk, for k = 1,2, is the process correlation between observations separated by k time periods (ie, between z t and z t+k ) with the effect of the intermediate z t+1,,z t+k 1 removed We can estimate φ kk, k = 1,2, with φ kk, k = 1,2, φ 1,1 = φ 1 from the AR(1) model φ 2,2 = φ 2 from the AR(2) model φ kk = φ k from the AR(k) model The Durbin formula gives somewhat different answers due to the basis of the estimator (ordinary least squares versus solution of the Yule-Walker equations) 3-30

Module 3 Segment 4 Introduction to ARMA Models and Time Series Modeling 3-31

Autoregressive-Moving Average (ARMA) Models (aka Box-Jenkins Models) AR(0): z t = µ+a t (White noise or Trivial model) AR(1): z t = θ 0 +φ 1 z t 1 +a t AR(2): z t = θ 0 +φ 1 z t 1 +φ 2 z t 2 +a t AR(p): z t = θ 0 +φ 1 z t 1 + +φ p z t p +a t MA(1): z t = θ 0 θ 1 a t 1 +a t MA(2): z t = θ 0 θ 1 a t 1 θ 2 a t 2 +a t MA(q): z t = θ 0 θ 1 a t 1 θ q a t q +a t ARMA(1,1): z t = θ 0 +φ 1 z t 1 θ 1 a t 1 +a t ARMA(p,q): z t = θ 0 +φ 1 z t 1 + +φ p z t p θ 1 a t 1 θ q a t q +a t a t nid(0,σ 2 a) 3-32

Graphical Output from RTSERIES Function iden(spottsd) Wolfer Sunspots w= Number of Spots Range-Mean Plot w 1780 1800 1820 1840 1860 time ACF Range 40 60 80 100 120 140 ACF Partial ACF -10 00 05 10-10 00 05 10 0 10 20 30 Lag PACF 20 30 40 50 60 70 Mean 0 10 20 30 Lag 3-33

Tabular Output from RTSERIES Function iden(spottsd) Identification Output for Wolfer Sunspots w= Number of Spots [1] "Standard deviation of the working series= 3736504" ACF Lag ACF se t-ratio 1 1 0806243956 01000000 806243956 2 2 0428105325 01516594 282280693 3 3 0069611110 01632975 042628402 Partial ACF Lag Partial ACF se t-ratio 1 1 0806243896 01 806243896 2 2-0634121358 01-634121358 3-34

Standard Errors for Sample Autocorrelations and Sample Partial Autocorrelations Sample ACF standard error: S ρk = ( 1+2 ρ 2 1 + +2 ρ 2 ) k 1 n Also can compute the t-like statistics t = ρ k /S ρk Sample PACF standard error: S φkk = 1/ n Use to compute t-like statistics t = φ kk /S φkk and set decision bounds In long realizations from a stationary process, if the true parameter is really 0, then the distribution of a t-like statistic can be approximated by N(0, 1) Values of ρ k and φ kk may be judged to be different from zero if the t-like statistics are outside specified limits (±2 is often suggested; might use ±15 as a warning ) 3-35

Drawing Conclusions from Sample ACF s and PACF s PACF PACF Dies Down Cutoff after p > 1 ACF Dies Down ARMA AR(p) ACF Cutoff after q > 1 M A(q) Impossible Technical details, background, and numerous examples will be presented in Modules 4 and 5 3-36

Comments on Drawing Conclusions from Sample ACF s and PACF s The t-like statistics should only be used as guidelines in model building because: An ARMA model is only an approximation to some true process Sampling distributions are complicated; the t-like statistics do not really follow a standard normal distribution (only approximately, in large samples, with a stationary process) Correlations among the ρ k values for different k (eg, ρ 1 and ρ 2 may be correlated) Problems relating to simultaneous inference (looking at many different statistics simultaneously, confidence/significance levels have little meaning) 3-37