Homework 4. 1 Data analysis problems

Similar documents
EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MAS451/MTH451 Time Series Analysis TIME ALLOWED: 2 HOURS

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

STAT 436 / Lecture 16: Key

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.

Time Series I Time Domain Methods

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Univariate ARIMA Models

Stat 565. (S)Arima & Forecasting. Charlotte Wickham. stat565.cwick.co.nz. Feb

Part 1. Multiple Choice (40 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 6 points each)

STAD57 Time Series Analysis. Lecture 23

Circle a single answer for each multiple choice question. Your choice should be made clearly.

477/577 In-class Exercise 5 : Fitting Wine Sales

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Time Series Solutions HT 2009

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

at least 50 and preferably 100 observations should be available to build a proper model

COMPUTER SESSION 3: ESTIMATION AND FORECASTING.

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Minitab Project Report - Assignment 6

We will only present the general ideas on how to obtain. follow closely the AR(1) and AR(2) cases presented before.

Ch 6. Model Specification. Time Series Analysis

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9)

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9) Outline. Return Rate. US Gross National Product

Forecasting using R. Rob J Hyndman. 2.5 Seasonal ARIMA models. Forecasting using R 1

The log transformation produces a time series whose variance can be treated as constant over time.

MCMC analysis of classical time series algorithms.

Review Session: Econometrics - CLEFIN (20192)

FE570 Financial Markets and Trading. Stevens Institute of Technology

Empirical Market Microstructure Analysis (EMMA)

Quantitative Finance I

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

STAD57 Time Series Analysis. Lecture 8

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

TMA4285 December 2015 Time series models, solution.

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

STA 6857 Autocorrelation and Cross-Correlation & Stationary Time Series ( 1.4, 1.5)

Homework 2 Solutions

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Modeling and forecasting global mean temperature time series

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

ARMA models with time-varying coefficients. Periodic case.

Final Examination 7/6/2011

Forecasting Area, Production and Yield of Cotton in India using ARIMA Model

The data was collected from the website and then converted to a time-series indexed from 1 to 86.

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Transformations for variance stabilization

A Data-Driven Model for Software Reliability Prediction

Lecture 5: Estimation of time series

Time Series Analysis -- An Introduction -- AMS 586

Development of Demand Forecasting Models for Improved Customer Service in Nigeria Soft Drink Industry_ Case of Coca-Cola Company Enugu

Lecture 16: ARIMA / GARCH Models Steven Skiena. skiena

ITSM-R Reference Manual

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

ibm: daily closing IBM stock prices (dates not given) internet: number of users logged on to an Internet server each minute (dates/times not given)

Classic Time Series Analysis

Lecture 19 Box-Jenkins Seasonal Models

Time Series Modeling. Shouvik Mani April 5, /688: Practical Data Science Carnegie Mellon University

Lecture 7: Model Building Bus 41910, Time Series Analysis, Mr. R. Tsay

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

STOR 356: Summary Course Notes

STAT 443 (Winter ) Forecasting

18.S096 Problem Set 4 Fall 2013 Time Series Due Date: 10/15/2013

STAT 520 FORECASTING AND TIME SERIES 2013 FALL Homework 05

Econometrics II Heij et al. Chapter 7.1

Seasonal Autoregressive Integrated Moving Average Model for Precipitation Time Series

Suan Sunandha Rajabhat University

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, March 2014

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Analysis of Violent Crime in Los Angeles County

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Chapter 5: Models for Nonstationary Time Series

Advanced Econometrics

Lecture 2: Univariate Time Series

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1

Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Scenario 5: Internet Usage Solution. θ j

Forecasting Bangladesh's Inflation through Econometric Models

Applied time-series analysis

Classical Decomposition Model Revisited: I

CHAPTER 8 FORECASTING PRACTICE I

Sugarcane Productivity in Bihar- A Forecast through ARIMA Model

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS)

Forecasting. Simon Shaw 2005/06 Semester II

Introduction to Time Series Analysis. Lecture 11.

MAT3379 (Winter 2016)

The Box-Cox Transformation and ARIMA Model Fitting

Econometría 2: Análisis de series de Tiempo

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Lesson 9: Autoregressive-Moving Average (ARMA) models

Univariate Time Series Analysis; ARIMA Models

Transcription:

Homework 4 1 Data analysis problems This week we will be analyzing a number of data sets. We are going to build ARIMA models using the steps outlined in class. It is also a good idea to read section 3.8 from our textbook. Make sure to outline the steps used in analyzing the data. If there are two (or more) competing models, make sure you discuss each of these. Make a decision about which model you think is best and support this decision with plots and and other information. I suggest using the sarima() function to do these fits. Here are short summaries about the data sets to analyze. 1. The data in cow.dat are the daily morning temperature readings for a cow. [Sol] It is better to take logarithm of the data and preform the analysis. But it is also ok if students do not take the logarithm. In either case, students need to take the difference of the data. [ If students take logarithm of the data ] ACF plot cuts off q = 1 and PACF tails off, it imply MA(1). It is also possible that both ACF and PACF tail off exponentially, so (p, q) = (1, 1). So, we have two possible models ARIMA(0,1,1) and ARIMA(1,1,1). After fitting these three models, we find that the first model can be rejected by looking at the p-values plot of the diagnostic output. The second model ARIMA(1,1,1) seems to fit the data fairly well, it passes all diagnostic tests, and the smaller AIC value = -2.88, but the coefficient of AR is not significant. However, the coefficient of AR is fairly significant and the diagnostics of ARIMA(1,1,1) is better and AIC is smaller than one of ARIM(0,1,1). Thus, we conclude that ARIMA(1,1,1) is the best model. Here is the output: 1

ar1 ma1 constant 0.1815-1.000-0.0043 s.e. 0.1161 0.076 0.0009 sigma^2 estimated as 0.01916: log likelihood = 39.36, aic = -70.71 [1] -2.875046 [1] -2.840761 [1] -3.782347 [ If students do not take the logarithm ] Obviously ACF cuts off after the first lag and PACF tails off. We can try other models but I find ARIMA(0,1,1) is the best model. Here is the output: ma1 constant -0.8724-0.2380 s.e. 0.0983 0.1339 sigma^2 estimated as 65.06: log likelihood = -260.21, aic = 526.41 [1] 5.22869 2

[1] 5.259864 [1] 4.29049 2. The data in sheep.dat are the sheep population (in millions) for England and Wales from 1867-1939. [Sol] The time series plot indicates non-stationarity. Hence, differencing is needed. The patterns in ACF and PACF plot are not clear. ACF and PACF plots indicates q = 4 and p = 3, respectively. It is also possible that both ACF/PACF tail off exponentially, and hence (p, q) = (1, 1). So, there are three possible models ARIMA(3,1,0), ARIMA(0,1,4) and ARIMA(1,1,1). After fitting these models, ARIMA(3,1,0) and ARIMA(0,1,4) seem to be reasonable because both models passes all the diagnostic tests provided by sarima function. If one is concerned with the normality assumption, ARIMA(3,1,0) appears to be better. But ARIMA(0,1,4) has the smallest AIC value. Here is the output: ## ARIMA(3,1,0) ar1 ar2 ar3 constant 0.4134-0.2044-0.3115-5.8649 s.e. 0.1192 0.1357 0.1241 7.4555 sigma^2 estimated as 4742: log likelihood = -407.26, aic = 824.51 [1] 9.573821 [1] 9.613486 3

[1] 8.699326 ## ARIMA(0,1,4) ma1 ma2 ma3 ma4 0.3109-0.2357-0.5094-0.5657 s.e. 0.1190 0.1085 0.1229 0.1169 constant -7.5410 s.e. 1.3763 sigma^2 estimated as 4348: log likelihood = -405.72, aic = 823.45 [1] 9.514487 [1] 9.559319 [1] 8.671367 3. The data in bicoal.dat are the annual bituminous coal production levels in the US from 1930-1968 in millions of net tons per year. [Sol] The time series plot indicates non-stationarity. Hence, differencing is needed. The patterns in ACF and PACF plot are not clear and it seem to be that both cut off. ACF and PACF plots indicates q = 2 and p = 2, respectively. It is also possible that both ACF/PACF tail off exponentially, and hence (p, q) = (1, 1). So, there are three possible models ARIMA(2,1,0), ARIMA(0,1,2), ARIMA(1,1,1) and ARIMA(2,1,2). After fitting these models, ARIMA(0,1,2) and ARIMA(2,1,1) seem to be reasonable: both have smaller AIC values than other models and passes all the diagnostic tests. Note that ARIMA(2,1,1) is obtained from ARIMA(2,1,2) 4

because the second coefficient of MA part was not significant. Here is the output: ## ARIMA(0,1,2) ma1 ma2 constant 0.0985-0.4666 0.0641 s.e. 0.1312 0.1355 5.1830 sigma^2 estimated as 3051: log likelihood = -260.93, aic = 529.86 [1] 9.145703 [1] 9.205072 [1] 8.261529 ## ARIMA(2,1,1) ar1 ar2 ma1 constant -0.6653-0.4345 0.6713 0.5941 s.e. 0.1946 0.1442 0.1729 6.1946 sigma^2 estimated as 2859: log likelihood = -259.45, aic = 528.9 [1] 9.121623 5

[1] 9.190916 [1] 8.276058 Other answers are ok if reasonable explanation is given. 2 Theoretical problems 1. Identify the following as a specific ARIMA model. That is, what are p, d( 0) and q, and what are the values of the parameters? x t = x t 1 0.25x t 2 + w t 0.1w t 1 where w t wn(0, σ 2 w). We need to check the causal and invertible conditions (for causality condition, use the results of Example 3.8 in the textbook). [Sol] This looks like an ARMA(2,1) model with φ 1 = 1, φ 2 = 0.25 and θ 1 = 0.1. We need to check the stationarity conditions : φ 1 +φ 2 = 0.75 < 1, φ 2 φ 1 = 1.25 < 1 and φ 2 = 0.25 < 1. So, the process is a stationary and invertible ARMA(2,1). 2. Compute the values for E( x t ) and V ar( x t ) in the following ARIMA model x t = 10 + 1.25x t 1 0.25x t 2 + w t 0.1w t 1 where w t wn(0, σ 2 w) (Hint : for V ar( x t ), you use the formula in Example 3.13 of the textbook). [Sol] x t = x t x t 1 = 10 + 0.25(x t 1 x t 2 ) + w t 0.1w t 1. So the model is a causal and invertible ARIMA(1,1,1) with φ 1 = 0.25, θ 1 = 0.1. Hence E( x t ) = 10 1 0.25 = 40 3 and V ar( x t ) = 1+2φ 1 θ 1 +θ1 2 σ 1 φ 2 w 2 = 1.024σ2 2. 1 3. Suppose that x t is generated according to x t = w t + cw t 1 + cw t 2 + cw t 3 + + +cw 0 for t > 0 where w t wn(0, σ 2 w). 1) Find the mean and autocovariance functions for x t, E(x t ) and cov(x t, x s ) for t < s. Is {x t } stationary? 6

[Sol] E(x t ) = 0 and V ar(x t ) = (1 + tc 2 )σw 2 which, in general, varies with t. Assume that t < s. Then cov(x t, x s ) = cov(w t + cw t 1 + cw t 2 + cw t 3 + + +cw 0, w s + cw s 1 + cw s 2 + cw s 3 + + +cw 0 ) = (c + c 2 t)σw 2 = c(1 + ct)σw. 2 2) Is x t stationary? Why or why not? [Sol] x t = (w t + cw t 1 + cw t 2 + cw t 3 + + +cw 0 ) (w t 1 + cw t 2 + cw t 3 + cw t 4 + + +cw 0 ) = w t (1 c)w t 1. So this is stationary for any value of c. 3) Identify x t as a specific ARIMA process. [Sol] The process { x t } is an MA(1) process so that {x t } is ARIMA(0,1,1) process with θ 1 = c 1. The { x t } process is invertible if c < 1. 7