Homework 5, Problem 1 Andrii Baryshpolets 6 April 2017

Similar documents
Transformations for variance stabilization

477/577 In-class Exercise 5 : Fitting Wine Sales

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9) Outline. Return Rate. US Gross National Product

STA 6857 ARIMA and SARIMA Models ( 3.8 and 3.9)

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

STAT 436 / Lecture 16: Key

Unit root problem, solution of difference equations Simple deterministic model, question of unit root

Report for Fitting a Time Series Model

Forecasting with ARIMA models This version: 14 January 2018

İlk Dönem Ocak 2008 ve Aralık 2009 arasındak borsa kapanış fiyatlarının logaritmik farklarının 100 ile çarpılması ile elde edilmiştir.

Stat 153 Time Series. Problem Set 4

Case Study: Modelling Industrial Dryer Temperature Arun K. Tangirala 11/19/2016

Autoregressive Integrated Moving Average Model to Predict Graduate Unemployment in Indonesia

Forecasting using R. Rob J Hyndman. 2.5 Seasonal ARIMA models. Forecasting using R 1

Data %>% Power %>% R You ready?

The data was collected from the website and then converted to a time-series indexed from 1 to 86.

Empirical Approach to Modelling and Forecasting Inflation in Ghana

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Time Series in R: Forecasting and Visualisation. Forecast evaluation 29 May 2017

Lecture 5: Estimation of time series

Univariate, Nonstationary Processes

Time Series Analysis of United States of America Crude Oil and Petroleum Products Importations from Saudi Arabia

Austrian Inflation Rate

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

The log transformation produces a time series whose variance can be treated as constant over time.

Package TimeSeries.OBeu

Lecture 6a: Unit Root and ARIMA Models

Frequency Forecasting using Time Series ARIMA model

Modeling and forecasting global mean temperature time series

Non-Stationary Time Series and Unit Root Testing

STA 6857 Forecasting ( 3.5 cont.)

Trend Analysis Peter Claussen 9/5/2017

FORECASTING USING R. Dynamic regression. Rob Hyndman Author, forecast

Advanced Econometrics

Forecasting the Prices of Indian Natural Rubber using ARIMA Model

Non-Stationary Time Series and Unit Root Testing

Trending Models in the Data

Homework 4. 1 Data analysis problems

Time Series Forecasting Methods:

Forecasting U.S.A Imports from China, Singapore, Indonesia, and Thailand.

Lecture 3-4: Probability models for time series

Automatic Forecasting

FE570 Financial Markets and Trading. Stevens Institute of Technology

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

Financial Econometrics Review Session Notes 3

11/18/2008. So run regression in first differences to examine association. 18 November November November 2008

Non-Stationary Time Series and Unit Root Testing

Empirical Market Microstructure Analysis (EMMA)

Spectral Analysis. Al Nosedal University of Toronto. Winter Al Nosedal University of Toronto Spectral Analysis Winter / 71

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

Application of ARIMA Models in Forecasting Monthly Total Rainfall of Rangamati, Bangladesh

Author: Yesuf M. Awel 1c. Affiliation: 1 PhD, Economist-Consultant; P.O Box , Addis Ababa, Ethiopia. c.

MAT3379 (Winter 2016)

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016

Lecture Notes of Bus (Spring 2017) Analysis of Financial Time Series Ruey S. Tsay

Package thief. R topics documented: January 24, Version 0.3 Title Temporal Hierarchical Forecasting

STAT 520 FORECASTING AND TIME SERIES 2013 FALL Homework 05

Time-Series Regression and Generalized Least Squares in R*

Package TSPred. April 5, 2017

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Part 1. Multiple Choice (40 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 6 points each)

Forecasting using R. Rob J Hyndman. 2.3 Stationarity and differencing. Forecasting using R 1

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

COMPUTER SESSION 3: ESTIMATION AND FORECASTING.

Forecasting Gold Price. A Comparative Study

Univariate ARIMA Models

Time Series and Forecasting Using R

Time series: Cointegration

Ross Bettinger, Analytical Consultant, Seattle, WA

Ch 6. Model Specification. Time Series Analysis

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Vector Autoregression

Time-Series analysis for wind speed forecasting

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

Forecasting: principles and practice 1

ITSx: Policy Analysis Using Interrupted Time Series

Ch 7 : Regression in Time Series

Short and Long Memory Time Series Models of Relative Humidity of Jos Metropolis

Prediction of Grain Products in Turkey

Econometrics for Policy Analysis A Train The Trainer Workshop Oct 22-28, 2016 Organized by African Heritage Institution

03 Time series with trend and seasonality components. Andrius Buteikis,

Seasonality. Matthieu Stigler January 8, Version 1.1

AR, MA and ARMA models

STAT Financial Time Series

FORECASTING THE INVENTORY LEVEL OF MAGNETIC CARDS IN TOLLING SYSTEM

Estimation and application of best ARIMA model for forecasting the uranium price.

Sugarcane Productivity in Bihar- A Forecast through ARIMA Model

Suan Sunandha Rajabhat University

Time Series I Time Domain Methods

ARIMA modeling to forecast area and production of rice in West Bengal

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Financial Time Series Analysis: Part II

Forecasting of the Austrian Inflation Rate

Automatic seasonal auto regressive moving average models and unit root test detection

Lecture 4a: ARMA Model

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Problem Set 2: Box-Jenkins methodology

distributed approximately according to white noise. Likewise, for general ARMA(p,q), the residuals can be expressed as

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

22/04/2014. Economic Research

Transcription:

Homework 5, Problem 1 Andrii Baryshpolets 6 April 2017 Total Private Residential Construction Spending library(quandl) Warning: package 'Quandl' was built under R version 3.3.3 Loading required package: xts Loading required package: zoo Attaching package: 'zoo' The following objects are masked from 'package:base': as.date, as.date.numeric library(zoo) library(xts) library(dygraphs) library(knitr) library(forecast) Loading required package: timedate This is forecast 7.3 library(urca) y <- Quandl("FRED/PRRESCON", type="ts") We plot the original and log-transformed Total Private Residential Construction Spending. ly <- log(y) plot(y, xlab=" Years 1993-2016", ylab="", main="total Private Residential Construction Spending" ) 1

Total Private Residential Construction Spending 20000 40000 60000 1995 2000 2005 2010 2015 Years 1993 2016 plot(ly, xlab="years 1993-2016", ylab="", main="log Total Private Residential Construction Spending" ) 2

Log Total Private Residential Construction Spending 9.5 10.0 10.5 11.0 1995 2000 2005 2010 2015 Years 1993 2016 The time series shows an exponential trend (with a possible structural break) as well as a seasonal pattern is present.for further analysis we will use log-transformed data untill the end of 2013 to build an estimation model. endof2013 <- 2013 + (11/12) ly.est <- window(ly, end = endof2013) ly.pred <- window(ly, start = endof2013 + (1/12)) dl.y <- diff(ly.est,lag=1) dl.y12 <- diff(ly.est,12) dl.y12_1 <- diff(diff(ly.est,12), 1) par(mfrow=c(2,2)) plot(ly.est, xlab="", ylab="", main=expression(log(y))) plot(dl.y, xlab="", ylab="", main=expression(paste(delta, "log(y)"))) plot(dl.y12, xlab="", ylab="", main=expression(paste(delta[12], "log(y)"))) plot(dl.y12_1, xlab="", ylab="", main=expression(paste(delta, Delta[12], "log(y)"))) 3

log(y) log(y) 9.5 10.5 0.2 0.1 1995 2000 2005 2010 1995 2000 2005 2010 12 log(y) 12 log(y) 0.5 0.0 1995 2000 2005 2010 0.06 0.02 1995 2000 2005 2010 The time series in differences and seasonal differences is the only that looks weakly stationary. To verify this is true, we look at ACFs and PACFs. library(forecast) maxlag <-24 par(mfrow=c(2,4)) Acf(ly.est, type='correlation', lag=maxlag, ylab="", main=expression(paste("acf for log(y)"))) Acf(dl.y, type='correlation', lag=maxlag, ylab="", main=expression(paste("acf for ", Delta,"log(y)"))) Acf(dl.y12, type='correlation', lag=maxlag, ylab="", main=expression(paste("acf for ", Delta[12], "log(y Acf(dl.y12_1, type='correlation', lag=maxlag, ylab="", main=expression(paste("acf for ", Delta, Delta[12 Acf(ly, type='partial', lag=maxlag, ylab="", main=expression(paste("pacf for log(y)"))) Acf(dl.y, type='partial', lag=maxlag, ylab="", main=expression(paste("pacf for ", Delta, "log(y)"))) Acf(dl.y12, type='partial', lag=maxlag, ylab="", main=expression(paste("pacf for ", Delta[12], "log(y)") Acf(dl.y12_1, type='partial', lag=maxlag, ylab="", main=expression(paste("pacf for ", Delta,Delta[12], " 4

ACF for log(y) ACF for log(y) ACF for 12 log(y) ACF for 12 log(y) 0.2 0.2 0.6 1.0 0.4 0.0 0.4 0.8 0.2 0.2 0.6 1.0 0.2 0.2 0.6 PACF for log(y) PACF for log(y) PACF for 12 log(y) PACF for 12 log(y) 0.5 0.0 0.5 1.0 0.4 0.0 0.4 0.8 0.4 0.0 0.4 0.8 0.2 0.2 0.4 0.6 So the last time series is the only weekly stationary according to ACFs and PACFs. Tests for presence of a unit root ADF test library(tseries) adf.test(ly) Augmented Dickey-Fuller Test data: ly Dickey-Fuller = -1.709, order = 6, p-value = 0.6988 alternative hypothesis: stationary By ADF test we cannot reject the H 0 hypothesis that the time series has a unit root. KPSS test library(tseries) kpss.test(ly, null="trend") Warning in kpss.test(ly, null = "Trend"): p-value smaller than printed p- value KPSS Test for Trend Stationarity 5

data: ly KPSS Trend = 0.8985, Truncation lag parameter = 3, p-value = 0.01 library(urca) ly.urkpss <- ur.kpss(ly, type="tau", lags="short") summary(ly.urkpss) # # KPSS Unit Root Test # # Test is of type: tau with 5 lags. Value of test-statistic is: 0.6327 Critical value for a significance level of: 10pct 5pct 2.5pct 1pct critical values 0.119 0.146 0.176 0.216 By KPSS test we reject the H 0 hypothesis that the time series is stationary. Building the model We use autoarima function to find the best specification for our model. M12.bic <- auto.arima(ly.est, ic="bic", seasonal=true, stationary=false, stepwise=false) M12.bic Series: ly.est ARIMA(1,1,2)(0,0,2)[12] Coefficients: ar1 ma1 ma2 sma1 sma2 0.0462 0.4704 0.5356 1.1878 0.7469 s.e. 0.1233 0.1064 0.0736 0.0660 0.0490 sigma^2 estimated as 0.0009834: log likelihood=501.49 AIC=-990.98 AICc=-990.64 BIC=-969.83 M12.aic <- auto.arima(ly.est, ic="aic", seasonal=true, stationary=false, stepwise=false) M12.aic Series: ly.est ARIMA(1,1,2)(0,0,2)[12] with drift Coefficients: ar1 ma1 ma2 sma1 sma2 drift 0.0452 0.4709 0.5358 1.1873 0.7468 0.0032 s.e. 0.1232 0.1062 0.0735 0.0660 0.0489 0.0116 sigma^2 estimated as 0.0009872: log likelihood=501.53 AIC=-989.06 AICc=-988.59 BIC=-964.38 M12.aicc <- auto.arima(ly.est, ic="aicc", seasonal=true, stationary=false, stepwise=false) M12.aicc 6

Series: ly.est ARIMA(1,1,2)(0,0,2)[12] with drift Coefficients: ar1 ma1 ma2 sma1 sma2 drift 0.0452 0.4709 0.5358 1.1873 0.7468 0.0032 s.e. 0.1232 0.1062 0.0735 0.0660 0.0489 0.0116 sigma^2 estimated as 0.0009872: log likelihood=501.53 AIC=-989.06 AICc=-988.59 BIC=-964.38 ARIMA(1,1,0)(1,0,0)[12] is the best specification suggested by Autoarima using BIC, AIC, AICc criterias. Forecast At first, we construct the multiplestep forecast. library(forecast) M12.bic.f.h <- forecast(m12.bic, length(ly.pred)) plot(m12.bic.f.h, type="o", pch=16, xlim=c(2012,2017), ylim=c(9.5,11), main="arima Model Multistep Forecast - TPRCS") lines(m12.bic.f.h$mean, type="p", pch=16, lty="dashed", col="blue") lines(ly, type="o", pch=16, lty="dashed") ARIMA Model Multistep Forecast TPRCS 9.5 10.0 10.5 11.0 2012 2013 2014 2015 2016 2017 Rolling scheme forecast 7

library(forecast) M12.bic.f.rol <- zoo() for(i in 1:length(ly.pred)) {M12.bic.rsc <- window( ly, start=1993+(i-1)/12, end=endof2013+(i-1)/12 ) M12.bic.updt <- arima(m12.bic.rsc, order=c(1,1,2), seasonal=list(order=c(0,0,2),period=na)) M12.bic.f.rol <- c(m12.bic.f.rol, forecast(m12.bic.updt, 1)$mean) } M12.bic.f.rol <- as.ts(m12.bic.f.rol) accuracy(m12.bic.f.rol, ly.pred) ME RMSE MAE MPE MAPE Test set -3.099263e-05 0.03353639 0.02724083-0.0001054584 0.2608801 ACF1 Theil's U Test set -0.04410045 0.442365 plot(m12.bic.f.h, type="o", pch=16, xlim=c(2012,2017), ylim=c(9.5,11), main="multistep vs. Rolling Scheme forecasts - TPRCS") lines(m12.bic.f.h$mean, type="p", pch=16, lty="dashed", col="blue") lines(m12.bic.f.rol, type="p", pch=16, lty="dashed", col="red") lines(ly, type="o", pch=16, lty="dashed") Multistep vs. Rolling Scheme forecasts TPRCS 9.5 10.0 10.5 11.0 2012 2013 2014 2015 2016 2017 Now we compare our model to the restricted Seasonal AR (9,1,0)(1,0,0)[12], which was offered in the class. ARIMA(1,1,0)(1,0,0)[12] class.arima <- arima(ly.est, order=c(9,1,0), seasonal=list(order=c(1,0,0),period=na)) class.arima 8

Call: arima(x = ly.est, order = c(9, 1, 0), seasonal = list(order = c(1, 0, 0), period = NA)) Coefficients: ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 0.4704 0.282-0.1059-0.0256 0.0536-0.0065-0.1653 0.1691 s.e. 0.0624 0.068 0.0705 0.0701 0.0702 0.0693 0.0702 0.0680 ar9 sar1-0.1576 0.9751 s.e. 0.0623 0.0087 sigma^2 estimated as 0.0003045: log likelihood = 641.66, aic = -1261.32 Restricted ARIMA(1,1,0)(1,0,0)[12] class.restrarima <- arima(ly.est, order=c(9,1,0), seasonal=list(order=c(1,0,0),period=na), class.restrarima transform.pa Call: arima(x = ly.est, order = c(9, 1, 0), seasonal = list(order = c(1, 0, 0), period = NA), transform.pars = FALSE, fixed = c(na, NA, 0, 0, 0, 0, NA, NA, NA, NA)) Coefficients: ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 sar1 0.4446 0.2377 0 0 0 0-0.1620 0.1649-0.1615 0.9773 s.e. 0.0599 0.0601 0 0 0 0 0.0617 0.0654 0.0623 0.0080 sigma^2 estimated as 0.0003067: log likelihood = 640.23, aic = -1266.46 We compare error accuracy between our model and restricted arima. library(forecast) par(mfrow=c(2,1)) accuracy(m12.bic.f.rol, ly.pred) ME RMSE MAE MPE MAPE Test set -3.099263e-05 0.03353639 0.02724083-0.0001054584 0.2608801 ACF1 Theil's U Test set -0.04410045 0.442365 accuracy(class.restrarima) ME RMSE MAE MPE MAPE Training set 0.0002291197 0.0174892 0.01367225 0.003050552 0.1341786 MASE ACF1 Training set 0.1995947 0.01831891 Restricted ARIMA rolling scheme library(forecast) class.restrarima.f.rol <- zoo() for(i in 1:length(ly.pred)) { class.restrarima.rsc <- window( ly, start=1993+(i-1)/12, end=endof2013+(i-1)/12 ) class.restrarima.updt <- arima(class.restrarima.rsc, order=c(9,1,0), seasonal=list(order=c(1,0,0),period class.restrarima.f.rol <- c(class.restrarima.f.rol, forecast(class.restrarima.updt, 1)$mean) 9

} class.restrarima.f.rol <- as.ts(class.restrarima.f.rol) accuracy(class.restrarima.f.rol, ly.pred) ME RMSE MAE MPE MAPE Test set -0.001900123 0.02202464 0.0163457-0.01841198 0.1562592 ACF1 Theil's U Test set -0.1134615 0.2864665 We compare our model and the restricted ARIMA plot(m12.bic.f.rol, type="o", pch=16, xlim=c(2013,2017), ylim=c(9.8,10.8), main="arima(9,1,0)(1,0,0) vs. ARIMA(1,1,2)(0,0,2) Rolling Scheme forecasts - TPRCS") lines(class.restrarima.f.rol, type="p", pch=16, lty="dashed", col="blue") lines(m12.bic.f.rol, type="p", pch=16, lty="dashed", col="red") lines(ly, type="o", pch=16, lty="dashed") ARIMA(9,1,0)(1,0,0) vs. ARIMA(1,1,2)(0,0,2) Rolling Scheme forecasts TPRCS M12.bic.f.rol 9.8 10.0 10.2 10.4 10.6 10.8 2013 2014 2015 2016 2017 Time Conclusion If we visually compare two rolling scheme forecasts for both ARIMA(9,1,0)(1,0,0) and ARIMA(1,1,2)(0,0,2), we can barely tell which one is a better forecast. However, discussed in class model seems to be preferable as it has lower ME, RMSE, MAE, Mape and only MPE is lower for our suggested model. 10