Time Series Econometrics Lecture Notes. Burak Saltoğlu

Similar documents
Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Empirical Market Microstructure Analysis (EMMA)

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

9) Time series econometrics

Trending Models in the Data

Lecture 2: Univariate Time Series

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

10) Time series econometrics

Autoregressive Moving Average (ARMA) Models and their Practical Applications

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

E 4101/5101 Lecture 9: Non-stationarity

Problem Set 2: Box-Jenkins methodology

Advanced Econometrics

at least 50 and preferably 100 observations should be available to build a proper model

Econometrics II Heij et al. Chapter 7.1

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

AR, MA and ARMA models

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

Univariate ARIMA Models

Econ 423 Lecture Notes: Additional Topics in Time Series 1

A lecture on time series

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Review Session: Econometrics - CLEFIN (20192)

Univariate Time Series Analysis; ARIMA Models

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Ch 6. Model Specification. Time Series Analysis

Time Series Econometrics 4 Vijayamohanan Pillai N

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

1 Regression with Time Series Variables

Econ 623 Econometrics II Topic 2: Stationary Time Series

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Cointegration, Stationarity and Error Correction Models.

Non-Stationary Time Series and Unit Root Testing

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Chapter 2: Unit Roots

Forecasting Bangladesh's Inflation through Econometric Models

STAT Financial Time Series

Introduction to Stochastic processes

Author: Yesuf M. Awel 1c. Affiliation: 1 PhD, Economist-Consultant; P.O Box , Addis Ababa, Ethiopia. c.

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Oil price volatility in the Philippines using generalized autoregressive conditional heteroscedasticity

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

LECTURE 11. Introduction to Econometrics. Autocorrelation

Stationarity and cointegration tests: Comparison of Engle - Granger and Johansen methodologies

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Problem set 1 - Solutions

Econometrics I: Univariate Time Series Econometrics (1)

Oil price and macroeconomy in Russia. Abstract

A time series is called strictly stationary if the joint distribution of every collection (Y t

ARIMA Modelling and Forecasting

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Non-Stationary Time Series and Unit Root Testing

FE570 Financial Markets and Trading. Stevens Institute of Technology

Non-Stationary Time Series and Unit Root Testing

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Applied time-series analysis

Forecasting Egyptian GDP Using ARIMA Models

1 Quantitative Techniques in Practice

MA Advanced Econometrics: Applying Least Squares to Time Series

Multivariate Time Series: VAR(p) Processes and Models

1 Linear Difference Equations

Unit Root and Cointegration

Volatility. Gerald P. Dwyer. February Clemson University

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Inflation Revisited: New Evidence from Modified Unit Root Tests

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

Covers Chapter 10-12, some of 16, some of 18 in Wooldridge. Regression Analysis with Time Series Data

Lecture 6a: Unit Root and ARIMA Models

This note introduces some key concepts in time series econometrics. First, we

Econometrics of Panel Data

Gaussian Copula Regression Application

10. Time series regression and forecasting

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

6. The econometrics of Financial Markets: Empirical Analysis of Financial Time Series. MA6622, Ernesto Mordecki, CityU, HK, 2006.

Nonstationary Time Series:

BCT Lecture 3. Lukas Vacha.

Econ 424 Time Series Concepts

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture 1: Stationary Time Series Analysis

Econometrics of financial markets, -solutions to seminar 1. Problem 1

ARDL Cointegration Tests for Beginner

Lecture on ARMA model

Financial Time Series Analysis: Part II

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Stationary Stochastic Time Series Models

Univariate linear models

Multiple Regression Analysis

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Econometría 2: Análisis de series de Tiempo

ECONOMETRICS II, FALL Testing for Unit Roots.

Econometric Forecasting

Note: The primary reference for these notes is Enders (2004). An alternative and more technical treatment can be found in Hamilton (1994).

Advanced Econometrics

Ross Bettinger, Analytical Consultant, Seattle, WA

FinQuiz Notes

Identifying the Monetary Policy Shock Christiano et al. (1999)

Transcription:

Time Series Econometrics Lecture Notes Burak Saltoğlu May 2017

2

Contents 1 Introduction 7 1.1 Linear Time Series................................ 7 1.1.1 Why do we study time series in econometrics.............. 8 1.2 Objectives of time series............................. 8 1.2.1 Description................................ 8 1.2.2 Explanation................................ 8 1.2.3 Prediction................................. 8 1.2.4 Policy and control............................ 9 1.3 Distributed Lag Models............................. 9 1.3.1 Autoregressive Models.......................... 9 1.3.2 ARDL Models.............................. 9 1.3.3 Granger Causality Test.......................... 10 1.4 Linear Time Series Models: y(t)........................ 11 1.4.1 Stochastic Process............................ 11 1.4.2 Times Series And White Noise...................... 12 1.4.3 White Noise................................ 12 1.4.4 Strict Sense Stationarity......................... 12 1.4.5 Wide Sense Stationarity......................... 12 1.4.6 Basic ARMA Models........................... 13 1.4.7 Lag Operators............................... 13 1.4.8 AR vs MA Representation Wold Decomposition............ 13 1.4.9 Autocorrelations and Autocovariance Functions............ 14 1.4.10 Sample Counterpart of AutoCovariance Function........... 14 1.4.11 Partial Autocorrelation Function.................... 14 1.4.12 Linear Time Series-AR.......................... 15 1.4.13 AR(1) Application on Turkish Growth Rate.............. 21 1.4.14 general interpretation of time series and autocorrelation graphs... 23 1.4.15 stationarity in the graphs........................ 24 1.4.16 Linear Time Series Models:MA(k).................... 24 1.4.17 MA(1) Correlogram............................ 25 1.4.18 An MA(1) Example............................ 25 1.4.19 Variance Covariance MA(2)....................... 25 1.4.20 Moving Average MA(k) Process..................... 26 1.4.21 ARMA Models: ARMA(1,1)....................... 27 3

CONTENTS CONTENTS 1.4.22 Maximum Likelihood Estimation.................... 28 1.4.23 Likelihood function for ARMA(1,1) process.............. 28 1.5 Model Selection.................................. 29 1.5.1 Two Model Selection Criteria...................... 29 1.5.2 Characterization of Time Series..................... 30 1.5.3 Correlogram................................ 30 1.5.4 Sample Autocorrelation......................... 30 1.5.5 Correlogram................................ 30 1.5.6 Test for Autocorrelation......................... 31 1.5.7 Box-Pierce Q Statistics.......................... 32 1.5.8 Ljung-Box Statistics........................... 32 1.5.9 Are Residuals Clean?........................... 33 1.5.10 Are Residuals GAUSSIAN?....................... 33 1.5.11 Optimal ARMA order level choice: Turkish GDP Growth....... 34 1.5.12 Box-Jenkins Approach to Time Series.................. 34 1.6 Forecasting.................................... 35 1.6.1 Introduction to Forecasting....................... 35 1.6.2 In Practice................................ 36 1.6.3 Mean Square Prediction Error Method (MSPE)............ 36 1.6.4 A Forecasting Example for AR(1).................... 36 1.6.5 Forecasting Performance......................... 38 1.6.6 Forecast Performance Evaluation.................... 39 1.6.7 Forecast Combinations.......................... 39 1.6.8 Forecasting Combination Example................... 40 1.6.9 Using Regression for Forecast Combinations.............. 40 1.7 Summary..................................... 40 2 Testing for Stationarity and Unit Roots 41 2.0.1 Spurious Regression............................ 41 2.0.2 Example Spurious Regression...................... 42 2.0.3 Examples: Gozalo............................. 43 2.0.4 Unit Roots: Stationarity......................... 44 2.0.5 Some Time Series Models: Random Walk Model............ 44 2.0.6 Why a Formal Test is Necessary?.................... 46 2.0.7 How Instructive to Use ACF?...................... 47 2.0.8 Testing for Unit Roots: Dickey Fuller................. 47 2.0.9 Dickey Fuller Test............................ 48 2.0.10 Dickey-Fuller F-test............................ 50 2.0.11 ADF: Augemented Dickey Fuller Test.................. 50 2.1 Questions..................................... 54 3 COINTEGRATION 57 3.0.1 Money Demand Stability......................... 59 3.0.2 Testing for Cointegration Engle Granger Residual-Based Tests Econometrica, 1987............................... 59 4

CONTENTS CONTENTS 3.0.3 Residual Based Cointegration test: Dickey Fuller test......... 60 3.0.4 Examples of Cointegration: Brent Wti Regression........... 60 3.0.5 Example of ECM............................. 60 3.0.6 Error Correction Term.......................... 61 3.0.7 Use of Cointegration in Economic and Finance............ 63 3.0.8 Conclusion................................. 63 4 Multiple Equation Systems 65 4.1 Seemingly Unrelated Regression Model..................... 65 4.1.1 Recursive Systems............................ 66 4.1.2 Structural and Reduced Forms...................... 66 4.1.3 Some Simultaneous Equation Models.................. 67 4.1.4 Keynesian Income Function....................... 68 4.1.5 Inconsistency of OLS........................... 68 4.1.6 Underidentification............................ 70 4.1.7 Test for Simultaneity Problem..................... 71 4.1.8 Simultaneous Equations in Matrix Form................ 71 5 Vector Autoregression VAR 73 5.0.1 Why VAR................................. 73 5.0.2 VAR Models................................ 73 5.0.3 An Example VAR Models: 1 month 12 months TRY Interest rates monthly.................................. 74 5.0.4 Hypotesis Testing............................. 75 5.1 Questions..................................... 76 6 Asymptotic Distribution Theory 79 6.1 Why large sample?................................ 79 6.1.1 Asymptotics................................ 79 6.2 Convergence in Probability............................ 80 6.2.1 Mean Square Convergence........................ 80 6.2.2 Convergence in Probability........................ 80 6.2.3 Almost sure convergence......................... 80 6.3 Law of large numbers............................... 81 6.3.1 Convergence in Functions........................ 81 6.3.2 Rules for Probability Limits....................... 82 6.4 Convergence in Distribution : Limiting Distributions............. 82 6.4.1 Example Limiting Probability Distributions.............. 82 6.4.2 Rules for Limiting Distributions..................... 82 6.5 Central Limit Theorems............................. 83 6.5.1 Example: distirbution of sample mean................. 83 6.5.2 A simple monte carlo for CLT...................... 84 5

CONTENTS CONTENTS 7 references 87 6

Chapter 1 Introduction We will discuss various topics in time series this course. We will mainly discuss linear time series methods. We will go over mainly five critical topics first general introduction on the linear time series. Then we will discuss non-stationary time series methods. A related topic to this will be co-integration and unit root testing. Finally, we will go over the Vector Autoregression (VAR) methodology. There are various topics such as time varying volatilty modelling and nonlinear time seris methods that are left for more advanced courses. There is no single textbook that we will rely on. We will use, Vance Martin, Stan Hurn and David Harris, 2013, Econometric Modellig with Time Series. However, there are many other relevant textbooks such as Hamilton (1994), Enders (2014),Chafield (2003) and Diebold(2016) among others. We will use various softwares mainly R or Matlab. Ability in coding is very useful to understand time series methods. The recent industrial popularity in big data, machine learning, deep learning are all very related time series econometrics. This course might be helpful for a student who wants to learn these topics too. In this section we will begin by linear time series methods. 1.1 Linear Time Series One of the most popular analytical tools in econometrics/statistics is the linear time series models. We can basically summarize what we will be doing as follows.first, what is time series? A time series consists of a set of variable y, that takes values on a equally spaced interval of time. Time subscript t is used to denote a variable that is observed in a a given sample frequency. Initially, low frequency time series studied. Yearly data, then quarterly data in macroeconomics was useful. But as the computer technology and financial transaction became more and more complex now financial econometricians are even working with time frequency of milisecond intervals. 1. We will be mainly focusing on relatilvey lower frequency aspects of the time series. 1 1 milisecond is 0.001 seconds 7

1.2. OBJECTIVES OF TIME SERIES CHAPTER 1. INTRODUCTION 1.1.1 Why do we study time series in econometrics Main aim of time series analysis is to describe main the time series characteristics of various macroeconomic or financial variables.historically, Yule(1927), Slutsky(1937) and Wold (1938) were the main pioneers of using econometrics in regression analysis. Econometricians mainly were analyzing long and short term cyclical decomposition of macro time series. Later on more is done in multiple regression and simultaneous regression. Research made by Operation research group was more focused on smoothing and forecasting time series. Box and Jenkins(1976) was the main methodology adapted by all economists and engineers. Since then there are still an important developments in the field of time series. In a univariate context, an economist tries to understand whether there is any seasonal pattern. Or to analyze whether a macro variable show a trend. More importantly, if there is a predictable pattern that we observe in a macro variable we try to forecast the future realization of these variables. Forecasting is a very important outlet for policy makers and the business. Why do we trust time series? There are various reasons why we see some predictable patterns in various time series.one reason why there are patterns in time series reliance on Psychological Reasons. People do not change their habits immediately.people s consumer s choice also show some type of momentum. For instance, once they start to demand housing they tend follow each other and house prices may go monotonically for a long period of time. Businessmen and other agents may also behave in a manner which makes the production follow a certain pattern. So time series distinguishes and analyzes these trends. There may be 1.2 Objectives of time series Time series analysis has four main objectives. 1.2.1 Description First step involves to plot the data and to obtain descriptive measures of the time series. Decomposing the cyclical and seasonal components of a given time series is also conducted in the first round. 1.2.2 Explanation Fitting the most appropriate specification is done at this stage. Choosing the lag order and linear model is done at this stage. 1.2.3 Prediction Given the observed macro or financial data,one may usually want to predict the future values of these series. This is the predictive process where the unknown future values. 8

CHAPTER 1. INTRODUCTION 1.3. DISTRIBUTED LAG MODELS 1.2.4 Policy and control Once the mathematical relationship between the input and output is found then to achieve a targeted output the level of input is set. It is usually useful for engineering but also relevant for policy analysis in macroeconomics. 1.3 Distributed Lag Models Distribute lag models rely on the philosophy that economic changes can be distributed over a number of time period. In other words, a series of lagged explanatory variables account for the time adjustment process. In the distributed lag (DL) model we have not only current value of the explanatory variable but also its past value(s). With DL models, the effect of a shock in the explanatory variable lasts more. We can estimate DL models (in principal) with OLS. Because the lags of X are also non-stochastic. Error terms assumed to hold all the necessary OLS assumptions such as normality, homoscedasticity and no serial correlation. Since the model is linear and attains all relevant assumptions these models can be estimated through OLS. Usually,there should be an economic relationship exists between x and y. 1.3.1 Autoregressive Models DL(0) : y t = β 0 + β 1 x t + u t (1.1) DL(1) : y t = β 0 + β 1 x t + β 2 x t 1 + u t (1.2) q DL(q) : y t = β 0 + β i x t i + u t (1.3) i=0 AR(0) : y t = β 0 + u t (1.4) AR(1) : y t = β 0 + α 1 y t 1 + u t (1.5) p AR(p) : y t = β 0 + α i y t i + u t (1.6) In the Autoregressive (AR) models, the past value(s) of the dependent variables becomes an explanatory variable. We can not esitmate an autoregressive model with OLS because: 1. Presence of stochastic explanatory variables 2. Posibility of serial correlation 1.3.2 ARDL Models In the ARDL models, we have both AR and DL part in one regression. i=1 ARDL(p, q) : y t = β 0 + p q α i y t i + β i x t i + u t (1.7) i=1 j=0 9

1.3. DISTRIBUTED LAG MODELS CHAPTER 1. INTRODUCTION 1.3.3 Granger Causality Test A common problem in economics is to determine changes in one variable are a cause of changes in another variable. For instance, do changes in the money supply cause changes in GNP or are GNP and money suppy are determined endogenously. Granger has suggested the following methodology to solve this issue. Let us consider the relation between GNP and money supply. A regression analysis can show us the relation between these two. But our regression analysis can not say the direction of the relation. The granger causality test examines the causality between series, the direction of the relation. We can test whether GNP causes money supply to increase or a monetary expansion lead GNP to rise, under conditions defined by Granger. First we can write the unrestricted regression between GNP and money supply. GNP t = m m α i M t i + β j GNP t j + u t (1.8) i=1 j=1 Then we can run the restricted regression GNP t = m β j GNP t j + u t (1.9) j=1 If Money supply has no contribution in determining GNP then we can set up the following hypothesis α 1 = α 2 =... = 0 If beta s are significant then we can say that Money suppy Granger causes GNP. Similarly, to test whether GNP causes Money Suppy we run the following set of regressions. First, the unrestricted regression is run M t = m m λ i GNP t i + δ j M t j + u t (1.10) i=1 j=1 Then we run the restrictions M t = m δ j M t j + u t (1.11) j=1 If the GNP has no contribution on determining Money supply one should expect λ 1 = λ 2 =.. = 0.If we reject the null of all coefficients are zero then we can say that GNP causes Money supply. On the other hand if we also conclude that M causes GNP simultaneously then we talk of bi-variate causation.both Money Suppy and GNP causes each other. It is an indication of endogeniety. Steps in Granger Casuality In order to formally test for testing M (Granger) causes GNP; 1. Regress GNP on all lagged GNP, obtain RSS 1 2. Regress GNP on all lagged GNP and all lagged M, obtain RSS 2 10

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) 3. The null is all α s are zero. 4. Test statistics; F = (RSS 1 RSS 2 )/m RSS 2 /(n k) where m is the number of lags, k is the number of parameters in step-2. df(m,n-k). Figure 1.1: Granger Causality Test As can be seen from the above EVIEWS output, there is not enough information to claim that there is Granger causality. 1.4 Linear Time Series Models: y(t) Time series analysis is useful when the economic relationship is difficult to set. Even if there are explanatory variables to express y, it is not possible to forecast y(t). 1.4.1 Stochastic Process Any time series data can be thought of as being generated by a stochastic process. A stochastic process is said to be stationary if its mean and variance are constant over time. The value of covariance between two time periods depends only on the distance or lag between the two time periods and not on the actual time at which the covariance is computed. 11

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION 1.4.2 Times Series And White Noise Suppose we have observed a sample of size T of some random variable Y t : {y 1, y 2, y 3,...y T }. This set of T numbers is only one possible outcome of the stochastic process. For example: {y t } t= = {..., y 1, y 0, y 1,..., y T, y T +1,...} {y t } t= would still be viewed as a single realization from a time series process. 1.4.3 White Noise A process is said to be white noise if it follows the following properties. 1. {ε t } t= = {..., ε 1, ε 0, ε 1,..., ε T, ε T +1,...} ] 2. E [ε t = 0 [ ] 3. E ε 2 t = σ 2 ] 4. E [ε t ε τ = 0 for t τ If a time series is time invariant with respect to changes in time, then the process can be estimated with fixed coefficients. 1.4.4 Strict Sense Stationarity If f ( y 1,..., y T ) represents the joint probability density of yt is said to be strictly stationarity if f ( y t,..., y t+k ) = f ( yt+m,..., y t+k+m ) 1.4.5 Wide Sense Stationarity y t is said to be wide sense stationary if the mean, variance and covariance of a time series are stationary. µ y = E[y t ] Covariance of the series must be stationary E[y t ] = E[y t+m ] σ 2 y = E[(y t µ y ) 2 ] = E[(y t+m µ y ) 2 ] γ k = cov(y t, y t+k ) = E [ (y t µ y )(y t+k µ y ) ] cov(y t, y t+k ) = cov(y t+m, y t+k+m ) Strict sense stationarity implies wide sense stationarity but the reverse is not true. Implication of stationarity: inference we obtain from a non-stationary series is misleading and wrong. 12

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) 1.4.6 Basic ARMA Models 1.4.7 Lag Operators AR : y t = φ 1 y t 1 + δ + ε t (1.12) MA : y t = ε t + θ 1 ε t 1 (1.13) AR(p) : y t = φ 1 y t 1 + φ 2 y t 2 +... + φ p y t p + δ + ε t (1.14) MA(q) : y t = ε t + θ 1 ε t 1 + θ 2 ε t 2 +... + θ q ε t q (1.15) Ly t = y t 1 or in general; or we can use lag polynomials; L 2 = LLy t = y t 2 L j = y t j L j = y t+j a(l) = (a 0 L 0 + a 1 L 1 + a 2 L 2 ) AR : (1 φ 1 L)y t = δ + ε t (1.16) MA : y t = (1 + θ 1 L)ε t (1.17) AR(p) : (1 φ 1 L 1 φ 2 L 2... φ p L p )y t = δ + ε t (1.18) MA(q) : y t = (1 + θ 1 L 1 + θ 2 L 2 +... + θ q L q )ε t (1.19) 1.4.8 AR vs MA Representation Wold Decomposition if φ 1 < 1 so that then; y t = φ 1 y t 1 + ε t y t = φ 1 (φ 1 y t 2 + ε t 1 ) + ε t = φ 2 1y t 2 + φ 1 ε t 1 + ε t y t = φ 1 (φ 2 1y t 3 + φ 1 ε t 2 + ε t 1 ) + ε t = (φ 3 1y t 3 + φ 2 1ε t 2 + φε t 1 ) + ε t y t = φ k 1y t k + φ k 1 1 y t k+1 +... + φ 2 1ε t 2 + φε t 1 + ε t lim k φk 1y t k = 0 So AR(1) can be represented as MA( ) y t = φ j 1ε t j j=0 13

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION 1.4.9 Autocorrelations and Autocovariance Functions y t = φ j ε t j j=0 [( ) 2 ] var(y t ) = E φ j ε t j var(y t ) = Note that; j=0 φ2j = 1 + φ 2 +... = 1 1 φ 2 j=0 φ 2j E[ε 2 t j] j=0 var(y t ) = 1 1 φ 2 E[ε2 t j] = 1 1 φ 2 σ2 Autocorrelation: γ j = cov(y t, y t j ) [ ] cov(y t, y t j ) = E (y t E[y t ])(y t j E[y t j ]) Correlation of y t and y t j given as γ 0 = cov(y t, y t ) = var(y t ) ρ j = cov(y t, y t j ) var(y t ) = γ j γ 0 1.4.10 Sample Counterpart of AutoCovariance Function Because of stationarity: γ k = γ 0 = 1 n 1 T (y t y) 2 = σ 2 t=1 T (y t y)(y t k y) k = 1, 2,.. t=1 γ k = γ k ˆρ k = γ k γ 0 1.4.11 Partial Autocorrelation Function The PACF of a time series is a function of its ACF and is a useful tool for determining the order p of an AR model. A simple,yet effective way to introduce PACF is to consider the following AR models in consecutive orders: 14

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) r t = φ 0,1 + φ 1,1 r t 1 + e 1t, r t = φ 0,2 + φ 1,2 r t 1 + φ 2,2 r t 2 + e 2t, r t = φ 0,3 + φ 1,3 r t 1 + φ 2,3 r t 2 + φ 3,3 r t 3 + e 3t, r t = φ 0,4 + φ 1,4 r t 1 + φ 2,4 r t 2 + φ 3,4 r t 3 + φ 4,4 r t 4 + e 4t,... where φ 0,j, φ i,j and e jt are, respectively, the constant term, the coefficient of r t i, and the error term of an AR(j) model. These models are in the form of a mulitple linear regression and can be estimated by the least squares method. As a matter of fact, they are arranged in a sequential order that enables us to apply the idea of partial F test in multiple linear regression analysis. The estimate φ 1,1 of the first equation is called the lag-1 sample PACF of r t.the estimate φ 2,2 of the second equation is called the lag-2 sample PACF of r t.the estimate φ 3,3 of the third equation is called the lag-3 sample PACF of r t, and so on. From the definition, the lag-2 sample PACF φ 2,2 shows the added contribution of r t 2 to r t over the AR(1) model r t = φ 0 +φ 1 r t 1 +e 1t. The lag-3 PACF shows the added contribution of r t 3 to r t over an AR(2) model, and so on. Therefore, for an AR(p) mode, the lag-p sample PACF should not be zero, but φ j,j should be close to zero for all j > p. We make use of this property to determine the order p. Measures the correlation between an observation k periods ago and the current observation, after controling for intermediate lags. For The first lags pacf and acf are equal. 1.4.12 Linear Time Series-AR Let δ = 0 then, y t = φ 1 y t 1 + δ + ε t φ 1 1 non-stationarity condition E[y t ] = µ = From stationarity, note that E[y 2 t 1] = γ 0 ; δ 1 φ 1 φ 1 1 stationarity condition γ 0 = E[(y t µ) 2 ] = E[(φ 1 y t 1 + ε t ) 2 ] γ 0 = E[φ 2 1y 2 t 1 + 2φ 1 y t 1 ε t + ε 2 t ] γ 0 = φ 2 1γ 0 + σ 2 ε γ 0 = 1 φ 2 1 You can also see the above result directly from the equation, y t = φ 1 y t 1 + δ + ε t by taking the variance of both sides. 15 σ2 ε

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION γ 1 = E[(y t µ y )(y t 1 µ y )] γ 1 = E[(φ 1 y t 1 + ε t )(y t 1 )] γ 1 = φ 1 γ 0 γ 1 = φ 1σε 2, γ 1 φ 2 0 = σ2 ε 1 1 φ 2 1 ρ 1 = γ 1 γ 0 = φ 1 σ 2 ε 1 φ 2 1 σ 2 ε 1 φ 2 1 = φ 1 For j = 2; γ 2 = E[(y t )(y t 2 )] γ 2 = E[(φ 1 y t 1 + ε t )(y t 2 )] γ 2 = E[(φ 1 (φ 1 y t 2 + ε t 1 ) + ε t )(y t 2 )] = E[(φ 2 1y t 2 + φ 1 ε t 1 + ε t )(y t 2 )] γ 2 = φ 2 1σ 2 y = φ 2 1γ 0 ρ 2 = φ2 1γ 0 γ 0, So if you have a data which is generated by an AR(1) process, it is correlogram will diminish slowly (if it is stationary). ρ 1 = γ k γ 0 = φ 1 ρ 2 = φ 2 1.. ρ k = φ k 1 16

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) Figure 1.3: AR(1) Process with φ 1 = 0.95 Figure 1.2: Random Walk (No Drift) 17

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION Figure 1.4: AR(1) Process with φ 1 = 0.99 Figure 1.5: AR(1) Process with φ 1 = 0.90 18

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) Figure 1.6: AR(1) Process with φ 1 = 0.5 Figure 1.7: AR(1) Process with φ 1 = 0.05 (Weak Predictable Part) 19

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION Figure 1.8: Turkish GDP Growth Figure 1.9: Turkish GDP Figure 1.10: US GDP: 1947-2017 (Quarterly) 20

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) Figure 1.11: Turkish GDP quarterly: Autocorrelations 1.4.13 AR(1) Application on Turkish Growth Rate Turkish GDP Estimate SE t-stat Constant 0.93 0.46 2.05 AR(1) 0.80 0.07 11.99 Variance 11.86 1.87 6.33 Mean 4.6635 S. Deviation 5.6062 Skewness -1.3031 Kurtosis 4.3208 E[y t ] = δ 1 φ = 0.93 2 1 0.8 = 4.562 2 var(y t ) = 1 1 φ 2 σ2 21

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION Figure 1.12: Turkish Inflation Figure 1.13: White Noise 22

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) Figure 1.14: Correlograms with φ = 0.9 and φ = 0.8 Linear Time Series Models:AR(p) Autoregressive : Expected value of Y: y t = k φ i y t i + δ + ε t i=1 E[Y t ] = µ = µ = k φ i µ + δ i=1 k φ i µ + δ i=1 δ 1 k i=1 φ i AR(k) 1.4.14 general interpretation of time series and autocorrelation graphs As can be seen in various simulated graphs in previous chapters we notice that, correlogram of stationary AR models tells a lot about the structure of the time series. First of all, white noise model produces time series realizations which are very difficult to predict. They produce rather erratic and unpredictable pattern. (see figure 1.13). For the AR process with φ 1 = 0.05 shows similar pattern to a white noise process (see figure 1.7). The predictive 23

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION power of with higher AR coefficients exhibit a totally different pattern. in the AR model the hgiher the persistence. For instance when φ 1 = 0.9 (see figure 1.14) for an AR process we see more predictable patterns in the original time series. In addition, we notice the correlogram regarding to this process has a slowly decaying pattern unlike when φ 1 = 0.5. 1.4.15 stationarity in the graphs As can be seen in the graph 1.2 random walk depicts a very different picture than other stationary AR processes. For instance, in a typical random walk realization, time series does not have a stable mean or variance. In general, random walk models are known with their unpredictable patterns. In some macro variables we notice random walk type features. 1.4.16 Linear Time Series Models:MA(k) Moving average: y t = µ + ε t + θ 1 ε t 1 MA(1) Note that in some texts y t = µ + ε t θ 1 ε t 1 is also considered as an MA process. The term moving average comes from the fact that y is constructed from a weighted sum of the two most recent error terms. var(y t ) = E[(y t µ)(y t µ)] = γ 0 = E[(µ + ε t + θ 1 ε t 1 µ)(µ + ε t + θ 1 ε t 1 µ)] By using white noise property, Covariance: var(y t ) = σ 2 ε + θ 2 1σ 2 ε var(y t ) = σ 2 ε(1 + θ 2 1) γ 1 = E[(y t µ)(y t 1 µ)] = E[(µ + ε t + θ 1 ε t 1 µ)(µ + ε t 1 + θ 1 ε t 2 µ))] Higher covariances, j=2; E[(y t µ)(y t 1 µ)] = θ 1 σ 2 ε γ 2 = E[(y t µ)(y t 2 µ)] = E[(µ + ε t + θ 1 ε t 1 µ)(µ + ε t 2 + θ 1 ε t 3 µ))] 0 γ j = 0 j = 2, 3,... correlation(y t, y t j ) = cov(y t, y t j ) var(y t ) ρ 1 = θ 1 σ 2 (1 + θ 2 1)σ 2 = θ 1 (1 + θ 2 1) ρ 2 = 0.. ρ k = 0 24 γ j = γ j = = ρ j γ 0 γ0 γ0

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) 1.4.17 MA(1) Correlogram ρ 1 = γ 1 γ 0 ρ 2 = 0.. ρ k = 0 since ρ k = γ k γ 0 = { θ1 (1+θ1 2 ), for k = 1 0, for k > 1 γ k = E[(ε t + θ 1 ε t 1 )(ε t k + θ 1 ε t k 1 )] = 0 for k > 1 So if you have a data that is generated by MA(1) its correlogram will decline to zero quickly(after one lag). 1.4.18 An MA(1) Example y t = µ + ε t + 0.5ε t 1 ρ 1 = MA(1) θ 1 (1 + θ 2 1) = 0.5 (1 + 0.5 2 ) = 0.4.. ρ k = 0 One major implication is the MA(1) process has a memory of only one Lag. i.e. MA(1) process forgets immediately after one term or only remembers just one previous realization. 1.4.19 Variance Covariance MA(2) For j = 1; γ 0 = E[(ε t + θ 1 ε t 1 + θ 2 ε t 2 ) 2 ] = E[(ε 2 t + θ 2 1ε 2 t 1 + θ 2 2ε 2 t 2)] γ 0 = σ 2 ε + θ 2 1σ 2 ε + θ 2 2σ 2 ε = σ 2 ε(1 + θ 2 1 + θ 2 2) γ 1 = cov(y t, y t 1 ) = E[(y t µ)(y t 1 µ)] since E[y t ] = µ γ 1 = E [ (ε t + θ 1 ε t 1 + θ 2 ε t 2 )(ε t 1 + θ 1 ε t 2 + θ 2 ε t 3 ) ] γ 1 = θ 1 E[ε 2 t 1] + θ 1 θ 2 E[ε 2 t 2] γ 1 = θ 1 σ 2 ε + θ 1 θ 2 σ 2 ε γ 1 = σ 2 ε(θ 1 + θ 1 θ 2 ) 25

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION For j = 2 ; cov(y t, y t 2 ) = E[(y t µ)(y t 2 µ)] = E [ (ε t + θ 1 ε t 1 + θ 2 ε t 2 )(ε t 2 + θ 1 ε t 3 + θ 2 ε t 4 ) ] γ 2 = θ 2 σ 2 ε For j = 3; Summary; γ 3 = 0 γ 0 = σ 2 ε + θ 2 1σ 2 ε + θ 2 2σ 2 ε = σ 2 ε(1 + θ 2 1 + θ 2 2) γ 1 = θ 1 σ 2 ε + θ 1 θ 2 σ 2 ε γ 2 = θ 2 σ 2 ε γ 3 = 0 Correlations; ρ 1 = γ 1 γ 0 = θ 1 + θ 1 θ 2 1 + θ 2 1 + θ 2 2 ρ 2 = γ 2 γ 0 = θ 2 1 + θ 2 1 + θ 2 2 ρ k = 0 k>2 1.4.20 Moving Average MA(k) Process y t = µ + ε t + Error term is white noise. MA(k) has k+2 parameters. Variance of y; k θ i ε t i i=1 MA(k) var(y t ) = γ 0 = var(µ + ε t + θ 1 ε t 1 + θ 2 ε t 2 +... + θ k ε t k ) var(y t ) = σε 2 + θ1σ 2 ε 2 + θ2σ 2 ε 2 +... + θkσ 2 ε 2 var(y t ) = σε(1 2 + θ1 2 + θ2 2 +... + θk) 2 Homework: Derive the autocorrelation function for MA(3),..MA(k). 26

CHAPTER 1. INTRODUCTION 1.4. LINEAR TIME SERIES MODELS: Y(T) 1.4.21 ARMA Models: ARMA(1,1) y t = φ 1 y t 1 + ε t + θ 1 ε t 1 γ 0 = E[y 2 t ] = E [ φ 2 1y 2 t 1 + 2φ 1 y t 1 ε t + 2φ 1 θ 1 y t 1 ε t 1 + ε 2 t + 2θ 1 ε t ε t 1 + θ 2 1ε 2 t 1 ] γ 0 = φ 2 1γ 0 + σ 2 ε + θ 2 1σ 2 ε + 2φ 1 θ 1 σ 2 ε γ 0 = σ2 ε + θ 2 1σ 2 ε + 2φ 1 θ 1 σ 2 ε (1 φ 2 1) γ 0 = σ2 ε(1 + θ 2 1 + 2φ 1 θ 1 ) (1 φ 2 1) For j = 1; γ 1 = E[y t 1 (φ 1 y t 1 + ε t + θ 1 ε t 1 )] γ 1 = E[y t 1 (φ 1 y t 1 ) + y t 1 ε t + θ 1 y t 1 ε t 1 ] γ 1 = E[φ 1 yt 1 2 + y t 1 (ε t + θ 1 ε t 1 )] γ 1 = E[φ 1 yt 1 2 + (φ 1 y t 2 + ε t 1 + θ 1 ε t 2 )(ε t + θ 1 ε t 1 )] Using white noise property, γ 1 = φ 1 γ 0 + θ 1 σε 2 For j = 2; γ 2 = E[y t 2 (φ 1 y t 1 + ε t + θ 1 ε t 1 )] γ 2 = φ 1 E[y t 2 y t 1 ] since [y t 2 = (φ 1 y t 3 + ε t 2 + θ 1 ε t 3 )] no correlation between y t 2, ε t and ε t 1 γ 2 = φ 1 γ 1.... γ k = φ 1 γ k 1 ρ 1 = γ 1 γ 0 So ARMA will have either oscilating or exponential decay depending on φ 1 or θ 1, but will be stationary and short memory(though we won t cover long memory models in this course). 27

1.4. LINEAR TIME SERIES MODELS: Y(T) CHAPTER 1. INTRODUCTION 1.4.22 Maximum Likelihood Estimation ε 1 ε 2.. N(0; σ2 ε) ε t ( f(ε t ) = 1 2π exp 1 2. ε2 t σ 2 Then if we use, independent and identical distribution assumptions, f(ε T )xf(ε T 1 )...xf(ε 0 ) = f(ε T, ε T 1...ε 0 ) ( ) ( ) ( ) 1 L(θ; y) =.exp 1 2πσ 2 2. ε2 1 1 σ.exp 1 2 2πσ 2 2. ε2 2 1... σ.exp 1 2 2πσ 2 2.ε2 T σ 2 Deriving the likelihood function: L(θ; r) = f(ε T, ε T 1,..., ε 1, ε 0 ) = f ε (ε 1,...ε T ) = (2πσ 2 ε) T 2 exp { ) T f(ε t Ω t 1 ) t=1 1 2σ 2 ε ( T t=1 ε 2 t } ) L(θ; y) = 1 (2πσ2 )(2πσ 2 )..(2πσ 2 ) e 1 2 T t ε 2 t σ 2 : Estimation AR(1); Since T and other parameters are constant we can ignore them in optimization and use ln(l(θ)) = l(θ) = 1 2 T ln(σ 2 ) 1 2 t=1 ε t = y t θ 1 y t 1 θ = (θ 1, σ 2 ) T t=1 ε 2 t σ 2 1.4.23 Likelihood function for ARMA(1,1) process y t = φ 1 y t 1 + ε t + θ 1 ε t 1 ε t = y t φ 1 y t 1 θ 1 ε t 1 ln(l(θ)) = l(θ) = 1 T ln(σ 2 ) 1 T [ yt φ 1 y t 1 θ 1 ε t 1 2 2 σ t=1 t=1 θ = (φ 1, θ 1, σ 2 ) 28 ] 2

CHAPTER 1. INTRODUCTION 1.5. MODEL SELECTION 1.5 Model Selection There are various steps to be followed in time series modelling. First step is to use graphical interpretation of the data. Graphical representation helps us in various aspects 1. To summarize and reveal patterns in the data:to distinguish between the linear and nonlinear features of the data can be seen from the data. 2. To identify anomalies in the data 3. Graphics enables us to present a big amount of data in a small space. 4. Multiple comparisons: being able to compare different piece of the data both within and between various series. to summarize How well does it fit the data? Adding additional lags for p and q will reduce the SSR. Adding new variables decrease the degrees of freedom In addition, adding new variables decreases the forecasting performance of the fitted model. Parsimonious model: optimizes this trade-off. 1.5.1 Two Model Selection Criteria Akaike Information Criterion Schwartz Bayesian Criterion AIC: k is the number of parameters estimated. If intercept term is allowed: k = (p + q + 1) else k = p + q. T: number of observations AIC = T ln(ssr) + 2k SBC = T ln(ssr) + kln(t ) Choose the lag order which minimizes the AIC or SBC. AIC may be biased towards selecting overparametrized model wheras SBC is asymptotically consistent. 29

1.5. MODEL SELECTION CHAPTER 1. INTRODUCTION 1.5.2 Characterization of Time Series Visual inspection Autocorrelation order selection Test for significance Barlett (individual) Box Ljung (joint) 1.5.3 Correlogram One simple test of stationarity is based on autocorrelation function (ACF). ACF at lag k is; ρ k = [ ] E (y t µ y )(y t+k µ y ) E[(yt µ y ) 2 ]E[(y t+k µ y ) 2 ] ρ k = cov(y t, y t+k ) σ yt σ yt+k Under stationarity, ρ k = cov(y t, y t+k ) σ 2 y ρ k = γ k γ 0 1.5.4 Sample Autocorrelation ρ k = T k t=1 (y t y)(y t+k y) T t=1 (y t y) 2 1.5.5 Correlogram 1 < ρ k < 1 If we plot ρ k against k, the graph is called as correlogram. As an example let us look at the correlogram of Turkey s GDP. 30

CHAPTER 1. INTRODUCTION 1.5. MODEL SELECTION Figure 1.15: Correlogram of Turkey s GDP 1.5.6 Test for Autocorrelation Barlett Test: to test for ρ k = 0 H 0 : ˆρ k = 0 H 1 : ˆρ k 0 ρ k N(0, 1 T ) Figure 1.16: Turkish Monthly Interest Rates 31

1.5. MODEL SELECTION CHAPTER 1. INTRODUCTION Figure 1.17: ISE30 Return Correlation 1.5.7 Box-Pierce Q Statistics To test the joint hypothesis that all the autocorrelation coefficients are simultaneously zero, one can use the Q statistics. Q = T m k=1 ρ 2 k where m= lag length, T=sample size Q asy χ 2 m 1.5.8 Ljung-Box Statistics It is variant of Q statistics as; LB = T (T + 2) m ( 2 ρk ) k=1 n k LB asy χ 2 m 32

CHAPTER 1. INTRODUCTION 1.5. MODEL SELECTION Figure 1.18: Box-Pierce Q Statistics 1.5.9 Are Residuals Clean? Figure 1.19: Graph of residuals from an AR(1) 1.5.10 Are Residuals GAUSSIAN? Figure 1.20 33

1.5. MODEL SELECTION CHAPTER 1. INTRODUCTION 1.5.11 Optimal ARMA order level choice: Turkish GDP Growth ARMA order to minimize BIC. ARMA(1,1),ARMA(1,2),... ARMA(4,4) (p,q) 1 2 3 4 1 388.3247 387.5649 381.3381 383.4551 2 386.1168 381.3223 385.5452 384.7664 3 390.0506 385.5218 385.4699 377.4532 4 388.1361 386.6513 389.7934 381.5608 ARMA(3,4) minimizes the BIC. What is the maximum lag order to start with? No clear rule but not more than the 10% of the whole sample should be left out. i.e. With 100 observations a maximum of 10 lag order is more or less the maximum AR level.. 1.5.12 Box-Jenkins Approach to Time Series Figure 1.21: Box-Jenkins Approach 34

CHAPTER 1. INTRODUCTION 1.6. FORECASTING 1.6 Forecasting Figure 1.22: Forecasting 1.6.1 Introduction to Forecasting y t = φ 0 + φ 1 y t 1 + ε t But if we want to project future realizations of y we may use y T +1 = φ 0 + φ 1 y T + ε T +1 Formally; E T [y T +1 ] = φ 0 + φ 1 y T 2 step ahead forecasts; 3 step ahead forecasts; E T [y T +h ] = E(y T +h y T,..., ε T,..., ε 1 ) E T [y T +2 ] = φ o 0ppkp0 + E T [φ 1 y T +1 ] E T [y T +2 ] = φ 0 + φ 1 (φ 0 + φ 1 y T ) = φ 0 + φ 1 φ 0 + φ 2 1y T E T [y T +2 ] = φ 0 + φ 1 (φ 0 + φ 1 y T ) = φ 0 + φ 1 φ 0 + φ 2 1y T E T [y T +3 ] = φ 0 + φ 1 (φ 0 + φ 1 φ 0 + φ 2 1y T ) = φ 0 + φ 1 φ0 + φ 2 1φ 0 + φ 3 1y T h step ahead forecasts; E T [y T +h ] = φ 0 (1 + φ 1 + φ 2 1 +... + φ h 1 1 ) + φ h 1y T 35

1.6. FORECASTING CHAPTER 1. INTRODUCTION 1.6.2 In Practice y t = φ 0 + φ 1 y t 1 But if we want to project the future realizations of y we may use E T [y T +1 ] = φ 0 + φ 1 y T If we can consistently estimate the order via AIC then one can forecast the future values of y. There are alternative measures to conduct forecast accuracy. 1.6.3 Mean Square Prediction Error Method (MSPE) Choose model with the lowest MSPE. If there are observations in the holdback periods, the MSPE for Model 1 is defined as: MSP E = 1 R T +R t=t +1 where e is the prediction error(i.e. e t = y T +1 ŷ T +1 ) RMSP E = 1 R e 2 t T +R t=t +1 1.6.4 A Forecasting Example for AR(1) e 2 t Suppose we are given t = 1, 2,.., 150 T = 151,..., 160 R = 10 y t = 0.9y t 1 + ε t Figure 1.23: AR(1) series with φ 1 = 0.9 36

CHAPTER 1. INTRODUCTION 1.6. FORECASTING Figure 1.24 y t = φ 1 y t 1 +ε t for convenience we dropped the intercept after estimating we found: φ 1 = 0.9 Suppose we want to forecast t = T + 1, T + 2,.., T + R T = 150 : y 150 = 7.16 E T [y T +1 ] = φ 1 y T ŷ 150+1 = 0.9 ( 7.16) = 6.45 y 150+1 = 6.26 actual e 150 = 0.18 forecast error 2-step ahead forecast: E T [y 150+2 ] = φ 1 (φ 1 y T ) = φ 2 1y T = 0.81 ( 7.16) and so forth. Figure 1.25: Forecast of AR(1) Model 37

1.6. FORECASTING CHAPTER 1. INTRODUCTION 1.6.5 Forecasting Performance Figure 1.26 MSP E = 1 R T +R t=t +1 where e is the prediction error(i.e. e t = y T +1 ŷ T +1 ) RMSP E = 1 R e 2 t T +R t=t +1 e 2 t Figure 1.27: AR(1) Forecast 38

CHAPTER 1. INTRODUCTION 1.6. FORECASTING Figure 1.28: Forecast Error and Error Square 1.6.6 Forecast Performance Evaluation If model A has less RMSE than Model B, then Model A is said to have a better forecasting power. Recently many papers are out to test whether A is better than B in terms of prediction power, but looking to RMSE is a good starting point. Seminal papers: Diebold and Mariano (1995) White (2000), Reality Check Econometrica paper..and many more recently 1.6.7 Forecast Combinations Assume that there are 2 competing forecast models: a and b y T +1 = wy a T +1 + (1 w)y b T +1 In addition, the forecast errors also has the same linear combination; ε T +1 = wε a T +1 + (1 w)ε b T +1 σ 2 T +1 = w 2 aσ 2 a,t +1 + (1 w a ) 2 σ 2 b,t +1 Assuming no correlation between model a and b σ 2 T +1 w a = 2w a σ 2 a,t +1 2(1 w a )σ 2 b,t +1 2w a σ 2 a,t +1 2(1 w a )σ 2 b,t +1 = 0 2w a σ 2 a,t +1 + 2w a σ 2 b,t +1 2σ 2 b,t +1 = 0 w a (σ 2 a,t +1 + σ 2 b,t +1) = σ 2 b,t +1 w a = σ 2 b,t +1 (σ 2 a,t +1 + σ2 b,t +1 ) 39

1.7. SUMMARY CHAPTER 1. INTRODUCTION So, if model a has greater prediction error than b we give more weights to a. [ ] [ ] σb,t 2 +1 σ y T +1 = (σa,t 2 +1 + σ2 b,t +1 ) yt a a,t 2 +1 +1 + (σa,t 2 +1 + σ2 b,t +1 ) yt b +1 1.6.8 Forecasting Combination Example Voting Behavior: Suppose company A forecasts the vote for party X: 40%, B forecasts 50%. past survey performances: σ 2 a = 0.3 σb 2 = 0.2 [ ] [ ] 20 y T +1 = y a 30 T +1 + (20 + 30) (20 + 30) y T +1 = 0.40 40 + 0.6 50 = 46% y b T +1 1.6.9 Using Regression for Forecast Combinations Run the following regression and then do the forecasts on the basis of estimated coefficients y T +1 = β 0 + β a y a T +1 + β b y b T +1 + ε T +1 1.7 Summary Find the AR, MA order via autocovariances, correlogram plots Use, AIC, SBC to choose orders Check LB stats Run a regression Do forecasting (use RMSE or MSE) to choose the best out-of-sample forecasting model. 40

Chapter 2 Testing for Stationarity and Unit Roots Outline What is unit roots? Why is it important? Spurious regression Test for unit roots Dickey Fuller Augmented Dickey Fuller tests Stationarity and random walk Can we test via ACF or Box Ljung? Why a formal test is necessary? Source: W Enders Chapter 4, chapter 6 2.0.1 Spurious Regression Regressions involving time series data include the possibility of obtaining spurious or dubious results signals the spurious regression. Two variables carrying the same trend makes two series to move together this does not mean that there is a genuine or natural relationship. If both y t and x t are non-stationary, y t = β 1 x t + ε t might display rather high R 2 high t-stats. One of OLS assumptions was the stationarity of these series, we will call such regression as spurious regression (Newbold and Granger (1974)). 41

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.1: Clive Granger Figure 2.2: Robert Engle The least squares estimates are not consistent and regular tests and inference do not hold. As rule of thumb (Granger and Newbold,1974) R 2 > dw 2.0.2 Example Spurious Regression Two simulated RW:Arl.xls X t = X t 1 + u t u t N(0, 1) u t and ε t are independent. Y t = Y t 1 + ε t ε N(0, 1) 42

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.3: Random Walks Figure 2.4: Y t = βx t + u t 2.0.3 Examples: Gozalo 1. Egyptian infant mortality rate (Y), 1971-1990, annual data on Gross Aggregate Income of American farmers (I) and Total Honduran Money Supply (M): Ŷ t = 179.9 + 2.952 I 4.26 M (16.63) ( 2.32) ( 0.0439) R 2 = 0.918 D/W = 0.4752 F = 95.17 CORR. = 0.8858, 0.9113, 0.9445 2. Total crime rates in US (Y) 1971-1991, annual data, on life expactancy of South Africa(X), Ŷ t = 24569 ( 6.03) + 628.9 (9.04) X R 2 = 0.811 D/W = 0.5061 F = 81.72 CORR. = 0.9008 43

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS 2.0.4 Unit Roots: Stationarity y t = β 1 y t 1 + ε t β 1 < 1 AR(1) β 1 = 1 Unit Roots! 2.0.5 Some Time Series Models: Random Walk Model y t = y t 1 + ε t ε t i.i.d. N(0, σε) 2 Where error term(ε t ) follows the white noise property with the following properties: 1. E[ε t ] = E[ε t ε t 1, ε t 2...] = E[ε t All information at t-1] = 0 2. E[ε t ε t j ] = cov(ε t, ε t j ) = 0 3. var(ε t ) = var(ε t ε t 1, ε t 2...) = var(ε t All information at t-1) = σ 2 ε Now let us look at the dynamics of such a model; if y 0 = 0 y t = y t 1 + ε t y 1 = ε 1 y 2 = y 1 + ε 2 y 2 = ε 1 + ε 2 y 3 = y 2 + ε 3 y 3 = ε 1 + ε 2 + ε 3... y N = ε 1 + ε 2 + ε 3 +... + ε N = N t=1 ε t σ 2 (y t ) = E[ε 2 1 +...ε 2 N] = σ 2 +... + σ 2 = Nσ 2 lim N σ2 (y t ) 44

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Implications of Random Walk: Variance of y t diverges to infinity as N tends to infinity Usefulness of point forecast y t+1 diminishes as N increases Unconditional variance of y t is unbounded. Shocks in a random walk model does not decay over time. So shocks will have a permanent effect on the y series. Figure 2.5: Random Walk with No Drift Figure 2.6: Random Walk: BIST 30 Index 45

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.7: Random Walk:ISE Percentage Returns Figure 2.8: Turkish Export and Imports (in USD mio) 2.0.6 Why a Formal Test is Necessary? y t = β 1 y t 1 + ε t To test for β 1 = 1 through t-test is not feasible since var(y t ) so that the standard t-test is not applicable For instance, daily brent oil series given below graph shows non-stationarity time series. 46

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.9: Brent Oil Historical Data 2.0.7 How Instructive to Use ACF? Figure 2.10: Correlogram of daili Brent Oil Does Crude Oil data follow random walk? (or does it contain unit root)? Neither Graph nor autocovariance functions can be formal proof of the existence of random walk series.how about standard t-test? 2.0.8 Testing for Unit Roots: Dickey Fuller We estimated the daily crude oil data with the following spesification y t = β 1 y t 1 + ε t Estimated regression model is; ŷ t = 1.000335y t 1 SE=0.000305 47

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS t test statistic = 3277 But it would not be appropriate to use this information to reject the null of unit root. This t-test is not appropriate under the null of a unit root. Dickey and Fuller (1979,1981) developed a formal test for unit roots. Hypothesis tests based on non-stationary variables cannot be analytically evaluated. But non-standard test statistics can be obtained via Monte Carlo. 2.0.9 Dickey Fuller Test y t = β 1 y t 1 + u t : Pure Random Walk Model y t = β 0 + β 1 y t 1 + u t : Random Walk with Drift y t = β 0 + β 1 y t 1 + β 2 t + u t : Random Walk with Drift and Time trend These are three versions of the Dickey-Fuller (DF) unit root tests. The null hypothesis for all versions is same whether β 1 is one or not. Now If we subtract y t 1 from each side y t y t 1 = β 1 y t 1 y t 1 + u t y t y t 1 = β 0 + β 1 y t 1 y t 1 + u t y t y t 1 = β 0 + β 1 y t 1 y t 1 + β 2 t + u t The test involves to estimate any of the below specifications: y t = γy t 1 + u t y t = β 0 + γy t 1 + u t y t = β 0 + β 2 t + γy t 1 + u t So we will run and test the slope to be significant or not So the test statistic is the same as conventional t-test. y t = β 0 + γy t 1 where γ = β 1 1 Hence; H 0 : γ = 0 48

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.11: Running DF regression Figure 2.12: Testing DF in E-views 49

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.13: DF E-views Figure 2.14: Testing for DF for other specifications: RW with trend 2.0.10 Dickey-Fuller F-test y t = β 0 + γy t 1 + u t H 0 : β 0 = γ = 0 y t = β 0 + β 2 t + γy t 1 + u t H 0 : β 0 = β 2 = γ = 0 Now of course the test statistic is distributed under F test which can be found in Dickey Fuller tables. They are calculated under conventional F tests. 2.0.11 ADF: Augemented Dickey Fuller Test y t = β 0 + γy t 1 + u t Granger points out that if the above equation has serial correlation then the test can have no meaning. He suggested that the lags of y t s should be used to remove the serial correlation. 50

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS This augmented test is known as Augmented Dickey Fuller test(adf). y t = β 0 + γy t 1 + p α i y t i i=1 y t = β 1 + β 2 t + γy t 1 + m α i y t i + u t With Dickey-Fuller (ADF) test we can handle with the autocorrelation problem. The m, number of lags included, should be big enough so that the error term is not serially correlated. The null hypothesis is again the same. Let us consider GDP example again i=1 Figure 2.15: Augmented Dickey Fuller Test Figure 2.16: Augmented Dickey Fuller Test For above figure, at 99% confidence level, we can not reject the null. 51

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.17: Augmented Dickey Fuller Test For above figure, at 99% confidence level, we reject the null. This time we augmented the regression to handle with serial correlation. Note that, because GDP is not stationary at level and stationary at first difference,it is called integrated order one, I(1). Then a stationary serie is I(0). y t = β 1 + β 2 t + γy t 1 + p α i y t i + u t i=1 In order to handle the autocorrelation problem, Augmented Dickey-Fuller (ADF) test is proposed. The p, number of lags included, should be big enough so that the error term is not serially correlated. So in practice we use either SBC or AIC to clean the residuals. The null hypothesis is again the same. y t = γy t 1 + p α i y t i + ε t H 0 : γ = 0 i=1 y t = δ + γy t 1 + p α i y t i + ε t H 0 : δ = γ = 0 i=1 y t = δ + γy t 1 + φt + p α i y t i + ε t H 0 : φ = δ = γ = 0 i=1 Example: Daily Brent Oil 52

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS Figure 2.18: Augmented Dickey Fuller Test Daily Brent Oil We can not reject the null of unit root. So Crude levels may behave like RW. Figure 2.19: Correlogram of Interest Rates Figure 2.20: Short and Long rates: Trl30 and 360 53

2.1. QUESTIONS CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS I(1) and I(0) Series If a series is stationary it is said to be I(0) series. If a series is not stationary but its first difference is stationary it is called to be difference stationary or I(1). Next section will investigate the stationarity behaviour of more than one time series known as co-integration. 2.1 Questions 1. Descriptive ANALYSIS OF TIME SERIES (STYLIZED FACTS) a In the following data set you are given US, and Turkish GDP growth rates and inflation data. b In addition, download both the GDP and ınflation for i. EU as one country ii. Germany iii. South Korea, Soth Africa,Brasil, India and China c GDP growth data are quarterly (like the one used in my data set Turkish and US, GDP growth series quarterly), d Inflation data is monthly observed (so use monthly US, EU inflation rates (source: (ECB or Eurostat, OECD or any other source. (Year on Year) e Similar credit rating countries such as (Brasil, South Africa, India) can constitute one group which is comparable with Turkey. f Find the sample mean, sample variance, skewness, and excess kurtosis estimates on these data compare your findings on different years and US Turkey comparison. g For Turkey divide the sample into two (after April 2001 and before). h Sharp Ratio is given as E(X)/Variance(X) is a critical performance measure. Look at the Turkish and US GDP and briefly comment on it. i Calculate autocovariance and autocorrelations (autocorrelation between y(t) and y(t-i), i=1,...50, for quarterly and monthly series seperately). (use the Matlab code attached or translate it into R or use Stata, EVIEWS) j Fit an AR(1) for all of the four series (Brasil, South Africa, India and Turkey) and compare your results. k Compare the uncoditional mean of the Turkish GDP and inflation series. l Compare the uncoditional mean of the Turkish GDP and inflation series. m Compare the Turkish inflation series with respect to others (i.e try to answer for similarities and differences between emerging market countries and developed markets). 2. AR(1) simulation (Given the following model): y t = φ 0 + φ 1 y t 1 + ε t 54

CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS 2.1. QUESTIONS Simulate the AR(1) process when φ 1 = 1, 0.95, 0.85, 0.5, 0.25, 0.10 for two cases with or without drift term (i.e. ) a for two cases with or without drift term (i.e. φ 0 = 0, 1 b a. What are the differences and similarities of the time series behaviour of these AR series? c What can be said about the unconditional mean of each series. Compare your simulated AR(1) sample average and the theoretical unconditional means of AR(1) model. (as we did in our lab session) d Draw autocovariances for each of the series you generate. e Estimate the Turkish GDP as an AR(1) model and use the coefficients for simulation. Compare the actual GDP and simulated AR(1). 3. By using the above series: estimate the AR(1,p) MA(1 q) and find the most suitable ARMA(p,q) combination. By using Akaike Information Criterion (AIC) and Bayesian Information Criterion. 4. On the basis of your optimal lag order choice above do a forecasting exercise for GDP and Inflation for Turkey and US. 5. Testing with unit roots: a Conduct the unit root tests (ADF) for the 5 of the (GDP or Inflation) data you have used in PS1. b State in a table which series are I(0) and I(1) or I(2) if any. c If you find any of your series I(1) then conduct an ARIMA forecast. 6. 1. AR(1) simulation (Given the following model): y t = φ 0 + φ 1 y t 1 + ε t Simulate the AR(1) process when φ 1 = 1, 0.99, 0.95, 0.90, 0.5 Note: In this question you need to simulate a total of 3 5 = 15 time series. a Without drift φ 0 = 0 b With a drift φ 0 = 1 c With drift and time trend y t = φ 0 + φ 1 y t 1 + φ 2 t + ε t d Then conduct ADF tests each of these three specifications. e In a table summarize your findings. 7. By using TUIK (or any other data source like TCMB) download total money supply, GDP (levels), exports and imports: (Use TL as the currency of GDP. ) 55

2.1. QUESTIONS CHAPTER 2. TESTING FOR STATIONARITY AND UNIT ROOTS a Test the existence of cointegration between money supply and gdp b Test the existence of cointegration between exports and imports c If there is co-integration relationship in the above case test the existence of error correction mechanism. d Clearly comment on the speed of adjustment coefficient. 56

Chapter 3 COINTEGRATION Economic theory, implies equilibrium relationships between the levels of time series variables that are best described as being I(1). Similarly, arbitrage arguments imply that the I(1) prices of certain financial time series are linked. (two stocks, two emerging market bonds etc). If two (or more) series are themselves non-stationary (I(1)), but a linear combination of them is stationary (I(0)) then these series are said to be co-integrated. Examples: Inflation and interest rates, Exchange Rates and inflation rates, Money Demand: inflation, interest rates, income Figure 3.1: Consumption and Income Logs 57

CHAPTER 3. COINTEGRATION Figure 3.2: Brent vs WTI Figure 3.3: Crude oil Futures 58

CHAPTER 3. COINTEGRATION Figure 3.4: Usd treasury 2 year vs 30 years 3.0.1 Money Demand Stability m d t = β 0 + β 1 r t + β 2 y t + β 3 inf t + ε t r: interest rates, y: income, infl: inflation. Each series in the above eqn may be nonstationary (i.e. I(1)) but the money demand relationship may be stationary. All of the above series may wander around individually but as an equilibrium relationship MD is stable. Or even though the series themselves may be non-stationary, they will move closely together over time and their difference will be stationary. Consider the m time series variables y 1,t, y 2,t,... y m,t known to be non-stationary, ie. suppose y i,t = I(1), i = 1, 2, 3,..., m Then, y t = (y 1,t, y 2,t,..., y m,t ) are said to form one or more cointegrating relations if there are linear combinations ofy i,t s that are I (0) Where, r denotes the number of cointegrating vectors. 3.0.2 Testing for Cointegration Engle Granger Residual-Based Tests Econometrica, 1987 Step1 Run an OLS regression of y 1,t (say) on the rest of the variables: namely y 2,t, y 3,t,... y m,t and save the residual from this regression. Dickey Fuller Test y 1,t = m β i y i,t + u t i=2 59

CHAPTER 3. COINTEGRATION û t = β 1 û t 1 + ε t Dickey-Fuller unit root tests. 3.0.3 Residual Based Cointegration test: Dickey Fuller test û t = δ + γû t 1 where γ = β 1 1 Hence H 0 : γ = 0 Therefore, testing for co-integration yields to test whether the residuals from a combination of I(1) series are I(0). If u is an I(0) then we conclude. Even the individual data series are I(1) their linear combination might be I(0). This means that there is an equilibrium vector and if the variables divert from equilibrium they will converge there at a later date. If the residuals appear to be I(1) then there does not exist any co-integration relationship implying that the inference obtained from these variables are not reliable. Higher Order Integration Higher order integration if two series are I(2) may be they might have an I(1) relationship. 3.0.4 Examples of Cointegration: Brent Wti Regression Null Hypothesis: RESID01 has a unit root Exogenous: Constant Lag Length: 0 (Automatic - based on SIC, maxlag=12) t-statistic Prob* Augmented Dickey-Fuller test statistic -4.226414 0.0009 Test Critical Values 1% level -3.487550 5% level -2.886509 10% level -2.580163 So we reject the null. 3.0.5 Example of ECM The following is the ECM that can be formed, ŷ t = α + β x t λ(û t 1 ) λ is the speed of adjustment towards equilibrium. λ < 0 is expected since error can be correction it is expected to lie between 0 and 1 u t 1 : is the equilibrium error 60

CHAPTER 3. COINTEGRATION Estimation of ECM ECM: λ: speed of adjustment coefficient u t 1 = (y t 1 βx t 1 ) equilibrium error If the system hits a random shock the λ will push the system back to equilibrium. The sign and the magnitude of the λ will be the main determinants of ECM. It is negative and the size shows the speed with which error corrects. 3.0.6 Error Correction Term The error correction term tells us the speed with which our model returns to equilibrium for a given exogenous shock. It should have a negative sign, indicating a move back towards equilibrium, a positive sign indicates movement away from equilibrium. The coefficient should lie between 0 and 1, 0 suggesting no adjustment one time period later, 1 indicates full adjustment. An Example Are Turkish interest rates with different maturities (1 month versus 12 months) co-integrated? Step 1: Test for I(1) for each series. Step 2: Test whether two of these series move together in the long-run. If yes then set up an Error Correction Mechanism. Figure 3.5: TRLGOV30 TRLGOV360 Figure 3.6 61

CHAPTER 3. COINTEGRATION Figure 3.7 So both of these series are non-stationary, i.e I(1). Now we test whether there exists a linear combination of these two series which is stationary. Both rt 360 and rt 30 are I(1) and test is I(0) Run another ADF on Test for Cointegration rt 360 = βrt 30 + u t û t = δ + γû t 1 + ε t Figure 3.8: Test for co-integration Estimate the ECM rt 360 = βrt 30 + u t Both rt 360 and rt 30 are I(1) and ε t is I(0). Then we have an equilibrium relationship which can be given as ECM: rt 360 = α + λ(rt 1 360 30 ˆβr t 1) + ε t r 360 t = 0.0032 0.099874(r 360 t 1 1.1r 30 t 1) 62

CHAPTER 3. COINTEGRATION Figure 3.9: Residual Actual Fitted Figure 3.10: ECM Regression 3.0.7 Use of Cointegration in Economic and Finance Purchasing Power Parity: FX rate differences between two countries is equal to inflation differences. Big Mac etc... Uncovered Interest Rate Parity: Exchange rate can be determined with the interest rate differentials Interest Rate Expectations: Long and short rate of interests should be moving together. Consumption Income HEDGE FUNDS! (ECM can be used to make money!) 3.0.8 Conclusion Test for co-integration via ADF is easy but might have problems when the relationship is more than 2-dimensional (Johansen is more suitable) Nonlinear co-integration, Near unit roots, structural breaks are also important. But stationarity and long run relationship of macro time series should be investigated in detail. 63

CHAPTER 3. COINTEGRATION 64

Chapter 4 Multiple Equation Systems Outline: Simultaneous Equations structural versus reduced for models Inconsistency of OLS Simultaneous Equations in Matrix Form VAR Models 4.1 Seemingly Unrelated Regression Model y 1,t = β 1,0 + β 1,1 x t + u 1,t y 2,t = β 2,0 + β 1,2 x t + u 1,t u t = (u 1,t, u 2,t ) T [ (0 ) ( ) ] σ 2 u t : iidn 1 σ 1,2 0 σ 2,1 σ2 2 X s are exogenous and disturbances are comptemporenously correlated. y 1,t = β 1,0 + β 1,1 x 1,t + β 1,2 y 2,t + u 1,t y 2,t = β 2,0 + β 2,1 x 2,t + β 2,2 y 1,t + u 2,t u t = (u 1,t, u 2,t ) T [ (0 ) ( ) ] σ 2 u t : iidn 1 σ 1,2 0 σ 2,1 σ2 2 Here both endogenous variables are incorporated into the systems (y 1 and y 2 ). 65

4.1. SEEMINGLY UNRELATED REGRESSION CHAPTER 4. MODEL MULTIPLE EQUATION SYSTEMS 4.1.1 Recursive Systems y 1,t = β 1,3 x 1,t + u 1,t y 2,t = β 2,2 y 1,t + β2, 3x 2,t + u 2,t y 3,t = β 3,1 y 1,t + β 3,2 y 2,t + β3, 3x 3,t + u 3,t u t : iidn[ 0 σ1 2 0 0 ] 0 0 σ2 2 0 0 0 0 σ3 2 4.1.2 Structural and Reduced Forms y 1,t β 1 y 2,t = u 1,t β 2 y 1,t + y 2,t α = u 2,t This model is known as the structural model which can be represented as By t + Ax t = u t y t = ( y1,t y 2,t ), B = Reduced Form In order to express everything in terms of y: ( ) ( ) 1 β1 0, A =, u β 2 1 α t = ( u1,t u 2,t ) y t = B 1 Ax t + B 1 u t y t = Πx t + v t Π = B 1A v t = B 1 u t Why simultaneous equations matter? it matters because it avoids the endogeneity. So we can avoid inconsistent estimators. In addition, in the past they were very useful to macroeconomic forecasting. But after 1980 s their forecasting power turned out to be rather weak. 66

CHAPTER 4. MULTIPLE EQUATION 4.1. SEEMINGLY SYSTEMSUNRELATED REGRESSION MODEL 4.1.3 Some Simultaneous Equation Models Demand and Supply Model: Q d t = α 0 + α 1 P t + u 1,t α 1 < 0 Q s t = β 0 + β 1 P t + u 2,t β 1 > 0 Q d t = Q s t Figure 4.1: Some Simultaneous Equation Models Figure 4.2: Some Simultaneous Equation Models 67

4.1. SEEMINGLY UNRELATED REGRESSION CHAPTER 4. MODEL MULTIPLE EQUATION SYSTEMS Figure 4.3: Some Simultaneous Equation Models 4.1.4 Keynesian Income Function Y t = C t + I t C t = β 0 + β 1 Y t + u t Figure 4.4: Some Simultaneous Equation Models 4.1.5 Inconsistency of OLS OLS may not be applied to estimate a single equation embedded in a system of simultaneous equations if one or more of explanatory variables are correlated with the disturbance term in that equation Result: the estimators thus obtained are inconsistent. Let us consider the previous model. 68

CHAPTER 4. MULTIPLE EQUATION 4.1. SEEMINGLY SYSTEMSUNRELATED REGRESSION MODEL By substituting C Y t = β 0 + β 1 Y t + I t + u t Y t = β 0 1 β 1 + I t 1 β 1 + u t 1 β 1 E[Y t ] = β 0 + Īt 1 β 1 1 β 1 [ ] cov(y t, u t ) = E (Y t E[Y t ])(u t E[u t ]) Since Y E[Y t ] = ut 1 β 1 u 2 t cov(y t, u t ) = E[ ] 1 β 1 and u t E[u t ] = u t cov(y t, u t ) = σ2 1 β 1 ˆβ 1 = n t=1 (C t C)(Y t Ȳ ) n t=1 (Y t Ȳ )2 ˆβ 1 = n t=1 C ty t n t=1 y2 t ˆβ 1 = n t=1 (β 0 + β 1 Y t + u t )y t n t=1 y2 t ˆβ 1 = β 0 n t=1 y t n t=1 y2 t + β n 1 t=1 Y n ty t n + t=1 u ty t n t=1 y2 t t=1 y2 t ˆβ 1 = β 1 + n t=1 u ty t n t=1 y2 t E[ ˆβ 1 ] = β 1 + E [ n t=1 u ty t n t=1 y2 t We cannot evaluate it via E(.) operator since expectation operator is a linear one. But we also know that u and Y are not independent. ( n ) plim( ˆβ t=1 1 ) = plim(β 1 ) + plim u ty t ] n t=1 y2 t ( plim( ˆβ ( ) n t=1 1 ) = plim(β 1 ) + plim u ty t /n) ( n 69 t=1 y2 t /n)

4.1. SEEMINGLY UNRELATED REGRESSION CHAPTER 4. MODEL MULTIPLE EQUATION SYSTEMS plim( ˆβ 1 ) = β 1 + σ2 /1 β 1 σ 2 Y So inconsistent. plim( ˆβ 1 ) = β 1 + 1 1 β 1 σ 2 σ 2 Y By substituting C we obtained; cov(y t, u t ) = σ2 1 β 1 Y t = β 0 1 β 1 + I t 1 β 1 + u t 1 β 1 A reduced for equation is one that expresses an endogenous variable solely in terms of predetermined variables and stochastic disturbance. So if we re-write this as; Y t = Π 0 + Π 1 I t + w t where Π 0 = β 0 1 β 1 ; Π 1 = 1 1 β 1 ; w t = ut 1 β 1 This is reduced form equation for Y. By applying same way we can derive reduced form for C too. 4.1.6 Underidentification Consider the Demand & Supply model Q s t = β 0 + β 1 P t + u 2,t β 1 > 0 Q d t = α 0 + α 1 P t + u 1,t α 1 < 0 Q d t = Q s t Reduced forms; α 0 + α 1 P t + u 1,t = β 0 + β 1 P t + u 2,t (α 1 β 1 )P t = β 0 α 0 + u 2t u 1t P t = β 0 α 0 α 1 β 1 + u 2,t u 1,t α 1 β 1 P t = Π 0 + v t 70

CHAPTER 4. MULTIPLE EQUATION 4.1. SEEMINGLY SYSTEMSUNRELATED REGRESSION MODEL ( ) β 0 α 0 Q t = α 0 + α 1 + u 2,t u 1,t + u 1t α 1 β 1 α 1 β 1 Q t = α 1β 0 α 1 α 0 + (α 1 β 1 )α 0 α 1 β 1 Q t = α 1β 0 α 0 β 1 α 1 β 1 + α 1u 2t α 1 u 1t + (α 1 β 1 )u 1t α 1 β 1 + α 1u 2t β 1 u 1t α 1 β 1 Q t = Π 1 + w t Now we have 2 reduced form parameters which include all four structural parameters. So we have 2 equations and 4 unknowns Then there is no unique solution If we regress reduced forms what we would have is only the mean values of price and quantity, nothing more! We can not identify the demand or supply function. 4.1.7 Test for Simultaneity Problem Hausman Specification Problem Demand: Q = α 0 + α 1 P + α 2 I + α 3 R + u 1 Estimation of Simultaneous Equations Supply: Q = β 0 + β 1 P + u 2 Figure 4.5: Estimation of Simultaneous Equations 4.1.8 Simultaneous Equations in Matrix Form Full Information Maximum Likelihood(FIML) Estimation: We have: Γy t + Cx t = u t 71

4.1. SEEMINGLY UNRELATED REGRESSION CHAPTER 4. MODEL MULTIPLE EQUATION SYSTEMS or The likelihood function is given as y t = Πx t + v t v t iidn(0, Ω) where Ω = Γ 1 Σ(Γ 1 ) L = (2π) n Ω n/2 exp [ 1/2 ] T (y t Πx t ) T Ω 1 (y t Πx t ) Consistent estimates are available with FIML, however FIML is very sensitive to correct specification of the system. t=1 72

Chapter 5 Vector Autoregression VAR In 1980 s proposed by Christopher Sims is an econometric model used to capture the evolution and the interdependencies among multiple economic time series. Generalizes the univariate AR models. All the variables in the VAR system are treated symmetrically. (by own lags and the lags of all the other variables in the model) VAR models as a theory-free method to estimate economic relationships. They consitute an alternative to the identification restrictions in structural models. 5.0.1 Why VAR Figure 5.1: Christoffer Sims, from Princeton (nobel prize winner 2011) First VAR paper in 1980 5.0.2 VAR Models In Vector Autoregression specification, all variables are regressed on their and others lagged values.for example a simple VAR model is y 1t = m 1 + a 11 y 1,t 1 + a 12 y 2,t 1 + ε 1t y 2t = m 2 + a 21 y 1,t 1 + a 22 y 2,t 1 + ε 2t or ( y1t y 2t ) = ( m1 m 2 ) ( ) ( ) a11 a + 12 y1,t 1 + a 21 a 22 y 2,t 1 ( ε1t ε 2t ) 73

CHAPTER 5. VECTOR AUTOREGRESSION VAR Which is called VAR(1) model with dimension 2 y t = m + Ay t 1 + ε t Generally VAR(p) model with k dimension is y t = m + A 1 y t 1 + A 2 y t 2 +... + A p y t p + ε t Where each A i is a k k matrix of coefficients, m and ε t is the k 1 vectors. Furthermore, E[ε t ] = 0 for all t and E[ε t ε T s ] = ω for t = s E[ε t ε s] = 0 for t s No serial correlation but there can be contemporaneous correlations. 5.0.3 An Example VAR Models: 1 month 12 months TRY Interest rates monthly DUPELICATED Generally VAR(p) model with k dimension is y t = m + A 1 y t 1 + A 2 y t 2 + ε t where each A i is a k k matrix of coefficients, m and ε t is the k 1 vectors. Furthermore, E[ε t ] = 0 for all t and E[ε t ε s] = ω for t = s E[ε t ε s] = 0 for t s No serial correlation but there can be contemporaneous correlations. Figure 5.2: Regression Output ( ) 0.78 0.58 Â 1 = 1.25 0.06 ( ) 0.06 0.50 Â 2 = 0.28 0.03 Akaike Information Criterion : 4.089038 Schwarz Criterion : 3.914965 74

CHAPTER 5. VECTOR AUTOREGRESSION VAR 5.0.4 Hypotesis Testing To test whether a VAR with a lag order 8 is preferred to a log order 10. (T c)(log r log u ) χ 2 df: # of restrictions T : number of observations c: number of parameters estimated in each equation of unresticted system log u log of the determinant of u Impulse Response Functions Suppose we want to see the reaction of our simple initial VAR(1) model to a shock, say ε 1 = [1, 0] and the rest is 0 where, y 1 = A = ( 0.4 ) 0.1 0.2 0.5 y 0 = 0 ( ) ( ) ( ) ( ) ( ) 1 0.4 0.1 1 0 0.4 y 0 2 = Ay 1 + ε t = + = 0.2 0.5 0 0 0.2 ( ) ( ) ( ) ( ) 0.4 0.1 0.4 0 0.18 y 3 = Ay 2 = + = 0.2 0.5 0.2 0 0.18 Figure 5.3: Response to Cholesky One S.D. Innovations ± 2 S.E. 75

5.1. QUESTIONS CHAPTER 5. VECTOR AUTOREGRESSION VAR 5.1 Questions 1. Descriptive ANALYSIS OF TIME SERIES (STYLIZED FACTS) a In the following data set you are given US, and Turkish GDP growth rates and inflation data. b In addition, download both the GDP and ınflation for i. EU as one country ii. Germany iii. South Korea, Soth Africa,Brasil, India and China c GDP growth data are quarterly (like the one used in my data set Turkish and US, GDP growth series quarterly), d Inflation data is monthly observed (so use monthly US, EU inflation rates (source: (ECB or Eurostat, OECD or any other source. (Year on Year) e Similar credit rating countries such as (Brasil, South Africa, India) can constitute one group which is comparable with Turkey. f Find the sample mean, sample variance, skewness, and excess kurtosis estimates on these data compare your findings on different years and US Turkey comparison. g For Turkey divide the sample into two (after April 2001 and before). h Sharp Ratio is given as E(X)/Variance(X) is a critical performance measure. Look at the Turkish and US GDP and briefly comment on it. i Calculate autocovariance and autocorrelations (autocorrelation between y(t) and y(t-i), i=1,...50, for quarterly and monthly series seperately). (use the Matlab code attached or translate it into R or use Stata, EVIEWS) j Fit an AR(1) for all of the four series (Brasil, South Africa, India and Turkey) and compare your results. k Compare the uncoditional mean of the Turkish GDP and inflation series. l Compare the uncoditional mean of the Turkish GDP and inflation series. m Compare the Turkish inflation series with respect to others (i.e try to answer for similarities and differences between emerging market countries and developed markets). 2. AR(1) simulation (Given the following model): y t = φ 0 + φ 1 y t 1 + ε t Simulate the AR(1) process when φ 1 = 1, 0.95, 0.85, 0.5, 0.25, 0.10 for two cases with or without drift term (i.e. ) a for two cases with or without drift term (i.e. φ 0 = 0, 1 76

CHAPTER 5. VECTOR AUTOREGRESSION VAR 5.1. QUESTIONS b a. What are the differences and similarities of the time series behaviour of these AR series? c What can be said about the unconditional mean of each series. Compare your simulated AR(1) sample average and the theoretical unconditional means of AR(1) model. (as we did in our lab session) d Draw autocovariances for each of the series you generate. e Estimate the Turkish GDP as an AR(1) model and use the coefficients for simulation. Compare the actual GDP and simulated AR(1). 3. By using the above series: estimate the AR(1,p) MA(1 q) and find the most suitable ARMA(p,q) combination. By using Akaike Information Criterion (AIC) and Bayesian Information Criterion. 4. On the basis of your optimal lag order choice above do a forecasting exercise for GDP and Inflation for Turkey and US. 5. Testing with unit roots: a Conduct the unit root tests (ADF) for the 5 of the (GDP or Inflation) data you have used in PS1. b State in a table which series are I(0) and I(1) or I(2) if any. c If you find any of your series I(1) then conduct an ARIMA forecast. 6. 1. AR(1) simulation (Given the following model): y t = φ 0 + φ 1 y t 1 + ε t Simulate the AR(1) process when φ 1 = 1, 0.99, 0.95, 0.90, 0.5 Note: In this question you need to simulate a total of 3 5 = 15 time series. a Without drift φ 0 = 0 b With a drift φ 0 = 1 c With drift and time trend y t = φ 0 + φ 1 y t 1 + φ 2 t + ε t d Then conduct ADF tests each of these three specifications. e In a table summarize your findings. 7. By using TUIK (or any other data source like TCMB) download total money supply, GDP (levels), exports and imports: (Use TL as the currency of GDP. ) a Test the existence of cointegration between money supply and gdp b Test the existence of cointegration between exports and imports c If there is co-integration relationship in the above case test the existence of error correction mechanism. d Clearly comment on the speed of adjustment coefficient. 77

5.1. QUESTIONS CHAPTER 5. VECTOR AUTOREGRESSION VAR 78

Chapter 6 Asymptotic Distribution Theory Econometric theory very much involves with what happens to the parameter uncertaninty when we can observe a very large data set. We will discuss the following concepts. Convergence in Probability Laws of Large Numbers Convergence of Functions Convergence in Distribution: Limiting Distributions Central Limit Theorems Asymtotic Distributions 6.1 Why large sample? We have studied the finite (exact) distribution of OLS estimator and its associated tests. If the regressors are endogeneous (i.e. X and u are correlated) we won t be able to handle the estimation. Rather than making assumptions on a sample of a given size, large sample theory makes assumptions on the stochastic processes that generates the sample. 6.1.1 Asymptotics What happens to a rv or a distribution as n tends to infinity. What is the approximate distribution under these limiting conditions. What is the rate of convergence to a limit. 2 Critical Laws of statistics are studied under large sample theory. 79

6.2. CONVERGENCE IN PROBABILITY CHAPTER 6. ASYMPTOTIC DISTRIBUTION THEORY Law of Large Numbers Central Limit Theorem 6.2 Convergence in Probability Definition : Let x n be a sequence random variable where n is sample size, the random variable x n converges in probability to a constant c if for any fositive ɛ lim P rob( x n c > ɛ) = 0 n the values that the x may take that are not close to c become increasingly unlikely as n increases. If xn converges to c, then we say, plimx n = c All the mass of the probability distribution concentrates around c. 6.2.1 Mean Square Convergence Definition mean square convergence: A stronger condition than convergence in probability is mean square convergence. A random sequence x n is said to converge in mean square to c which can be formulated as lim E[(x n c) 2 ] = 0 n x n ms c 6.2.2 Convergence in Probability Definition : An estimator ˆθ n of a parameter θ is a consistent estimator iff 6.2.3 Almost sure convergence plimˆθ n = θ The random variable x n converges almost surely to the constant c if and only if It can be written as lim P ( x i c > ɛ) for all i n) = 0 for ɛ > 0 n x n AS c 80

CHAPTER 6. ASYMPTOTIC DISTRIBUTION THEORY 6.3. LAW OF LARGE NUMBERS Intuitively : once the sequence x n gets ccloser to c it stays that way. Almost sure convergence:alternative definition The random variable x n is said to converge almost surely to c if and only if 6.3 Law of large numbers prob( lim n x n = c) = 1 Law of large numbers is one of the key concepts in probability theory and statistics. Two main versions are strong law of large numbers and weak law of large numbers. Weak Law of large numbers : Based on convergence in probability Strong form of large numbers : Based on Almost sure convergence Laws of Large Numbers 1. Weak Law of large Numbers (Khinchine) if x i, i = 2, 3,.., n is a random i.i.d. sample from a distribution with finite meane(x i ) = µ Remarks: a No finite variance assumption b Requires i.i.d. sampling plim x n = µ 2. Strong Law of Large Numbers (Kolmogorov) if x i, i = 2, 3,.., n is a random i.i.d. random variables such that E(x i ) < µ < (finite mean) and var(x i ) = σ 2 i < then x n µ n AS 0 Remarks: It is stronger because of in AS convergence iid ness is not required But with iid ness we will obtain a convergence to a constant mean 6.3.1 Convergence in Functions Theorem (Slutsky: For a continious function, g(x n ) plim(g(x n ) = g(plim(x n ) Using Slutsky theorem, we can write some rules of plim. 81

6.4. CONVERGENCE IN DISTRIBUTION CHAPTER 6. : ASYMPTOTIC LIMITING DISTRIBUTIONS THEORY 6.3.2 Rules for Probability Limits 1. For plimx = c and plimy = d plim(x + y) = c + d plimxy = cd plim(x/y) = c/d 2. For matrices X and Y with plimx = A and plimy = B plimx 1 = A 1 plimxy = AB 6.4 Convergence in Distribution : Limiting Distributions Definition: x n with cdf F n (X) converges in distribution to a random variable with cdf F (x) if lim F n(x) F (x) = 0 for all continuity points of F (x) n then F(x) is the limitin distribution of x n and can be shown as x n d x 6.4.1 Example Limiting Probability Distributions Student s t Distribution: Given that Z N(0, 1) and X χ 2 n t = Z has student s t distribution with n degrees of freedom. X/n Properties: It is symmetric, positive everywhere and leptokurtic(has fatter tails than normal dist.) Only parameter is n, degrees of freedom. E(t) = 0 and V ar(t) = n/(n 2) n t N(0, 1) 6.4.2 Rules for Limiting Distributions : 82

CHAPTER 6. ASYMPTOTIC DISTRIBUTION THEORY 6.5. CENTRAL LIMIT THEOREMS 1. If x n d x and plimy n + c then; x n y n x n + y n x n /y n d cx d c + x d c/x 2. As a corrolary to Slutsky theorem, if x n d x and g(x) is a continous function g(x n ) d g(x) 3. If y n has a limiting distribution and plim(x n y n ) = 0 then x n has the same limiting distibution with y n 6.5 Central Limit Theorems Lindberge-Levy Central Limit Theorem CLT states that any sampling distribution from any distribution would converge to normal distribution. The mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. The sample mean x n for an iid sequence has a degenerate probability distributions as n µ. For this purpose we would like to investigate n( x n µ) x n = 1/n x i E[ n( x n µ) 2 ] = n/n 2 var( x i ) = 1/nvar( x i ) = σ 2 Therefore in contrast to x n, n( x n ) µ) may converge to nondegenerate distribution. n( x µ) d N(0, σ 2 ) 6.5.1 Example: distirbution of sample mean X 1, X 2,.., X N can be viewed as N variables each having the same distribution mean and variance. X 1 : Stands for all possible values that can be obtained by the first variable drawn from X, X 2 : Second variable, and so on.. X = 1/N N X i = 1/N(X 1 + X 2 +... + X N ) i 83

6.5. CENTRAL LIMIT THEOREMS CHAPTER 6. ASYMPTOTIC DISTRIBUTION THEORY So X s are said to be identically and independently distributed (iid) random variables. Derive the distribution of E[ X] = 1/NE N X i = 1/NE(X 1 + X 2 +.. + X N ) i E[ X] = 1/N[E(X 1 ) + E(X 2 ) +.. + E(X N )] sinde (X 1, X 2,.., X N ) are all iid meaning to have the same mean distribution mean and variance. 1/N[µ + µ +... + µ] = µ This means the sample mean of X has is equal to population mean µ. Central Limit Idea σ 2 X = var( X) = var[1/n N X i ] = 1/N 2 var( i N X i ) i since X 1, X 2,..X N are all iid meaning to have the same variance and no covariances. σ 2 X = 1/N 2 var( N X i ) i σ 2 X = 1/N 2 [var(x 1 ) +.. + var(x N )] = 1/N 2 [σ 2 +... + sigma 2 ] σ 2 X = σ 2 /N So X N(µ, σ 2 ) is the sampling distribution [ X N(µ, σ 2 /n)] 6.5.2 A simple monte carlo for CLT see : clt.m MATLAB code LetX1, X2... Xnare N(0, 1), AsE(Xbar) = mu n Stdev( X) 0 84

CHAPTER 6. ASYMPTOTIC DISTRIBUTION THEORY 6.5. CENTRAL LIMIT THEOREMS Figure 6.1: Logistic Function Figure 6.2: X = (x 1 )x 1 N(0, 1), E( X = 0.0182) and σ = 0.98 Figure 6.3: CLT when n = 5, E( X = 0.01) and σ = 0.47 85

6.5. CENTRAL LIMIT THEOREMS CHAPTER 6. ASYMPTOTIC DISTRIBUTION THEORY Figure 6.4: CLT when n = 1000, E( X = 0.002) and σ = 0.031 Figure 6.5: Stdev of sample average disappears 86