The regression model with one stochastic regressor (part II)

Similar documents
ECON 3150/4150, Spring term Lecture 6

The regression model with one stochastic regressor.

The regression model with one fixed regressor cont d

Dynamic Regression Models (Lect 15)

ECON 4160, Autumn term Lecture 1

The multiple regression model; Indicator variables as regressors

ECON 4160, Lecture 11 and 12

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II

Forecasting. Simultaneous equations bias (Lect 16)

Lecture 6: Dynamic panel models 1

Reliability of inference (1 of 2 lectures)

ECON 3150/4150 spring term 2014: Exercise set for the first seminar and DIY exercises for the first few weeks of the course

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 7: Single equation models

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Econometrics of Panel Data

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Econometría 2: Análisis de series de Tiempo

Final Exam. Economics 835: Econometrics. Fall 2010

ECON 4160, Spring term Lecture 12

ECON 4160, Autumn term 2017 Lecture 9

ECON 3150/4150, Spring term Lecture 7

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

ECON 4160, Spring term 2015 Lecture 7

Econometrics Summary Algebraic and Statistical Preliminaries

EMERGING MARKETS - Lecture 2: Methodology refresher

1 Motivation for Instrumental Variable (IV) Regression

The Simple Linear Regression Model

Making sense of Econometrics: Basics

Lecture 7: Dynamic panel models 2

Chapter 2: simple regression model

ECON3327: Financial Econometrics, Spring 2016

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

The linear regression model: functional form and structural breaks

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Linear Models in Econometrics

1. The Multivariate Classical Linear Regression Model

1. The OLS Estimator. 1.1 Population model and notation

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

ECON Introductory Econometrics. Lecture 16: Instrumental variables

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

Econometrics. 7) Endogeneity

2 Prediction and Analysis of Variance

Empirical Market Microstructure Analysis (EMMA)

11. Further Issues in Using OLS with TS Data

Econ 623 Econometrics II Topic 2: Stationary Time Series

ECO375 Tutorial 8 Instrumental Variables

MA Advanced Econometrics: Applying Least Squares to Time Series

Review of Statistics

Simple Linear Regression: The Model

Advanced Econometrics I

2. Linear regression with multiple regressors

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

A Course on Advanced Econometrics

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

1 Introduction to Generalized Least Squares

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Applied Statistics and Econometrics

Econometrics I Lecture 3: The Simple Linear Regression Model

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Introductory Econometrics

Economics 241B Estimation with Instruments

An overview of applied econometrics

Lecture 4: Heteroskedasticity

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

E 4101/5101 Lecture 9: Non-stationarity

ECON The Simple Regression Model

Economics Department LSE. Econometrics: Timeseries EXERCISE 1: SERIAL CORRELATION (ANALYTICAL)

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

9) Time series econometrics

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 18th Class 7/2/10

Ch 2: Simple Linear Regression

ECO 310: Empirical Industrial Organization Lecture 2 - Estimation of Demand and Supply

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Chapter 2. Dynamic panel data models

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

Advanced Econometrics

Econometrics 2, Class 1

Economics 308: Econometrics Professor Moody

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Switching Regime Estimation

Exam D0M61A Advanced econometrics

4.8 Instrumental Variables

Introduction to Econometrics

FENG CHIA UNIVERSITY ECONOMETRICS I: HOMEWORK 4. Prof. Mei-Yuan Chen Spring 2008


Introduction to Estimation Methods for Time Series models. Lecture 1

Stabilization policy with rational expectations. IAM ch 21.

Econometrics. 8) Instrumental variables

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

Transcription:

The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012

We will finish Lecture topic 4: The regression model with stochastic regressor We will first look at an application: The Norwegian Phillips curve over a long historical period (seperate note): Reminder about the importance of variable transformation for obtaining a conditional expectations function that is linear in parameters Then look at an special case of the theory: Regression with variables that are jointly normally distributed. It is useful as reference and for introducing two issues that that did not arise in RM1: Why regress y and x and not x on y? And what is the relationship between regression and correlation and between regression and causality? Finally, we define exogeneity as an econometric concept, and

extend the regression model to time series data References: See Lecture 6 and the more detailed references that we give below.

Binormal variables I Before we begin: Remember that regression does not require normally distributed variables : In fact already RM1 showed that! Assume that we have stochastic variables (y i,x i ), i = 1, 2,..., n that are generated by the following system of linear equations: y i = µ y + ɛ y,i (1) x i = µ x + ɛ x,i (2) where µ y and µ x are parameters and ɛ y,i and ɛ x,i have a normal joint probability distribution ( ) ( ( )) ɛxi σ 2 N 0, x ω xy ɛ yi ω xy σy 2 (3)

Binormal variables II ɛ x,i and ɛ y,i are therefore bivariate normal with expectation zero and covariance matrix ( σ 2 x ω xy ω xy σ 2 y ). The correlation coeffi cient between ɛ x,i and ɛ y,i is: ρ xy = ω xy σ x σ y. It is the population correlation coeffi cient. Since linear combination of normally distributed variables are also normally distributed, it follows that y i and x i given by (1), (2) are also normally distributed

Binormal variables III From the properties of the normal distribution: the distribution of y i conditional on x i is also normal, with expectation E[y i x i ] = µ y ρ xy σ y σ x µ x } {{ } β 1 + ρ xy σ y σ x }{{} β 2 = β 1 + β 2 x i (4) We will not derive this, but if you are interested, see e.g., BN kap 4.5.6 and 5.7 x i

Binormal variables IV If we define the stochastic variables e i, i = 1, 2,..., n we see that the regression model: e i = y i E (y i x i ) (5) y i = β 1 + β 2 x i + e i. (6) gives y i as the sum of the conditional expectations function (4) and the disturbance e i. This is of course a general characterization of RM2, what we have gained by assuming a bivariate normal is that β 1 and β 2 have been expressed as functions of the underlying population parameters µ x, µ y, σ 2 x, σ 2 y and ρ xy.

Binormal variables V Note that e i can be written as e i = µ y + ɛ yi β 1 β 2 (µ x + ɛ xi ) = ɛ yi ω xy σ 2 x Which can be used to show: ɛ xi E (e i ) = 0, E (e i ɛ xi ) = 0 Var(e i ) σ 2 = σ 2 y (1 ρ 2 xy ) (7) E (x i e i ) = 0 for all i In particular (7) shows that the reduction in unexplained variance of y i relative to total variance of y i is due to correlation.

Regression, correlation and causality I The statistical system given by (1), (2) and (3) is mapped into model form: y i = β 1 + β 2 x i + e i (8) x i = µ x + ɛ x,i (9) where is (8) is the conditional model of y i given x i (what we wish to explain) and (9) is the marginal model of x i (what we do not try to explain). Note: this does not mean that (8) and (9) prove that x i is causing y i!

Regression, correlation and causality II An equally valid model of the statistical system is x i = γ 1 + γ 2 y i + ε i (10) y i = µ y + ɛ y,i (11) where ε i has similar properties as e i, but for the case where we model x i conditionally on y i. γ 2 can be shown to be which is not β 2 and not 1 β 2 either. γ 2 = ρ xy σ x σ y

Regression, correlation and causality III Note that you have shown in Seminar exercise 1 that the same results hold in the data, i.e., when the population parameters are replaced by empirical moments! So we have two conditional model, one representing x y and the other y x How can we tell which of them represents causation? The general answer is that we cannot assert causality from regression alone that can only be done with reference to (subject matter) theory!

Regression, correlation and causality IV Recall the picture of econometrics as a combined discipline that we began with! Looking ahead to intermediate and advanced courses: In both cross section data and with time series data we have often access to natural experiments, that can make it possible to substantiate a causal interpretation

Exogeneity defined I Part of the specification of RM2 was that E (e i x h ) = 0, i and h (12) which implies that the disturbance e i is uncorrelated with all x h variables: cov(e i, x h ) = 0, i and h (13) We showed that, because of conditioning, we had for h = i cov(e i, x i ) = 0 (14) is an inherent property of the model. It always holds. However: (13) is a more general statement than (14).

Exogeneity defined II It is therefore custom to include (13) as an assumption in the model specification. This assumption is called the assumption of exogenous explanatory variable, cf HGL p 402 and BN. We will now look at two examples where exogeneity fail, but with different consequences for the OLS estimators

The measurement error model Measurement error in the regressor I HGL Ch 10.2. BN kap 6.3 Assume that the parameters of interest is between an observable variable y i and an unobservable variable x (permanent income is the example in HGL) y i = β 1 + β 2 x i + v i By the same assumptions as for RM2, but using the symbols xi and v i is place of x i and e i, this can formulated as a regression model. However, that model would be irrelevant for practice since x i is unobservable

The measurement error model Measurement error in the regressor II To formulate a model in observables weextend the list of assumption with x i = xi + u i where u i is a random measurement error that is uncorrelated with both v i and x i. It is tempting to say that is a valid regression model. y i = β 1 + β 2 x i + e i (15)

The measurement error model Measurement error in the regressor III However, since e i in this case must be e i = v i β 2 u i then cov(e i, x i ) = β 2 var(u i ) = 0 (16) showing that x i cannot be regarded as exogenous in (15). If we estimate (15) by OLS, what do we get in terms of properties? We will only motivate an answer, since a precise answer will use Probability limits that will be explained under Topic 6

The measurement error model Measurement error in the regressor IV As always, the OLS estimator for β 2 can be written as ˆβ 2 = n i=1(x i x)y i n i=1(x i x) 2 = β 2 + n i=1(x i x)e i n i=1(x i x) 2 Unlike in RM2, we cannot show ( n ) E i=1 (x i x)e i n i=1(x i x) 2 = 0 with the use of conditional expectation because x i and e i contain common stochastic variables.

The measurement error model Measurement error in the regressor V Intuitively however, we can guess that there is going to be a bias since the n i=1(x i x)e i is an empirical counterpart to cov(e i, x i ), which is non-zero from the specification of the model. This turns out to be true: In fact failure of exogenity of x implies that ˆβ 2 becomes inconsistent: We do not get the exactly true β 2 even in infinitely large samples. Looking ahead: The method of moments (Topic 10) can be used instead of OLS to obtain a consistent estimator.

The measurement error model Measurement error in y If the only departure from RM2 is that we have y i = β 1 + β 2 x i + v i where y is unobservable, the consequences are different. As long as the measurement error in y is uncorrelated with x, the model in terms of the observables has the same properties as before. In particular: No bias of OLS estimator for β 2! Show as a DIY!

The Lucas critique Rational expectations and the Lucas critique I The measurement error model can be use to explain the famous Lucas critique in macroeconomics Let x t represent the expected value of x t. Under the hypothesis of adaptive expectations the OLS estimator of β 2 remains consistent. But under the assumption of rational expectations we have that u i in x i = xi + u i represents a random expectations error. The result is that OLS gives an inconsistent estimator of the structural parameter β 2.

The Lucas critique Rational expectations and the Lucas critique II Inconsistent because the OLS estimator is contaminated by parameters of the expectations formation process. Moreover: Since expectations change when policy changes, the OLS estimator ˆβ 2 is subject to structural breaks: It will change when policy chagnes and will be an unreliable guide to judge the effects of economic polices. Looking ahead: Later courses discuss both the theory and the relevance of the Lucas critique (it can in fact be tested!). If interested: BN 5.12 is relatively detailed comared to other introductory books.

Models for time series data I For time series data we use t as a subscript for the stochastic variables/observations. It is also custom to replace n by T. If we formulate a static model y t = β 1 + β 2 x t + e t (17) for time series data, the specification of RM2 will in essence be unchanged, with e.g., assumption d. written as cov (e t, e t±s x t ) = 0, s = t which is called the assumption of no autocorrelation in the disturbances.

Models for time series data II For the static model (17) the hypothesis of no autocorrelation often fails. This regularly shows up in the OLS residuals ê t from (17) which are usually highly correlated with ê t 1 (and often older residual as well). The explanation is that time series variables are typically serially correlated: y t is usually highly correlated with y t 1, and x t is correlated with x t 1. Therefore the independent sampling assumption of RM2 is irrelevant for the case of time series data

A simple dynamic model I The solution of the problem with autocorrelation is either to correct the OLS estimators, or to represent the serial correlation of y t and x t in the conditional expectation (dynamic econometric models) The simplest example a dynamic model is y t = β 1 + β 2 y t 1 + e t, with 1 < β 2 < 1 (18) where the explanatory variable replacing x t is the history the y variable. This type of equation is called an autoregressive model of order one (AR(1)). It is a linear stochastic difference equation.

A simple dynamic model II In terms of properties of estimators: How close does this model come to RM2? The answer is: So close that it can be seen as a variant of RM2 To complete the specification of the dynamic regression model, we can define the conditional expectation function and the disturbance properties E (y t y t 1 ) = β 1 + β 2 y t 1 E (e t y t 1 ) = 0 var(e t y t 1 ) = σ 2 cov(e t,, e t±s y t 1 ) = 0

A simple dynamic model III What can we say about cov(e t±s,y t 1 )? in this model? For s = 0, we have from E (e t y t 1 ) = 0 that cov(e t,y t 1 ) = 0 but at we know, exogeneity requires that y t 1 is uncorrelated with all disturbances, both past and future. The mathematical solution for y t in (18) is found by repeated substitution of y t 1, y t 2 and so on back to infinity: (19) shows that y t = β 1 i=0 βi 2 + i=0 βi 2e t i (19)

A simple dynamic model IV y t 1 is uncorrelated with e t and all future disturbances, but y t 1 is correlated with e t 1 and all other past disturbances Hence y t 1 is not exogenous in (18), but y t 1 is not complete endogenous either. We have an intermediary case between exogeneity and endogeneity, and we say that y t 1 is a pre-determined variable in (19). In the case of a pre-determined explanatory variable the properties of the OLS estimators ˆβ 2 and ˆβ 1 are consistent but with finite sample biases, that are due to the correlation between y t 1 and past disturbances.

A simple dynamic model V As an example, we have that E ( ˆβ 2 β 2 ) 2β 2 T for the simplest case with β 1 = 0 (no drift).

0.01 0.02 0.03 0.04 0.05 0.06 plot of bias formula for β 2 = 0.5 T = 1, 2,..., 100 0.07 0.08 10 20 30 40 50 60 70 80 90 100

Summary of the regression model I As long as the regressor is deterministic or exogenous, and the classical assumptions about the disturbance properties hold, the regression model gives OLS estimators that are BLUE. In the case of stochastic x, the proof is in term of conditional and iterated expectations. With normally distributed disturbances, hypotheses tests and confidence intervals can be based on percentiles from the t-distribution. Consistency of estimators also holds. We have only proved that for the case of deterministic regressor: The theory of Probability limit is needed for the case of stochastic x.

Summary of the regression model II Without normally distributed disturbances, the t-test is approximately valid, and the degree of approximation becomes better with larger n If x is a pre-determined stochastic regressor, there is a (small) bias in the OLS estimator. That bias is decreasing in the sample size. Hence, for typical sample sizes (more than 30 observations) the case or pre-determinedness can be regarded as a variant of RM2: the properties are very similar.