The multiple regression model; Indicator variables as regressors

Similar documents
The regression model with one fixed regressor cont d

The linear regression model: functional form and structural breaks

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor.

ECON 3150/4150, Spring term Lecture 7

Reliability of inference (1 of 2 lectures)

Dynamic Regression Models (Lect 15)

Econometrics Summary Algebraic and Statistical Preliminaries

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

EMERGING MARKETS - Lecture 2: Methodology refresher

Making sense of Econometrics: Basics

ECON 3150/4150 spring term 2014: Exercise set for the first seminar and DIY exercises for the first few weeks of the course

Review of Econometrics

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

2. Linear regression with multiple regressors

Making sense of Econometrics: Basics

Forecasting. Simultaneous equations bias (Lect 16)

Introduction to Econometrics

Introductory Econometrics

Motivation for multiple regression

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Lecture 3: Multiple Regression

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

ECON 5350 Class Notes Functional Form and Structural Change

Lecture-1: Introduction to Econometrics

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 7: Single equation models

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Slides to Lecture 3 of Introductory Dynamic Macroeconomics. Linear Dynamic Models (Ch 2 of IDM)

Lab 07 Introduction to Econometrics

WISE International Masters

Lecture 1: OLS derivations and inference

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Econometrics -- Final Exam (Sample)

E 4101/5101 Lecture 9: Non-stationarity

ECON 3150/4150, Spring term Lecture 6

Econometrics Problem Set 11

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II

Applied Statistics and Econometrics

Lecture 4: Heteroskedasticity

Reference: Davidson and MacKinnon Ch 2. In particular page

Econometrics Review questions for exam

Homework Set 2, ECO 311, Fall 2014

mrw.dat is used in Section 14.2 to illustrate heteroskedasticity-robust tests of linear restrictions.

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Multiple Linear Regression CIVL 7012/8012

Econ107 Applied Econometrics

Final Exam - Solutions

Introduction to Econometrics

10. Time series regression and forecasting

ECON 4230 Intermediate Econometric Theory Exam

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

The Multiple Regression Model Estimation

Linear Models in Econometrics

Simple Linear Regression: The Model

Christopher Dougherty London School of Economics and Political Science

ECON 4160, Lecture 11 and 12

Linear Regression. Junhui Qian. October 27, 2014

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Chapter 2: simple regression model

The Simple Regression Model. Part II. The Simple Regression Model

Linear Regression with one Regressor

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

ECON3150/4150 Spring 2015

Problem Set - Instrumental Variables

Lecture 14 Simple Linear Regression

Single and multiple linear regression analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Homoskedasticity. Var (u X) = σ 2. (23)

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Empirical Economic Research, Part II

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Introductory Econometrics

LECTURE 9: GENTLE INTRODUCTION TO

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

ECON 4160, Autumn term Lecture 1

Econometrics of Panel Data

Environmental Econometrics

ECON 4160, Spring term 2015 Lecture 7

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Econometrics - ECON4160 Exercises (GiveWin & PcGive) 1. Exercises to PcGive lectures 5th February 2004

Applied Microeconometrics (L5): Panel Data-Basics

ECNS 561 Multiple Regression Analysis

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Lecture 7: Dynamic panel models 2

Lectures 5 & 6: Hypothesis Testing

Answer Key: Problem Set 5

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Lecture 14. More on using dummy variables (deal with seasonality)

1. The OLS Estimator. 1.1 Population model and notation

Regression #8: Loose Ends

Applied Statistics and Econometrics. Giuseppe Ragusa Lecture 15: Instrumental Variables

ECON 4160, Spring term Lecture 12

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Transcription:

The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21

This lecture (#12): Based on the econometric model specification from Lecture 9 and 10: Inference about the natural rate of unemployment. Application of delta method. Lecture 3 and 11, HGL 5.5.3; BN 4.4.5 Estimation of partial effects. Frisch-Waugh theorem. Separate note (translated from BN 7.2) Omitted variables bias. HGL 6.3.2; BN 7.7.2 Indicator variables. HGL 7.1-7.2.3,7.5-7.5.2,7.5.5; BN 7.8 2 / 21

PCM natural rate of unemployment I The natural rate of unemployment is a central concept in macro economics In order to estimate the natural rate, we must specify a model where it is a parameter (explicitly or implicitly) There are many such models. In Lecture 2 we estimated a linear Phillips curve model (PCM) π t = β 0 + β 1 U t + ε t (1) where π t is Norwegian inflation in year t and U t is the unemployment percentage U t. The natural rate can defined in such a way that it becomes parameter in (1). Re-write the PCM as 3 / 21

PCM natural rate of unemployment II and define π t = β 1 (U t β 0 ) + ε t β 1 = β 1 (U t U nat ) + ε t (2) U nat := β 0 β 1 as the natural rate of unemployment. Since E (ε t U nat ) = 0 we have that E (π t U t = U nat ) = 0 U nat is at parameter in both (1) and (2). 4 / 21

PCM natural rate of unemployment III (2) is however NOT linear in parameters. To estimate U nat from (2) requires Non-linear Least Squares, (NLS). However, with the use of the delta method we can make inference about U nat by estimating the linear-in parameter model (1) In Lecture 2 We also showed results for non-linear PCMs, (but linear in parameters). For example π t = β 0 β 1 U 1 t + ε t (3) = β 1 (U t (U nat) ) 1 ) + ε t (U nat) ) 1 := β 0 β 1 U nat = β 1 β 0 5 / 21

With data from 1975 2005 (T = 25) we obtained: π t = β 0 + β 1 U t + ε t π t = β 0 + β 1 (1/U t ) + ε t ˆπ t = 10.5 (1.453) 1.83 (0.423) U t ˆπ t = 1.033 (1.198) J-B test:χ 2 (2) 1.0925[0.5791] 1.8211[0.4023] Nat-rate (Û nat ) 5.73 % 14.9 % IT-rate (Û it ) 4.36 % 4.36 % Note: + 15.39 (2.998) (1/U t) Since we use time series data, the IID assumption may only be a crude approximation. But the Jarque-Bera test is not rejecting U it is the inflation target rate of unemployment : we defined it for π t = 2.5, instead of 0 6 / 21

Use the delta-method formula from Lecture 3 and 11: Var(Û nat ) = Var( ( ) ˆβ 2 [ 0 ) Var( ˆβ 0 ) + ˆβ 1 1ˆβ ( Û nat) 2 Var( ˆβ 1 ) 2 ( Û nat) Cov( ˆβ 0, ˆβ 1 )] 1 From the estimation: Cov( ˆβ 0, ˆβ 1 ) = 0.57401 Var(Û nat ) = Var( ( ) ˆβ 0 1 2 [ ] ) 1.453 ˆβ 2 + (5.73) 2 0.423 2 2 (5.73) 0.57401 1 1.83 = 0.42038 Approximate 95 % confidence interval for U nat is therefore or 5.73 ± 2 0.42038 = 5.73 ± 2 0.64 37 [4.43% ; 7.03%] Memo: Direct estimation using Non Linear Least Squares (NLS) gives: Var(Û nat ) = 0.6479 2 7 / 21

Frisch-Waugh theorem I We have several times stated that the β j (j = 1, 2,... k) in the classical multiple regression model Y i = β 0 + k j=1 β jx ji + ε i shall be interpreted as partial effects, since E (Y X 1i,... X ki ) X ji = β j j Two caveats: The partial derivative is not relevant if the model is a polynomial in X i None of the X j variables are indicator variables (see below) 8 / 21

Frisch-Waugh theorem II We can give an alternative derivation (to Lecture 10) of the OLS estimators ˆβ 1 and ˆβ 2 that shows they are indeed (BLUE) estimators of the partial derivatives. Assume first that we have observations of Y i and X 1i that have been cleaned of the influence of X 2i. Call these observations { e Y X2 i, e X1 X 2 i } i = 1, 2,..., n. Based on this data set we could estimate the partial effect of X 1 on Y from the simple regression model e Y X2 i = β 0 + β 1e X1 X 2 i + ε i (4) The { question is: } how to obtain the data set ey X2 i, e X1 X 2 i i = 1, 2,..., n? Rest, in class and in a note 9 / 21

Frisch-Waugh theorem III Conclusion: We obtain the same estimate of ˆβ 1 in two ways: 1. Estimate the k = 2 regression model by OLS 2. Regress out the effect that X 2 has on Y and X 1, and use these residuals to estimate the partial effect of X 1 on Y. This result is a special case of the general Frisch-Waugh theorem. This theorem, dating back to an 1933 journal article is central in modern econometrics, and you will encounter it in more advanced textbook from the last decade and in later courses. 10 / 21

Omitted variable bias I In the k = 2 variable model we have from the specification of the model that: E (ε i X 1i,X 2i ) = 0 This does not imply that in the simple regression model E (e i X 1i,X 2i ) = 0 Y i = β 0 + β 1 X 1i + e i (5) even though E (e i X 1i ) = 0 and all the other classical properties hold for the disturbance in (5). 11 / 21

Omitted variable bias II This leads to a omitted variables bias, that we analyse IN CLASS 12 / 21

Representing qualitative explanatory factors I Qualitative explanatory variables are important in econometric models: Discrete levels of qualifications; policy on/off; seasonal effects on consumption, temporary or permanent structural breaks etc We represent qualitative factors by one or more indicator variables or dummies. We treat them as ordinary regressors, they represent no new problems for estimation and inference The difference from continuous regressors lie in the interpretation of the coefficients of the dummies 13 / 21

Indicator variables as the only explanatory variable I In the simplest case we have Y i = β 0 + β 1 D 1i + ε i (6) where D i is an indicator variable: { 1 if individual i belongs to category 1 D 1i = 0 else This is the same model as in Seminar exercise 2. You showed that the OLS estimators are ˆβ 0 = Ȳ 0 ˆβ 1 = Ȳ 1 Ȳ 0 (7) 14 / 21

Indicator variables as the only explanatory variable II In modern terminology (7) is called the difference estimator (HGL 7.5.1). D 1i = 1 is then typically representing individual in treatment group and D 1i = 0 no treatment (control group) It is an OLS estimator, and given classical properties of ε i the estimator is BLUE, and we use exactly the same inference procedure as before. The difference estimator can be extended to data sets where we observe the individual Y s before and after a treatment period, and where we can define a second qualitative variable { 1 if the period is after treatment D 2t = 0 else 15 / 21

Indicator variables as the only explanatory variable III This leads to the difference-in-difference estimator in HGL 7.5.5. which is the OLS estimator of β 3 in the multivariate regression model: Graph in Class Y it = β 0 + β 1 D 1i + β 2 D 2t + β 3 D 1i D 2t + ε it (8) 16 / 21

The combination rule for dummy variables I We can use several dummy variables for several qualitative factors in the same model providing we observe the following rule: If the intercept is included in the equation, then no-sub group of additive dummy variable should sum to a constant values. The purpose of this rule is to avoid perfect multicollinearity Operationalization: Assume that the qualitative factor is made up of m categories: it is represented in the model by m 1 dummy variables. The left-out category is called the reference value In the simple model from Seminar 2 we had category 0 and 1. That factor is represented by the single variable D 1i. 17 / 21

Dummys together with continuous variables I A common case is that k variable regression model contains both continuous variables and dummies as regressors Example: log-linear consumption function for quarterly data: ln(c t ) = β 0 + β 1 ln(inc t ) + β 2 D 1t + β 3 D 2t + β 4 D 3t + ε t (9) where C is private consumption (in real terms), INC: household disposable income and { 1, if j quarter D ji = 0, else, j = 1, 2, 3. 4th quarter is the reference value of the qualitative variable seasonality. 18 / 21

Dummys together with continuous variables II β 1 is the marginal propensity to consume (in elasticity form!) β 2, β 3 and β 4 represent quarterly shifts in the intercept relative to the reference quarter: They are NOT derivatives! Example in class 19 / 21

Interaction variables I Dummies can be uses to model changes in the slope coefficients. An alternative model to (9) might be where ln(c t ) = β 0 + β 1 ln(inc t ) + β 2 ln(inc t ) D 4t + β 3 D 1t + β 4 D 2t + β 5 D 3t + ε t { 1 if t after financial deregulation D 4t = 0 else The hypothesis is that the elasticity ln(c t )/ ln(inc t ) was permanently affected by easier access to credit etc. 20 / 21

Interaction variables II If H 0 : β 2 = 0 is rejected, we have evidence of a structural break: One single regression function is not representative of the whole sample. Extension: 2-sample Chow-test of equality of two regressions. In class 21 / 21