The multiple regression model; Indicator variables as regressors

The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21

This lecture (#12): Based on the econometric model specification from Lecture 9 and 10: Inference about the natural rate of unemployment. Application of delta method. Lecture 3 and 11, HGL 5.5.3; BN 4.4.5 Estimation of partial effects. Frisch-Waugh theorem. Separate note (translated from BN 7.2) Omitted variables bias. HGL 6.3.2; BN 7.7.2 Indicator variables. HGL 7.1-7.2.3,7.5-7.5.2,7.5.5; BN 7.8 2 / 21

PCM natural rate of unemployment I The natural rate of unemployment is a central concept in macro economics In order to estimate the natural rate, we must specify a model where it is a parameter (explicitly or implicitly) There are many such models. In Lecture 2 we estimated a linear Phillips curve model (PCM) π t = β 0 + β 1 U t + ε t (1) where π t is Norwegian inflation in year t and U t is the unemployment percentage U t. The natural rate can defined in such a way that it becomes parameter in (1). Re-write the PCM as 3 / 21

PCM natural rate of unemployment II and define π t = β 1 (U t β 0 ) + ε t β 1 = β 1 (U t U nat ) + ε t (2) U nat := β 0 β 1 as the natural rate of unemployment. Since E (ε t U nat ) = 0 we have that E (π t U t = U nat ) = 0 U nat is at parameter in both (1) and (2). 4 / 21

PCM natural rate of unemployment III (2) is however NOT linear in parameters. To estimate U nat from (2) requires Non-linear Least Squares, (NLS). However, with the use of the delta method we can make inference about U nat by estimating the linear-in parameter model (1) In Lecture 2 We also showed results for non-linear PCMs, (but linear in parameters). For example π t = β 0 β 1 U 1 t + ε t (3) = β 1 (U t (U nat) ) 1 ) + ε t (U nat) ) 1 := β 0 β 1 U nat = β 1 β 0 5 / 21

With data from 1975 2005 (T = 25) we obtained: π t = β 0 + β 1 U t + ε t π t = β 0 + β 1 (1/U t ) + ε t ˆπ t = 10.5 (1.453) 1.83 (0.423) U t ˆπ t = 1.033 (1.198) J-B test:χ 2 (2) 1.0925[0.5791] 1.8211[0.4023] Nat-rate (Û nat ) 5.73 % 14.9 % IT-rate (Û it ) 4.36 % 4.36 % Note: + 15.39 (2.998) (1/U t) Since we use time series data, the IID assumption may only be a crude approximation. But the Jarque-Bera test is not rejecting U it is the inflation target rate of unemployment : we defined it for π t = 2.5, instead of 0 6 / 21

Use the delta-method formula from Lecture 3 and 11: Var(Û nat ) = Var( ( ) ˆβ 2 [ 0 ) Var( ˆβ 0 ) + ˆβ 1 1ˆβ ( Û nat) 2 Var( ˆβ 1 ) 2 ( Û nat) Cov( ˆβ 0, ˆβ 1 )] 1 From the estimation: Cov( ˆβ 0, ˆβ 1 ) = 0.57401 Var(Û nat ) = Var( ( ) ˆβ 0 1 2 [ ] ) 1.453 ˆβ 2 + (5.73) 2 0.423 2 2 (5.73) 0.57401 1 1.83 = 0.42038 Approximate 95 % confidence interval for U nat is therefore or 5.73 ± 2 0.42038 = 5.73 ± 2 0.64 37 [4.43% ; 7.03%] Memo: Direct estimation using Non Linear Least Squares (NLS) gives: Var(Û nat ) = 0.6479 2 7 / 21

Frisch-Waugh theorem I We have several times stated that the β j (j = 1, 2,... k) in the classical multiple regression model Y i = β 0 + k j=1 β jx ji + ε i shall be interpreted as partial effects, since E (Y X 1i,... X ki ) X ji = β j j Two caveats: The partial derivative is not relevant if the model is a polynomial in X i None of the X j variables are indicator variables (see below) 8 / 21

Frisch-Waugh theorem II We can give an alternative derivation (to Lecture 10) of the OLS estimators ˆβ 1 and ˆβ 2 that shows they are indeed (BLUE) estimators of the partial derivatives. Assume first that we have observations of Y i and X 1i that have been cleaned of the influence of X 2i. Call these observations { e Y X2 i, e X1 X 2 i } i = 1, 2,..., n. Based on this data set we could estimate the partial effect of X 1 on Y from the simple regression model e Y X2 i = β 0 + β 1e X1 X 2 i + ε i (4) The { question is: } how to obtain the data set ey X2 i, e X1 X 2 i i = 1, 2,..., n? Rest, in class and in a note 9 / 21

Frisch-Waugh theorem III Conclusion: We obtain the same estimate of ˆβ 1 in two ways: 1. Estimate the k = 2 regression model by OLS 2. Regress out the effect that X 2 has on Y and X 1, and use these residuals to estimate the partial effect of X 1 on Y. This result is a special case of the general Frisch-Waugh theorem. This theorem, dating back to an 1933 journal article is central in modern econometrics, and you will encounter it in more advanced textbook from the last decade and in later courses. 10 / 21

Omitted variable bias I In the k = 2 variable model we have from the specification of the model that: E (ε i X 1i,X 2i ) = 0 This does not imply that in the simple regression model E (e i X 1i,X 2i ) = 0 Y i = β 0 + β 1 X 1i + e i (5) even though E (e i X 1i ) = 0 and all the other classical properties hold for the disturbance in (5). 11 / 21

Omitted variable bias II This leads to a omitted variables bias, that we analyse IN CLASS 12 / 21

Representing qualitative explanatory factors I Qualitative explanatory variables are important in econometric models: Discrete levels of qualifications; policy on/off; seasonal effects on consumption, temporary or permanent structural breaks etc We represent qualitative factors by one or more indicator variables or dummies. We treat them as ordinary regressors, they represent no new problems for estimation and inference The difference from continuous regressors lie in the interpretation of the coefficients of the dummies 13 / 21

Indicator variables as the only explanatory variable I In the simplest case we have Y i = β 0 + β 1 D 1i + ε i (6) where D i is an indicator variable: { 1 if individual i belongs to category 1 D 1i = 0 else This is the same model as in Seminar exercise 2. You showed that the OLS estimators are ˆβ 0 = Ȳ 0 ˆβ 1 = Ȳ 1 Ȳ 0 (7) 14 / 21

Indicator variables as the only explanatory variable II In modern terminology (7) is called the difference estimator (HGL 7.5.1). D 1i = 1 is then typically representing individual in treatment group and D 1i = 0 no treatment (control group) It is an OLS estimator, and given classical properties of ε i the estimator is BLUE, and we use exactly the same inference procedure as before. The difference estimator can be extended to data sets where we observe the individual Y s before and after a treatment period, and where we can define a second qualitative variable { 1 if the period is after treatment D 2t = 0 else 15 / 21

Indicator variables as the only explanatory variable III This leads to the difference-in-difference estimator in HGL 7.5.5. which is the OLS estimator of β 3 in the multivariate regression model: Graph in Class Y it = β 0 + β 1 D 1i + β 2 D 2t + β 3 D 1i D 2t + ε it (8) 16 / 21

The combination rule for dummy variables I We can use several dummy variables for several qualitative factors in the same model providing we observe the following rule: If the intercept is included in the equation, then no-sub group of additive dummy variable should sum to a constant values. The purpose of this rule is to avoid perfect multicollinearity Operationalization: Assume that the qualitative factor is made up of m categories: it is represented in the model by m 1 dummy variables. The left-out category is called the reference value In the simple model from Seminar 2 we had category 0 and 1. That factor is represented by the single variable D 1i. 17 / 21

Dummys together with continuous variables I A common case is that k variable regression model contains both continuous variables and dummies as regressors Example: log-linear consumption function for quarterly data: ln(c t ) = β 0 + β 1 ln(inc t ) + β 2 D 1t + β 3 D 2t + β 4 D 3t + ε t (9) where C is private consumption (in real terms), INC: household disposable income and { 1, if j quarter D ji = 0, else, j = 1, 2, 3. 4th quarter is the reference value of the qualitative variable seasonality. 18 / 21

Dummys together with continuous variables II β 1 is the marginal propensity to consume (in elasticity form!) β 2, β 3 and β 4 represent quarterly shifts in the intercept relative to the reference quarter: They are NOT derivatives! Example in class 19 / 21

Interaction variables I Dummies can be uses to model changes in the slope coefficients. An alternative model to (9) might be where ln(c t ) = β 0 + β 1 ln(inc t ) + β 2 ln(inc t ) D 4t + β 3 D 1t + β 4 D 2t + β 5 D 3t + ε t { 1 if t after financial deregulation D 4t = 0 else The hypothesis is that the elasticity ln(c t )/ ln(inc t ) was permanently affected by easier access to credit etc. 20 / 21

Interaction variables II If H 0 : β 2 = 0 is rejected, we have evidence of a structural break: One single regression function is not representative of the whole sample. Extension: 2-sample Chow-test of equality of two regressions. In class 21 / 21