Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

Size: px
Start display at page:

Download "Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)"

Transcription

1 Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies) Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico Year M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

2 Introduction The variables we have considered so far have been quantitative variables (wages, income, labour experience, years of education, etc.) The gender or race of an individual, the region where he lives, the industrial sector where a firm belongs, are examples of qualitative factors that often appear in empirical models. In this Chapter, we analyse the regression model with qualitative variables. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

3 The qualitative factors often appear offering binary information: The individual is either male or female an individual participates or not in a professional training programme a firm offers a retirement plan to its workers or not, etc. These qualitative variables can be represented with binary variables taking values 0 or 1 and are denoted as dummy variables. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

4 A unique binary variable We start analysing the following example: consider the dummy variable 1 if the individual is a female female = 0 if the individual is a male We include this variable in a model for hourly wage as a function of education as seen in Chapter 6 wage = β 0 + δ 0 female + β 1 educ + u (1) If in this model we compute the mean of wages for males and females with the same years of education, we have Females! E(wage j female = 1, educ) = β 0 + δ 0 + β 1 educ Males! E(wage j female = 0, educ) = β 0 + β 1 educ (2) M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

5 Therefore, δ 0 = E(wage j female = 1, educ) E(wage j female = 0, educ) ) δ 0 captures the difference in the average wage between females and males with the same years of education In this model, the value of δ 0 determines if there is difference in wages between males and females: If δ 0 < 0, we have that for the same level of education, the average wage of females is smaller than the average wage for males. On the contrary, if δ 0 = 0, we have that for the same level of education, the average wage of females is the same as for males. Therefore, in this model, the hypothesis that there are not differences in wages between males and females is δ 0 = 0. If δ 0 < 0, we have graphically M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

6 M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

7 We could have also included the dummy variable for male 1 if the individual is a male male = 0 if the individual is a female where the dummy for being male has been included instead of the dummy for female. In this case we have: wage = α 0 + γ 0 male + α 1 educ + u (3) If we compute now the mean wage for males and females with the same years of education we have that: Therefore Females! E(wage j male = 0, educ) = α 0 + α 1 educ Males! E(wage j male = 1, educ) = α 0 + γ 0 + α 1 educ γ 0 = E(wage j male = 1, educ) E(wage j male = 0, educ) ) γ 0 captures the difference in the average wage between males and females with the same years of education (4) M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

8 Since both in (2) and in (4) the mean wage is expressed as a linear function of the years of education but the only difference is that we allow for difference in the constant term between males and females, both models (1) and (3) are equivalent. Comparing (2) and (4) we have that β 0 + δ 0 + β 1 educ = α 0 + α 1 educ β 0 + β 1 educ = α 0 + γ 0 + α 1 educ and therefore γ 0 = δ 0. 8 < ) : β 1 = α 1 β 0 + δ 0 = α 0 β 0 = α 0 + γ 0 This relationship between the parameters of the models (1) and (3) is also verified for the OLS estimators of both models as the following example illustrates. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

9 Example 1 Using the data from file WAGE1 in Wooldridge on a sample of n = 526 individuals, models (1) and (3) have been estimated and the following results have been obtained: [wage = (0.6725) [wage = (0.6523) (0.2790) female educ (5) (0.0504) male educ (0.2790) (0.0504) where wage is the hourly wage in dollars and educ are the years of education We can check that in fact bβ 1 = bα 1 = bβ 0 + bδ 0 = = = bα 0 bα 0 + bγ 0 = = = bβ 0 According to this model, the hourly wage of males is on average dollars higher than the wage of females with the same years of education M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

10 Example 1 (cont.) If we want to test, versus a two-sided alternative, if the average wage of females is equal to the average wage of males with the same years of education, we have to test in model (1) The test statistic is H 0 : δ 0 = 0 H 1 : δ 0 < 0 t = b δ 0 se(bδ 0 ) t 523 under H 0 In this sample t = 2.273/ = 8.16 and the p-value is basically equal to zero. Therefore, we can reject H 0 at any reasonable significance level and conclude that the average wage of females is lower than the average wage for males with the same years of education. Note that alternatively we could have used model (3) and test H 0 : γ 0 = 0 vs H 1 : γ 0 > 0. We should have draw the same conclusion since the p-value is the same in both tests. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

11 Lastly, we can also write the model including both gender dummies (male and female) without including the constant term, that is, we could specify the model wage = β 0 male + α 0 female + β 1 educ + u (6) in this model β 0 is the constant term for males and α 0 is the constant term for females. Model (6) is equivalent to models (1) and (3). However, model (6) has the disadvantage with respect to the other two models that if we want to test whether there are differences in wages against females, we must test H 0 : β 0 = α 0 H 1 : β 0 > α 0 which is more difficult to test than H 0 : δ 0 = 0 vs H 0 : δ 0 < 0 in model (1), or to test H 0 : γ 0 = 0 vs H 0 : γ 0 > 0 in model (3). M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

12 Comments: We cannot specify a model including both dummies for gender and a constant term. The reason is that, since for each observation in the sample we have that male + female = 1, if both gender dummies and the constant term are included there will be perfect multicollinearity in the model (the model will be misspecified) We can include a binary variable in the model with more explanatory variables and the interpretation of the coefficient for this dummy variable would be analogous. For example, in the model wage = β 0 + δ 0 female + β 1 educ + β 2 exper + u (7) δ 0 captures the difference in the average hourly wage between females and males with the same years of education and the same years of labour experience. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

13 Comments (cont.): The interpretation of the coefficients of the rest of the variables is the same as in the case in which the model did not include dummy variables. For example, in model (7), β 1 captures the increase in the average wage given an increase in one year of education, holding labour experience and gender fixed, that is, we capture the effect either for males or for females. If the dependent is in logs, the coefficient of the dummy variable multiplied by 100 captures percentage differences. We see with an example how the estimated coefficient of a dummy variable should be interpreted when the dependent variable is in logs. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

14 Example 2 Consider the model log(wage) = β 0 + δ 0 female + β 1 educ + u (8) Using the same sample as in example 1, this model has been estimated obtaining the following results: \ log(wage) = (0.094) (0.039) female educ (9) (0.007) In this model, the estimated coefficient of female has the following interpretation: is the average estimated difference between the log wage of females and the log wage of males with the same years of education. Therefore, since the percentage differences is approximately equal to the difference of logs multiplied by 100, this model estimates that females earn on average 36.1% less than males with the same years of education. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

15 Binary variables with multiple categories Assume that we want to analyse now whether there are differences between different regions. The region of residence is also a qualitative factor, but as opposed to gender that can only take 2 values, countries are generally divided into more than 2 regions. Assume that we have J regions in a country, we have to consider as a reference region one of them, for example, region 1 and define J 1 dummy variables as follows: R 2 = R J = 1 if the individual lives in region 2 0 if the individual does not live in region 2. 1 if the individual lives in region J 0 if the individual does not live in region J M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

16 Once these dummy variables have been defined, we can consider model In this model log(wage) = β 0 + δ 2 R δ J R J + β 1 educ + u 100δ 2 captures the percentage difference in wages between those individuals living in region 2 and those living in region 1 with the same years of education δ J captures the percentage difference in wages between those individuals living in region J and those living in region 1 with the same years of education The hypothesis that there are not differences in wages between regions is δ 2 =... = δ J = 0. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

17 We could have used as a reference category any of the other regions For example, if we use region 2 as the reference category we should define the dummy corresponding to region 1 R 1 = 1 if the individual lives in region 1 0 if the individual does not live in region 1 and consider a model including the following dummy variablesr 1, R 3,.., R J. When we use region 2 as the reference category, the coefficients of the rest of dummies reflect the percentage differences in wages between the different regions and region 2. The model defined excluding any of the regional dummies is equivalent and the relationship between the parameters of both models would be obtained analogously to the case of the gender dummies. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

18 Example 3 Consider the model log(wage) = β 0 + δ 1 northcen + δ 2 south + δ 3 west + β 1 educ + u where the reference category is the east region. Using the same sample as in example 1, this model has been estimated obtaining the following results: log(wage) \ = (0.107) (0.061) northcen 0.060south (0.057) (0.067) west educ (10) (0.0076) n = 526, R 2 = 0.193, SCR = This model estimates that the hourly wage for individuals with the same years of education is on average 7% less in the north region than in the east region, 6% less in the south region than in the east region and 4.6% larger in the west region than in the east. In order to test, for example, if there are differences in wages between the west region and the east region, we must test H 0 : δ 3 = 0 versus H 1 : δ 3 6= 0. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

19 Example 3 (cont.) The test statistic is t = b δ 3 se(bδ 3 ) t 521 under H 0 In this sample t = 0.046/0.067 = 0.68 and the p-value = Prob(jt 521 j > 0.68) = Therefore, we cannot reject H 0 at any reasonable significance level. There is no evidence to state that, controlling for the years of education, there are wage differences between regions east and west. Note that we can also compute the estimated difference in wages for individuals with the same years of education living in any two regions. For example, comparing regions west and east, we have E(log(wage) j west = 1, educ) = β 0 + δ 3 + β 1 educ E(log(wage) j northcen = 1, educ) = β 0 + δ 1 + β 1 educ Since bδ 3 bδ 1 = ( 0.07) = 0.116, we have that the hourly wage for individuals with the same years of education is on average 11.6% larger in region west than in the north region. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

20 Example 3 (cont.) In order to test if the differences in wages between the west region and the north region are statistically significant we must test H 0 : δ 3 δ 1 = 0 versus H 1 : δ 3 δ 1 6= 0. Another possibility to do this test is to use the north region as the reference category, that is log(wage) = γ 0 + γ 1 east + γ 2 south + γ 3 west + β 1 educ + u To test in this model if the differences in wages between the west region and the north region are statistically significant we have to test H 0 : γ 3 = 0 versus H 1 : γ 3 6= 0. The estimated model is log(wage) \ = east south (0.106) (0.061) (0.055) (0.066) (0.0076) In this sample t = and the p-value is Therefore, we can only reject H 0 for levels above 8%. There is evidence to state that, controlling for the years of education, there are differences in wages between the regions west and east, but this evidence is not very strong. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

21 Example 3 (cont.) If we want to test if there are differences in wages between regions, we must test the null hypothesis H 0 : δ 1 = δ 2 = δ 3 = 0 In order to perform this test we must estimate the restricted model log(wage) = β 0 + β 1 educ + u and compute the sum of squared residuals, SCR r = The test statistic is F = (SCR r SCR nr ) /3 SCR nr /( ) F 3,521 under H 0 and since SCR nr = and SCR r = , we have that F = ( ) / /521 = 1.44 The p-value of this test is 0.23 and therefore there is no difference against the null hypothesis. We conclude that, controlling for the years of education, there is not enough evidence to state that there is difference in wages between regions M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

22 The effect of the education certificates on wages Consider the model log(wage) = β 0 + β 1 educ + β 2 exper + u We know that, in this model, 100β 1 captures the return to equation, i.e. the percentage variation in wages due to an increase in the number of years of education, by holding xed labour experience and the rest of the factors affecting y. This model implies that a one-year increase implies a percentage change that is the same when we increase education from 1 to 2, from 3 to 4, etc.. regardless of whether this additional year implies the end of a particular certificate such as Bachiller or University Degree. Additionally, in many data sets there is no information about the years of education but we have information about the maximum certificate each individual in the sample achieves. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

23 In order to take into account this "certificate" effect on wages, we try to define a series of dummy variables that capture the higher degrees achieved by individuals: 8 < 1 if individual did not finish certificate educational lev educ0 = : educ1 = 8 < : 0 otherwise 1 if individual finished certificate educational level 1 but he did not finish level 2 0 otherwise educj =. 8 < : 1 if individual finished certificate educational level J 0 otherwise M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

24 Once these variables are defined we consider the following model: log(wage) = β 0 + δ 1 educ δ J educj + β 1 exper + u (11) where the excluded variable is not having finished the first education level δ 1 captures the percentage difference in wages between individuals with educational level 1 and those that have not finished level 1 with the same labour experience, δ J captures the percentage difference in wages between individuals with educational level J and those that have not finished level 1 with the same labour experience. In this model, the hypothesis on the absence of wage differences among educational levels is δ 1 = δ 2 =... = δ J = 0. We could also write the model including dummy educ0 and excluding any of the other variables. The obtained model would be equivalent to model (11) and the equivalences among the parameters could be obtained analogously to example 1. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

25 Example 4 Using a subsample for year 2001 of 4450 Spanish workers in data set "European Community Household Panel", the following model has been estimated where log(salary) = β 0 + δ 1 bach + δ 2 univ + β 1 age + β 2 agesq + u salary is monthly wages in euros bach takes value 1 if individual has finished bachillerato but he does not have a degree and univ takes value 1 if individual has a university degree, the omitted category is not having finished high-school. age is age and agesq = age 2. The data for this sample (SalaryECHP2001) is available in folder "datos" from campus virtual. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

26 Example 4 The results of this estimation are: log(salary) \ = bach (0.088) (0.021) age agesq (0.0046) ( ) n = 4450, R 2 = (0.018) univ According to these results, the monthly wage for individuals with the same age is on average 15.9% larger for those individuals who finished bachillerato than for those individuals that did not finish it and 43.5% larger for those individuals who have a university degree than for those that did not even finish bachillerato. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

27 Models with many qualitative factors We can also construct models in which we include many qualitative factors at the same time. For example, we can consider civil status for which we also define two categories: married and not married, and we can define the following dummy variable and the model married = 1 if the individual is married 0 if the individual is not married log(wage) = β 0 + δ 0 female + γ 0 married + β 1 educ + u (12) where wage is wage. Note that in this model the reference category for the gender is male and for the civil status is not being married. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

28 If we compute in this model the mean of the log of wage for males and females, married and not married, with the same years of education we have: E(log(wage) j female = 0, married = 0) = β 0 + β 1 educ E(log(wage) j female = 1, married = 0) = β 0 + δ 0 + β 1 educ E(log(wage) j female = 0, married = 1) = β 0 + γ 0 + β 1 educ E(log(wage) j female = 1, married = 1) = β 0 + δ 0 + γ 0 + β 1 educ where these expectations are also conditional on educ. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

29 and therefore 8 >< ) >: δ 0 = E(log(wage) j female = 1, married = 0) E(log(wage) j female = 0, married = 0) = E(log(wage) j female = 1, married = 1) E(log(wage) j female = 0, married = 1) γ 0 = E(log(wage) j female = 0, married = 1) E(log(wage) j female = 0, married = 0) = E(log(wage) j female = 1, married = 1) E(log(wage) j female = 1, married = 0) 100δ 0 captures the percentage difference in wages between females and males with the same years of education and civil status 100γ 0 captures the percentage difference in wage between married and not married with the same years of education and same gender M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

30 We could also write the model using the dummy variable for male instead of female and for the dummy of not being married instead of the dummy for married. These models are equivalent and we could obtain the relationship between the parameters of both models analogously as in Example 1. Example 5 With the same sample as in example 1, the following model for the hourly wage has been estimated log(wage) \ = female married (0.0386) (0.0394) (0.0069) educ These results show that the wage for women is on average 32.8% smaller than wage for males with the same years of education and the same civil status, while wage for married is 20.9% larger than wage for those individuals who are not married with the same years of education and same gender. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

31 Interactions with binary variables Interactions between binary variables The model we analysed in the previous section does not allow for the differences in wages between males and females with a certain number of years of education to be different depending on the civil status of the individual. Thus, in the previous example the percentage difference in wage between married women and married male is 100δ 0, and for not married is also 100δ 0. In order to allow that the difference in wages between males and females can be different for different civil status, and for the differences between married and not married to be different between males and females, we should add to the model a variable which is the product between the dummy variable female and the dummy variable married. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

32 We define the variable female married = female married (note that this variable is a dummy that takes value 1 for married women and 0 otherwise) and we consider the model: log(wage) = β 0 + δ 0 female + γ 0 married (13) +φ 0 female married + β 1 educ + u In this model the mean of the log wages for males and females, married and not married, with the same years of education is: E(log(wage) j female = 0, married = 0) = β 0 + β 1 educ E(log(wage) j female = 1, married = 0) = β 0 + δ 0 + β 1 educ E(log(wage) j female = 0, married = 1) = β 0 + γ 0 + β 1 educ E(log(wage) j female = 1, married = 1) = β 0 + δ 0 + γ 0 + φ 0 + β 1 educ where the expectations are also conditioned on educ. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

33 and therefore δ 0 = E(log(wage) j female = 1, married = 0) E(log(wage) j female = 0, married = 0) δ 0 + φ 0 = E(log(wage) j female = 1, married = 1) E(log(wage) j female = 0, married = 1) γ 0 = E(log(wage) j female = 0, married = 1) E(log(wage) j female = 0, married = 0) γ 0 + φ 0 = E(log(wage) j female = 1, married = 1) E(log(wage) j female = 1, married = 0) M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

34 When we introduce the interaction between the two dummy variables we allow for the difference in wages between males and females with the same years of education to be different between married and not married. In this model the percentage difference in wages between females and males with the same years of education is 100δ 0 for not married individuals and 100(δ 0 + φ 0 ) if married. Also, the difference in wages between married and not married with the same years of education can also be different between males and females. In this model the percentage difference in wages between married and not married with the same years of education is 100γ 0 for males 100(γ 0 + φ 0 ) for females. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

35 Example 5 (cont) With the same sample of example 1 the following results have been obtained: log(wage) \ = female According to this results: (0.0602) (0.0554) married female married educ (0.0771) (0.0067) The wage of not married females is on average 10.3% smaller than the wages of not married males with the same years of education. The wage of married females is on average 47.3% (100 ( )) smaller than the wage of married males with the same years of education. The wage of married males is on average 40% larger than for unmarried males with the same years of education. The wage of married females is on average 3% (100 ( )) larger than for unmarried females with the same years of education. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

36 Example 5 (cont) In order to test in this model the null hypothesis that the difference in wages between males and females are the same for married and unmarried individuals, we have to test: H 0 : φ 0 = 0 H 1 : φ 0 6= 0 The test statistic under the normality assumption of the errors is t = b φ 0 se(bφ 0 ) t 521 under H 0 The value of the statistic in the sample is t = 0.37/ = 4.8 and the p-value is basically zero. Therefore, we can reject H 0 at any reasonable significance level and conclude that the difference in wages between males and females with the same years of education are different between married and unmarried individuals. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

37 How to allow different slopes So far we have been considering different examples on how to use dummy variables to allow for the constant terms of different groups to differ in the context of the regression model. In all these examples the slopes were the same for all the groups and therefore, the partial effects of the explanatory variables were the same for all the groups. In practice, there are situation in which we can think that the slopes of the model can be different for different groups. For example, there can be differences between males and females in their returns to education. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

38 In order to take into account those possible differences in the returns to education between males and females we can consider the following model: log(wage) = β 0 + δ 0 female + β 1 educ + δ 1 female educ + u (14) where female educ = female educ. In this model we note that for males (female = 0), the constant term if β 0 and the slope is β 1. While for females (female = 1), the constant term if β 0 + δ 0 and the slope is β 1 + δ 1. Therefore, δ 0 captures the difference in the intercept between females and males and δ 1 captures the difference in the slope between females and males, that is, the difference in the returns to education between males and females. If δ 0 < 0 and δ 1 < 0, graphically we have M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

39 M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

40 In order to test whether there are differences in the returns to education between males and females, we must test the null hypothesis H 0 : δ 1 = 0 in model (14). Note that the null hypothesis δ 1 = 0 does not impose any restriction in δ 0, so that under this hypothesis the constant term of the wage equation can be different for males and females. Therefore, this hypothesis allows for wage differences between females and males but these wage differences must be constant (in percentage terms) for all the educational levels. We could also test whether there are differences in wages between males and females with the same years of education. To do so, we must test the null hypothesis H 0 : δ 0 = δ 1 = 0 in model (14). M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

41 Example 6 Using again the same sample as in example 1, the model (14) has been obtained with the following results: log(wage) \ = (0.119) 0.360female educ (0.185) (0.009) female educ (0.0145) n = 526, R 2 = 0.300, SCR = According to these results, the return to education is on average 7.7% for males and basically the same for females. The estimated difference in the returns to education between males and females is on average %. This difference is negligible from the economic point of view and additionally is not statistically significant since the t-ratio is very small, t = / = M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

42 Example 6 (cont.) The coefficient of female is very large from the economic point of view, since this implies large differences in wages of 36% between males and females with the same years of education. However, the associated t-ratio is relatively small, t = 0.36/0.185 = 1.946, if this is compared with the results we obtained when the interaction term is not included (see equation (9)). The p-value of the one-sided test is Prob(t 522 < 1.946) = so that we could reject at 5% (but not at 1%) the hypothesis that H 0 : δ 0 = 0 versus H 1 : δ 0 < 0. Note that however the estimated coefficient for female is quite similar to the one we obtained when the interaction is not included. The difference is that the standard error is 5 times larger now than in equation (9), which implies that the statistical evidence against the null hypothesis must be weaker now than when the interaction was not included. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

43 Example 6 (cont.) The reason why the standard error is so large is because there is multicollinearity since the correlation between female and female educ is very high. Note that the interpretation of the estimated coefficient of female (with the opposite sign) as the constant percentage difference between males and females with the same years of education is correct in this example because the coefficient female educ is basically equal to zero. In general, if the model includes an interaction term between the dummy variable and one of the explanatory variables in the model, the dummy variable does not capture the difference in the dependent variable between the two groups with the same values of the explanatory variables. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

44 Example 6 (cont.) We test now if there are differences in wages between males and females with the same level of education, that it is, we test the null hypothesis H 0 : δ 0 = δ 1 = 0. To do this test we should estimate the restricted model log(wage) = β 0 + β 1 educ + u and compute the sum of squares residuals, SCR r = The test statistic is F = (SCR r SCR nr ) /2 SCR nr /( ) F 2,522 under H 0 and since SCR nr = and SCR r = , we have that F = ( ) / /522 = The p-value of the test is 0 and therefore there is strong evidence against the null hypothesis. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

45 Testing for differences in the regression functions between different groups Now, we will see how to test whether two populations or groups follow the same regression function versus the alternative hypothesis that at least one of the slopes is different between groups. Example 7 Suppose we want to test whether the same model describes college grade point averages (GPA) of male and female athletes. cumgpa = β 0 + β 1 sat + β 2 hsperc + β 3 tothrs + u (15) where cumgpa is the college grade point average over a scale of four points, hsperc is high school rank percentile (so that if hsperc = 5 the individual is among the best 5% of graduated students), sat is the result of a school test and tothrs is total hours of college courses. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

46 Example 7 (cont.) If we are interested in testing whether there are differences between males and females in this model, we must add one of the gender dummies, for example female, and the interactions of this dummy with all the variables in the model in order to allow for the constant term and all the slope to differ between groups: cumgpa = β 0 + δ 0 female + β 1 sat + δ 1 female sat + β 2 hsperc (16) +δ 2 female hsperc + β 3 tothrs + δ 3 female tothrs + u Parameter δ 0 is the difference of the constant term between females and males, δ 1, is the difference between females and males in the effect of sat on cumgpa, and so on The null hypothesis that cumgpa follows the same model for males and females is H 0 : δ 0 = δ 1 = δ 2 = δ 3 = 0 M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

47 Example 7 (cont.) Using the data of the spring term in file GPA3 of Wooldridge, the following results have been obtained: cumgpa = 1.48 (0.21) (0.0014) (0.411) female sat female sat ( ) ( ) hsperc female hsperc (0.0032) female tothrs (0.0016) n = 366, R 2 = 0.406, SCR = ( ) tothrs Neither female nor any of its interactions are individually significant at 5%. In fact, at 10% only the interaction with sat is significant. However, we know that this does not mean that there is no evidence to reject the null hypothesis δ 0 = δ 1 = δ 2 = δ 3 = 0. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

48 Example 7 (cont.) To test this hypothesis we have to estimate the restricted model cumgpa = β 0 + β 1 sat + β 2 hsperc + β 3 tothrs + u and compute the sum of squared residuals, SCR r = The test statistic is F = (SCR r SCR nr ) /4 SCR nr /( ) F 4,358 under H 0 and since SCR nr = and SCR r = , we have that F = ( ) / /358 = 8.18 The p-value is basically equal to 0 and therefore there is strong evidence against the null hypothesis. The model for GPA is different for males and females despite the fact that neither female nor any of its interactions are individually significant at 5%. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

49 In a model like the one above with only three explanatory variables, it is quite easy to add all the interaction terms for the differences between groups. However, in other examples, with many explanatory variables, it is convenient to compute the F statistic in a different way. The test statistic F can be easily computed without the need of estimating the model with all the interaction terms. Consider a general model with k explanatory variables and a constant, and assume that we have two groups that are denoted with g = 1 and g = 2, and that we want to test if the constant term and all the slopes are the same for all the groups. We write the model as for g = 1 and g = 2. y = β g,0 + β g,1 x β g,k x k + u (17) The hypothesis that the parameters of the model are the same for the two groups is equivalent to k + 1 linear restrictions. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

50 The unrestricted model includes 2(k + 1) variables: the constant term, one of the group dummies, the k explanatory variables and the k interactions of the group dummy variable with the explanatory variables. It can be shown that the SSR of the unrestricted model can be computed as SSR nr = SSR 1 + SSR 2, where SSR 1 is the SSR of model (17) estimated with the observations of group 1 and SSR 2 is the SSR of model (17) estimated with the observations of group 2. Therefore the test statistic is F = (SCR r (SCR 1 + SCR 2 )) /(k + 1) (SCR 1 + SCR 2 ) /(n 2(k + 1)) This statistic is denoted as the Chow statistic. F k+1,n 2(k+1) under H 0 (18) M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

51 Example 7 (cont.) If in Example 7 we estimate model (17) using the data for 90 women in the sample, we have that SSR 1 = and using data for 276 males we have that SSR 2 = Therefore SSR 1 + SSR 2 = = , which coincides with the SSR of the unrestricted model which has been computed above. An important limitation of the Chow test is that the null hypothesis does not allow for any differences between groups. In many cases, it is more interesting to allow for differences in the constant term and test whether there are differences in the slopes. In this case, the restricted model includes a dummy for the group and the unrestricted model a dummy for the group and all the possible interactions. SSR nr can be computed, as in the previous test, separately estimating the model with two subsamples and computing the sum of the SSR. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

52 Example 7 (cont.) In example 7, in order to test if there are not differences in the slopes, allowing for differences in the constant term, we should test in model (16). The restricted model is H 0 : δ 1 = δ 2 = δ 3 = 0 cumgpa = β 0 + δ 0 female + β 1 sat + β 2 hsperc + β 3 tothrs + u (19) and the SSR of this model (estimated using the 366 observations) is SSR r = The test statistic is F = (SSR r SSR nr ) /3 SSR nr /( ) F 3,358 under H 0 and since SSR nr = and SSR r = , we have that F = ( ) / /358 = 1.53 The p-value of the test is and therefore there is not evidence against the null hypothesis. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

53 Example 7 (cont.) Not rejecting the null hypothesis that there are no differences in the slopes suggests that the best model is the one where there are only differences in the constant term, that is, model (19). The results of the estimation of this model are: cumgpa = female sat + (0.180) (0.059) ( ) hsperc tothrs (0.0012) ( ) n = 366, R 2 = female is now quite significant and its estimated coefficient implies that, for given values of sat, hsperc and tothrs, the GPA of females is on average 0.31 points larger than the GPA for males, which means an important difference. M. Angeles Carnero (UA) Chapter 9: Dummies Year / 53

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information 1. Qualitative Information Qualitative Information Up to now, we assume that all the variables has quantitative meaning. But often in empirical work, we must incorporate qualitative factor into regression

More information

ECON Interactions and Dummies

ECON Interactions and Dummies ECON 351 - Interactions and Dummies Maggie Jones 1 / 25 Readings Chapter 6: Section on Models with Interaction Terms Chapter 7: Full Chapter 2 / 25 Interaction Terms with Continuous Variables In some regressions

More information

Answer Key: Problem Set 6

Answer Key: Problem Set 6 : Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,

More information

Lab 10 - Binary Variables

Lab 10 - Binary Variables Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

More information

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang Project Report for STAT7 Statistical Methods Instructor: Dr. Ramon V. Leon Wage Data Analysis Yuanlei Zhang 77--7 November, Part : Introduction Data Set The data set contains a random sample of observations

More information

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is Chapter 9 Dummy (Binary) Variables 9.1 Introduction The multiple regression model y = β+β x +β x + +β x + e (9.1.1) t 1 2 t2 3 t3 K tk t Assumption MR1 is 1. yt =β 1+β 2xt2 + L+β KxtK + et, t = 1, K, T

More information

Econ 444, class 11. Robert de Jong 1. Monday November 6. Ohio State University. Econ 444, Wednesday November 1, class Department of Economics

Econ 444, class 11. Robert de Jong 1. Monday November 6. Ohio State University. Econ 444, Wednesday November 1, class Department of Economics Econ 444, class 11 Robert de Jong 1 1 Department of Economics Ohio State University Monday November 6 Monday November 6 1 Exercise for today 2 New material: 1 dummy variables 2 multicollinearity Exercise

More information

Problem 13.5 (10 points)

Problem 13.5 (10 points) BOSTON COLLEGE Department of Economics EC 327 Financial Econometrics Spring 2013, Prof. Baum, Mr. Park Problem Set 2 Due Monday 25 February 2013 Total Points Possible: 210 points Problem 13.5 (10 points)

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157.

CHAPTER 7. + ˆ δ. (1 nopc) + ˆ β1. =.157, so the new intercept is = The coefficient on nopc is.157. CHAPTER 7 SOLUTIONS TO PROBLEMS 7. (i) The coefficient on male is 87.75, so a man is estimated to sleep almost one and one-half hours more per week than a comparable woman. Further, t male = 87.75/34.33

More information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information

Regression with Qualitative Information. Part VI. Regression with Qualitative Information Part VI Regression with Qualitative Information As of Oct 17, 2017 1 Regression with Qualitative Information Single Dummy Independent Variable Multiple Categories Ordinal Information Interaction Involving

More information

Solutions to Problem Set 4 (Due November 13) Maximum number of points for Problem set 4 is: 66. Problem C 6.1

Solutions to Problem Set 4 (Due November 13) Maximum number of points for Problem set 4 is: 66. Problem C 6.1 Solutions to Problem Set 4 (Due November 13) EC 228 01, Fall 2013 Prof. Baum, Mr. Lim Maximum number of points for Problem set 4 is: 66 Problem C 6.1 (i) (3 pts.) If the presence of the incinerator depresses

More information

Answer Key: Problem Set 5

Answer Key: Problem Set 5 : Problem Set 5. Let nopc be a dummy variable equal to one if the student does not own a PC, and zero otherwise. i. If nopc is used instead of PC in the model of: colgpa = β + δ PC + β hsgpa + β ACT +

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3 Solutions to Problem Set 5 (Due November 22) EC 228 02, Fall 2010 Prof. Baum, Ms Hristakeva Maximum number of points for Problem set 5 is: 220 Problem 7.3 (i) (5 points) The t statistic on hsize 2 is over

More information

Econometrics Problem Set 6

Econometrics Problem Set 6 Econometrics Problem Set 6 WISE, Xiamen University Spring 2016-17 Conceptual Questions 1. This question refers to the estimated regressions shown in Table 1 computed using data for 1988 from the CPS. The

More information

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover). STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods in Economics 2 Course code: EC2402 Examiner: Peter Skogman Thoursie Number of credits: 7,5 credits (hp) Date of exam: Saturday,

More information

Econometrics Problem Set 4

Econometrics Problem Set 4 Econometrics Problem Set 4 WISE, Xiamen University Spring 2016-17 Conceptual Questions 1. This question refers to the estimated regressions in shown in Table 1 computed using data for 1988 from the CPS.

More information

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics CHAPTER 4 & 5 Linear Regression with One Regressor Kazu Matsuda IBEC PHBU 430 Econometrics Introduction Simple linear regression model = Linear model with one independent variable. y = dependent variable

More information

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one

More information

Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8

Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8 Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8 1. The sales of a company amount to 196 millions of dollars in 2009 and increased up to 198 millions in 2010. (a)

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Econometrics Problem Set 3

Econometrics Problem Set 3 Econometrics Problem Set 3 Conceptual Questions 1. This question refers to the estimated regressions in table 1 computed using data for 1988 from the U.S. Current Population Survey. The data set consists

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%. UNIVERSITY OF EAST ANGLIA School of Economics Main Series PGT Examination 017-18 ECONOMETRIC METHODS ECO-7000A Time allowed: hours Answer ALL FOUR Questions. Question 1 carries a weight of 5%; Question

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

Solutions to Exercises in Chapter 9

Solutions to Exercises in Chapter 9 in 9. (a) When a GPA is increased by one unit, and other variables are held constant, average starting salary will increase by the amount $643. Students who take econometrics will have a starting salary

More information

ECNS 561 Topics in Multiple Regression Analysis

ECNS 561 Topics in Multiple Regression Analysis ECNS 561 Topics in Multiple Regression Analysis Scaling Data For the simple regression case, we already discussed the effects of changing the units of measurement Nothing different here Coefficients, SEs,

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

ECO321: Economic Statistics II

ECO321: Economic Statistics II ECO321: Economic Statistics II Chapter 6: Linear Regression a Hiroshi Morita hmorita@hunter.cuny.edu Department of Economics Hunter College, The City University of New York a c 2010 by Hiroshi Morita.

More information

Econometrics Multiple Regression Analysis with Qualitative Information: Binary (or Dummy) Variables

Econometrics Multiple Regression Analysis with Qualitative Information: Binary (or Dummy) Variables Econometrics Multiple Regression Analysis with Qualitative Information: Binary (or Dummy) Variables João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo

More information

ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES

ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES How to quantify qualitative variables to quantitative variables? Why do we need to do this? Econometric model needs quantitative variables to estimate its parameters

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

CHAPTER 4. > 0, where β

CHAPTER 4. > 0, where β CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

ECON 4230 Intermediate Econometric Theory Exam

ECON 4230 Intermediate Econometric Theory Exam ECON 4230 Intermediate Econometric Theory Exam Multiple Choice (20 pts). Circle the best answer. 1. The Classical assumption of mean zero errors is satisfied if the regression model a) is linear in the

More information

Inference in Regression Analysis

Inference in Regression Analysis ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)

More information

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers Student Name: Economics 4818 - Introduction to Econometrics - Fall 2007 Final Exam - Answers SHOW ALL WORK! Evaluation: Problems: 3, 4C, 5C and 5F are worth 4 points. All other questions are worth 3 points.

More information

Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms

Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Amherst College Department of Economics Economics 360 Fall 2012 Wednesday, October 10 Handout: One-Tailed Tests, Two-Tailed Tests, and Logarithms Preview A One-Tailed Hypothesis Test: The Downward Sloping

More information

CIVL 7012/8012. Simple Linear Regression. Lecture 3

CIVL 7012/8012. Simple Linear Regression. Lecture 3 CIVL 7012/8012 Simple Linear Regression Lecture 3 OLS assumptions - 1 Model of population Sample estimation (best-fit line) y = β 0 + β 1 x + ε y = b 0 + b 1 x We want E b 1 = β 1 ---> (1) Meaning we want

More information

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11 Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12) Remember: Z.05 = 1.645, Z.01 = 2.33 We will only cover one-sided hypothesis testing (cases 12.3, 12.4.2, 12.5.2,

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

Econometrics Problem Set 6

Econometrics Problem Set 6 Econometrics Problem Set 6 WISE, Xiamen University Spring 2016-17 Conceptual Questions 1. This question refers to the estimated regressions shown in Table 1 computed using data for 1988 from the CPS. The

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

Heteroskedasticity (Section )

Heteroskedasticity (Section ) Heteroskedasticity (Section 8.1-8.4) Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Heteroskedasticity 1 / 44 Consequences of Heteroskedasticity for OLS Consequences

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Per Pettersson-Lidbom Number of creds: 7,5 creds Date of exam: Thursday, January 15, 009 Examination

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore What is Multiple Linear Regression Several independent variables may influence the change in response variable we are trying to study. When several independent variables are included in the equation, the

More information

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows:

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows: INTRODUCTORY ECONOMETRICS Dpt of Econometrics & Statistics (EA3) University of the Basque Country UPV/EHU OCW Self Evaluation answers Time: 21/2 hours SURNAME: NAME: ID#: Specific competences to be evaluated

More information

Lecture Module 6. Agenda. Professor Spearot. 1 P-values. 2 Comparing Parameters. 3 Predictions. 4 F-Tests

Lecture Module 6. Agenda. Professor Spearot. 1 P-values. 2 Comparing Parameters. 3 Predictions. 4 F-Tests Lecture Module 6 Professor Spearot Agenda 1 P-values 2 Comparing Parameters 3 Predictions 4 F-Tests Confidence intervals Housing prices and cancer risk Housing prices and cancer risk (Davis, 2004) Housing

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Statistical methods for Education Economics

Statistical methods for Education Economics Statistical methods for Education Economics Massimiliano Bratti http://www.economia.unimi.it/bratti Course of Education Economics Faculty of Political Sciences, University of Milan Academic Year 2007-08

More information

Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24

Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24 Midterm Exam Midterm Review AMS-UCSC May 6th, 2015 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 1 / 24 Topics Topics We will talk about... 1 Review Spring 2015. Session 1 (Midterm Review)

More information

Exercise Sheet 4 Instrumental Variables and Two Stage Least Squares Estimation

Exercise Sheet 4 Instrumental Variables and Two Stage Least Squares Estimation Exercise Sheet 4 Instrumental Variables and Two Stage Least Squares Estimation ECONOMETRICS I. UC3M 1. [W 15.1] Consider a simple model to estimate the e ect of personal computer (P C) ownership on the

More information

Introductory Econometrics Exercises for tutorials (Fall 2014)

Introductory Econometrics Exercises for tutorials (Fall 2014) Introductory Econometrics Exercises for tutorials (Fall 2014) Dept. of Econometrics, Uni. of Economics, Prague, zouharj@vse.cz September 23, 2014 Tutorial 1: Review of basic statistical concepts Exercise

More information

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators 1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory

More information

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies

More information

Homework Set 2, ECO 311, Fall 2014

Homework Set 2, ECO 311, Fall 2014 Homework Set 2, ECO 311, Fall 2014 Due Date: At the beginning of class on October 21, 2014 Instruction: There are twelve questions. Each question is worth 2 points. You need to submit the answers of only

More information

University of Maryland Spring Economics 422 Final Examination

University of Maryland Spring Economics 422 Final Examination Department of Economics John C. Chao University of Maryland Spring 2009 Economics 422 Final Examination This exam contains 4 regular questions and 1 bonus question. The total number of points for the regular

More information

PhD/MA Econometrics Examination January 2012 PART A

PhD/MA Econometrics Examination January 2012 PART A PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator

More information

Course Econometrics I

Course Econometrics I Course Econometrics I 3. Multiple Regression Analysis: Binary Variables Martin Halla Johannes Kepler University of Linz Department of Economics Last update: April 29, 2014 Martin Halla CS Econometrics

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

School of Mathematical Sciences. Question 1

School of Mathematical Sciences. Question 1 School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant

More information

Linear Regression With Special Variables

Linear Regression With Special Variables Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:

More information

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University Instructions: Answer all four (4) questions. Be sure to show your work or provide su cient justi cation for

More information

Sociology 593 Exam 2 March 28, 2002

Sociology 593 Exam 2 March 28, 2002 Sociology 59 Exam March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that

More information

Lecture-1: Introduction to Econometrics

Lecture-1: Introduction to Econometrics Lecture-1: Introduction to Econometrics 1 Definition Econometrics may be defined as 2 the science in which the tools of economic theory, mathematics and statistical inference is applied to the analysis

More information

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013 Midterm 3 4/5/2013 Instructions: You may use a calculator, and one sheet of notes. You will never be penalized for showing work, but if what is asked for can be computed directly, points awarded will depend

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

In Class Review Exercises Vartanian: SW 540

In Class Review Exercises Vartanian: SW 540 In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE

More information

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00)

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00) BOSTON COLLEGE Department of Economics EC 228 02 Econometric Methods Fall 2009, Prof. Baum, Ms. Phillips (TA), Ms. Pumphrey (grader) Problem Set 5 Due Tuesday 10 November 2009 Total Points Possible: 160

More information

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX

GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX GROWING APART: THE CHANGING FIRM-SIZE WAGE PREMIUM AND ITS INEQUALITY CONSEQUENCES ONLINE APPENDIX The following document is the online appendix for the paper, Growing Apart: The Changing Firm-Size Wage

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 1: Introduction

Economics 326 Methods of Empirical Research in Economics. Lecture 1: Introduction Economics 326 Methods of Empirical Research in Economics Lecture 1: Introduction Hiro Kasahara University of British Columbia December 24, 2014 What is Econometrics? Econometrics is concerned with the

More information

Irish Industrial Wages: An Econometric Analysis Edward J. O Brien - Junior Sophister

Irish Industrial Wages: An Econometric Analysis Edward J. O Brien - Junior Sophister Irish Industrial Wages: An Econometric Analysis Edward J. O Brien - Junior Sophister With pay agreements firmly back on the national agenda, Edward O Brien s topical econometric analysis aims to identify

More information

Lecture 28 Chi-Square Analysis

Lecture 28 Chi-Square Analysis Lecture 28 STAT 225 Introduction to Probability Models April 23, 2014 Whitney Huang Purdue University 28.1 χ 2 test for For a given contingency table, we want to test if two have a relationship or not

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information