Properties of estimator Functional Form. Econometrics. Lecture 8. Nathaniel Higgins JHU. Nathaniel Higgins Lecture 8

Size: px

Start display at page:

Download "Properties of estimator Functional Form. Econometrics. Lecture 8. Nathaniel Higgins JHU. Nathaniel Higgins Lecture 8"

Gary Hood
5 years ago
Views:

1 Econometrics Lecture 8 Nathaniel Higgins JHU

2 Homework Next class: GDP, population, temperature, energy, and mortality data together by Nov. 9 (next class) If you have questions / need help, let Rob or I know! Rob will use part of his time this week to specifically help iron out any difficulties with collecting and curating this data Put time asap into getting all of this data on hand If you have time to collect all the data, do it!

3 Hypotheses Wrap-up We have completed material on basic hypothesis testing You should be able to: 1 Test single hypotheses involving a single coefficient 2 Test single hypotheses involving a combination of coefficients 3 Test multiple hypotheses jointly Single hypotheses can be evaluated against: 1 One-sided alternative 2 Two-sided alternative Can evaluate all our hypotheses using test statistics or p-values Can evaluate our t-tests using a confidence interval as well

4 Next topic Properties of the OLS estimator Material maps over to Wooldridge Ch 5 This finishes up our work with basic OLS Next set of topics will be working on functional form how to write models that deal with different shapes in the data (more fun)

5 Properties of the OLS estimator Everything we ve done up to this point has used the OLS estimator Why? Because the OLS estimator has nice properties under a reasonable set of assumptions Recall the assumptions we have made: MRL.1-MLR.6

6 Gauss-Markov assumptions MLR.1 y = β 0 + β 1 x β k x k + u 2 MLR.2 Our data (x,y) represent a random sample from the relevant population 3 MLR.3 x i z x j 4 MLR.4 E(u x) = 0

7 Gauss-Markov assumptions MLR.5 Var(u x) = σ 2 6 MLR.6 u is independent of x s and is distributed normally with mean of zero and variance of σ 2

8 Properties of the OLS estimator If we make assumptions MLR.1 - MLR.4, we get: OLS is unbiased (see section 3.3) know this! If we add assumption MLR.5, we get: OLS is BLUE (see section 3.5) know this! If we add assumption MLR.6, not only do we know that OLS is best in the sense that it has lower variance than all the other linear estimators, we can actually compute the variance

9 Properties of the OLS estimator If we make assumptions MLR.1 - MLR.4, we get: OLS is unbiased (see section 3.3) know this! If we add assumption MLR.5, we get: OLS is BLUE (see section 3.5) know this! If we add assumption MLR.6, not only do we know that OLS is best in the sense that it has lower variance than all the other linear estimators, we can actually compute the variance If we assume only MLR.1 - MLR.4, we get that OLS is consistent (new jargon)

10 Properties of the OLS estimator Consistency Consistency is a large-sample property What does this mean, practically? It means you appeal to large sample properties whenever you can. Typically, large sample properties can be appealed to with dsets that most of you would consider small Consistency means: as the sample size gets bigger and bigger (formally: approaches infinity ), the distribution of the OLS estimator β j collapses around the true value β j That s really all I expect you to know: as you get more and more data, the OLS estimator gets better and better (in the sense that our estimates become tighter around the true value)

11 Properties of the OLS estimator Normality So if we assume MLR.1 - MLR.4, we get that OLS is asymptotically awesome But recall: we needed MLR.5 and MLR.6 to perform inference Why? Because we needed to know the exact distribution of β j to form our t- and F-statistics Problem: in small samples, we may find that u doesn t look very normal To the rescue: the central limit theorem (appendix C) If we assume MLR.5 (homoskedasticity) but NOT MLR.6 (normality), we get: As n approaches infinity, ( β j β j )/se( β j ) is distributed normally

12 Properties of the OLS estimator Normality Hold the phone! I thought that ( β j β j )/se( β j ) was distributed at a t!

13 Properties of the OLS estimator Normality Hold the phone! I thought that ( β j β j )/se( β j ) was distributed at a t! Fear not. It is distributed as a t. But as sample sizes get really big, the t distribution starts to look exactly like N(0,1) (the standard normal distribution).

14 Properties of the OLS estimator Normality Hold the phone! I thought that ( β j β j )/se( β j ) was distributed at a t! Fear not. It is distributed as a t. But as sample sizes get really big, the t distribution starts to look exactly like N(0,1) (the standard normal distribution). So... the central limit theorem helps rescue us, because we can essentially use the same testing procedures that we have always used even if u is not normally distributed, as long as the sample is big enough

15 Properties of the OLS estimator Normality Hold the phone! I thought that ( β j β j )/se( β j ) was distributed at a t! Fear not. It is distributed as a t. But as sample sizes get really big, the t distribution starts to look exactly like N(0,1) (the standard normal distribution). So... the central limit theorem helps rescue us, because we can essentially use the same testing procedures that we have always used even if u is not normally distributed, as long as the sample is big enough What s big enough?

16 Properties of the OLS estimator Normality What s big enough? Depends. If u is not exactly normal, but it s pretty close, n = 30 is plenty. If u is really really not normal, then we need more n.

17 Properties of the OLS estimator Normality What s big enough? Depends. If u is not exactly normal, but it s pretty close, n = 30 is plenty. If u is really really not normal, then we need more n. How to tell? Graph the distribution of û! If it looks kinda symmetrical and nice, you re in good shape with a relatively small sample. If it looks way ugly, buckle up and get a bigger sample.

18 Properties of the OLS estimator Efficiency With small samples, we knew that OLS was BLUE (using MLR.1 - MLR.5) Saying that OLS is BLUE means: OLS is the Best Linear Unbiased Estimator Bestness refers to the fact that the OLS estimator has minimum variance among all the other linear unbiased estimators A synonym for best in this context is efficient. When we say OLS is efficient, we mean that it has minimum variance (ask me why) With large samples, we can say that OLS is asymptotically efficient

19 Plan Discuss functional form more directly than we have so far logarithmic transformations polynomial forms units

20 Functional form Transformation of variables Everything we have learned so far has been about linear models y = β 0 + β 1 x + u We say that the model above is linear There is a small set of nonlinear functional forms that you can work with... because you can transform them into linear functional forms

21 Functional form Transformation of variables dep var ind var coeff level-level y x y due to x log-log log(y) log(x) % in y due to 1% in x

22 Functional form Transformation of variables dep var ind var coeff level-level y x y due to x log-log log(y) log(x) % in y due to 1% in x level-log y log(x) 100 y due to 1% x log-level log(y) x (1/100) (% y) due to x

23 Functional form Transformation of variables dep var ind var coeff level-level y x y due to x log-log log(y) log(x) % in y due to 1% in x level-log y log(x) 100 y due to 1% x log-level log(y) x (1/100) (% y) due to x dep var ind var coeff level-level y x y = β x log-log log(y) log(x) % y = β% x

24 Functional form Transformation of variables dep var ind var coeff level-level y x y due to x log-log log(y) log(x) % in y due to 1% in x level-log y log(x) 100 y due to 1% x log-level log(y) x (1/100) (% y) due to x dep var ind var coeff level-level y x y = β x log-log log(y) log(x) % y = β% x level-log y log(x) y = (1/100) β% x log-level log(y) x % y = 100 β x

25 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: ŷ = x When x increases by one unit, y increases by 5 units

26 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: ŷ = x When x increases by one unit, y increases by 5 units ŷ = log(x) When x increases by one pct., y increases by 0.05

27 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: ŷ = x When x increases by one unit, y increases by 5 units ŷ = log(x) When x increases by one pct., y increases by 0.05 log(y) = x When x increases by one unit, y increases by 500 pct.

28 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: log(y) = log(x) When x increases by one pct., y increases by 5 pct.

29 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: log(y) = log(x) When x increases by one pct., y increases by 5 pct. A logarithmic transformation should be done for the right reasons

30 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: log(y) = log(x) When x increases by one pct., y increases by 5 pct. A logarithmic transformation should be done for the right reasons Even though the interpretation of the coefficients in a log-transformed model are sometimes convenient, you should never transform data because of the interpretation of coefficients OK, who s going to ask it?

31 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: log(y) = log(x) When x increases by one pct., y increases by 5 pct. A logarithmic transformation should be done for the right reasons Even though the interpretation of the coefficients in a log-transformed model are sometimes convenient, you should never transform data because of the interpretation of coefficients OK, who s going to ask it? Q: so why do you do the transformation?

32 Functional form Transformation of variables Interpretation of the coefficient values when using a logarithmic transformation: log(y) = log(x) When x increases by one pct., y increases by 5 pct. A logarithmic transformation should be done for the right reasons Even though the interpretation of the coefficients in a log-transformed model are sometimes convenient, you should never transform data because of the interpretation of coefficients OK, who s going to ask it? Q: so why do you do the transformation? Because it fits the data better

33 Functional form Transformation of variables Because it fits the data better? Because theory or practice (previous experience) suggest that the relationship is best described by logarithmic transforms Example: economic growth is often thought to be best described by a proportional relationship (Horowitz transforms GDP using logarithms)

34 Functional form Polynomials The use of polynomials gives us a lot of flexibility You can approximate any crazy functional form with polynomials ( Typically, we use only quadratic (squared-terms) or cubic (cubed-terms) terms to approximate higher-order terms are rare We do this because we believe the relationship between y and x is not constant Example: environmental kuznets curve (we talked about last time) Using logarithms to describe a nonlinear relationship implies a very specific nonlinear function using polynomials is more flexible

35 Functional form Polynomials Polynomial transforms and their interpretation We usually interpret coefficients β as the marginal impact of changes in x on y In this relationship y = β 0 + β 1 x + u β is the increase in y due to a one unit increase in x

36 Functional form Polynomials Polynomial transforms and their interpretation We usually interpret coefficients β as the marginal impact of changes in x on y In this relationship y = β 0 + β 1 x + u β is the increase in y due to a one unit increase in x Suppose we specify a quadratic relationship y = β 0 + β 1 x + β 2 x 2 + u If x increases by one unit, by how much does y increase?

37 Functional form Polynomials y = β 0 + β 1 x + β 2 x 2 + u If x increases by a little bit, by how much does y increase?

38 Functional form Polynomials y = β 0 + β 1 x + β 2 x 2 + u If x increases by a little bit, by how much does y increase? Calculus! dy dx = β 1 + 2β 2 x

39 Functional form Polynomials y = β 0 + β 1 x + β 2 x 2 + u If x increases by a little bit, by how much does y increase? Calculus! dy dx = β 1 + 2β 2 x When x increases by a little bit, y increases by β 1 + 2β 2 x The coefficients no longer have much interpretation by themselves rather, we interpret a function of the coefficients (above) to describe the relationship between y and x (see Horowitz for an example of this)

40 Functional form Polynomials y = β 0 + β 1 x + β 2 x 2 + u If x increases by one unit, by how much does y increase?

41 Functional form Polynomials y = β 0 + β 1 x + β 2 x 2 + u If x increases by one unit, by how much does y increase? No calculus! y = β 0 + β 1 (x + 1) + β 2 (x + 1) 2 + u Subtract the above from and what do we get? = β 0 + β 1 x + β 1 + β 2 x 2 + 2β 2 x + β 2 + u y = β 0 + β 1 x + β 2 x 2 + u β 1 + 2β 2 x + β 2

42 Functional form Units That s enough about specific functional form for now Using logarithms and polynomials, you will be able to understand 99% of the functional forms used in applied econometrics (Other transformations like sin, cos, etc. are used, but very rarely and for specific reasons, such as describing cyclic behavior, e.g.) For exam-type purposes, make sure you understand how to work with coefficients to get to an interpretation of the relationship between y and x (see these slides, and page 46 of Wooldridge) There is one more thing we should touch on briefly: units

43 Functional form Units We are interested in the relationship between GDP and energy reserves (oil, e.g.) We specify the following model gdp = β 0 + β 1 oil + u

44 Functional form Units We are interested in the relationship between GDP and energy reserves (oil, e.g.) We specify the following model gdp = β 0 + β 1 oil + u Nobody balks at this. Seems reasonable.

45 Functional form Units We are interested in the relationship between GDP and energy reserves (oil, e.g.) We specify the following model gdp = β 0 + β 1 oil + u Nobody balks at this. Seems reasonable. But how is oil measured? In what units?

46 Functional form Units We are interested in the relationship between GDP and energy reserves (oil, e.g.) We specify the following model gdp = β 0 + β 1 oil + u Nobody balks at this. Seems reasonable. But how is oil measured? In what units? We could legitimately measure oil in barrels, in thousands of barrels, or in BTUs Any way we do it, we are still representing the fundamental relationship between gdp and oil

47 Functional form Units Suppose we measure oil in BTUs and run our regression. We get: ĝdp = β 0 + β 1 BTU

48 Functional form Units Suppose we measure oil in BTUs and run our regression. We get: ĝdp = β 0 + β 1 BTU What increase in GDP do we expect (based on our model) due to an increase in known oil reserves, measured in barrels?

49 Functional form Units Suppose we measure oil in BTUs and run our regression. We get: ĝdp = β 0 + β 1 BTU What increase in GDP do we expect (based on our model) due to an increase in known oil reserves, measured in barrels? That is, we want to know what α 1 is from the regression: ĝdp = α 0 + α 1 barrels

50 Functional form Units Suppose we measure oil in BTUs and run our regression. We get: ĝdp = β 0 + β 1 BTU What increase in GDP do we expect (based on our model) due to an increase in known oil reserves, measured in barrels? That is, we want to know what α 1 is from the regression: ĝdp = α 0 + α 1 barrels We need to know the conversion between BTUs and barrels: 5.6 million BTU/barrel

51 Functional form Units Then we can say α 1 = 5.6million β 1

52 Functional form Dummies Use of dummy variables on the RHS Wait. What are dummy variables? Simple:

53 Functional form Dummies Use of dummy variables on the RHS Wait. What are dummy variables? Simple: dummy variables are binary (0/1) variables that indicate something (That s why they are also known as indicator variables)

54 Functional form Dummies Use of dummy variables on the RHS Wait. What are dummy variables? Simple: dummy variables are binary (0/1) variables that indicate something (That s why they are also known as indicator variables) You probably already know what they are

55 Dummy variables A dummy variable is an indicator variable the variable indicates that some particular thing is true about an observation Easiest way to start thinking about it is with a toy example (next slide)

56 Dummy variables A dummy variable is an indicator variable the variable indicates that some particular thing is true about an observation Easiest way to start thinking about it is with a toy example (next slide) Keep in mind as we go through the examples: Big idea: means Remember when I told you that econometrics was nothing but fancy averaging?

57 Toy example We have a sample of data that includes wage data for men and women We suspect that men and women might earn different wages We can run a model to test this presumption wage = β 0 + β 1 male + u

58 Toy example We have a sample of data that includes wage data for men and women We suspect that men and women might earn different wages We can run a model to test this presumption wage = β 0 + β 1 male + u male is a dummy variable What is the interpretation of β 0 in the model above?

59 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with?

60 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with? Women.

61 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with? Women. Another way to say the same thing:

62 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with? Women. Another way to say the same thing: What does the model look like for women?

63 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with? Women. Another way to say the same thing: What does the model look like for women? wage = β 0 + u

64 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with? Women. Another way to say the same thing: What does the model look like for women? wage = β 0 + u This is the wage equation conditional on being a woman (male = 0)

65 Toy example Approach to interpretation: zero-out male wage = β 0 + β u What does this leave us with? Women. Another way to say the same thing: What does the model look like for women? wage = β 0 + u This is the wage equation conditional on being a woman (male = 0) The expected wage of women is given by β 0 (the model without the u term)

66 Toy example What does the model look like for men?

67 Toy example What does the model look like for men? wage = β 0 + β 1 + u It s the wage equation, conditional on being a man (male = 1)

68 Toy example What does the model look like for men? wage = β 0 + β 1 + u It s the wage equation, conditional on being a man (male = 1) What is the expected wage of men?

69 Toy example What does the model look like for men? wage = β 0 + β 1 + u It s the wage equation, conditional on being a man (male = 1) What is the expected wage of men? wage = β 0 + β 1

70 Dummy variables Dummy variables are just like other variables, except they only take on two values: 0 and 1 Using dummy variables effectively creates a bunch of category-specific intercepts (in our first example the two categories were men and women)

71 Dummy variables Dummy variables are just like other variables, except they only take on two values: 0 and 1 Using dummy variables effectively creates a bunch of category-specific intercepts (in our first example the two categories were men and women) Exercise (for studying): Make a model of wages Individuals are categorized two ways: by sex, and by whether or not they have a college degree Build a model and determine the predicted wage (in terms of the coefficients of the model) for every type of person in the sample

72 Dummy variables You might build a model like this wage = β 0 + β 1 male + β 2 college + u

73 Dummy variables You might build a model like this wage = β 0 + β 1 male + β 2 college + u What is the expected wage of: 1 A woman who did not go to college? 2 A woman who did? 3 A man who did not go to college? 4 A man who did?

74 Dummy variables You might build a model like this wage = β 0 + β 1 male + β 2 college + u What is the expected wage of: 1 A woman who did not go to college? β 0 2 A woman who did? 3 A man who did not go to college? 4 A man who did?

75 Dummy variables You might build a model like this wage = β 0 + β 1 male + β 2 college + u What is the expected wage of: 1 A woman who did not go to college? β 0 2 A woman who did? β 0 + β 2 3 A man who did not go to college? 4 A man who did?

76 Dummy variables You might build a model like this wage = β 0 + β 1 male + β 2 college + u What is the expected wage of: 1 A woman who did not go to college? β 0 2 A woman who did? β 0 + β 2 3 A man who did not go to college? β 0 + β 1 4 A man who did?

77 Dummy variables You might build a model like this wage = β 0 + β 1 male + β 2 college + u What is the expected wage of: 1 A woman who did not go to college? β 0 2 A woman who did? β 0 + β 2 3 A man who did not go to college? β 0 + β 1 4 A man who did? β 0 + β 1 + β 2

78 Dummy variables This is pretty easy to see in a table wage = β 0 + β 1 male + β 2 college + u women men no college college

79 Dummy variables This is pretty easy to see in a table wage = β 0 + β 1 male + β 2 college + u no college women β 0 β 0 men β 0 β 0 college

80 Dummy variables This is pretty easy to see in a table wage = β 0 + β 1 male + β 2 college + u no college college women β 0 β 0 men β 0 + β 1 β 0 + β 1

81 Dummy variables This is pretty easy to see in a table wage = β 0 + β 1 male + β 2 college + u no college college women β 0 β 0 + β 2 men β 0 + β 1 β 0 + β 1 + β 2

82 Dummy variables This is pretty easy to see in a table wage = β 0 + β 1 male + β 2 college + u no college college women β 0 β 0 + β 2 men β 0 + β 1 β 0 + β 1 + β 2 There is one other way to build a model like this (entirely out of dummy variables)

83 Dummy variables We can build a model like this without a regular intercept So instead of everybody sharing an intercept, each group can have their own

84 Dummy variables We can build a model like this without a regular intercept So instead of everybody sharing an intercept, each group can have their own wage = β 1 female + β 2 male + β 3 college + u

85 Dummy variables We can build a model like this without a regular intercept So instead of everybody sharing an intercept, each group can have their own wage = β 1 female + β 2 male + β 3 college + u Notice what is not in the model

86 Dummy variables We can build a model like this without a regular intercept So instead of everybody sharing an intercept, each group can have their own wage = β 1 female + β 2 male + β 3 college + u Notice what is not in the model wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u

87 Dummy variables We can build a model like this without a regular intercept So instead of everybody sharing an intercept, each group can have their own wage = β 1 female + β 2 male + β 3 college + u Notice what is not in the model wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u The second model cannot work!

88 Dummy variables We can build a model like this without a regular intercept So instead of everybody sharing an intercept, each group can have their own wage = β 1 female + β 2 male + β 3 college + u Notice what is not in the model wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u The second model cannot work! Why?

89 Dummy variable facts To understand why it won t work, it s easiest to reveal the same fact in a simpler setting

90 Dummy variable facts To understand why it won t work, it s easiest to reveal the same fact in a simpler setting We can t run the following model for the same reason

91 Dummy variable facts To understand why it won t work, it s easiest to reveal the same fact in a simpler setting We can t run the following model for the same reason wage = β 0 + β 1 female + β 2 male + u

92 Dummy variable facts To understand why it won t work, it s easiest to reveal the same fact in a simpler setting We can t run the following model for the same reason wage = β 0 + β 1 female + β 2 male + u We can t run this model because then the sum of the constant that everybody shares (the data that multiplies β 0 ) is exactly equal to the sum of the data female and male

93 Dummy variable facts We can t run this model because then the sum of the constant that everybody shares (the data that multiplies β 0 ) is exactly equal to the sum of the data female and male

94 Dummy variable facts We can t run this model because then the sum of the constant that everybody shares (the data that multiplies β 0 ) is exactly equal to the sum of the data female and male Let s re-write the equation with constant = 1 explicitly in the model so you can see what I mean

95 Dummy variable facts We can t run this model because then the sum of the constant that everybody shares (the data that multiplies β 0 ) is exactly equal to the sum of the data female and male Let s re-write the equation with constant = 1 explicitly in the model so you can see what I mean wage = β 0 + β 1 female + β 2 male + u = β 0 constant + β 1 female + β 2 male + u

96 Dummy variable facts We can t run this model because then the sum of the constant that everybody shares (the data that multiplies β 0 ) is exactly equal to the sum of the data female and male Let s re-write the equation with constant = 1 explicitly in the model so you can see what I mean wage = β 0 + β 1 female + β 2 male + u = β 0 constant + β 1 female + β 2 male + u We can t run this regression because i.e. female + male = 1 constant = female + male

97 Dummy variable facts The data looks like this female male

98 Dummy variable facts The data looks like this female male female + male

99 Dummy variable facts The data looks like this female male female + male

100 Dummy variable facts The data looks like this female male female + male constant

101 Dummy variable facts The data looks like this female male female + male constant

102 Dummy variable facts The data looks like this female male female + male constant We can t include data on income in thousands of dollars (income10k) and income in dollars (income) in the same regression because income10k = income 10, 000 No linear combinations in the data!

103 Dummy variable facts The data looks like this female male female + male constant We can t include data on income in thousands of dollars (income10k) and income in dollars (income) in the same regression because income10k = income 10, 000 No linear combinations in the data! male + female = constant

104 Dummy variable facts The data looks like this female male female + male constant We can t include data on income in thousands of dollars (income10k) and income in dollars (income) in the same regression because income10k = income 10, 000 No linear combinations in the data! male + female = constant = bad

105 Dummy variables So back to the original problem wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u Why can t this work? Let s look at the data to see if we can detect the same type of problem

106 Dummy variables So back to the original problem wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u Why can t this work? Let s look at the data to see if we can detect the same type of problem female male nocollege college Does anybody see it? Exercise: find why we can t do this (find the linear combination in the data)

107 Dummy variables When we look at the data female male nocollege college we can see that female + male = nocollege + college This violates the rule that we cannot have linear combinations in the data Intuitively, you don t add a column of data (a variable) when that column doesn t add any new information If we already have female, male, and nocollege, then we know everything about each observation without ever looking at the college variable

108 Dummy variables We can t estimate wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u because female + male = college + nocollege. Can we run this model? wage = β 0 + β 1 female + β 2 male + β 3 college + u

109 Dummy variables We can t estimate wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u because female + male = college + nocollege. Can we run this model? wage = β 0 + β 1 female + β 2 male + β 3 college + u Nope.

110 Dummy variables We can t estimate wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u because female + male = college + nocollege. Can we run this model? wage = β 0 + β 1 female + β 2 male + β 3 college + u Nope. female + male = constant (this hasn t changed)

111 Dummy variables We can t estimate wage = β 1 female + β 2 male + β 3 college + β 4 nocollege + u because female + male = college + nocollege. Can we run this model? wage = β 0 + β 1 female + β 2 male + β 3 college + u Nope. female + male = constant (this hasn t changed) Takeaway: No linear combinations

112 Dummy variables If we want to run a model that differentiates women and men, and college graduates and non-college-graduates, what can we do?

113 Dummy variables If we want to run a model that differentiates women and men, and college graduates and non-college-graduates, what can we do? Two ways to go:

114 Dummy variables If we want to run a model that differentiates women and men, and college graduates and non-college-graduates, what can we do? Two ways to go: 1 Can include indicators of all but one type of individual in each category and include a constant (the usual thing that multiplies β 0 ) wage = β 0 + β 1 male + β 2 college + u

115 Dummy variables If we want to run a model that differentiates women and men, and college graduates and non-college-graduates, what can we do? Two ways to go: 1 Can include indicators of all but one type of individual in each category and include a constant (the usual thing that multiplies β 0 ) wage = β 0 + β 1 male + β 2 college + u 2 Can include indicators of all but one type of individual in each category, plus for one category we can include all types of individuals (the gender category in the example below) wage = α 1 female + α 2 male + α 3 college + u

116 Dummy variables First way wage = β 0 + β 1 male + β 2 college + u constant male college

117 Dummy variables First way wage = β 0 + β 1 male + β 2 college + u constant male college sum

118 Dummy variables First way wage = β 0 + β 1 male + β 2 college + u constant male college sum

119 Dummy variables First way wage = β 0 + β 1 male + β 2 college + u constant male college sum

120 Dummy variables First way wage = β 0 + β 1 male + β 2 college + u constant male college sum

121 Dummy variables First way wage = β 0 + β 1 male + β 2 college + u constant male college sum

122 Dummy variables First way Test: passed wage = β 0 + β 1 male + β 2 college + u constant male college sum

123 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college

124 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college sum

125 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college sum

126 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college sum

127 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college sum

128 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college sum

129 Dummy variables Second way wage = α 1 female + α 2 male + α 3 college + u female male college sum Since the data does not sum to a constant, we can estimate the model

130 Dummy variables Takeaway: we can build a model out of dummy variables We can build the model any way we want, as long as the dummy variables don t induce a linear combination in the data

131 Dummy variables OK. We can estimate the model a couple of ways. Interpretation of the variables is slightly different depending on which formulation you use Choose the way that meets your needs They are totally equivalent: choose which formulation gives you the interpretation that you like best Interpretation can be confusing it takes practice

132 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v

133 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education

134 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education in a manner of speaking, so is α 1

135 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education in a manner of speaking, so is α 1 So what s the difference?

136 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education in a manner of speaking, so is α 1 So what s the difference? β 1 is the difference between the expected wage of males and females with no college degree

137 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education in a manner of speaking, so is α 1 So what s the difference? β 1 is the difference between the expected wage of males and females with no college degree : (β 0 + β 1 ) β 0

138 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education in a manner of speaking, so is α 1 So what s the difference? β 1 is the difference between the expected wage of males and females with no college degree : (β 0 + β 1 ) β 0 (it s also the difference between the expected wage of males and females with a college degree)

139 Interpretation Interpret α 1 and β 1 wage = β 0 + β 1 male + β 2 college + u wage = α 1 male + α 2 female + α 3 college + v β 1 is the effect of being male on wage, regardless of education in a manner of speaking, so is α 1 So what s the difference? β 1 is the difference between the expected wage of males and females with no college degree : (β 0 + β 1 ) β 0 (it s also the difference between the expected wage of males and females with a college degree) α 1 by itself is the expected wage of a male with no college degree (so α 1 = β 0 + β 1 )

140 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0

141 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0 = β 1

142 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0 = β 1 The difference bet. a graduate male and a graduate female in (1): (β 0 + β 1 + β 2 ) (β 0 + β 2 )

143 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0 = β 1 The difference bet. a graduate male and a graduate female in (1): (β 0 + β 1 + β 2 ) (β 0 + β 2 ) = β 1

144 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0 = β 1 The difference bet. a graduate male and a graduate female in (1): (β 0 + β 1 + β 2 ) (β 0 + β 2 ) = β 1 The difference bet. a non-graduate male and a non-graduate female in (2): α 1 α 2

145 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0 = β 1 The difference bet. a graduate male and a graduate female in (1): (β 0 + β 1 + β 2 ) (β 0 + β 2 ) = β 1 The difference bet. a non-graduate male and a non-graduate female in (2): α 1 α 2 The difference bet. a graduate male and a graduate female in (2): (α 1 + α 3 ) (α 2 + α 3 )

146 Interpretation Compare the two models: wage = β 0 + β 1 male + β 2 college + u (1) wage = α 1 male + α 2 female + α 3 college + v (2) The difference bet. a non-graduate male and a non-graduate female in (1): (β 0 + β 1 ) β 0 = β 1 The difference bet. a graduate male and a graduate female in (1): (β 0 + β 1 + β 2 ) (β 0 + β 2 ) = β 1 The difference bet. a non-graduate male and a non-graduate female in (2): α 1 α 2 The difference bet. a graduate male and a graduate female in (2): (α 1 + α 3 ) (α 2 + α 3 ) = α 1 α 2

147 Interpretation Takeaway: when we do not include an intercept in the model (what you are used to thinking of as β 0 ) the interpretation of the coefficients changes When working with a standard model (i.e. a model with the regular old β 0 constant), the omitted category becomes a benchmark (baseline) to which others are compared Omitting the constant enables us to use a single constant to estimate the expected wage (or whatever y happens to be) of a particular group of individuals. Sometimes this is desirable.

148 Dummy variables Consider a new model

149 Dummy variables Consider a new model We are still looking at wages, but now we have a sample of male and female professionals. Specifically doctors, lawyers, and professors

150 Dummy variables Consider a new model We are still looking at wages, but now we have a sample of male and female professionals. Specifically doctors, lawyers, and professors We want to build a wage model using one of the formulations we have covered

151 Professional wages The two formulations differ only by whether or not they include a constant

152 Professional wages The two formulations differ only by whether or not they include a constant wage = β 0 + β 1 male + β 2 doctor + β 3 lawyer + u wage = α 1 male + α 2 female + α 3 doctor + α 4 lawyer + v

153 Professional wages The two formulations differ only by whether or not they include a constant wage = β 0 + β 1 male + β 2 doctor + β 3 lawyer + u wage = α 1 male + α 2 female + α 3 doctor + α 4 lawyer + v If we want to know whether or not being a doctor influences wages we can conduct a t-test of the null hypothesis that β 2 = 0 or α 3 = 0 H 0 : β 2 = 0 H A : β 2 0

154 Professional wages If we want to know whether or not profession matters (in general) then we could conduct an f-test

155 Professional wages If we want to know whether or not profession matters (in general) then we could conduct an f-test H 0 : β 2 = 0, β 3 = 0 H A : at least one of the above is not true

156 Professional wages If either β 2 0 or β 3 0 then profession matters Note that in both formulations of the model, the omitted category is professors. So when we test hypotheses about β 2 or α 3, we are really testing hypotheses about how different doctors and professors are.

157 Professional wages If either β 2 0 or β 3 0 then profession matters Note that in both formulations of the model, the omitted category is professors. So when we test hypotheses about β 2 or α 3, we are really testing hypotheses about how different doctors and professors are. Takeaway 1: We can use categorical variables (dummy variables) to test individual or compound hypotheses Takeaway 2: The omitted category matters. It influences the coefficients themselves, and we must take this into account when interpreting the coefficients. Takeaway 3: We can use categorical variables and continuous variables in the same regression (even though we didn t look at that specifically in the notes above)

158 Dummy variables Interactions Now THIS is different from the formulations we have already looked at

159 Dummy variables Interactions Now THIS is different from the formulations we have already looked at This is where we start using dummy variables to manipulate functional form

160 Dummy variables Interactions Now THIS is different from the formulations we have already looked at This is where we start using dummy variables to manipulate functional form So far we have looked at categorical variables separately This makes sense in some circumstances but not in others

161 Example Suppose the data look like this

162 Example Our current model works OK the effect of profession appears to be the same for men and women

163 Example Even easier to see this way the gap between men and women is constant over professions

164 Example What if the data looks like this instead?

165 Example Let s put some lines in to make it easier to see

166 Example If the effect of being a male is different depending on an individual s profession, then our current model is inappropriate. Why? wage = β 0 + β 1 male + β 2 doctor + β 3 lawyer + u

167 Example If the effect of being a male is different depending on an individual s profession, then our current model is inappropriate. wage = β 0 + β 1 male + β 2 doctor + β 3 lawyer + u Why? Our current model includes the effect of being male and the effect of being a lawyer But the effect of being a lawyer is the effect of being a lawyer, holding all else constant and the same is true for the effect of gender β 3 is the effect of being a lawyer, holding gender constant What s the problem with this?

168 Example If the effect of being a male is different depending on an individual s profession, then our current model is inappropriate. wage = β 0 + β 1 male + β 2 doctor + β 3 lawyer + u Why? Our current model includes the effect of being male and the effect of being a lawyer But the effect of being a lawyer is the effect of being a lawyer, holding all else constant and the same is true for the effect of gender β 3 is the effect of being a lawyer, holding gender constant What s the problem with this? The effect of being a lawyer IS NOT constant over gender (and the effect of being male is not constant over profession)

169 Interaction terms What s the solution? Interaction terms

170 Interaction terms What s the solution? Interaction terms Whenever an effect differs over a category, interact the categories wage = β 0 +β 1 male+β 2 doctor+β 3 lawyer+β 4 male lawyer+u

171 Interaction terms We can interact any number of categorical variables we want We want to interact any time we think there is reason to believe that the interaction is meaningful meaningful could be motivated by theory or by differences in the data

172 Interaction terms We can interact any number of categorical variables we want We want to interact any time we think there is reason to believe that the interaction is meaningful meaningful could be motivated by theory or by differences in the data A fully interacted model would look like this wage = β 0 +β 1 D M +β 2 D D +β 3 D L +β 4 D M D D +β 5 D M D L +u Or like this (where subscripts indicate male doctor, etc.) wage = α 0 +α 1 D MD +α 2 D FD +α 3 D ML +α 4 D FL +α 5 D MP +v

173 Brain exercise Are the intercepts the same in the following two models? w = β 0 + β 1 D M + β 2 D D + β 3 D L + β 4 D M D D + β 5 D M D L + u w = α 0 + α 1 D MD + α 2 D FD + α 3 D ML + α 4 D FL + α 5 D MP + v Hint: make a table

174 Brain exercise Are the intercepts the same in the following two models? w = β 0 + β 1 D M + β 2 D D + β 3 D L + β 4 D M D D + β 5 D M D L + u w = α 0 + α 1 D MD + α 2 D FD + α 3 D ML + α 4 D FL + α 5 D MP + v Hint: make a table female Doc Law Prof male

175 Answer Here is mine for the first model (the β model): Doc Law Prof female (β 0 + β 2 ) (β 0 + β 3 ) (β 0 ) male (β 0 + β 1 + β 2 + β 4 ) (β 0 + β 1 + β 3 + β 5 ) (β 0 + β 1 )

176 Answer Here is mine for the first model (the β model): Doc Law Prof female (β 0 + β 2 ) (β 0 + β 3 ) (β 0 ) male (β 0 + β 1 + β 2 + β 4 ) (β 0 + β 1 + β 3 + β 5 ) (β 0 + β 1 ) and the α model: Doc Law Prof female (α 0 + α 2 ) (α 0 + β 4 ) (α 0 ) male (α 0 + α 1 ) (α 0 + α 3 ) (α 0 + α 5 )

177 Which formulation to use? Sometimes we want to control for categories by introducing category-specific intercepts Sometimes we want to control for categories by introducing interaction terms Sometimes we actually want to interpret the coefficient values on the categorical variables Choose whichever formulation fits your needs Choose categories to omit for the same reason

178 Real example Card, David, and Aalan B. Krueger (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania, The American Economic Review, 84(4): Economic theory: if the minimum wage goes up unemployment will go up (Some of the people who are making less than the new min wage will lose their jobs)

179 Real example Card, David, and Aalan B. Krueger (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania, The American Economic Review, 84(4): Economic theory: if the minimum wage goes up unemployment will go up (Some of the people who are making less than the new min wage will lose their jobs) What happened:

180 Real example Card, David, and Aalan B. Krueger (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania, The American Economic Review, 84(4): Economic theory: if the minimum wage goes up unemployment will go up (Some of the people who are making less than the new min wage will lose their jobs) What happened: April 1992 New Jersey raised its min wage from $4.25 to $5.05

181 Real example Card, David, and Aalan B. Krueger (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania, The American Economic Review, 84(4): Economic theory: if the minimum wage goes up unemployment will go up (Some of the people who are making less than the new min wage will lose their jobs) What happened: April 1992 New Jersey raised its min wage from $4.25 to $5.05 Card and Krueger decided to take advantage of this to find out whether employment decreased when the minimum wage increased How to do this?

182 Real example What did they do?

183 Real example What did they do? CK knew the minimum wage was coming, so they sent out a survey to each major fast-food restaurant they could find in the phone book before the wage hike: KFC, Roy Rogers, Wendy s, Burger King (410 total)

184 Real example What did they do? CK knew the minimum wage was coming, so they sent out a survey to each major fast-food restaurant they could find in the phone book before the wage hike: KFC, Roy Rogers, Wendy s, Burger King (410 total) CK then waited until after the wage hike and sent out another survey to the same restaurants

185 Real example Problem:

186 Real example Problem: What if other things are changing around the same time? Maybe other things are causing employment, wages, prices, etc. to change over the period straddling April 1992 Solution?

187 Real example Problem: What if other things are changing around the same time? Maybe other things are causing employment, wages, prices, etc. to change over the period straddling April 1992 Solution? A second dimension of comparison:

188 Real example Problem: What if other things are changing around the same time? Maybe other things are causing employment, wages, prices, etc. to change over the period straddling April 1992 Solution? A second dimension of comparison: Spatial

189 Real example

190 Real example Dimension 1: before and after treatment Dimension 2: in and out of treatment group

191 Real example Dimension 1: before and after treatment Dimension 2: in and out of treatment group Translation of dimension 1: before and after April 1992 Translation of dimension 2: in New Jersey vs. in Pennsylvania

192 Real example Dimension 1: before and after treatment Dimension 2: in and out of treatment group Translation of dimension 1: before and after April 1992 Translation of dimension 2: in New Jersey vs. in Pennsylvania What does the regression look like that does this comparison for us? avgemployment = β 0 + β 1 NJ + β 2 D + β 3 D NJ + u

193 Real example Dimension 1: before and after treatment Dimension 2: in and out of treatment group Translation of dimension 1: before and after April 1992 Translation of dimension 2: in New Jersey vs. in Pennsylvania What does the regression look like that does this comparison for us? avgemployment = β 0 + β 1 NJ + β 2 D + β 3 D NJ + u What is the effect of the minimum wage increase?

194 Real example Dimension 1: before and after treatment Dimension 2: in and out of treatment group Translation of dimension 1: before and after April 1992 Translation of dimension 2: in New Jersey vs. in Pennsylvania What does the regression look like that does this comparison for us? avgemployment = β 0 + β 1 NJ + β 2 D + β 3 D NJ + u What is the effect of the minimum wage increase? β 3 = 2.75 average employment of full-time equivalent employees went up as a result

195 Review Dummy variables: indicator variables (syn) Use to represent categorical information Can use to discretize continuous variables (to better represent actual relationships Can interact Interact multiple dummy variables to create cross-category indicators Interact dummy variables with continuous variables to create category-specific slopes Use t-test to test for effects of categories Use F-test to test for effects of a whole categorization

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male