Lecture 14. More on using dummy variables (deal with seasonality)

Size: px
Start display at page:

Download "Lecture 14. More on using dummy variables (deal with seasonality)"

Transcription

1 Lecture 14. More on using dummy variables (deal with seasonality) More things to worry about: measurement error in variables (can lead to bias in OLS (endogeneity) )

2 Have seen that dummy variables are useful when interested in measuring average differences between discrete groups Or for policy evaluation (use of interaction terms lead to the difference-in-difference estimator) Now see how dummy variables can be used to deal with seasonality in data

3

4 Using Dummy Variables to capture Seasonality in Data Can also use dummy variables to pick out and control for seasonal variation in data

5 Using Dummy Variables to capture Seasonality in Data Can also use dummy variables to pick out and control for seasonal variation in data The idea is to include a set of dummy variables for each quarter (or month or day) which will then net out the average change in a variable resulting from any seasonal fluctuations

6 Using Dummy Variables to capture Seasonality in Data Can also use dummy variables to pick out and control for seasonal variation in data The idea is to include a set of dummy variables for each quarter (or month or day) which will then net out the average change in a variable resulting from any seasonal fluctuations Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 X + u t

7 Using Dummy Variables to capture Seasonality in Data Can also use dummy variables to pick out and control for seasonal variation in data The idea is to include a set of dummy variables for each quarter (or month or day) which will then net out the average change in a variable resulting from any seasonal fluctuations Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 X + u t Hence the coefficient on the quarterly dummy Q1 =1 if data belong to the 1 st quarter of the year (Jan-Mar) = 0 otherwise

8 Using Dummy Variables to capture Seasonality in Data Can also use dummy variables to pick out and control for seasonal variation in data The idea is to include a set of dummy variables for each quarter (or month or day) which will then net out the average change in a variable resulting from any seasonal fluctuations Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 X + u t Hence the coefficient on the quarterly dummy Q1 =1 if data belong to the 1 st quarter of the year (Jan-Mar) = 0 otherwise gives the level of Y in the 1 st quarter of the year relative to the constant (Q4 level of Y) averaged over all Q1 observations in the data set

9 Using Dummy Variables to capture Seasonality in Data Can also use dummy variables to pick out and control for seasonal variation in data The idea is to include a set of dummy variables for each quarter (or month or day) which will then net out the average change in a variable resulting from any seasonal fluctuations Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 X + u t Hence the coefficient on the quarterly dummy Q1 =1 if data belong to the 1 st quarter of the year (Jan-Mar) = 0 otherwise gives the level of Y in the 1 st quarter of the year relative to the constant (Q4 level of Y) averaged over all Q1 observations in the data set Series net of seasonal effects are said to be seasonally adjusted

10 It may also be useful to model an economic series as a combination of seasonal and a trend component

11 It may also be useful to model an economic series as a combination of seasonal and a trend component Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 Trend + u t

12 It may also be useful to model an economic series as a combination of seasonal and a trend component Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 Trend + u t where Trend = 1 in year 1

13 It may also be useful to model an economic series as a combination of seasonal and a trend component Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 Trend + u t where Trend = 1 in year 1 = 2 in year 2

14 It may also be useful to model an economic series as a combination of seasonal and a trend component Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 Trend + u t where Trend =1 in year 1 = 2 in year 2 = T in year T

15 It may also be useful to model an economic series as a combination of seasonal and a trend component Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 Trend + u t where Trend =1 in year 1 = 2 in year 2 = T in year T since dy t /dtrend = b 4 given that the coefficient measures the unit change in y for a unit change in the trend variable and the units of measurement in this case are years

16 It may also be useful to model an economic series as a combination of seasonal and a trend component Y t = b 0 + b 1 Q1 + b 2 Q2+ b 3 Q3 + b 4 Trend + u t where Trend =1 in year 1 = 2 in year 2 = T in year T since dy t /dtrend = b 4 given that the coefficient measures the unit change in y for a unit change in the trend variable and the units of measurement in this case are years then in the model above the trend term measures the annual change in the Y variable net of any seasonal influences

17 The

18 In 2000 the UK department of Transport announced that By 2010 we want to achieve, compared with the average for : a 40% reduction in the number of people killed or seriously injured in road accidents; a 50% reduction in the number of children killed or seriously injured; and a 10% reduction in the slight casualty rate, expressed as the number of people slightly injured per 100 million vehicle kilometres. Did they reach this target?

19 The data set accidents.dta (on the course web site) contains quarterly information on the number of road accidents in the UK from 1983 to 2006 twoway (line acc time, xline(2000) ) total road accidents : DoT quarterly data time The graph shows that road accidents vary more within than between years Can see seasonal influence from a regression of number of accidents on 3 dummy variables (1 for each quarter minus the default category which is the 4 th quarter)

20 . list acc year quart time Q1 Q2 Q3 Q4, clean acc year quart time Q1 Q2 Q3 Q Q Q Q Q Q Q A regression of road accident numbers on quarterly dummies (q4=winter is default given by constant term at accidents, on average in the 4 th quarter) shows accidents are significantly less likely to happen outside the fourth quarter (October-December). On average there are 14,539 fewer accidents in the first quarter of the year than in the last reg acc Q1 Q2 Q3 Source SS df MS Number of obs = F( 3, 104) = Model e Prob > F = Residual e R-squared = Adj R-squared = Total e Root MSE = acc Coef. Std. Err. t P> t [95% Conf. Interval] Q Q Q _cons Saving residual values after netting out the influence of the seasons is the basis for the production of seasonally adjusted data (better guide to underlying trend), used in many official government statistics. Can get a sense of how this works with the following command after a regression. predict rhat, resid /* saves the residuals in a new variable with the name rhat */ twoway (line rhat time, xline(2000) )

21 Residuals time Can see the seasonality is reduced and the trend is much clearer Graph of the residuals is much smoother than the original series it should be since much of the seasonality has been taken out by the dummy variables. The graph also shows that once seasonality accounted for, there is little evidence in a change in the number of road accidents over time until the year 2000 To model both seasonal and trend components of an economic series, simply include both seasonal dummies and a time trend in the regression model. reg logacc Q1 Q2 Q3 year Y t = b 0 + b 1 Q 1 + b 2 Q 2 + b 3 Q 3 + b 4 TREND + u t Source SS df MS Number of obs = F( 4, 103) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = logacc Coef. Std. Err. t P> t [95% Conf. Interval] Q

22 Q Q year _cons Can see that there is a downward trend in road accidents (of around 400 a year over the whole sample period) net of any seasonality. Could also use dummy variable interactions to test whether this trend is stronger after How? Can also use seasonal dummy variables to check whether an apparent association between variables is in fact caused by seasonality in the data. reg acc du Source SS df MS Number of obs = F( 1, 69) = 6.19 Model Prob > F = Residual e R-squared = Adj R-squared = Total e Root MSE = acc Coef. Std. Err. t P> t [95% Conf. Interval] du _cons The regression suggests a negative association between the change in the unemployment rate and the level of accidents (a 1 percentage point rise in the unemployment rate leads to a fall in the number of accidents by 4104 if this regression is to be believed) Might this be in part because seasonal movements in both data series are influencing the results (the unemployment rate also varies seasonally, typically higher in q1 of each year). reg acc du q2-q4 Source SS df MS Number of obs = F( 4, 66) = Model e Prob > F = Residual R-squared = Adj R-squared = Total e Root MSE =

23 acc Coef. Std. Err. t P> t [95% Conf. Interval] du q q q _cons Can see if add quarterly seasonal dummy variables then apparent effect of unemployment disappears.

24 Measurement Error Often a data set will contain imperfect measures of the data we would ideally like.

25 Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment) are only best guesses of theoretical counterparts and frequently revised by government statisticians (so earlier estimates must have been subject to error)

26

27 Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment) are only best guesses of theoretical counterparts and frequently revised by government statisticians (so earlier estimates must have been subject to error) Survey Data: (income, health, age) Individuals often lie, forget or round to nearest large number ( 102 a week or 100?) payperiod.dta: hist grossam if pyperiod==1 & grossam<1000 & grossam>0 & grsp==1, bin(100) xline( )

28 Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment) are only best guesses of theoretical counterparts and frequently revised by government statisticians (so earlier estimates must have been subject to error) Survey Data: (income, health, age) Individuals often lie, forget or round to nearest large number ( 102 a week or 100?) Proxy Data: (Ability, Intelligence, Permanent Income) Difficult to agree on definition, let alone measure

29 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1)

30 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e

31 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e and e is a random residual term just like u, so E(e)=0

32 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e and e is a random residual term just like u, so E(e)=0 Sub. (2) into (1)

33 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e and e is a random residual term just like u, so E(e)=0 Sub. (2) into (1) y - e = b0 + b1x + u

34 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e and e is a random residual term just like u, so E(e)=0 Sub. (2) into (1) y - e = b0 + b1x + u take the error term on the left to the other side

35 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e and e is a random residual term just like u, so E(e)=0 Sub. (2) into (1) y - e = b0 + b1x + u take the error term on the left to the other side y = b0 + b1x + u + e

36 Measurement Error in Dependent Variable True: y = b0 + b1x + u (1) Observe: y = y + e (2) ie dependent variable measured with error e and e is a random residual term just like u, so E(e)=0 Sub. (2) into (1) y - e = b0 + b1x + u take the error term on the left to the other side y = b0 + b1x + u + e y = b0 + b1x + v where v = u + e (3)

37 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since

38 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since E(u) = E(e) = 0

39 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since E(u) = E(e) = 0 (just random residuals so mean is zero)

40 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since E(u) = E(e) = 0 (just random residuals so mean is zero) Cov(X,u) = 0 (no correlation between original X variable and original error term)

41 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since E(u) = E(e) = 0 (just random residuals so mean is zero) Cov(X,u) = 0 (no correlation between original X variable and original error term) and also Cov(X,e) = 0 (nothing to suggest X variable correlated with meas. error in dependent variable)

42 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since E(u) = E(e) = 0 (just random residuals so mean is zero) Cov(X,u) = 0 (no correlation between original X variable and original error term) and also Cov(X,e) = 0 (nothing to suggest X variable correlated with meas. error in dependent variable) So OLS estimates are unbiased in this case

43 Ok to estimate y = b0 + b1x + v where v = u + e (3) by OLS, since E(u) = E(e) = 0 (just random residuals so mean is zero) Cov(X,u) = 0 (no correlation between original X variable and original error term) and also Cov(X,e) = 0 (nothing to suggest X variable correlated with meas. error in dependent variable) So OLS estimates are unbiased in this case but standard errors are larger than would be in absence of meas. error with the associated problems of inference (type II error)

44 True: ^ σ 2 Var( β ) = u from y = b0 + b1x + u (A) NVar( X )

45 True: ^ σ 2 Var( β ) = u from y = b0 + b1x + u (A) NVar( X ) Estimate: ~ σ 2 Var( β ) = v from y = b0 + b1x + v NVar( X )

46 True: ^ σ 2 Var( β ) = u from y = b0 + b1x + u (A) NVar( X ) Estimate: ~ σ 2 Var( β ) = v from y = b0 + b1x + v NVar( X ) But v=u+e

47 True: ^ σ 2 Var( β ) = u from y = b0 + b1x + u (A) NVar( X ) Estimate: ~ σ 2 Var( β ) = v from y = b0 + b1x + v NVar( X ) But v=u+e and so covariances) σ v = σu + σe (using rules on

48 True: ^ σ 2 Var( β ) = u from y = b0 + b1x + u (A) NVar( X ) Estimate: ~ σ 2 Var( β ) = v from y = b0 + b1x + v NVar( X ) But v=u+e and so σ v = σu + σe (using rules on covariances) Hence ~ σ 2 2 ( ) u + σ Var β = e (B) NVar( X )

49 True: ^ σ 2 Var( β ) = u from y = b0 + b1x + u (A) NVar( X ) Estimate: ~ σ 2 Var( β ) = v from y = b0 + b1x + v NVar( X ) But v=u+e and so σ v = σu + σe (using rules on covariances) Hence ~ Var( β ) σ 2 2 u + σ = e (B) NVar( X ) [Since var(v) = var(u+e) = var(e)+var(u)+2cov(e,u)

50 and if assume things that cause measurement error in y are unrelated to residual u, so cov(e,u)=0. ]

51 and if assume things that cause measurement error in y are unrelated to residual u, so cov(e,u)=0. ] so var(v) = var(e)+var(u)

52 and if assume things that cause measurement error in y are unrelated to residual u, so cov(e,u)=0. ] so var(v) = var(e)+var(u) Hence ~ Var( β ) σ 2 2 u + σ = e > NVar( X ) ^ Var( β ) = σ 2 u NVar( X )

53 and if assume things that cause measurement error in y are unrelated to residual u, so cov(e,u)=0. ] so var(v) = var(e)+var(u) Hence ~ Var( β ) σ 2 2 u + σ = e > NVar( X ) ^ Var( β ) = σ 2 u NVar( X ) So the residual variance in presence of measurement error in dependent variable now also contains an additional contribution from error in y variable, σ 2 e so standard errors are larger in models where there is measurement error in the Y variable and the bigger the measurement error the larger the standard errors, the lower the t (and F) values and the greater the risk of Type II error (failing to reject a false null)

54 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1)

55 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2)

56 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w)

57 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1)

58 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1) ie use fact that X = X - w

59 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1) ie use fact that X = X - w y = b0 + b1(x-w) + u

60 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1) ie use fact that X = X - w y = b0 + b1(x-w) + u y = b0 + b1x - b1w + u

61 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1) ie use fact that X = X - w y = b0 + b1(x-w) + u y = b0 + b1x - b1w + u y = b0 + b1x + v (3)

62 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1) ie use fact that X = X - w y = b0 + b1(x-w) + u y = b0 + b1x - b1w + u y = b0 + b1x + v (3) where now v = - b1w + u (so residual term again consists of 2 components)

63 Measurement Error in Explanatory Variable True: y = b0 + b1x + u (1) Observe: X = X + w (2) ie right hand side variable measured with error (w) sub. (2) into (1) ie use fact that X = X - w y = b0 + b1(x-w) + u y = b0 + b1x - b1w + u y = b0 + b1x + v (3) where now v = - b1w + u (so residual term again consists of 2 components) Hence (3) is the basis for OLS estimation. Does this matter?

64 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4)

65 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (4) (just sub. in for y and cancel terms) Since by assumption Cov(X,u) = 0

66 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased

67 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate

68 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate y = b0 + b1x + v

69 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate y = b0 + b1x + v not y = b0 + b1x + u

70 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate y = b0 + b1x + v not y = b0 + b1x + u and now rewrite Cov(X,v)

71 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate y = b0 + b1x + v not y = b0 + b1x + u and now rewrite Cov(X,v) = Cov(X +w,

72 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate y = b0 + b1x + v not y = b0 + b1x + u and now rewrite Cov(X,v) = Cov(X +w,- b1w + u) (sub. in for X and v using (2) & (3))

73 In (2 variable) model, we know that OLS implies that ^ Cov( X, y ) Cov( X ( 1 ) ( bo + b X + u Cov X, u) b 1 = = = b1 + Var( X ) Var( X ) Var( X ) (just sub. in for y and cancel terms) (4) Since by assumption Cov(X,u) = 0 ^ ^ then in (4) E ( b1) = b1 and OLS is unbiased but in presence of measurement error, we estimate y = b0 + b1x + v not y = b0 + b1x + u and now rewrite Cov(X,v) = Cov(X +w,- b1w + u) (sub. in for X and v using (2) & (3))

74 Expanding terms using rules on covariances Cov(X +w,- b1w + u)

75 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w)

76 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X

77 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X),

78 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)

79 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)= Cov(X,u)

80 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)= Cov(X,u)= Cov(X,-b1w) = 0

81 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)= Cov(X,u)= Cov(X,-b1w) = 0 This leaves Cov( X, v) = Cov( w, b1 w)

82 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)= Cov(X,u)= Cov(X,-b1w) = 0 This leaves Cov( X, v) = Cov( w, b1 w) = b1 Cov( w, w)

83 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)= Cov(X,u)= Cov(X,-b1w) = 0 This leaves Cov( X, v) = Cov( w, b1 w) = b1 Cov( w, w) = b1 Var( w)

84 Cov(X +w,- b1w + u) = Cov(X,u)+ Cov(X,-b1w) + Cov(w,u) + Cov(w,-b1w) u and w are independent errors (caused by different factors), so no reason to expect them to be correlated with each other or the value of X (this means any error in X should not depend on the level of X), so Cov(w,u)= Cov(X,u)= Cov(X,-b1w) = 0 This leaves Cov( X, v) = Cov( w, b1 w) = b1 Cov( w, w) = b1 Var( w) Hence Cov ( X, v) 0

85 In other words there is now a correlation between the X variable and the error term in (3). So if we estimate (3) by OLS, this violates the main assumption need to get an unbiased estimate ie we need covariance between explanatory variable and residual to be zero, (see earlier notes) and this is not the case when explanatory variable measured with error.

86 Given y = b0 + b1x + v if we estimate this by OLS, this violates the main assumption need to get an unbiased estimate ie we need covariance between explanatory variable and residual to be zero, (see earlier notes) and this is not the case when explanatory variable measured with error. (4) becomes ^ Cov( X, y ) Cov( X ( bo + b1 X + u) Cov( X, v) b1 Var( w) b1 = = = b1 + = b1 + b1 Var( X ) Var( X ) Var( X ) Var( X )

87 Given y = b0 + b1x + v if we estimate (3) by OLS, this violates the main assumption need to get an unbiased estimate ie we need covariance between explanatory variable and residual to be zero, (see earlier notes) and this is not the case when explanatory variable measured with error. (4) becomes ^ Cov( X, y ) Cov( X ( bo + b1 X + u) Cov( X, v) b1 Var( w) b1 = = = b1 + = b1 + b1 Var( X ) Var( X ) Var( X ) Var( X ) ^ ^ so E( b1) b1 and OLS gives biased estimates in presence of measurement error in the explanatory variable

88 Not only that, can show that OLS estimates are always biased toward zero (Attenuation Bias)

89 Not only that, can show that OLS estimates are always biased toward zero (Attenuation Bias) ^ if b1>0 then b ols 1 < b1

90 Not only that, can show that OLS estimates are always biased toward zero (Attenuation Bias) ^ if b1>0 then b ols 1 < b1 ^ if b1<0 then b ols 1 > b1 ie closer to zero in both cases

91 Problem is that Cov(X,v) 0 ie residual and right hand side variable are correlated In such situations the X variable is said to be endogenous

92 Example of Consequences Of Measurement Error In Data This example uses artificially generated measurement error to illustrate the basic issues.. list /* list observations in data set */ y_ y_observ x_ x_observ The data set measerr.dta gives (unobserved) values of y and x and their observed counterparts, (the 1 st 5 observations on x underestimate by 20 and the last 5 overestimate the value by 20) First look at regression of y on x. reg y_ x_ Source SS df MS Number of obs = F( 1, 8) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = y_ Coef. Std. Err. t P> t [95% Conf. Interval] x_ _cons Now look at consequence of measurement error in dependent variable. reg y_obs x_ Source SS df MS Number of obs = F( 1, 8) = 77.65

93 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = y_observ Coef. Std. Err. t P> t [95% Conf. Interval] x_ _cons Consequence: Coefficients virtually identical, (unbiased) but standard errors larger and hence t values smaller and confidence intervals wider. Measurement error in explanatory variable:. reg y_ x_obs Source SS df MS Number of obs = F( 1, 8) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = y_ Coef. Std. Err. t P> t [95% Conf. Interval] x_observ _cons Consequence: both coefficients biased. and slope coefficient is biased toward zero (0.45 compared with 0.60 ie underestimate effect by 25%) Intercept is biased upward (compare 50.1 with 25.0) Problem is that Cov(X,u) 0 ie residual and right hand side variable are correlated In such situations the X variable is said to be endogenous

94 Solution? - Get better data

95 Solution? - Get better data If that is not possible do something to get round the problem. - replace the variable causing the correlation with the residual with one that is not but that at the same time is still related to the original variable

96 Solution? - Get better data If that is not possible do something to get round the problem. - replace the variable causing the correlation with the residual with one that is not but that at the same time is still related to the original variable Any variable that has these 2 properties is called an Instrumental Variable

97 Solution? - Get better data If that is not possible do something to get round the problem. - replace the variable causing the correlation with the residual with one that is not but that at the same time is still related to the original variable Any variable that has these 2 properties is called an Instrumental Variable

98 Phillip Wright

99 More formally, an instrument Z for the variable of concern X satisfies 1) Cov(X,Z) 0

100 More formally, an instrument Z for the variable of concern X satisfies 1) Cov(X,Z) 0 correlated with the problem variable

101 More formally, an instrument Z for the variable of concern X satisfies 1) Cov(X,Z) 0 correlated with the problem variable 2) Cov(Z,u) = 0

102 More formally, an instrument Z for the variable of concern X satisfies 1) Cov(X,Z) 0 correlated with the problem variable 2) Cov(Z,u) = 0 but uncorrelated with the residual (so does not suffer from measurement error and also is not correlated with any unobservable factors influencing the dependent variable)

103 Instrumental variable (IV) estimation proceeds as follows:

104 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1)

105 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply (1) by the instrument Z

106 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply by the instrument Z Zy = Zb0 + b1zx + Zu

107 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply by the instrument Z Zy = Zb0 + b1zx + Zu Follows that Cov(Z,y) = Cov[Zb0 + b1zx + Zu]

108 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply by the instrument Z Zy = Zb0 + b1zx + Zu Follows that Cov(Z,y) = Cov[Zb0 + b1zx + Zu] = Cov(Zb0) + Cov(b1Z,X) + Cov(Z,u)

109 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply by the instrument Z Zy = Zb0 + b1zx + Zu Follows that Cov(Z,y) = Cov[Zb0 + b1zx + Zu] = Cov(Zb0) + Cov(b1Z,X) + Cov(Z,u) since Cov(Zb0) = 0 (using rules on covariance of a constant)

110 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply by the instrument Z Zy = Zb0 + b1zx + Zu Follows that Cov(Z,y) = Cov[Zb0 + b1zx + Zu] = Cov(Zb0) + Cov(b1Z,X) + Cov(Z,u) since Cov(Zb0) = 0 (using rules on covariance of a constant) and Cov(Z,u) = 0 (if assumption above about the properties of instruments is correct)

111 Instrumental variable (IV) estimation proceeds as follows: Given a model y = b0 + b1x + u (1) Multiply by the instrument Z Zy = Zb0 + b1zx + Zu Follows that Cov(Z,y) = Cov[Zb0 + b1zx + Zu] = Cov(Zb0) + Cov(b1Z,X) + Cov(Z,u) since Cov(Zb0) = 0 (using rules on covariance of a constant) and Cov(Z,u) = 0 (if assumption above about the properties of instruments is correct) then Cov(Z,y) = 0 + b1cov(z,x) + 0

112 Solving Cov(Z,y) = 0 + b1cov(z,x) + 0 for b1 gives the formula to calculate the instrumental variable estimator

113 Solving Cov(Z,y) = 0 + b1cov(z,x) + 0 for b1 gives the formula to calculate the instrumental variable estimator So b1 IV = Cov( Z, y) Cov( Z, X )

114 Solving Cov(Z,y) = 0 + b1cov(z,x) + 0 for b1 gives the formula to calculate the instrumental variable estimator So b1 IV = Cov( Z, y) Cov( Z, X ) (compare with b1 OLS = Cov( X, y) Var( X ) )

115 Solving Cov(Z,y) = 0 + b1cov(z,x) + 0 for b1 gives the formula to calculate the instrumental variable estimator So b1 IV = Cov( Z, y) Cov( Z, X ) (compare with b1 OLS = Cov( X, y) Var( X ) ) In the presence of measurement error (or endogeneity in general) the IV estimate is unbiased in large samples (but may be biased in small samples) - technically the IV estimator is said to be consistent

116 Solving Cov(Z,y) = 0 + b1cov(z,x) + 0 for b1 gives the formula to calculate the instrumental variable estimator So b1 IV = Cov( Z, y) Cov( Z, X ) (compare with b1 OLS = Cov( X, y) Var( X ) ) In the presence of measurement error (or endogeneity in general) the IV estimate is unbiased in large samples (but may be biased in small samples) - technically the IV estimator is said to be consistent while the OLS estimator is inconsistent IN THE PRESENCE OF ENDOGENEITY which makes IV a useful estimation technique to employ

117 However can show that (in the 2 variable case) the variance of the IV estimator is given by ^ Var ( β IV 1 ) = N s 2 1 * * Var( X ) r 2 X Z where rxz 2 is the square of the correlation coefficient between endogenous variable and instrument

118 However can show that (in the 2 variable case) the variance of the IV estimator is given by ^ Var ( β IV 1 ) = N s 2 1 * * Var( X ) r 2 X Z where rxz 2 is the square of the correlation coefficient between endogenous variable and instrument ^ s 2 (compared with OLS Var ( β OLS 1 ) = ) N * Var( X )

119 However can show that (in the 2 variable case) the variance of the IV estimator is given by ^ Var ( β IV 1 ) = N s 2 1 * * Var( X ) r 2 X Z where rxz 2 is the square of the correlation coefficient between endogenous variable and instrument ^ s 2 (compared with OLS Var ( β OLS 1 ) = ) N * Var( X ) Since r 2 >0 So IV estimation is less precise (efficient) than OLS estimation May sometimes want to trade off bias against efficiency

120 Where to find an instrument?

121 Lecture 15.Potential solution to endogeneity instrumental variable estimation. Tests for endogeneity. Other sources of endogeneity. Problems with weak instruments

122 Problem with measurement error in X variables is that it makes Cov(X,u) 0 ie residual and right hand side variable are correlated In such situations the X variable is said to be endogenous and OLS will be biased toward zero (inconsistent) in this case

123 Solution? - Get better data If that is not possible do something to get round the problem. - replace the variable causing the correlation with the residual with one that is not but that at the same time is still related to the original variable Any variable that has these 2 properties is called an Instrumental Variable More formally, an instrument Z for the variable of concern X satisfies 1) Cov(X,Z) 0 correlated with the problem variable 2) Cov(Z,u) = 0 but uncorrelated with the residual (so does not suffer from measurement error and also is not correlated with any unobservable factors influencing the dependent variable)

124 So b 1 IV = Cov( Z, y) Cov( Z, X ) (compare with b 1 OLS = Cov( X, y) Var( X ) ) In the presence of measurement error (or endogeneity in general) the IV estimate is unbiased in large samples (but may be biased in small samples) - technically the IV estimator is consistent while the OLS estimator is inconsistent which makes IV a useful estimation technique to employ

125 However can show that (in the 2 variable case) the variance of the IV estimator is given by ^ Var ( β IV 1 ) = N s 2 1 * * Var( X ) r 2 X Z where r 2 xz is the square of the correlation coefficient between endogenous variable and instrument ^ s 2 (compared with OLS Var ( β OLS 1 ) = ) N * Var( X ) So IV estimation is less precise (efficient) than OLS estimation if r XZ 2 >0 (which it must be to satisfy the other requirement of an instrument) but the greater the correlation between X and Z, r 2 XZ, the smaller is Var( (and hence lower standard errors and higher t values) ^ β IV )

126 So why not ensure that the correlation between X and the instrument Z is as high as possible?

127 So why not ensure that the correlation between X and the instrument Z is as high as possible? - if X and Z are perfectly correlated then Z must also be correlated with u and so suffer the same problems as X the initial problem is not solved.

128 So why not ensure that the correlation between X and the instrument Z is as high as possible? - if X and Z are perfectly correlated then Z must also be correlated with u and so suffer the same problems as X the initial problem is not solved. Conversely if the correlation between the endogenous variable and the instrument is small there are also problems

129 So why not ensure that the correlation between X and the instrument Z is as high as possible? - if X and Z are perfectly correlated then Z must also be correlated with u and so suffer the same problems as X the initial problem is not solved. Conversely if the correlation between the endogenous variable and the instrument is small there are also problems Since can always write the IV estimator as b IV Cov( Z, y) 1 = Cov( Z, X )

130 So why not ensure that the correlation between X and the instrument Z is as high as possible? - if X and Z are perfectly correlated then Z must also be correlated with u and so suffer the same problems as X the initial problem is not solved. Conversely if the correlation between the endogenous variable and the instrument is small there are also problems Since can always write the IV estimator as b IV Cov( Z, y) 1 = Cov( Z, X ) sub. in for y = b 0 + b 1 X + u

131 So why not ensure that the correlation between X and the instrument Z is as high as possible? - if X and Z are perfectly correlated then Z must also be correlated with u and so suffer the same problems as X the initial problem is not solved. Conversely if the correlation between the endogenous variable and the instrument is small there are also problems Since can always write the IV estimator as b IV Cov( Z, y) 1 = Cov( Z, X ) sub. in for y = b 0 + b 1 X + u Cov ( Z, b0 + b1 X + u) Cov( Z, X )

132 So why not ensure that the correlation between X and the instrument Z is as high as possible? - if X and Z are perfectly correlated then Z must also be correlated with u and so suffer the same problems as X the initial problem is not solved. Conversely if the correlation between the endogenous variable and the instrument is small there are also problems Since can always write the IV estimator as b IV Cov( Z, y) 1 = Cov( Z, X ) sub. in for y = b 0 + b 1 X + u Cov ( Z, b0 + b1 X + u) Cov( Z, X ) Cov ( Z, b (, ) (, ) = 0 ) + b1 Cov Z X + Cov Z u Cov( Z, X )

133 b IV 0 + b (, ) (, ) 1 = 1Cov Z X + Cov Z u Cov( Z, X ) So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X )

134 b IV 0 + b (, ) (, ) 1 = 1Cov Z X + Cov Z u Cov( Z, X ) So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1

135 b IV 0 + b (, ) (, ) 1 = 1Cov Z X + Cov Z u Cov( Z, X ) So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1 So: always check extent of correlation between X and Z before any IV estimation (see later)

136 So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1 So: always check extent of correlation between X and Z before any IV estimation (see later) In large samples you can have as many instruments as you like though finding good ones is a different matter. In large samples more is better. In small samples a minimum number of instruments is better (bias in small samples increases with no. of instruments).

137 So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1 So: always check extent of correlation between X and Z before any IV estimation (see later) In large samples you can have as many instruments as you like though finding good ones is a different matter. In large samples more is better. In small samples a minimum number of instruments is better (bias in small samples increases with no. of instruments). Where to find good instruments?

138 So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1 So: always check extent of correlation between X and Z before any IV estimation (see later) In large samples you can have as many instruments as you like though finding good ones is a different matter. In large samples more is better. In small samples a minimum number of instruments is better (bias in small samples increases with no. of instruments). Where to find good instruments? - difficult

139 So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1 So: always check extent of correlation between X and Z before any IV estimation (see later) In large samples you can have as many instruments as you like though finding good ones is a different matter. In large samples more is better. In small samples a minimum number of instruments is better (bias in small samples increases with no. of instruments). Where to find good instruments? - difficult - The appropriate instrument will vary depending on the issue under study.

140 So b 1 IV = b 1 + Cov( Z, u) Cov( Z, X ) So if Cov(X,Z) is small then the IV estimate can be a long way from the value b 1 So: always check extent of correlation between X and Z before any IV estimation (see later) In large samples you can have as many instruments as you like though finding good ones is a different matter. In large samples more is better. In small samples a minimum number of instruments is better (bias in small samples increases with no. of instruments). Where to find good instruments? - difficult. The appropriate instrument will vary depending on the issue under study.

141 In the case of measurement error, could use the rank of X as an instrument (ie order the variable X by size and use the number of the order rather than the actual vale.

142 In the case of measurement error, could use the rank of X as an instrument (ie order the variable X by size and use the number of the order rather than the actual vale. Clearly correlated with the original value but because it is a rank should not be affected with measurement error

143 In the case of measurement error, could use the rank of X as an instrument (ie order the variable X by size and use the number of the order rather than the actual vale. Clearly correlated with the original value but because it is a rank should not be affected with measurement error - Though this assumes that the measurement error is not so large as to affect the () ordering of the X variable

144 egen rankx=rank(x_obs) /* stata command to create the ranking of x_observ */. list x_obs rankx x_observ rankx ranks from smallest observed x to largest Now do instrumental variable estimates using rankx as the instrument for x_obs ivreg y_t (x_ob=rankx) Instrumental variables (2SLS) regression Source SS df MS Number of obs = F( 1, 8) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = y_ Coef. Std. Err. t P> t [95% Conf. Interval] x_observ _cons Instrumented: x_observ Instruments: rankx Can see both estimated coefficients are a little closer to their values than estimates from regression with measurement error (but not much)in this case the rank of X is not a very good instrumentnote that standard error in instrumented regression is larger than standard error in regression of y_ on x_observed as expected with IV estimation

145 Testing for Endogeneity It is good practice to compare OLS and IV estimates. If estimates are very different this may be a sign that things are amiss.

146 Testing for Endogeneity It is good practice to compare OLS and IV estimates. If estimates are very different this may be a sign that things are amiss. Using the idea that IV estimation will always be (asymptotically) unbiased whereas OLS will only be unbiased if Cov(X,u) = 0 then can do the following: Wu-Hausman Test for Endogeneity

147 Testing for Endogeneity It is good practice to compare OLS and IV estimates. If estimates are very different this may be a sign that things are amiss. Using the idea that IV estimation will always be (asymptotically) unbiased whereas OLS will only be unbiased if Cov(X,u) = 0 then can do the following: Wu-Hausman Test for Endogeneity 1. Given y = b 0 + b 1 X + u (A) Regress the endogenous variable X on the instrument(s) Z X = d 0 + d 1 Z + v (B)

148 Testing for Endogeneity It is good practice to compare OLS and IV estimates. If estimates are very different this may be a sign that things are amiss. Using the idea that IV estimation will always be (asymptotically) unbiased whereas OLS will only be unbiased if Cov(X,u) = 0 then can do the following: Wu-Hausman Test for Endogeneity 1. Given y = b 0 + b 1 X + u (A) Regress the endogenous variable X on the instrument(s) Z X = d 0 + d 1 Z + v (B) Save the residuals ^ v

149 2. Include this residual as an extra term in the original model

150 Include this residual as an extra term in the original model ie given y = b 0 + b 1 X + u

151 Include this residual as an extra term in the original model ie given y = b 0 + b 1 X + u estimate y = b 0 + b 1 X + b 2^ v + e and test whether b 2 = 0 (using a t test)

152 Include this residual as an extra term in the original model ie given y = b 0 + b 1 X + u estimate y = b 0 + b 1 X + b 2^ v + e and test whether b 2 = 0 (using a t test) If b 2 = 0 conclude there is no correlation between X and u

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time Autocorrelation Given the model Y t = b 0 + b 1 X t + u t Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time This could be caused

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

Econometrics. 8) Instrumental variables

Econometrics. 8) Instrumental variables 30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Problem Set 10: Panel Data

Problem Set 10: Panel Data Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

Question 1 [17 points]: (ch 11)

Question 1 [17 points]: (ch 11) Question 1 [17 points]: (ch 11) A study analyzed the probability that Major League Baseball (MLB) players "survive" for another season, or, in other words, play one more season. They studied a model of

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

ECON Introductory Econometrics. Lecture 13: Internal and external validity

ECON Introductory Econometrics. Lecture 13: Internal and external validity ECON4150 - Introductory Econometrics Lecture 13: Internal and external validity Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 9 Lecture outline 2 Definitions of internal and external

More information

Problem Set 1 ANSWERS

Problem Set 1 ANSWERS Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Autoregressive models with distributed lags (ADL)

Autoregressive models with distributed lags (ADL) Autoregressive models with distributed lags (ADL) It often happens than including the lagged dependent variable in the model results in model which is better fitted and needs less parameters. It can be

More information

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.

More information

Answers: Problem Set 9. Dynamic Models

Answers: Problem Set 9. Dynamic Models Answers: Problem Set 9. Dynamic Models 1. Given annual data for the period 1970-1999, you undertake an OLS regression of log Y on a time trend, defined as taking the value 1 in 1970, 2 in 1972 etc. The

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Section Least Squares Regression

Section Least Squares Regression Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it

More information

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15 Econ 495 - Econometric Review 1 Contents 7 Introduction to Time Series 3 7.1 Time Series vs. Cross-Sectional Data............ 3 7.2 Detrending Time Series................... 15 7.3 Types of Stochastic

More information

At this point, if you ve done everything correctly, you should have data that looks something like:

At this point, if you ve done everything correctly, you should have data that looks something like: This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows

More information

Specification Error: Omitted and Extraneous Variables

Specification Error: Omitted and Extraneous Variables Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct

More information

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL INTRODUCTION TO BASIC LINEAR REGRESSION MODEL 13 September 2011 Yogyakarta, Indonesia Cosimo Beverelli (World Trade Organization) 1 LINEAR REGRESSION MODEL In general, regression models estimate the effect

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

7 Introduction to Time Series

7 Introduction to Time Series Econ 495 - Econometric Review 1 7 Introduction to Time Series 7.1 Time Series vs. Cross-Sectional Data Time series data has a temporal ordering, unlike cross-section data, we will need to changes some

More information

Fixed and Random Effects Models: Vartanian, SW 683

Fixed and Random Effects Models: Vartanian, SW 683 : Vartanian, SW 683 Fixed and random effects models See: http://teaching.sociology.ul.ie/dcw/confront/node45.html When you have repeated observations per individual this is a problem and an advantage:

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant) Heteroskedasticity Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set so that E(u 2 i /X i ) σ 2 i (In practice this means the spread

More information

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Testing methodology. It often the case that we try to determine the form of the model on the basis of data Testing methodology It often the case that we try to determine the form of the model on the basis of data The simplest case: we try to determine the set of explanatory variables in the model Testing for

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Panel Data (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Regression with Panel Data A panel dataset contains observations on multiple entities

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation 1/30 Outline Basic Econometrics in Transportation Autocorrelation Amir Samimi What is the nature of autocorrelation? What are the theoretical and practical consequences of autocorrelation? Since the assumption

More information

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from

More information

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set Heteroskedasticity Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set Heteroskedasticity Occurs when the Gauss Markov assumption that

More information

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

Introduction to Econometrics. Regression with Panel Data

Introduction to Econometrics. Regression with Panel Data Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Regression with Panel Data Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 Regression with Panel

More information

Econometrics Homework 1

Econometrics Homework 1 Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z

More information

The multiple regression model; Indicator variables as regressors

The multiple regression model; Indicator variables as regressors The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21 This lecture (#12): Based on the econometric model specification from Lecture 9

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

Lecture 24: Partial correlation, multiple regression, and correlation

Lecture 24: Partial correlation, multiple regression, and correlation Lecture 24: Partial correlation, multiple regression, and correlation Ernesto F. L. Amaral November 21, 2017 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Econometrics II Censoring & Truncation. May 5, 2011

Econometrics II Censoring & Truncation. May 5, 2011 Econometrics II Censoring & Truncation Måns Söderbom May 5, 2011 1 Censored and Truncated Models Recall that a corner solution is an actual economic outcome, e.g. zero expenditure on health by a household

More information

Empirical Application of Panel Data Regression

Empirical Application of Panel Data Regression Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Lab 11 - Heteroskedasticity

Lab 11 - Heteroskedasticity Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Lecture 8: Instrumental Variables Estimation

Lecture 8: Instrumental Variables Estimation Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Week 3: Simple Linear Regression

Week 3: Simple Linear Regression Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

Lecture 19. Common problem in cross section estimation heteroskedasticity

Lecture 19. Common problem in cross section estimation heteroskedasticity Lecture 19 Learning to worry about and deal with stationarity Common problem in cross section estimation heteroskedasticity What is it Why does it matter What to do about it Stationarity Ultimately whether

More information

. regress lchnimp lchempi lgas lrtwex befile6 affile6 afdec6 t

. regress lchnimp lchempi lgas lrtwex befile6 affile6 afdec6 t BOSTON COLLEGE Department of Economics EC 228 Econometrics, Prof. Baum, Ms. Yu, Fall 2003 Problem Set 7 Solutions Problem sets should be your own work. You may work together with classmates, but if you

More information

Outline. 11. Time Series Analysis. Basic Regression. Differences between Time Series and Cross Section

Outline. 11. Time Series Analysis. Basic Regression. Differences between Time Series and Cross Section Outline I. The Nature of Time Series Data 11. Time Series Analysis II. Examples of Time Series Models IV. Functional Form, Dummy Variables, and Index Basic Regression Numbers Read Wooldridge (2013), Chapter

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

Instrumental Variables, Simultaneous and Systems of Equations

Instrumental Variables, Simultaneous and Systems of Equations Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Lecture 8: Functional Form

Lecture 8: Functional Form Lecture 8: Functional Form What we know now OLS - fitting a straight line y = b 0 + b 1 X through the data using the principle of choosing the straight line that minimises the sum of squared residuals

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

5.2. a. Unobserved factors that tend to make an individual healthier also tend

5.2. a. Unobserved factors that tend to make an individual healthier also tend SOLUTIONS TO CHAPTER 5 PROBLEMS ^ ^ ^ ^ 5.1. Define x _ (z,y ) and x _ v, and let B _ (B,r ) be OLS estimator 1 1 1 1 ^ ^ ^ ^ from (5.5), where B = (D,a ). Using the hint, B can also be obtained by 1 1

More information

Lecture 3: Multivariate Regression

Lecture 3: Multivariate Regression Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous

More information

Econometrics. 7) Endogeneity

Econometrics. 7) Endogeneity 30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors

More information

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10) Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the

More information

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK 1 ECONOMETRICS STUDY PACK MAY/JUNE 2016 Question 1 (a) (i) Describing economic reality (ii) Testing hypothesis about economic theory (iii) Forecasting future

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval] Problem Set #3-Key Sonoma State University Economics 317- Introduction to Econometrics Dr. Cuellar 1. Use the data set Wage1.dta to answer the following questions. a. For the regression model Wage i =

More information

Functional Form. So far considered models written in linear form. Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X

Functional Form. So far considered models written in linear form. Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X Functional Form So far considered models written in linear form Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X Functional Form So far considered models written in linear form

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Graduate Econometrics Lecture 4: Heteroskedasticity

Graduate Econometrics Lecture 4: Heteroskedasticity Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

Chapter 6: Linear Regression With Multiple Regressors

Chapter 6: Linear Regression With Multiple Regressors Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections Answer Key Fixed Effect and First Difference Models 1. See discussion in class.. David Neumark and William Wascher published a study in 199 of the effect of minimum wages on teenage employment using a

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information