Econometrics Homework 4 Solutions Question 1 (a) General sources of problem: measurement error in regressors, omitted variables that are correlated to the regressors, and simultaneous equation (reverse causation). In this case, there may be simultaneous equation problem if hours one is willing to work can affect the wage the employers are willing to pay. Also, measurement error is a common problem when wage is calculated by using earnings divided by wage. (b) (i) It is not in the equation as a regressor. (ii) It is correlated to wage. (iii) It is uncorrelated to the error term (deterants of hours of work not included as a regressor). (c) We should run the first stage regression: regress lw on ax, ax 2, hedu, kl6, k618, edu, loinc and age (a constant of course. It can be omitted when it s understood.) We should look at the F statistics of testing whether ax, ax 2 and hedu all have zero coeffi cients. he rule of thumb is that it s weak if it s below 10. So, here the instruments are weak. Note the p-value is irrelevant here. (d) We can do the overidentifying restriction test. Obtain the residuals for the IV regression, then regress the residual on all exogenous variables. Here, they are ax, ax 2, hedu, kl6, k618, edu, loinc and age. hen, obtain the statistic nr 2. his is distributed χ 2 with degrees of freedom 2 under the null hypothesis of valid over-identification restrictions. (It s not robust to heteroscedasticity, but I don t want to complicate the matters here.) he p-value is a lot higher than 0.05, so it cannot reject the null of valid over-identification restrictions. (e) Run the first stage regression described in part (c), obtain the residual (lw_res), and do the OLS on the original equation by adding this residual into the equation (i.e. regress lhr on lw, kl6, k618, edu, loinc, age and lw_res and 1 using OLS). he t-statistic on the coeffi cient of lw_res is our test statistic, which is normally distributed. he p-value is so low that we reject the null that the regressor lw is exogenous. (f) Using White s robust standard errors only affects the variance estimates, but not the coeffi cient estimates. values are affected because it involves the standard errors. (g) he wage elasticity is 1.761, which is positive. As t = 1.761/0.599 = 2. 94, so it is statistically significant. It is reasonable, as the higher the wage, the more the women tend to work more, as leisure becomes more expensive and income effect is positive. (h) Having small kids has a negative impact on hours of work, a smaller impact of the kids are older. More education also tend to work less in terms of hours. Older women work less. hose with higher income from other sources also work less (income effect on leisure). he coeffi cients on kids below 6 and education are statistically significant at 5% level. (I am brief here.) (i) We now test kl6 k618 = 0 against they are not equal. Given F = 3.945, the degrees of freedom here are (1,420) and the 5% critical value is about 3.84, so we reject the null that the effects are the same at 5% level. (j) One should add an interaction term kl6 lw. he coeffi cient would show the difference of wage elasticity on labor supply for those with a small kid than those without a small kid. 1
Question 2 (a) Since given education level and age (experience), more able people are more likely to join the party, so b 2 obtained this way is likely to also capture the effect of aibility, but not just the effect of being in the party. (b) Some proxy of ability or human capital investment other than level of education may help. (e.g. parental income, or some past test scores, university GPA.) (c) Parents party membership can be valid instruments if it affects the child s probability to become members and has no effect on earnings directly, after controlling for other regressors. (correlated to children s party membership status but uncorrelated to error term of the earnings equation.) (d) Maybe not, if party membership tends to be detered early on in life. It is useful unless some people change status between the surveys. Question 3 (a) FE: ȳ i = (y i1 + y i2 )/2 and x i = (x i1 + x i2 )/2, and y i1 ȳ i = (x i1 x i ) + (ε i1 ε i ) y i2 ȳ i = (x i2 x i ) + (ε i2 ε i ) (Notice that the constant term 1 has also been differenced away. I have omitted the the details here.) Now, y i1 (y i1 + y i2 )/2 = (y i1 y i2 )/2 and y i2 (y i1 + y i2 )/2 = (y i2 y i1 )/2. Similarly for x i1 and x i2. So, for OLS, the objective function becomes n ( 1 2 (y i1 y i2 ) 1 ) 2 ( 1 2 (x i1 x i2 ) + 2 (y i2 y i1 ) 1 ) 2 2 (x i2 x i1 ) Note that the term inside the first and second squares are just negative of each other, and thus having the same value after squared, so it is the same as hen for FD: n he objective function of OLS becomes 1 ( (yi2 y i1 ) (x i2 x i1 ) ) 2 2 y i2 y i1 = (x i2 x i1 ) + (ε i2 ε i1 ) n ( (yi2 y i1 ) (x i2 x i1 ) ) 2 Since the objective function of FE and FD differ only by a multiple 1/2, the imizers are the same. hus, FD and FE estimators are the same for = 2. (b) he first differenced model is consistent when (x ki1 x ki2 ) is uncorrelated to (ε i1 ε i2 ). So if ε is uncorrelated to x for oneself and between the pair of twins, then it is satisfied. (his allow correlation between c i and x ij as it is removed from the estimating equation already.) 2
(c) Since twins must be born on the same day, their age must be the same, so differencing will result in zeros. hus, coeffi cients on age cannot be obtained. (d) he coeffi cient implies that for a year increase in education, the wage is higher by about 9.2%. his is statistically significant, as t = 0.092/0.024 > 1.96.(I have forgotten to put in the number of observations, which is 149 pairs of twins.) (e) he measurement error in regressors would result in attenuation bias (bias towards zero). (Measurement error is a problem here because we rely only on differences within pairs of twins, without using the differences across different pairs of twins.) (f) First, though x i1 x i2 can still be subject to measurement error, if this is not correlated to the error term ε i2 ε i1 and the original measurement errors, then it is a valid instrument, as it is correlated to x i1 x i2. (g) he FDIV estimate is much higher than FD estimate, which implies the measurement error leads to a serious downward bias. he standard error is higher than in FD case. It means the effect of 1 year of education, controlling effect from family and gene, is 16.7% and is statistically significant. (From Ashenfelter, Orley and Krueger, Alan (1994) "Estimates of the Economic Return to Schooling from a New Sample of wins" American Economic Review 84(5), 1157-73.) Question 4 (a) he state effects are not controlled for in specification 1, so the unobserved characteristics of the states may be correlated to the beer tax level. For example, the states with more serious fatality due to drunk driving may have a higher beer tax. Specification 2 has controlled for the state effect (by using FE estimator or state dummies), so such effects are removed, which possibly makes the estimated coeffi cient closer to the true effect from beer tax. (b) If all states have the same laws, all units have the same values for the driving law variables for a given year, which is then perfectly correlated to the time dummies for related years. So there is perfect multicollinearity and the effect cannot be estimated by controlling time fixed effects. (c) (I have given the units of the variables, so you may just interpret the sign and statistical significance.) he beer tax has a statistically significant negative effect on traffi c fatality, while drinking age laws have very small and insignificant effect, some are effect positive. Higher punishment such as mandatory jail and community services also have small and statistically insignificant effect on traffi c fatality. Driving experience per driver also have small and insignificant effect. Economic variables like unemployment rate has a negative effect on fatality while real income per capita increases fatality, and the last two are statistically significant. Only beer tax and economic variables are statistically significant in the regression. (d) From the lower part of the table, the F test of the joint significance of drinking age law coeffi cients is 0.48 with p-valiue 0.696, which means we cannot reject the null hypothesis that all drinking law coeffi cients are zero, so such law has little effect if any to the traffi c fatality. (e) he restrictions are that each year lower in drinking age have the same effect, and each type of mandatory punishment has the same effect. (his is meant to increase the power of the test because it uses fewer parameters, even though such equality may not be true.) (f) If the idiosyncratic error term ε it is autocorrelated and/or heteroscedastic, we should use clustered robust standard errors. Question 5 (a) E(e it ) = E(u it ) θe(ū i ) = 0 θe( s=1 u is/ ) = θ s=1 E(u is)/ = 0 since E(u is ) = E(c i + ε is ) = 0 for all i, s. 3
(b) Note that e it = u it θū i = c i + ε it θc i θ ( ε is ) s=1 = (1 θ)c i + (1 θ )ε it j t( θ ε ij) hus, for t s cov(e it, e is ) = (1 θ) 2 σ 2 c 2(1 θ ) θ σ2 ε + θ2 2 ( 2)σ2 ε = (1 θ) 2 σ 2 c + ( θ2 2θ )σ2 ε ( ) ( ) = σ 2 c 2θ σ 2 c + σ2 ε + θ 2 σ 2 c + σ2 ε For the first equality, the second term comes from matching (1 θ )ε it with θ ε it and the third term comes from matching θ ε ij for j s, t. (c) o solve for θ when cov(e it, e is ) = 0, we have ( ) ( ) 2 ( ) 2 σ 2 c + σ2 ε 4 σ 2 c + σ2 ε 4 σ 2 c + σ2 ε σ 2 c θ = ( ) 2 σ 2 c + σ2 ε σ = 1 1 ( 2 c ) σ 2 c + σ2 ε σ = 1 2 ε σ ( ) = 1 2 ε σ 2 c + σ2 ε ( σ 2 c + σ 2 ε) which is the same as in the lecture notes. (d) V ar(e it ) = V ar(u it θū i ) = V ar(c i + ε it θ (ci + ε ij )) = V ar (1 θ)c i + (1 θ )ε it θ ε ij = (1 θ) 2 V ar (c i ) + (1 θ )2 V ar(ε it ) + θ2 ( 1) 2 V ar(ε ij ) [ = (1 θ) 2 σ 2 c + (1 θ ] )2 + θ2 ( 1) 2 σ 2 ε As this is true for all i and for all t, it is homoscedastic as long as σ 2 ε is constant across i and t. (It is OK if you do not calculate the variance, but you should get the idea.) I ask this question because GLS transformation aims at removing both autocorrelation and heteroscedasticity, if any. j t 4
(e) When σ 2 c 0, θ 0, so we go back to simple OLS, which satisfied Gauss Markov heorem s assumption when there is no unobserved individual effect c i. So this choice is effi cient. (f) When σ 2 ε 0, θ 1, which is the same as fixed effect estimator, but the justification is different. Note that if variance of ε it is very small (relative to σ 2 c), differencing with θ = 1 remove c i, and the remaining error is ε it ε i, whose variance is close to 0 if σ 2 ε is close to zero. A low error variance would reduce the variance of OLS estimator of coeffi cients as we have shown before. Question 6 (a) We should use Breush-Godfrey test. We can do it by regressing the residuals on its lag and all regressors, and obtain the statistics ( p)r 2 for this regression. Under the null of no serial correlation of errors, the statistic is distributed in χ 2 of degree of freedom equal number of lags. Now the p-value is so small, it means the null of no serial correlation can be strongly rejected. (b) Rough weather would increase the price, as it is harder to go out to the sea to catch fish. hey are both statistically significant. Recent rough weather has a stronger effect. (c) It looks like hursday has a higher prices than other days. But the joint test cannot reject that all day of the week effects are zero at 10% significance level. (d) he P-W procedure is a FGLS estimator adjusting for first order autocorrelation of residuals. (Here I do not need the detail, but you should write a little more if I ask for them.) he change for day of the week effect is small, and the pattern is the same. he effect of wave2 and wave3 are smaller than the OLS. 5