Econometrics Problem Set 0 WISE, Xiamen University Spring 207 Conceptual Questions Dependent variable: P ass Probit Logit LPM Probit Logit LPM Probit () (2) (3) (4) (5) (6) (7) Experience 0.03 0.040 0.006 0.04 (0.009) (0.06) (0.002) (0.56) M ale -0.333-0.622-0.07-0.74 (0.6) (0.303) (0.034) (0.259) Male -0.05 Experience (0.09) Constant 0.72.059 0.774.282 2.97 0.900 0.806 (0.26) (0.22) (0.034) (0.24) (0.242) (0.022) (0.200) Table : Four hundred driver s license applications were randomly selected and asked whether they passed their driving test (P ass i ) or failed their test (P ass i 0); data were also collected on their gender (Male i if male and 0 if female), and their years of driving experience (Experience i in years). The above table summarizes several estimated models.. (SW.4) Using the results in columns (4)-(6) of Table : (a) Compute the estimated probability of passing the test for men and for women. Group Probit Logit LPM Men Φ(.282 0.333) 0.829 0.829 0.829 +e (2.97 0.6222) Women Φ(.282) 0.900 0.900 0.900 +e (2.97) (b) Are the models in (4)-(6) different? Why or why not? Because there is only one regressor and it is binary (M ale), estimates for each model show the fraction of males and females passing the test. Thus, the results are identical for all models. 2. (SW.5) Using the results in column (7) of Table : (a) Akira is a man with 0 years of driving experience. What is the probability that he will pass the test?
Pr(P ass ) Φ(0.806 + 0.04 0 0.74 0.05 0) Φ(0.806 + 0.4 0.74 0.5) Φ(0.892) 0.838. The probability that he will pass the test is 0.838. (b) Jane is a woman with 2 years of driving experience. What is the probability that she will pass the test? Pr(P ass ) Φ(0.806 + 0.082) Φ(0.888) 0.827. The probability that she will pass the test is 0.827. (c) Does the effect of experience on test performance depend on gender? Explain. No. The coefficient of the interaction of gender and experience is not significant. So the effect of experience on test performance does not depend on gender. 3. (SW.8) Consider the linear probability model Y i β 0 + β X i + u i, where Pr(Y i X i ) β 0 + β X i. (a) Show that E(u i X i ) 0. Since Y i is a binary variable, we know Thus, E(Y i X i ) Pr(Y i X i ) + 0 Pr(Y i 0 X i ) Pr(Y i X i ) β 0 + β X i. E(u i X i ) E(Y i (β 0 + β X i ) X i ) E(Y i X i ) (β 0 + β X i ) 0. Page 2
(b) Show that var(u i X i ) (β 0 + β X i )[ (β 0 + β X i )]. var(u i X i ) var(y i (β 0 + β X i ) X i ) var(y i X i ) Pr(Y i X i )[ Pr(Y i X i )] (β 0 + β X i )[ (β 0 + β X i )] (c) Is u i heteroskedastic? Explain. Yes, u i is heteroskedastic, the variance of u i depends on X i. (d) Let {(Y i, X i )} n be i.i.d. observations. Derive the likelihood function of the model. The probability that Y i conditional on X i is p i β 0 + β X. The conditional probability distribution for the ith observation is Pr(Y i y i X i ) p y i i ( p i) y. Assuming that (X i, Y i ) are i.i.d., i,..., n, the joint probability distribution of Y,..., Y n conditional on the X s is Pr(Y y,..., Y n y n X,..., X n )) Pr(Y i y i X i ) p y i i ( p y i i ) (β 0 + β X i ) y i [ (β 0 + β X i )] y i 4. (SW.0) Suppose that a random variable Y has the following probability distribution: Pr(Y ) p, Pr(Y 2) q and Pr(Y 3) p q. A random sample of size n is drawn from this distribution and the random variables are denoted Y, Y 2,..., Y n. (a) Derive the likelihood function for the parameters p and q. Let n equal the number of observations of the random variable Y which equal ; and n 2 equal the number of observations of the random variable Y which equal 2. The joint probability distribution of Y,..., Y n is Pr(Y y,..., Y n y n ) n Pr(Y i y i ) p n q n 2 ( p q) n n 2. The likelihood function is the above joint probability distribution treated as a func- Page 3
tion of the unknown coefficients (p and q), i.e., f(p, q; Y,..., Y n ) p n q n 2 ( p q) n n n 2. (b) Derive formulas for the of p and q. The log likelihood function is ln f(p, q; y) n ln(p) + n 2 ln(q) + (n n n 2 ) ln( p q). Differentiating the above equation with respect to p and q gives the following first order conditions: n This implies that p n n, and q n 2 n. p n n n 2 0; p q n 2 q n n n 2 0. p q 5. Show that logistic model quantifies the effect of the regressors in terms of the log-odds ratio of the probability of success relative to the probability of failure, i.e., show that ( ) Pr(Y i X i,..., X ki ) log Pr(Y i 0 X 0 + X i +... + k X ki. i,..., X ki ) Pr(Y i ) Pr(Y i 0) +e ( 0 + X i +...+ k X ki ) +e ( 0 + X i +...+ k X ki ) ( ) ( ) + e ( 0 + X i +...+ k X ki ) + e ( 0 + X i +...+ k X ki ) e ( 0 + X i +...+ k X ki ) e 0 + X i +...+ k X ki. Thus, ( ) Pr(Yi ) log Pr(Y 0 + X i +... + k X ki. i 0) 6. Consider the following sample of i.i.d. observations (X i, Y i ) 4 : {(0, 0), (0, ), (, 0), (, )}. Page 4
(a) Assume that Pr(Y i X i ) β 0 + β X i. i. Derive the likelihood function for this sample. The probability that Y i conditional on X i is p i β 0 +β X. The conditional probability distribution for the ith observation is Pr(Y i y i X i ) p y i i ( p i) y. Assuming that (X i, Y i ) are i.i.d., i,..., n, the joint probability distribution of Y,..., Y n conditional on the X s is Pr(Y y,..., Y n y n X,..., X n )) Pr(Y i y i X i ) p y i i ( p y i i ) (β 0 + β X i ) y i [ (β 0 + β X i )] y i Thus, n f(β 0, β ; Y,..., Y n X,..., X n ) (β 0 + β X i ) y i [ (β 0 + β X i )] y i f(β 0, β ; 0,, 0, 0, 0,, ) ( β 0 )(β 0 )( β 0 β )(β 0 + β ) ii. What are the value of β 0 and β that maximize the likelihood function. f β ( β 0 )(β 0 )( β 0 β β 0 β ) Thus, at the optimum, the function β (β 0 ) /2 β 0, defines the optimal value of β for any value of β 0. f(β 0, β ) β 0 β 0 f(β 0, β (β 0 )) 0 β 0 β 0 0 β ( 2 0 )/4. β Thus, the maximum likelihood estimates are 0 0.5 and 0. (b) Assume that Pr(Y i X i ) Φ(β 0 + β X i ). i. Derive the likelihood function for this sample. Page 5
n f(β 0, β ; Y,..., Y n X,..., X n ) Φ(β 0 + β X i ) y i [ Φ(β 0 + β X i )] y i f(β 0, β ; 0,, 0, 0, 0,, ) ( Φ(β 0 ))Φ(β 0 )( Φ(β 0 + β ))Φ(β 0 + β ) ii. What are the value of β 0 and β that maximize the likelihood function. f β ( Φ(β 0 ))Φ(β 0 )( ϕ(β 0 + β )Φ(β 0 + β ) + ϕ(β 0 + β )( Φ(β 0 + β ))) ( Φ(β 0 ))Φ(β 0 )ϕ(β 0 + β )( 2Φ(β 0 + β )). At the optimum, Φ(β0 + β) /2. Thus, the function β (β 0 ) β 0, defines the optimal value of β for any value of β 0. f(β 0, β ) β 0 β 0 f(β 0, β (β 0 )) 0 β 0 β 0 0 β β ϕ( 0 )( 2Φ( 0 ))/4. Thus, the maximum likelihood estimates are 0 0 and 0. (c) Assume that Pr(Y i X i ) /( + e (β 0+β X i ) ). i. Derive the likelihood function for this sample. f(β 0, β ; Y,..., Y n X,..., X n ) n ( + e (β 0+β X i ) ) y i [ ( + e (β 0+β X i ) ) ] y i f(β 0, β ; 0,, 0, 0, 0,, ) ( ( + e β 0 ) )( + e β 0 ) ( ( + e β 0 β ) )( + e β 0 β ) ii. What are the value of β 0 and β that maximize the likelihood function. f β ( ( + e β 0 ) )( + e β 0 ) ( + e β 0 β ) 3 e β 0 β ( 2e β 0 β ( + e β 0 β ) ) At the optimum, e β 0 β. Thus, the function β (β 0 ) β 0, defines the Page 6
optimal value of β for any value of β 0. f(β 0, β ) β 0 β 0 β 0 β β f(β 0, β (β 0 )) β 0 β 0 β 0 β β ( + e β 0 ) 3 (e β 0 )/4 Thus, the maximum likelihood estimates are 0 0 and 0. Empirical Questions For these empirical exercises, the required datasets and a detailed description of them can be found at www.wise.xmu.edu.cn/course/gecon/written.html. 7. (SW E.2) It has been conjectured that workplace smoking bans induce smokers to quit be reducing their opportunities to smoke. In this assignment you will estimate the effect of workplace smoking bans on smoking using data on a sample of 0,000 U.S. indoor workers from 99-993. The dataset Smoking contains information on whether individuals were or were not subject to a workplace smoking ban, whether the individuals smoked, and other individual characteristics. Use the dataset to carry out the following exercises. The R code required for each question is listed within its respective solution. The code listed here initializes the software. # install relevant libraries library("aer") # read data Smoke<-read.csv("D:/R/Smoking.csv") ## The following table summarizes the regressions need to answer the questions in parts (a)-(h). Page 7
Dependent variable: smoker Probit LPM smkban 0.59 0.047 (0.029) (0.009) age 0.035 0.00 (0.007) (0.002) age 2 0.00047 0.0003 (0.00008) (0.00002) female 0.2 0.033 (0.029) (0.009) hsdrop.42 0.323 (0.073) (0.09) hsgrad 0.883 0.233 (0.06) (0.03) colsome 0.677 0.64 (0.062) (0.03) colgrad 0.235 0.045 (0.066) (0.02) black 0.084 0.028 (0.054) (0.06) hispanic 0.338 0.05 (0.050) (0.04) Intercept.735 0.04 (0.52) (0.04) Robust standard errors in parentheses. significant at p <.0; p <.05; p <.0; p <.00 0,000 observations. (a) Estimate a probit model with smoker as the dependent variable and the following regressors; smkban, female, age, age 2, hsdrop, hsgrad, colsome, solgrad, black and hispanic. # probit regression m <- glm(smoker ~ smkban+age+i(age^2)+female+hsdrop+hsgrad+colsome +colgrad+black+hispanic, family binomial(link "probit"), datasmoke) coeftest(m,vcovvcovhc(m,"hc")) See the Probit column in the above table. (b) Test the hypothesis that the coefficient on smkban is zero in the population version of this probit regression against the alternative that it is nonzero, at the 5% significance level. The value of the t-statistic equals 5.437, which is significant, so we Page 8
reject the null. The coefficient on smkban isn t zero in the population version of this probit regression. (c) Test the hypothesis that the probability of smoking does not depend on the level of education in this probit model. # F-test for joint signifigance of education linear.hypothesis(m,c("hsdrop0","hsgrad0","colsome0","colgrad0"),vcovvcovhc(m,"hc"),test"f") ## The t-statistic for hsdrop, hsgrad, colsome and colgrad are 5.543, 4.550, 0.978 and 3.579 respectively, implying that all are individually significant. The F -statistic (09.58) is also significant, so the probability of smoking does depend on the level of education in this probit model. (d) Mr A. is white, non-hispanic, 20 years old and a high school dropout. Using the probit regression from (a), and assuming that Mr. A is not subject to a workplace smoking ban, calculate the probability that Mr. A smokes. Carry out the calculation again assuming that he is subject to a workplace smoking ban. What is the effect of a smoking ban on the probability of smoking? # Mr. A data MrA <- data.frame(smkbanc(0,), age20, hsdrop, hsgrad0, colsome0, colgrad0, black0, hispanic0, female0) # predicted probability for Mr. A to smoke predictmra<-predict(m, newdatamra, type"response") # effect of smoking ban predictmra[]-predictmra[2] ## When assuming Mr. A is not subject to a workplace smoking ban, the probability that he smokes is 0.464, when he is subject to a workplace smoking ban, the probability that he smokes is 0.402. The effect of a smoking ban on the probability of smoking is 0.464 0.402 0.062. (e) Repeat (d) for Ms. B, a female, black 40-year-old, college graduate. # Ms. B data MsB <- data.frame(smkbanc(0,), age40, hsdrop0, hsgrad0, colsome0, colgrad, black, hispanic0, female) # predicted probability for Ms. B to smoke Page 9
predictmsb<-predict(m, newdatamsb, type"response") # effect of smoking ban predictmsb[]-predictmsb[2] ## When assuming Ms. B is not subject to a workplace smoking ban, the probability that she smokes is 0.44, when she is subject to a workplace smoking ban, the probability that she smokes is 0.. The effect of a smoking ban on the probability of smoking is 0.44 0. 0.033. (f) Estimate a linear probability model with smoker as the dependent variable and the following regressors; smkban, female, age, age 2, hsdrop, hsgrad, colsome, solgrad, black and hispanic. # linear probability model m2<-lm(smoker~smkban+age+i(age^2)+female+hsdrop+hsgrad+colsome +colgrad+black+hispanic,datasmoke) coeftest(m2,vcovvcovhc(m2,"hc")) ## See the LPM column of the above table. (g) Repeat (d) and (e) using the linear probability model estimated in (f). # predicted probability for Mr. A to smoke predictmra<-predict(m2, newdatamra, type"response") # effect of smoking ban predictmra[]-predictmra[2] # predicted probability for Ms. B to smoke predictmsb<-predict(m2, newdatamsb, type"response") # effect of smoking ban predictmsb[]-predictmsb[2] In the linear model, when assuming Mr. A is not subject to a workplace smoking ban, the probability that he smokes is 0.449; when he is subject to a workplace smoking ban, the probability that he smokes is 0.402. The effect of a smoking ban on the probability of smoking is 0.449 0.402 0.047. When assuming Ms. B is not subject to a workplace smoking ban, the probability that she smokes is 0.46; when she is subject to a workplace smoking ban, the probability that she smokes is 0.099. The effect of a smoking ban on the probability of smoking is 0.46 0.099 0.047. (Notice that this is given by the coefficient on smkban, 0.047, in the linear probability model.) (h) Based on the answers to (d)-(g), do the probit and linear probability model results differ? If they do, which results make more sense? Are the estimated effects large in a Page 0
real-world sense. From (d)-(g),the results for the probit and linear probability model are obviously different. The linear probability model assumes that the marginal impact of workplace smoking bans on the probability of an individual smoking is not dependent on the other characteristics of the individual. In contrast, the predicted marginal impact of workplace smoking bans on the probability of smoking using the probit model depends on individual characteristics. Therefore, in the linear probability model, the marginal impact of workplace smoking bans is the same for Mr. A and Ms. B, although their profiles would suggest that Mr. A has a higher probability of smoking based on his characteristics, regardless of any workplace smoking ban. Looking at the probit model results, the marginal impact of workplace smoking bans on the odds of smoking are different for Mr. A and Ms. B, because their different characteristics are incorporated into the probability of smoking. In this sense the probit model is likely more appropriate. Are the impacts of workplace smoking bans large in a real-world sense? Most people might believe the impacts are large. For example, in (d) the reduction on the probability is 6.3%. Applied to a large number of people, this translates into a 6.3% reduction in the number of people smoking. (i) Are there important remaining threats to internal validity? There may be simultaneous causality bias, the probability of smoking and smoking bans have a mutual effect on each other. If there are more smokers, it is hard to issue smoking bans. Companies that impose a smoking ban may have fewer smokers to begin with. Smokers may seek employment with employers that do not have a smoking ban. States with smoking bans already may have fewer smokers than states without smoking bans. Page