i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)

Size: px

Start display at page:

Download "i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)"

Carmel Bailey
5 years ago
Views:

1 ECO 6375 Prof Millimet Problem Set #2: Answer Key Stata problem 2 Q 3 Q (a) The sample average of the individual-specific marginal effects is 0039 for educw and for white Thus, on average, an extra year of education raises the probability of a woman working by 39% and being white reduces the probability of working by 54% (b) These are identical to those reported by -margins-, but differ from those reported by -dprobit- The marginal effects reported by -dprobit- are the marginal effects computed at the sample mean of the covariates This is verified in the attached do file While the marginal effects are asymptotically equivalent, they do differ slightly for the sample -margins- can also be used to obtain marginal effects at the mean (or any other data point) (a) We know the OLS estimator is β = = = i (x i x)(y i y) i (x i x) 2 = N i x i(y i y) Var(x) N = N i (x i x)(y i y) i (x i x) 2 i x i N N i x iy i y N Var(x) i:y i = x i yx = P (x x) Var(x) Var(x) = P (x P x ( P )x 0 ) Var(x) = P ( P ) Var(x) (x x 0 ) where P = N /N is the proportion of the sample with y = (also equal to y) Thus, the slope estimator is a scaled version of the mean difference in x across the sub-samples with y = and y = 0 (b) In the LPM, we have Pr(y i = x i ) = x i β Thus, the log-likelihood is given by which becomes undefined if x i β / (0, ) ln[l(θ)] = i ln[pr(y i x i, θ)] = i ln[x iβ] y i ln[ x i β] ( y i) (a) In a probit model, the probability that y i = is given by Pr(y i = x i ) = Φ(x i β) where Φ( ) is the standard normal CDF So, given the estimation results, Pr(y i = male =, exp = 0) = Φ[003(0) + 074() ] = Φ[6] 868%

2 (b) Similarly, Pr(y i = male = 0, exp = 5) = Φ[003(5) + 074(0) ] = Φ[0787] 784% (c) Marginal effects in the probit model give the change in the probability that y i = given a change in x For experience, this is given by Pr(y i = x i ) exp = Φ(x iβ) exp = φ(x i β)β exp 4 Q which is equal to φ(003 exp i )β exp for males Thus, the marginal effect is not constant for males, but varies with experience (d) For females, the marginal effect is φ(003 exp i +0632)β exp (e) For a male and female with the same years of experience, the marginal effects are different since the 074 is missing in (d) In other words, even holding experience constant, the marginal effect of an additional year of experience will vary by gender as long as the coefficient on gender is not zero (a) Disturbance term can take only two discrete values, so standard errors are invalid LPM can give nonsense predictions of probabilities greater than or less than 0 Unrealistic assumption that the marginal effects are constant (b) The standard errors are (asymptotically) valid because the model is fitted using MLE The predicted probabilities are constrained to lie in the range 0 to The assumption of constant marginal effects to be relaxed (c) Since p is a function of Z, and Z is a linear function of the X variables, the marginal effect of X j is p = dp Z = dp X j dz X j dz β j where β j is the coefficient of X j in the expression for Z In the case of probit analysis, p = F (Z) is the standard normal CDF Hence, dp/dz is just the standard normal PDF For males, this is 0368 when evaluated at the means Hence, the marginal effect of AGE is = 0050 and that of S is = 0049 For females, the corresponding figures are = 0042 and = 0026, respectively So for every extra year of age, the probability is reduced by 50 percent for males and 42 percent for females For every extra year of schooling, the probability increases by 49 percent for males and 26 percent for females (d) Two possibilities i Derive the covariance matrix of the estimated marginal effects for males and females using the delta method Then perform a test of equality of the marginal effects 2

3 ii Fit a probit regression for the combined sample, adding a male intercept dummy and male slope dummies for AGE and S Test the joint hypothesis that all coefficients involving the male dummy are equal to zero Note, if any of the coefficients for male are non-zero, then the marginal effects will very by gender since the marginal effects depend on all parameters in the model 5 Q (a) The marginal effect is given by Pr(y = z, z 2 ) z 2 = (γ + 2γ 2 z 2 )φ(z β + γ z 2 + γ 2 z 2 2) with the corresponding estimator given by replacing the parameters with the probit estimates (b) The marginal effect of z 2 is given by Pr(y = z, z 2, d) z 2 = (γ + γ 3 d)φ(z β + γ z 2 + γ 2 d + γ 3 z 2 d) with the corresponding estimator given by replacing the parameters with the probit estimates The partial effect of d is given by Pr(y = z, z 2, d = ) Pr(y = z, z 2, d = 0) = Φ(z β + (γ + γ 3 )z 2 + γ 2 ) Φ(z β + γ z 2 ) with the corresponding estimator given by replacing the parameters with the probit estimates 6 The Poisson density must be adjusted to represent the density of y condition on y > 0 This is accomplished by dividing the density by Pr(y > 0) Formally, The log-likelihood is given by f(y y > 0) = f(y, y > 0) f(y > 0) = exp{ λ}λy y! = exp{ λ}λy y! = f(y) f(y > 0) /[ Pr(y = 0)] /[ exp{ λ}] ln[l(θ)] = i { λ + y i ln λ ln(y!) ln[ exp{ λ}]} = N{λ ln[ exp{ λ}]} + ln λ i y i i ln(y!) 7 Q (a) The log-likelihood function is ln[l(θ)] = ln[pr(p t x t, θ)] i = [ λ t + P t ln λ t ln(y t!)] i 3

4 (b) Two concerns are i over- or under-dispersion (ie, Var(y x) E[y x]), and ii that zeros may be prevalent in the data such that we would want to model the effect of going from zero to positive values as different than changes among positive values (i) may be tested using the regression-based test from class Specifcally, H o : Var(y i ) = E[y i ] which can be tested using by regressing H : Var(y i ) = E[y i ] + αg(e[y i ]) z t = ( P t λ t ) 2 Pt λ t 2 on either (i) a constant or (ii) λ t and no constant The t-statistic is a test of the null hypothesis (ii) may be assessed simply by looking at the frequency of zeros in the data If the proportion of zeros is large, then one may wish to use a zero-inflated poisson model (c) The coefficients give the percentage change in the number of patents from a unit change in each variable (d) Here, we need to compute the marginal effect of an additional research employee, and then multiply this by ten Formally, 0 E[P t x t ] R t = 0λ t β2 = exp{ β RD t + β 2 R t + β 3 N t }(0 β 2 ) Thus, the expected effect depends on the values of RD, R, and N There are three common ways to quantify the expected effect for the firm i calculate the value at the sample mean: RD, R, and N ii calculate the value for each year in the sample, and then report the average exp{ β RD t + β 2 R t + β 3 N t }(0 β 2 ) t= iii calculate the value at some particular value of the variables that seem reasonable, say RD, R, and N exp{ β RD + β 2 R + β 3 N }(0 β 2 ) 4

5 ps2 - Printed on 9/20/20 8:4:06 AM cap log cl 2 clear all 3 set mem 50m 4 set more off 5 cd c:\eco6375\ps 6 7 log using ps2log, replace 8 9 use "C:\eco7377\ps\ejdta", clear 0 g lfpw=laborw>0 probit lfpw white educw agew 2 3 **obtain obs-specific marginal effects 4 predict xb, xb 5 g me_educw=normalden(xb)*_b[educw] 6 g xb=xb if white== 7 replace xb=xb+_b[white] if white==0 8 g xb0=xb if white==0 9 replace xb0=xb-_b[white] if white== 20 g me_white=normal(xb)-normal(xb0) 2 **get sample average of obs-specific marginal effects 22 su me_* probit lfpw iwhite educw agew 25 margins white 26 margins, dydx(educw agew) 27 dprobit lfpw white educw agew **are these commands getting marginal effects at sample average? compute and compare 30 foreach v of varlist white educw agew { 3 su `v' 32 loc m`v'=r(mean) 33 } 34 qui probit lfpw white educw agew 35 g xbm=_b[_cons]+`mwhite'*_b[white]+`meducw'*_b[educw]+`magew'*_b[agew] 36 di "Marginal Effect for educw computed at sample mean = " %27f normalden(xbm)*_b[educw] 37 di "Marginal Effect for white computed at sample mean = " %27f normal(xbm+(-`mwhite')*_b[white])-normal(xbm- `mwhite'*_b[white]) log cl 4 Page

6 ps name: <unnamed> log: c:\eco6375\ps\ps2log log type: text opened on: 20 Sep 20, 08:39:43 use "C:\eco7377\ps\ejdta", clear g lfpw=laborw>0 probit lfpw white educw agew Iteration 0: log likelihood = Iteration : log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Probit regression Number of obs = 485 LR chi2(3) = 6730 Prob > chi2 = Log likelihood = Pseudo R2 = lfpw Coef Std Err z P> z [95% Conf Interval] white educw agew _cons **obtain obs-specific marginal effects predict xb, xb g me_educw=normalden(xb)*_b[educw] g xb=xb if white== (496 missing values generated) replace xb=xb+_b[white] if white==0 (496 real changes made) g xb0=xb if white==0 (989 missing values generated) replace xb0=xb-_b[white] if white== (989 real changes made) g me_white=normal(xb)-normal(xb0) **get sample average of obs-specific marginal effects su me_* Variable Obs Mean Std Dev Min Max me_educw me_white probit lfpw iwhite educw agew Iteration 0: log likelihood = Iteration : log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Probit regression Number of obs = 485 LR chi2(3) = 6730 Prob > chi2 = Log likelihood = Pseudo R2 = Page

7 lfpw Coef Std Err z P> z [95% Conf Interval] white educw agew _cons margins white Predictive margins Number of obs = 485 Model VCE : OIM ps2 Expression : Pr(lfpw), predict() Delta-method Margin Std Err z P> z [95% Conf Interval] white margins, dydx(educw agew) Average marginal effects Number of obs = 485 Model VCE : OIM Expression : Pr(lfpw), predict() dy/dx wrt : educw agew Delta-method dy/dx Std Err z P> z [95% Conf Interval] educw agew dprobit lfpw white educw agew Iteration 0: log likelihood = Iteration : log likelihood = Iteration 2: log likelihood = Probit regression, reporting marginal effects Number of obs = 485 LR chi2(3) = 6730 Prob > chi2 = Log likelihood = Pseudo R2 = lfpw df/dx Std Err z P> z x-bar [ 95% CI ] white* educw agew obs P pred P (at x-bar) (*) df/dx is for discrete change of dummy variable from 0 to z and P> z correspond to the test of the underlying coefficient being 0 **are these commands getting marginal effects at sample average? compute and compare foreach v of varlist white educw agew { 2 su `v' 3 loc m`v'=r(mean) 4 } Page 2

8 ps2 Variable Obs Mean Std Dev Min Max white Variable Obs Mean Std Dev Min Max educw Variable Obs Mean Std Dev Min Max agew qui probit lfpw white educw agew g xbm=_b[_cons]+`mwhite'*_b[white]+`meducw'*_b[educw]+`magew'*_b[agew] di "Marginal Effect for educw computed at sample mean = " %27f normalden(xbm)*_b[educw] Marginal Effect for educw computed at sample mean = di "Marginal Effect for white computed at sample mean = " %27f normal(xbm+(-`mwhite')*_b[white])-normal(xbm-`mwhite'*_b[white]) Marginal Effect for white computed at sample mean = log cl name: <unnamed> log: c:\eco6375\ps\ps2log log type: text closed on: 20 Sep 20, 08:39: Page 3

Binary Dependent Variables

Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome