Instrumental Variable Regression

Size: px

Start display at page:

Download "Instrumental Variable Regression"

Tiffany Goodwin
5 years ago
Views:

1 Topic 6 Instrumental Variable Regression ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà

2 Outline of this topic Randomized Experiments, natural experiments and causation Instrumental variables: causation versus correlation. Endogeneity bias Derivation of IV with GMM Properties of IV Tests

3 Experiments and quasi-experiments i The ideal environment is one in which we can control an experiment and randomly assign a treatment across individual units. Think of a pharmaceutical trial where you could control every single variable in your patients lives and you could randomly assign a treatment and a placebo. Experiments are expensive and in economics, usually rare so we often have to rely on quasiexperiments or natural experiments: a situation where something similar to a treatment assignment took place.

4 Puerto Rico and Hurricane Betsy Puerto Rico and the effects of class size on earnings: in 1956 there was a hurricane (Betsy) that traversed the island of Puerto Rico. Not many casualties but much infrastructure damage. As a result, children along the hurricane s path had to attend neighboring schools, effectively doubling class sizes in some districts but not others. The hurricane effectively assigned at random what school districts had class sizes doubled.

5 Program evaluation In statistics/econometrics this refers to studies designed to assess the effects of public policy: e.g. class size reduction, anti-smoking campaigns, etc. Three types of experiments: 1. Clinical i l drug trial: does a proposed drug lower cholesterol? Y = cholesterol level X = treatment t t or control group (or dose of drug) 2. Job training program (Job Training Partnership Act) Y = has a job, or not (or Y = wage income) X = went through experimental program, or not 3. Class size effect (Tennessee class size experiment) Y = test score (Stanford Achievement Test) X = class size treatment group (regular, regular + aide, small)

6 Causality An ideal randomized controlled experiment assigns subjects to treatment and control groups. More generally, the treatment level X is randomly assigned y i = x i + ² i Random assignment ensures that x i and ² i are independent and hence E(² i jx i )=0which is what we need to ensure ^ ^ is unbiased. Suppose x i is a binary variable (i.e. 1 for treatment, 0 for control) then

7 Treatment Effect the average treatment effect is simply: AT E = E(y i jx i =1) E(y i jx i =0); AT ^ E =¹y x=1 ¹y x=0 This is sometimes called the differences estimator. In practice, several reasons things fail: omitted variable bias: we omit important explanatory variables correlated with x. Errors-in-variables bias. Simultaneous causality bias.

8 Omitted Variables Bias Suppose the data are generated by the model: y = X + W± + ² But you estimate the model: Then: y = X + u ^ 1 1 ^ =(X 0 X) 1 (X 0 y)=(x 0 X) 1 (X 0 (X + W± + ²) = = +(X 0 X) 1 (X 0 W)± +(X 0 X) 1 (X 0 ²) So the bias depends d on the correlation between X and W and. Notice: E(ujX) = E(X 0 W±jX) = 0

9 Errors-in-variables Suppose the data are generated by: y i 0 = 1 + 2x 0 i + ² 0 i However, we observe our variables with error, e.g.: x i = x 0 i + v xi ;andy i = yi 0 + v yi For simplicity (although perhaps not realistically) assume the errors are iid i.i.d. D(0; ¾ 2 j ); j = x; y Substituting the observed y, x into the original regression, we have

10 Errors-in-Variables y = 0 i 1 + 2(x i v xi ) + ² i + v yt = 1 + 2x i + f² 0 i + v yi 2v xi g Measurement error clearly increases the error variance, since: But this is a minor problem compared to: u i V (u i )=¾² 2 + ¾y ¾ x 2 E(u i jx i )=E(² 0 i + v yi 2v xi jx 0 i + v xi )= 2v xi 6=0

11 Simultaneous Causality Classic example: a regression of quantity of wheat (Q) on the price of wheat (P). Suppose you want to estimate the elasticity of demand for wheat: ln(q i ) = ln(p i ) + u i In principle ^ 1 would be an estimate of this elasticity (why?). In practice, it is not To see this, suppose we have data on quantities and prices for different years

12 Simultaneous Causality Observations of quantities and prices over time:

13 Simultaneous Causality The interaction of supply and demand generates the following scatter of points

14 Simultaneous Causality Now suppose that I use the variable rainfall as a way to isolate shifts in the supply curve out of this cloud of points(assuming that rainfall does nothing to change the demand for wheat):

15 What is the common thread? In all of these situations, the regression parameters are estimated t with bias. The source of the bias is that the residuals of the actual regression that we estimate are correlated with the regressors E(Xi² 0 i ) 6= 0: Omitted variable bias: the regressors are correlated with the omitted variables. In the regression of test scores on class size omitting parental income will bias the effect of class size (our policy variable of interest) since richer districts i t tend to have smaller classes and richer parents have more resources (unmeasured) to help their children do well in school

16 The common thread (cont.) Errors-in-variables: we saw that even in the benign case that t the measurement error is random and well-behaved we had bias. Of course, this only gets worse if the bias is not random (e.g. the size of the measurement error is related to the value the regressor takes). Simultaneous causality: perhaps the more fundamental source of bias (often called endogeneity bias). Many variables are jointly dt determined dso some work needs to be done to filter the portion of the causality direction that we want to measure

17 The Solution: Instrumental t Variables Instrumental Variables: Z i is an l 1vectorsuchthat E(Zi² 0 i )=0andE(ZiX 0 i ) 6= 0withl k Remarks: Not all elements of X i need be endogenous. Those that are exogenous could be used as elements in Z i. In fact, if all the X i were exogenous, we would recover the usual moment condition we used for the method of moments derivation of OLS. If l = k we say the model is just identified, d otherwise we say the model is overidentified.

18 Two Stage Least Squares So the instruments are correlated with that part of the regressors that is uncorrelated with the error term. We could get at that part by regressing the regressors on the instruments: X = Z + U n ll k n k n k ^X = Z ^ =Z(Z 0 Z) 1 Z 0 X = P z X with P z idempotent Next, plug-in X into the linear regression

19 TSLS (continued) Y = X + ² =(Z +U) + ² = Z +(U + ²) =Z + V Notice that: E(Z 0 V )=E(Z 0 U )+E(Z 0 ²)=0 ^ =(Z 0 Z) 1 Z 0 Y and from before ^ =(Z 0 Z) 1 Z 0 X Let s think of the just identified case, i.e., = k 1 k kk 1! = 1 l = k ^ IV = (Z 0 Z) 1 Z 0 X 1 (Z 0 Z) 1 Z 0 Y =(Z 0 X) 1 (Z 0 Y )

20 TSLS Now using method of moments: E(Z 0 ²)=0! E(Z 0 (Y X )) = 0! E(Z 0 Y )=E(Z 0 X) And the analogy principle: ^ IV = μ Z 0 X n 1 μ Z 0 Y n =(Z 0 X) 1 (Z 0 Y ) However, what happens when we have overidentification, i.e. l >k ^ l k is not directly invertible

21 TSLS (cont.) However, notice ^ = ^ ^! (Z 0 Z) 1 Z 0 Y =(Z 0 Z) 1 (Z 0 X) l k Pre-multiplying both sides by X Z X 0 Z(Z 0 Z) 1 Z 0 Y = X 0 Z(Z 0 Z) 1 (Z 0 X) ^ k 1 ^ k k k 1 hence ^ k 1 1 = X 0 Z(Z 0 Z) 1 Z 0 X X 0 Z(Z 0 Z) 1 (Z 0 Y ) ^ IV = (X 0 P z X) ) 1 (X 0 P z Y )

22 TSLS The General Case Let s go back. First stage: X = Z +U ^X = Z ^ =Z(Z 0 Z) 1 Z 0 X = P z X Second Stage: do OLS on the auxiliary regression Y = ^X + V ^ IV =(^X 0 ^X) 1 ^X0 Y =(X 0 P z X) 1 (X 0 P z Y ) Since P z P z = P z

23 Remarks When doing TSLS note that the standard errors of the second stage regression (the one where you substitute X with ^X ) are incorrect. The reason is that the usual OLS formulas in the second stage do not take into account the estimation error in ^X^X. As we will see in more detail, the same issues with heteroscedasticity can arise in instrumental variable regression as well.

24 Example: Demand for Cigarretes We want to estimate: ln(q i d )= 0 + 1ln(p i )+² i Data: a panel of observations across states and time Annual cigarette consumption and average prices paid (including tax) 48 continental US states, Proposed instrumental variable: Z i = general sales tax per pack in the state = SalesTax i Is this a valid instrument? (1) Relevant? (2) Exogenous?

25 STATA example: First Stage Instrument = Z = rtaxso = general sales tax (real $/pack) X Z. reg lravgprs rtaxso if year==1995, r; Regression with robust standard errors Number of obs = 48 F( 1, 46) = Prob > F = R-squared = Root MSE = Robust lravgprs Coef. Std. Err. t P> t [95% Conf. Interval] rtaxso _cons X-hat. predict lravphat; Now we have the predicted values from the 1 st stage

26 Second Stage Y X-hat. reg lpackpc lravphat if year==1995, r; Regression with robust standard errors Number of obs = 48 F( 1, 46) = Prob > F = R-squared = Root MSE = Robust lpackpc Coef. Std. Err. t P> t [95% Conf. Interval] lravphat _cons These coefficients are the TSLS estimates The standard errors are wrong because they ignore the fact that ARE/ECN the 240A first stage was estimated

27 Using STATA s built-in ivreg command Y X Z. ivreg lpackpc (lravgprs = rtaxso) if year==1995, r; IV (2SLS) regression with robust standard errors Number of obs = 48 F( 1, 46) = Prob > F = R-squared = Root MSE = Robust lpackpc Coef. Std. Err. t P> t [95% Conf. Interval] lravgprs _cons Instrumented: lravgprs This is the endogenous regressor Instruments: rtaxso This is the instrumental varible OK, the change in the SEs was small this time...but not always! ln(^qd i d )= 9:72 1:08 ln(^p i), n = 48 (1:53) (0:32)

28 Elaborating on the Example cigarettes cigarettes ln( Q ) = β 0 + β 1 ln( P i ) + β 2 ln(income i i) ) + u i ( i i Z 1i = general sales tax i Z 2i = cigarette-specific ifi tax i cigarettes Endogenous variable: ln( P i )( one X ) ) Included exogenous variable: ln(income i ) ( one W ) Instruments (excluded endogenous variables): general sales tax, cigarette-specific ifi tax ( two Zs ) Is the demand elasticity β 1 overidentified, exactly identified, or underidentified?

29 Cigarrete Demand: One Instrument t Y W X Z. ivreg lpackpc lperinc (lravgprs = rtaxso) if year==1995, r; IV (2SLS) regression with robust standard errors Number of obs = 48 F( 2, 45) = 8.19 Prob > F = R-squared = Root MSE = Robust lpackpc Coef. Std. Err. t P> t [95% Conf. Interval] lravgprs lperinc _cons Instrumented: lravgprs Instruments: lperinc rtaxso STATA lists ALL the exogenous regressors as instruments slightly different terminology than we have been using Running IV as a single command yields correct SEs Use, r for heteroskedasticity-robust SEs

30 Cigarrete Demand: Two Instruments t Y W X Z 1 Z 2. ivreg lpackpc lperinc (lravgprs = rtaxso rtax) if year==1995, r; IV (2SLS) regression with robust standard errors Number of obs = 48 F( 2, 45) = Prob > F = R-squared = Root MSE = Robust lpackpc Coef. Std. Err. t P> t [95% Conf. Interval] lravgprs lperinc _cons Instrumented: lravgprs Instruments: lperinc rtaxso rtax STATA lists ALL the exogenous regressors as instruments slightly different terminology than we have been using

31 IV as Generalized Method of Moments Rather than examining the properties of the justidentified or the overidentified TSLS-IV estimator, it is easier to examine IV using GMM LetZ Z i ; X i be 1 l and 1 k vectors respectively Suppose E(Zi² 0 i )=0 Hence define: g i(y 0 i (y i ; X i ; Z i ; ) = Z i i X i ) From the population moment condition, the sample analog is 1 n nx i=1 g i (y i ;X i ;Z i ; ) =g n ( ) = Z0 (Y X ) n =0

32 GMM The generic GMM objective function is: ^Q n ( ) = g n ( ) 0 ^Wgn ( ) For some weighting l l matrix Notice that ^G n( = Z 0 X ^W p! W So if I apply directly (big if since I have not checked assumptions) the results from extremum estimators p ^ n( 0) d 1 1! N(0; (G 0 WG) 1 G 0 W ÐWG(G 0 WG) 1 ) Were. If homoscedastic Ð = E(ZiZ i Z i ² i ) Ð = ¾ (Z Z)

33 GMM in detail μ Z 0 (Y X ) 0 max n First order conditions: Recall: μ Z 0 X 0 2 ^W n W^W μ Z 0 (Y X ) n μ Z 0 (Y X ) =0 n X 0 Z ^WZ 0 Y = X 0 Z ^WZ 0 X ^ GMM = (X 0 Z ^W Z 0 X) ) 1 X 0 Z ^W Z 0 Y 1 ^ IV = X 0 Z(Z 0 Z) 1 Z 0 X 1 X 0 Z(Z 0 Z) 1 (Z 0 Y )

34 GMM (cont.) Intriguing, so if I choose ^W =(Z 0 Z) 1 then ^ GMM = ^ IV In fact, since ^G = G = Z 0 X then under homoscedasticity Ð=¾ 2 (Z 0 Z) and hence p n( ^ 0) d! N(0; (G 0 WG) 1 G 0 W ÐWG(G 0 WG) 1 ) p d n( ^ 0) )! N(0;¾ 2 (X 0 Z(Z 0 Z) 11 Z 0 X) 11 X 0 Z(Z 0 Z) 11 (Z 0 Z)(Z 0 Z) 11 Z 0 X(X 0 Z(Z 0 Z) 11 Z 0 X) p n( ^ 0) d! N(0;¾ 2 (X 0 Z(Z 0 Z) 1 Z 0 X) 1 ) p n( ^ 0) d! N(0;¾ 2 (X 0 P z X) 1 )

35 GMM (cont.) Ask yourself, why (and when) would choosing be optimal? ^W =(Z 0 Z) 1 In general, we know that the optimal choice of weighting matrix is Ã! 11 1 nx ^W = g i (y i ;X i ;Z i ; n ^ )g i (y i ;X i ;Z i ; ^ ) 0 i=1 ^W p! W = E(g i ( )g i ( ) 0 )=E(Z 0 iz i ² 2 i )

36 A Recap: A Lot to Digest Take GMM as the primitive way of thinking about the instrumental variable problem and TSLS as a special case Under just-identification (and linearity in the regression), things are simple: the choice of weighting matrix is irrelevant because we have the same number of moments and parameters (unique solution). Hence ^ IV = ^ GMM =(Z 0 X) 1 (Z 0 Y )

37 Recap Continued In the over-identified case we have more moment conditions than parameters. Hence the weighting matrix matters. Under certain conditions, ^ TSLS = ^ GMM =(X 0 P z X) 1 (X 0 P z Y ) And from the usual GMM results p n( ^ 0) d! N(0;¾ 2 (X 0 P z X) 1 ) But, there are very specific assumptions for this to work. Two critical assumptions about the instruments

38 Assumptions on the Instruments 1. Instrument Relevance: it is easiest to think about why this matters in the context of TSLS. If the Z do not explain the X, then in the first stage regression ^X X = Z +U we get lousy for the second stage. 2. Instrument Exogeneity: it better be the case that E(Z 0 ²)=0, otherwise there is no advantage with respect to using the X themselves.

39 Distribution of GMM Estimator Assume that ^W! p W and let: E(ZiX 0 i )=Q. Further assume that Then: Hence: n( )! N(0; V ) Ð=E(Z 0 iz i ² 2 i )=E(g 0 ig i )whereg i = Z 0 i² i μ μ 1 1 n X0 Z ^W n Z0 X μ μ 1 n X0 Z ^W p 1 Z 0 ² n p ( ^ ) d N(0 V ) p! Q 0 WQ d! Q 0 WN(0; Ð) V =(Q 0 WQ) 1 (Q 0 W ÐWQ)(Q 0 WQ) 1

40 Efficient GMM If then: W =Ð 1 p n( ^ ) d! N(0; (Q 0 Ð 1 Q) 1 ) In small samples: Ã 1 n! 1 X W ^W = (^g (g ¹g) 0 (^g ¹g) ^g 0^² n i (g i with g i = Z i ² i i=1 And in the linear model under some assumptions one could use W ^W = (Z 0 Z) 11 as a first stage estimator for ^ and hence construct ^² i and thus ^g i

41 Assessing IV Instrument t Relevance Let s think back to TSLS. The just-identified case: ^ IV =(Z 0 X) 1 (Z 0 Y ) In the special case: and E(z 2 i ² 2 i )=¾ 2 ² ¾ 2 z y i = 0 + 1x i + ² i with E(z i ² i ) =0 ^ IV = S ZY S ZX p! ¾ zy ¾ zx with V ( ^ IV )= ¾2 ² ¾ 2 z (¾ zx ) 2 So, if ¾ zx! 0 then ^ IV! 1 but V ( IV ^ IV )! 1 as well

42 What Happens Asymptotically When E(z ix 0 i )=0 then by the CLT 1 nx p z 0 d n ix i! N1» N(0;E(zi 2 x 2 i )) i=1 1 nx p z 0 d n i ² i! N2» N(0;E(zi 2 ² 2 i )) n i=1 Therefore: ^ = 1 n P n p n i=1 z0 i ² i 1 P n p n i=1 z0 i x i d! N 2» Cauchy N 1

43 Weak Identification The Cauchy distribution has a mean, variance, and in fact higher moments, that are not defined (although its mode and its median are). Let s think of a less extreme case, where there is correlation between the instrument and the endogenous variable, but it is weak. One way to do this asymptotically (in a one variable-one instrument case: y i = x i + ² i x i = z i + u i = n 1=2 c

44 Weak Identification (cont.) Clearly, this device assures that asymptotically the instrument and the endogenous variable are uncorrelated. Specifically, n n n 1 nx p z n ix 0 i = 1 nx p z n i nx p z i u i n i=1 = 1 n i=1 nx z i 2 c + p 1 n i=1 d! Qc + N 1 i=1 nx z i u i i=1 ^ d d! N 2 Qc + N 1

45 Remarks ^ is inconsistent for The asymptotic distribution of ^ ^ is non-normal Standard t-test have non-standard distributions

46 Example: Sampling Distribution of TSLS with weak instruments t Light line: strong instruments Dark line: weak instruments

47 Some things to check in TSLS One way to check for the strength of the instruments is with the F-stat of the first stage n regressions X = Z + U n k n l l k n k At a minimum, you should reject the null that (excluding any included exogenous variables) they should be jointly zero. Staiger and Stock (1997) and Stock and Wright (1998) are two standard references.

48 Overidentification Test (a.k.a J-Test) We have checked one of the conditions for IV to work: instrument relevance To check the other, exogeneity, we rely on having more moment conditions than parameters. The reason is that then we have a situation where, if the model is correctly specified, then all moment conditions should be satisfied. If it is not, then some will be violated.

49 Overidentification Test (cont.) Recall the GMM objective function: ^Q n ( ) = g n ( ) 0 ^Wgn ( ) where 1 n n i=1 nx gi (y i ;X i ;Z i ; ) =g n ( ) = Z0 (Y X ) n =0 when then only as and the model is correctly specified because the moment conditions only hold in populations, not in sample.

50 Overidentification Test (cont.) The specification test on the model is therefore based on the distance between the sample moment conditions and zero. It is important to note that to apply the test one must use the optimal weighting matrix. This is because we rely on the asymptotic result that 1 nx 1 nx d pn g i ( ) = p Z n i=1 n i² 0 i! N(0; Ð) i=1 Ð=E(g 0 = E(ZiZ 0 ² 2 i ( ) g i ( )) ( i i i )

51 Overidentification Test (cont.) Hence: as long as n ^Q n ( ^ ) = ng n ( ^ ) 0 ^Wgn ( ^ ) d! Â 2 l k ^W = 1 X n gi ^ ) 0 ^ ) = 1 X n 0 W g i^² 2 n i ( ) g i ( ) Z i=1 n i Z i ² i i=1 Remarks: Careful, this test tends to have low power.

52 The Distance Statistic Hypothesis tests could be constructed with the asymptotic covariance GMM estimate and the Wald principle, as we have often seen. However, if the hypotheses are nonlinear, it is often better to use directly the GMM criterion function. Suppose we want to test H = k r 0 : h( ) 0 for h : R! R

53 The Distance Statistic Let the estimates under the alternative be ^ =argmin Let the estimates under the null be ~ =argmin ~Q n ( ) Let h( )=0 ^Q n ( ) D = n( Q ~ n ( ~ ) ^Q n ( ^ )) Then D 0andD! d Â 2 r. Further, if h is linear then D equals the Wald statistic.

54 Fixed-effects model of cigarette demandd ln( Q cigarettes it ) = α i + β 1 ln( P cigarettes it ) + β 2 ln(income it ) + u it i = 1,,48, t = 1985, 1986,,1995 α i reflects unobserved omitted factors that t vary across states t but not over time, e.g. attitude towards smoking cigarettes Still, corr(ln( ( P it ),u it) ) is plausibly nonzero because of supply/demand interactions Estimation strategy: Use panel ldata regression methods to eliminate i α i Use TSLS to handle simultaneous causality bias Use T = 2 with changes ( changes method) look at long-term response, not short-term dynamics (short- v. long-run elasticities)

55 The changes method (when T=2) T One way to model long-term effects is to consider 10-year changes, between 1985 and 1995 Rewrite the regression in changes form: cigarettes cigarettes ln( Q i1995 ) ln( Q i1985 ) cigarettes cigarettes = β 1 [ln( P i1995 ) ln( P i1985 )] +β 2 [ln(income i1995 ) ln(income i1985 )] + (u i1995 u i1985 ) Create 10-year change variables, for example: 10-year change in log price = ln(p i1995 ) ln(p i1985 ) Then estimate the demand elasticity by TSLS using 10-year changes in the instrumental variables

56 STATA: Cigarette demand First create 10-year change variables 10-year change in log price = ln(p it ) ln(p it 10 ) = ln(p it /P it 10 ). gen dlpackpc = log(packpc/packpc[_n-10]); _n-10 is the 10-yr lagged value. gen dlavgprs = log(avgprs/avgprs[_n-10]);. gen dlperinc = log(perinc/perinc[_n-10]);. gen drtaxs = rtaxs-rtaxs[_n-10];. gen drtax = rtax-rtax[_n-10];. gen drtaxso = rtaxso-rtaxso[_n-10];

57 Use TSLS to estimate the demand elasticity by using the 10-year changes specification Y W X Z. ivreg dlpackpc p dlperinc (dlavgprs = drtaxso), r; IV (2SLS) regression with robust standard errors Number of obs = 48 F( 2, 45) = Prob > F = R-squared = Root MSE = Robust dlpackpc Coef. Std. Err. t P> t [95% Conf. Interval] dlavgprs dlperinc _cons Instrumented: dlavgprs Instruments: dlperinc drtaxso NOTE: - All the variables Y, X, W, and Z s are in 10-year changes - Estimated elasticity =.94 (SE =.21) surprisingly elastic! - Income elasticity small, not statistically different from zero - Must check whether the instrument is relevant

58 Check instrument relevance: compute first-stage t F. reg dlavgprs drtaxso dlperinc, r; Regression with robust standard errors Number of obs = 48 F( 2, 45) = Prob > F = R-squared = Root MSE = Robust dlavgprs Coef. Std. Err. t P> t [95% Conf. Interval] drtaxso dlperinc _cons test drtaxso; We didn t need to run test here because with m=1 instrument, the ( 1) drtaxso = 0 F-statistic is the square of the t-statistic, that is, F( 1, 45) = *5.80 = Prob > F = First stage F = 33.7 > 10 so instrument is not weak Can we check instrument exogeneity? No: l = k

59 Check instrument relevance: compute first-stage t F X Z1 Z2 W. reg dlavgprs drtaxso drtax dlperinc, r; Regression with robust standard errors Number of obs = 48 F( 3, 44) = Prob > F = R-squared = Root MSE = Robust dlavgprs Coef. Std. Err. t P> t [95% Conf. Interval] drtaxso drtax dlperinc _ cons test drtaxso drtax; ( 1) drtaxso = 0 ( 2) drtax = 0 F( ARE/ECN 2, 240A44) = > 10 so instruments aren t weak Prob > F =

60 What about two instruments (i (cig-only tax, sales tax)?. ivreg dlpackpc dlperinc (dlavgprs = drtaxso drtax), r; IV (2SLS) regression with robust standard errors Number of obs = 48 F( 2, 45) = Prob > F = R-squared = Root MSE = Robust dlpackpc Coef. Std. Err. t P> t [95% Conf. Interval] dlavgprs dlperinc _cons Instrumented: dlavgprs Instruments: dlperinc drtaxso drtax drtaxso = general sales tax only drtax = cigarette-specific tax only Estimated elasticity is -1.2, even more elastic than using general sales tax only With l > k, we can test the overidentifying restrictions

61 Test the overidentifying i restrictions ti. predict e, resid; Computes predicted values for most recently estimated regression (the previous TSLS regression). reg e drtaxso drtax dlperinc; Regress e on Z s and W s Source SS df MS Number of obs = F( 3, 44) = 1.64 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = e Coef. Std. Err. t P> t [95% Conf. Interval] drtaxso drtax dlperinc _cons test drtaxso drtax; ( 1) drtaxso = 0 Compute J-statistic, which is m*f, ( 2) drtax = 0 where F tests whether coefficients on the instruments are zero F( ARE/ECN 2, 240A 44) = 2.47 so J = = 4.93 Prob > F = ** WARNING this uses the wrong d.f. **

62 The correct degrees of freedom for the J-statistic J ti ti is l k: l k J = lf, where F = the F-statistic testing the coefficients on Z 1i,,ZZ li in a regression of the TSLS residuals against Z 1i,,Z li, W 1i,,W li. Under the null hypothesis that all the instruments are exogeneous, J has a chi-squared distribution with l k degrees of freedom Here, J = 4.93, distributed chi-squared with d.f. = 1; the 5% critical value is 3.84, so reject at 5% sig. level. In STATA:. dis "J-stat = " r(df)*r(f) " p-value = " chiprob(r(df)-1,r(df)*r(f)); J-stat = p-value = J = = 4.93 p-value l from chi-squared(1) distribution ib ti Now ARE/ECN what??? 240A

63 Tbl Tabular summary of fthese results:

64 How should we interpret the J-test rejection? J-test rejects the null hypothesis that both the instruments are exogenous This means that either rtaxso is endogenous, or rtax is endogenous, or both The J-test doesn t tell us which!! You must exercise judgment Why might rtax (cig-only tax) be endogenous? Political forces: history of smoking or lots of smokers political pressure for low cigarette taxes If so, cig-only tax is endogenous This reasoning doesn t apply to general sales tax use ARE/ECN just 240A one instrument, the general sales tax

65 The Demand for Cigarettes: Summary of Empirical i Results Use the estimated elasticity based on TSLS with the general sales tax as the only instrument: Elasticity = -.94, SE =.21 This elasticity is surprisingly large (not inelastic) a 1% increase in prices reduces cigarette sales by nearly 1%. This is much more elastic than conventional wisdom in the health economics literature. This is a long-run (ten-year change) elasticity. What would you expect a short-run (one-year change) elasticity to be more or less elastic?

Multivariate Regression: Part I

Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a