ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

Size: px
Start display at page:

Download "ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS"

Transcription

1 ECONOMETRICS II (ECO 2401) Victor Aguirregabiria Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS 1. Introduction and Notation 2. Randomized treatment 3. Conditional independence 4. Di erence-in-di erences (a variant of the CI assumption)

2 5. Randomized eligibility: LATE 6. Regression Discontinuity Design 7. Roy s Model

3 1. INTRODUCTION. We are interested in estimating the causal e ect of an explanatory variable D on an outcome variable Y. This setting is very general: e ect of a drug on cholesterol level; e ect of education of labor earnings; e ect of price on demand; e ect of a wage tax on employment; e ect of a competition policy on rms pro ts; etc, etc, etc. We consider a stylized but very general model where D is binary, D 2 f0; 1g, and Y can be continuous or discrete, e.g., D = University degree, Y = Earnings.

4 Following the language in this literature we denote D as the treatment variable, and Y is the outcome variable. - D = 1 indicates that the subject "receives treatment" or "is in the treatment group", or " is in the experimental group"; - D = 0 indicates that the subject "does not receive treatment" or "is in the control group". Example: A retail chain is interested in estimating the e ect on demand of a 20% discount in the price of its key product. The rm decides to implement this price discount in some of its stores (experimental group) and keep the regular price in other stores (control group).

5 Let Y 0 and Y 1 be latent variables that represent the outcome variable for an individual without and with treatment, respectively. We have an observation of the outcome variable Y per individual. Therefore, we observe: Y = ( Y0 if D = 0 Y 1 if D = 1 = (1 D) Y 0 + D Y 1 The Treatment E ect for an individual is: T E Y 1 Y 0 Note: Even if we could observe the same individual with and without treatment, it would be at di erent moments [More on this below].

6 Subjects are heterogeneous in multiple dimensions, and Treatment E ects can be very heterogeneous across individuals. - The e ect of a medicine drug varies substantially across patients; - The e ect of a university degree on earnings can be very di erent across individuals; - The e ect of a price reduction on demand can be substantially di erent across stores of the same chain Ideally, we would like to estimate the TE of each individual. However, this is not feasible because we observe an individual either with or without treatment but not both.

7 Under some conditions / restrictions, we will be able to estimate some features of the distribution of the TEs in the population of interest. A commonly used parameter that measures the aggregate e ect of a treatment is the Average Treatment E ect, ATE. AT E = E (T E) = E (Y 1 Y 0 ) An the Conditional Average Treatment E ect. AT E(x) = E (Y 1 Y 0 j X = x) where X is a vector of predetermined attributes of the subject. AT E(x) is the ATE for subpopulation of individuals with X = x.

8 Regression-like representation of the model De ne 0 E(Y 0 ) and 1 E(Y 1 ) such that we can write: 8 >< >: Y 0 = 0 + U 0 Y 1 = 1 + U 1 where, by construction, E(U 0 ) = E(U 1 ) = 0. Note that, by de nition, AT E = 1 0.

9 Regression-like representation of the model [2] Using these de nitions, we have that: Y = (1 D) ( 0 + U 0 ) + D ( 1 + U 1 ) = + D + e where = 0, = 1 0 = AT E, and e = U 0 + (U 1 U 0 )D We will show below the Regression-Like representation of the model that includes the X variables.

10 Estimation of ATE and Endogeneity Problem Let fy i ; d i ; x i : i = 1; 2; :::; Ng be a random sample of N individuals, some with treatment (d i = 1), and others without treatment (d i = 0). The researcher is interested in estimating these data to estimate AT E or/and AT E(x). We now present two simple and intuitive estimators of the ATE: - Di erence in means estimator: - OLS estimator of Y on D. We show that they are equivalent and, without further restrictions, they are inconsistent estimators of the ATE.

11 Estimation of ATE and Endogeneity Problem [2] Di erence in means estimator: [AT E DM = y D=1 y D=0 with y D=1 = P Ni=1 y i d i P Ni=1 d i and y D=0 = P Ni=1 y i (1 d i ) P Ni=1 (1 d i ) OLS Estimator: [AT E OLS = b OLS = P Ni=1 (y i y) (d i d) P Ni=1 (d i d) 2

12 Equivalence of Di erence-in-means and OLS of ATE [AT E OLS = P Ni=1 (y i y) (d i d) P Ni=1 (d i d) 2 Note: P Ni=1 di d 2 = P Ni=1 d 2 i 2 P N i=1 d i d + P N i=1 d 2 = N d 2N d 2 + N d 2 = n d(1 d) And: P Ni=1 di d (y i y) = P N i=1 d i y i N d y

13 Equivalence of Di erence-in-means and OLS of ATE [2] Therefore: [AT E OLS = P Ni=1 d i y i N d y N d(1 d) = 1 1 d P Ni=1 d i y i P Ni=1 d i y! Note that: = 1 1 d y(d=1) y y = N 1 P N i=1 d i y i + (1 d i )y i = d y (D=1) + (1 d) y (D=0)

14 Equivalence of Di erence-in-means and OLS of ATE [3] Thus, [AT E OLS = 1 1 d y(d=1) d y (D=1) (1 d) y (D=0) = y (D=1) y (D=0)

15 Inconsistency of DM / OLS Estimators Is this estimator consistent? No, without further assumptions. It is clear that [AT E DM! p E (Y j D = 1) NOT independent of Y 0 and Y 1 : E (Y j D = 0), and if D is E (Y j D = 1) E (Y j D = 0) = E (Y 1 j D = 1) E (Y 0 j D = 0) 6= E (Y 1 ) E (Y 0 ) = AT E In Economics or social sciences, we expect the "choice of treatment" D to be correlated with the "e ect of treatment" Y 1 Y 0. Examples.

16 Inconsistency of DM / OLS Estimators [2] In the regression like representation of the model: Y = + D + e where e = U 0 + (U 1 U 0 )D Such that: E (D e) = E (D U 0 + D(U 1 U 0 )) = E (D U 1 ) 6= 0

17 2. RANDOMIZED TREATMENT Suppose that the treatment dummy D is independent of the latent outcome variables Y 0 and Y 1. D cb (Y 0 ; Y 1 ) where cb represents "statistical independence". Given that Y = (1 D) Y 0 + D Y 1 and D cb (Y 0 ; Y 1 ): 8 >< E (Y j D = 0) = E (Y 0 j D = 0) = E (Y 0 ) such that: >: E (Y j D = 1) = E (Y 1 j D = 1) = E (Y 1 ) AT E E (Y 1 Y 0 ) = E (Y j D = 1) E (Y j D = 0) and AT E is identi ed from data of fy; Dg

18 RANDOMIZED TREATMENT [2] We can construct root-n consistent estimators of E (Y j D = 1) and E b (Y j D = 0) using: P Ni=1 P y i d Ni=1 i y i (1 d i ) y D=1 = P Ni=1 and y D=0 = P d Ni=1 i (1 d i ) Then, a root-n consistent estimator of AT E is: [AT E = y D=1 y D=0

19 ENDOGENOUS TREATMENT The main concern in this literature is the endogeneity of treatment. D is not independent of T E = Y 1 Y 0 The assumption of D cb (Y 0 ; Y 1 ) is equivalent to assume that treatment is perfectly randomized. This assumption is not plausible in most applications unless there is a randomized experiment and all the individuals comply to their treatment assigment. This may be a realistic condition in some randomized experiments in medical or natural science experiments, or even in lab experiments in experimental economics. However, it is quite unrealistic in social sciences, even in randomized eld experiments in social sciences.

20 ENDOGENOUS TREATMENT [2] In general, treatment D is not independent of the potential outcomes Y 0 and Y 1. Individuals tend to self-select into treatment or not treatment according to their individual-speci c bene ts of treatment, i.e., according to Y 0 and Y 1. In eld randomized experiments in social sciences, we typically can randomize eligibility to treatment but not treatment itself. Treatment No Treatment Eligible Compliers Not Compliers Not Elligible Not Compliers Compliers

21 In general: - Some subjects eligible to treatment choose not to take the treatment; - Some subjects not eligible decide to take an alternative but similar treatment.

22 Regression-like representation of the model [2] *** The OLS estimator of Y on D in the linear regression Y = + D + e is: b OLS = P ni=1 (y i y) (d i d) P ni=1 (d i d) 2

23 Regression-like representation of the model [2] Is the OLS estimator of (i.e., the AT E) consistent? Consistency of the OLS requires E(D e) = 0. Let s see that this condition holds under randomized treatment. Randomized treatment implies D cb (U 0 ; U 1 ) and therefore E(U 0 j D) = E(U 1 j D) = 0. E(D e ) = E(D [U 0 + D(U 1 U 0 )]) = Pr (D = 1) E(U 1 j D = 1) = 0

24 Regression-like representation of the model [3] Without a randomized experiment, the unobservable component of the potential outcomes, U 0 and U 1, can be correlated with the treatment dummy D and this implies correlation between the error term e and the regressor D. The OLS estimator b OLS = y D=1 y D=0 will be inconsistent.

25 3. CONDITIONAL INDEPENDENCE A weaker version of the assumption of independence between treatment and potential outcomes is that this independence holds only conditional on a vector of observable individual characteristics (control variables) X. D cb (Y 0 ; Y 1 ) j X Given that Y = (1 D) Y 0 + D Y 1 and D cb (Y 0 ; Y 1 ) j X: 8 >< >: E (Y j D = 0; X = x) = E (Y 0 j D = 0; X = x) = E (Y 0 j X = x) E (Y j D = 1; X = x) = E (Y 1 j D = 1; X = x) = E (Y 1 j X = x) such that: AT E(x) E (Y 1 Y 0 j X = x) = E (Y j D = 1; X = x) E (Y j D = 0; X = x and the conditional ATE(x) is identi ed. Then, we can also identify: AT E = E X ( AT E(x) )

26 CONDITIONAL INDEPENDENCE [2] Estimation: With conditional independence D cb (Y 0 ; Y 1 ) j X but without unconditional independence D, (Y 0 ; Y 1 ), the estimator [AT E = y D=1 y D=0 of the ATE is inconsistent. To estimate consistently the ATE we need to condition on X and estimate rst the conditional ATE(x). Is X is a vector of discrete random variables (and our sample is relatively large), we can estimate ATE(x) using frequency estimators to estimate E (Y j D = 1; and E (Y j D = 0; X = x): [AT E(x) = y D=1 (x) y D=0 (x)

27 with y D=1 (x) = P Ni=1 y i d i 1fx i = xg P Ni=1 d i 1fx i = xg y D=0 (x) = P Ni=1 y i (1 P Ni=1 (1 d i ) 1fx i = xg d i ) 1fx i = xg

28 CONDITIONAL INDEPENDENCE [3] If X contains continuous variables (or if our sample is not so large) we can estimate ATE(x) using Kernel Estimators to estimate E (Y j D = 1; X = x) and E (Y j D = 0; X = x): [AT E(x) = y D=1 (x) y D=0 (x) with P Ni=1 y i d i K x i x! y D=1 (x) = b N P Ni=1 d i K x i x! b N P Ni=1 y i (1 d i ) K x i x! y D=0 (x) = b N! P Ni=1 (1 d i ) K x i x b N

29 Regression-like representation under CI De ne 0 (x) E(Y 0 j X = x) and 1 (x) E(Y 1 j X = x) such that we can write: 8 >< >: Y 0 = 0 (x) + U 0 Y 1 = 0 (x) + U 1 where, by construction, E(U 0 j X = x) = E(U 1 j X = x) = 0. Note that, by de nition, AT E(x) = 1 (x) 0 (x).

30 Regression-like representation under CI [2] Taking into account that Y = (1 D)Y 0 + D Y 1 : Y = (1 D) ( 0 (X) + U 0 ) + D ( 1 (X) + U 1 ) where: = (X) + (X) D + e (X) = 0 (X) (X) = AT E(X) e = U 0 + D (U 1 U 0 )

31 Regression-like representation under CI [3] Under the CI assumption, the OLS estimation of (x) in this regression model provides a consistent estimator of the AT E(x). Y = (X) + (X) D + e This is because, under the CI Assumption we have that: E (e j X; D = 0) = E (U 0 jx; D = 0) = E (U 0 ) = 0 E (e j X; D = 1) = E (U 1 jx; D = 1) = E (U 1 ) = 0 We can apply (nonparametric) Least Squares to estimate consistently AT E(X).

32 Regression-like representation under CI [4] Suppose that (x) and (x) are well approximated by a polynomial of order q in x: When x is a scalar: y i = h x i + ::: + q x q i i + h x i + ::: + q x q i i di + e i We can estimate parameters 0 s and 0 s by OLS and the construct the estimate of the ATE(x): [AT E(x) = b (x) = b 0 + b 1 x + ::: + b q x q

33 Curse of dimensionality in NP estimation of ATE(x) The Kernel and Polynomial series estimators of ATE(x) su er of the wellknown curse of dimensionality in NP estimator. The speed of convergence of [AT E(x) to the true AT E(x) declines with the number of continuous explanatory variables in the vector X. The estimator can be very imprecise unless we have very large samples. When X is discrete, these estimators have good asymptotic properties, but we still need su cient observations for each discrete value of x. A possible approach is to construct an estimate of the unconditional ATE given the estimates of ATE(x): [AT E = 1 N NX i=1 [AT E(x i )

34 Curse of dimensionality in NP estimation of ATE(x) [2] This estimator [AT E is root-n consistent and asymptotically normal (Newey, ET 1994) despite [AT E(x) have lower speed of convergence due to continuous regressors. However, in some applications we are interested in the conditional ATE(x). Furthermore, even if we are interested only in unconditional ATE, the nite sample properties of the previous estimator [AT E are a ected by the poor and imprecise estimates [AT E(x i ). Rosenbaum and Rubin (Biometrika, 1983) provide an interesting and useful approach to deal with this curse of dimensionality in the NP estimation of ATE.

35 Rosenbaum and Rubin (1983) Matching estimator using the Propensity Score Since D is a binary variable, its distribution conditional on X = x is Bernoulli with probability P (x) where: P (x) Pr (D = 1 j X = x) In the TE literature, P (x) is denoted the Propensity Score. Note that P (x) contains all the information in the distribution of D conditional on X = x. Therefore, if D is independent of (Y 0 ; Y 1 ) conditional on X, then it should be also true that D is independent of (Y 0 ; Y 1 ) conditional on P (X). D cb (Y 0 ; Y 1 ) j P (X)

36 Matching estimator using the Propensity Score [2] De ne e 0 (p) E(Y 0 j P (X) = p) and e 1 (p) E(Y 1 j P (X) = p) such that we can write: 8 >< >: Y 0 = e 0 (p) + U 0 Y 1 = e 1 (p) + U 1 where, by construction, E(U 0 j P (x)) = E(U 1 j P (x)) = 0. Note that, by de nition, AT E(p) = e 1 (p) e 0 (p).

37 Matching estimator using the Propensity Score [3] The CI assumption implies that AT E(p) is identi ed as: AT E(p) = E (Y 1 j P (X) = p) E (Y 0 j P (X) = p) = E (Y j D = 1; P (X) = p) E (Y j D = 0; P (X) = p) Based on this insight, Rosenbaum and Rubin proposed the following estimator of the ATE. Let bp i b P (x i ) be a consistent estimator of the propensity score for individual i. Then: where [AT E = 1 N NX i=1 [AT E( bp(x i )) [AT E(p) = y D=1 (p) y D=0 (p)

38 Matching estimator using the Propensity Score [4] with: P Ni=1 y i d i K b p i p! y D=1 (p) = b N P Ni=1 d i K b p i p! b N P Ni=1 y i (1 d i ) K b p i p! y D=0 (p) = b N P Nj=1 (1 d i ) K b p i p! and bp i = bp(x i ) = P N j=1 d j K x! j x i =[ P N b j=1 K x! j x i ]. N b N Now, the dimension of the conditioning variables in the estimation of the conditional expectations y D=1 (p) and y D=0 (p) is 1 (the propensity score) b N

39 instead of dim(x). This improves the asymptotic and nite sample properties of the estimators of ATE.

40 4. DIFFERENCES-IN-DIFFERENCES (DiD) DiD is a particular case of Conditional Indipendence when we have (1) Panel Data; (2) A particular structure of the Treatment variable D; U 0, U 1 (3) An assumption about the components structure of the unobservables Suppose that we have panel data fy it ; d it ; x it g for t = 1,...,T, with T 2. The treatment dummy D it 2 f0; 1g has the following structure:

41 D it = T i 1ft t g - T i 2 f0; 1g is the dummy that indicates that an individual i belongs to the experimental group. - 1ft t g is the dummy that indicates that period t is a period of treatment.

42 DIFFERENCES-IN-DIFFERENCES [2] The model is the same as before but for panel data: Y 0;it and Y 1;it are the latent variables that represent the outcome variable for an individual without and with treatment, respectively. And have that: 8 >< with 0 E(Y 0;it ) and 1 E(Y 1;it ) >: Y 0;it = 0 + U 0;it Y 1;it = 1 + U 1;it The model is completed with an assumption about the component structure of U 0;it and U 1;it : U 0;it = i + 0t + u 0it U 1;it = i + 1t + u 1it Note 0i = 1i. Key restriction.

43 DIFFERENCES-IN-DIFFERENCES [3] Model: Y it = (1 D it ) Y 0;it + D it Y 1;it that we can represent as: Y it = + D it + h U 0;it + D it U1;it U 0;it i where = 0, 1 0 = AT E, and U 0;it + D it U1;it U 0;it = i + 0t + u 0it + D it ( 1t 0t + u 1it u 0it )

44 DIFFERENCES-IN-DIFFERENCES [4] The DiD estimator is simply the OLS estimator in the equation in rstdi erences when we include time-dummies: Y it = D it + e t T D t + e it Note that: D it = 8 >< >: 0 for t < t or t > t T i for t = t Therefore, the model has information about only at t = t. Y i /t = T i + e t + e it

45 DIFFERENCES-IN-DIFFERENCES [5] And according to the model: e it = u 0it u 0it 1 + u 1it u 0it = u 1it u 0it 1 Consistency os the DiD estimator requires: T i is independent of the transitory shocks u 1it u 0it 1 More importantly, the error component restriction: 0i = 1i

46 5. RANDOMIZED ELIGIBILITY TO TREATMENT Let Z 2 f0; 1g be a random variable that represents whether the individual is eligible to treatment (Z = 1) or not (Z = 0). This Z variable comes from a randomized experiment. In general, in eld experiments in the social sciences, Z 6= D. - We observe subjects with fz i = 1 and d i = 0g: eligible but not taking the treatment; - We observe (os suspect) subjects with fz i = 0 and d i = 1g: non-eligible but taking a similar / alternative treatment.

47 Using Z as a proxy for D generates an inconsistent estimator [See Below] However, we show below that, under some additional assumptions, Z can be used as an instrument for D. This IV estimator is not a consistent estimator for the ATE for all the population. However, thsi IV estimator is a consistent estimator of the ATE for a particular subpopulation of subjects: the compliers.

48 IV Estimator For z = 0; 1, let P (z) be the propensity score P (z) Pr(D = 1 j Z = z). Consider the following assumptions on the instrument Z. [Independence] Z is independent of potential outcomes (Y 0 ; Y 1 ); [Relevance] Z is correlated with treatment, i.e., P (1) > P (0). Consider the regression-like representation of the model: Y = + D + e The IV estimator of the ATE is: b IV = 2 4 NX (z i z) d i d i=1 i= NX 4 (z i z) (y i y) 3 5

49 Wald Estimator Wald Estimator is de ned as: b W ald = y Z=1 y Z=0 d Z=1 d Z=0 where y Z=1 and d Z=1 are the sample means of Y and D, respectively, for the subsample of observations with Z = 1, and similarly, y Z=0 and d Z=0 are the sample means of Y and D for the subsample of observations with Z = 0. We can show that for this model, the IV and the Wald estimators are the same. P Ni=1 P z i (y i y) Ni=1 b IV = P Ni=1 z i di d z i y i N 1 y = P Ni=1 z i d i N 1 d = N 1 (y 1 y) N 1 d1 d = N 1 N 0 N (y Z=1 y Z=0 ) = y Z=1 y Z=0 = b W ald dz=1 d Z=0 d Z=0 N 1 N 0 N d Z=1

50 Inconsistency of IV (Wald) Estimator for ATE In general, this IV is NOT a consistent estimator of the ATE. Though the instrument Z is independent of U 0 and U 1, it is correlated with the error term e = U 0 + D(U 1 U 0 ). E(e j Z = 0) = E(U 0 + D(U 1 U 0 ) j Z = 0) And, = P (0) E(U 1 U 0 j D = 1) E(e j Z = 1) = E(U 0 + D(U 1 U 0 ) j Z = 1) = P (1) E(U 1 U 0 j D = 1) Such that: E(e j Z = 1) E(e j Z = 0) = [P (1) P (0)] E(U 1 U 0 j D = 1) 6= 0

51 For the IV estimator to be consistent, we need E(Z e) = 0. Note that E(Z e) = Pr(Z = 1) E(e j Z = 0) + Pr(Z = 0) E(e j Z = 0). Given that E(e j Z = 1) E(e j Z = 0) = [P (1) P (0)] E(U 1 U 0 j D = 1), note that E(Ze) = [Pr(Z = 0) + Pr(Z = 1) [P (1) P (0)]] E(U 1 U 0 jd = 1) that in general is di erent to zero.

52 Inconsistency of IV in Random Coe cients Models with Endogeneity More generally, in models with random coe cients and endogenous variables, the error term of the regression model includes interactions between the random coe cient and the endogenous variable. In these models, IV estimation does not provide a consistent estimator of the average coe cient. Consider the model: such that Y i = X i i + " i with i = + v i Y i = X i + e i with e i = " i + X i v i where X i is correlated with v i, but there is a vector of instruments Z i that is independent of " i and v i.

53 The IV estimator b IV = 0 i=1 z 0 i x i is an asymptotically biased estimator of. 1 A 1 0 i=1 z 0 i y i 1 A The reason is simple: despite Z i is independent of " i and v i, it is not independent of e i = " i + X i v i.

54 LOCAL AVERAGE TREATMENT EFFECT (LATE) Though the IV estimator is an inconsistent estimator of ATE (when we have heterogeneous treatment e ects), under some conditions (Monotonicity), the IV is a consistent estimator of the the ATE for a subpopulation of individuals: the Compliers. To understand some assumptions of the model and some properties of the estimators, it is useful to de ne the following latent variables: D 0 = Treatment indicator under the hypothetical case that individual were not eligible, i.e., when Z = 0; D 1 = Treatment indicator under the hypothetical case that individual were eligible, i.e., when Z = 1.

55 D 0 and D 1 are unobservable. All what we observe is the treatment D. D = (1 Z) D 0 + Z D 1

56 LATE [2] According to these latent variables, we can de ne D 0 = 0 D 0 = 1 D 1 = 0 Never Takers De ers D 1 = 1 Compliers Always Takers [Assumption: Monotonicity] For every individual, D 1 D 0, i.e., there are not de ers.

57 LATE [3] Using the de nitions of "individual types" above, the assumption of Monotonicity establishes that there are not "De ers" in the population. Under the assumptions of Independence, Relevance, and Monotonicity, the IV estimator converges in probability to the Local Average Treatment E ect parameter de ned as LAT E E (Y 1 Y 0 j D 1 > D 0 ) : LATE is the ATE for the subpopulation of Compliers.

58 Proof IV is a consistent estimator of LATE As we have shown before, the IV and the Wald estimator are the same: b IV = y 1 y 0 d 1 d 0 By the LLN, b converges in probability to E(Y jz = 1) E(Y jz = 0) E(DjZ = 1) E(DjZ = 0). Now, we show that, under the Monotonicity assumption, E(Y jz = 1) E(Y jz = 0) E(DjZ = 1) E(DjZ = 0) = E (Y 1 Y 0 j D 1 > D 0 ) = LAT E

59 Proof IV is a consistent estimator of LATE [2] Note that Y = Y 0 +D (Y 1 Y 0 ), and D = D 0 +Z (D 1 D 0 ). Therefore, (by independence of Z with (Y 0 ; Y 1 ; D 0 ; D 1 )): E(Y jz = 1) = E(Y 0 + D 1 (Y 1 Y 0 ) j Z = 1) = E(Y 0 + D 1 (Y 1 Y 0 ) ) And E(Y jz = 0) = E(Y 0 + D 1 (Y 1 Y 0 ) jz = 0) = E(Y 0 + D 0 (Y 1 Y 0 ) )

60 Proof IV is a consistent estimator of LATE [3] Therefore, the numerator of the PLIM of IV is: Numerator of PLIM of IV = E(Y 0 + D 1 (Y 1 Y 0 ) ) E(Y 0 + D 0 (Y 1 Y 0 ) ) = E((D 1 D 0 ) (Y 1 Y 0 ) ) By the monotonicity assumption, (D 1 D 0 ) can be only 0 or 1. Therefore, Numerator of PLIM of IV = Pr(D 1 D 0 > 0) E(Y 1 Y 0 j D 1 D 0 > 0)

61 Proof IV is a consistent estimator of LATE [4] Similarly, for the denominator of the PLIM of IV we have that (by independence of Z with (D 0 ; D 1 )) E(DjZ = 1) = E(D 0 + Z(D 1 D 0 ) jz = 1) = E(D 1 ) And (by independence of Z with (D 0 ; D 1 )) E(DjZ = 0) = E(D 0 + Z(D 1 D 0 ) jz = 0) = E(D 0 )

62 Proof IV is a consistent estimator of LATE [5] The denominator of the PLIM of IV is: Denominator of PLIM of IV = E(D 1 D 0 ) Again, by the monotonicity assumption, (D 1 D 0 ) can be only 0 or 1, such that E(D 1 D 0 ) = Pr(D 1 D 0 > 0). Therefore, PLIM of IV = Pr(D 1 D 0 > 0) E(Y 1 Y 0 j D 1 D 0 > 0) Pr(D 1 D 0 > 0) = E(Y 1 Y 0 j D 1 D 0 > 0) = LAT E

63 What if Monotonicity does not hold? What is the plim of the IV?

64 External Validity of LATE How di erent is the LATE to the ATE? Can we apply the LATE (ATE of compliers) to the rest of the population (Always Takers and Never Takers). In general, we cannot. However, if the proportion of compliers in the population is large (e.g., > 80%), we can be more con dent about the external validity of the LATE. If thsi proportion is small (e.g., < 20%) we should be very cautious. Under the Monotonicity assumption, we can identify the proportion of compliers in the population.

65 Identifying the Proportion of Compliers Let C, A, N, and D, be the proportion of compliers, always-takers, nerver-takers, and de ers in the population. Under Monotonicity, we have that D = 0, such that C + A + N = 1. We have that: Pr (D = 1 j Z = 0) = C Pr (D = 1jZ = 0; C) + A Pr (D = 1jZ = 0; A) + N Pr (D = 1jZ = 0; N) = A Similarly, Pr (D = 1 j Z = 1) = C Pr (D = 1jZ = 1; C) + A Pr (D = 1jZ = 1; A) + N Pr (D = 1jZ = 1; N) = C + A

66 Therefore, C = Pr (D = 1 j Z = 1) Pr (D = 1 j Z = 0)

67 What if Monotonicity does not hold? What is the interpretation of Pr (D = 1 j Z = 1) Pr (D = 1 j Z = 0)?

68 6. REGRESSION DISCONTINUITY (RD) Van der Klaauw (2002) uses a RD approach to estimate the e ect of nancial aid on students decisions to accept admission to a given college. He exploits discontinuities in an administrative formula that determines aid based on SAT score, GPA, & other components. Angrist and Lavy (1999) estimate the e ect of class size on student test scores, with identi cation coming from a rule requiring that one classroom be added in a school whenever average class size exceeds a predetermined threshold. Here class size is a discontinuous (and note: non-monotonic) function of enrollment in the student s school. Black (1999) uses a RD approach to estimates parents willingness to pay for school quality by comparing housing prices near school district boundaries.

69 Suppose that the probability of treatment (of D = 1) depends on some observable variable X, which is continuous. The variable X need not be independent of Y 0 and Y 1 (of T E). De ne: P (x) Pr (D = 1 j X = x) The key feature of the RD approach is that P (x) is such that there is a point x 0 in which P (:) is discontinuous. Note that, though this is a necessary condition to apply a RD approach, it is not really an assumption because P (x) is identi ed at every point in the support of X, so we can check whether this discontinuity exits or not.

70 The key identi cation assumption is that the functions 0 (x) E(Y 0 jx = x) and 1 (x) E(Y 1 jx = x) are continuous functions of x. ASSUMPTION RD: The functions 0 (x) E(Y 0 jx = x) and 1 (x) E(Y 1 jx = x) are continuous at X = x 0. Under this assumption any observed discontinuity in E(Y jx = x) should be associated with the policy e ect.

71 Under Assumption RD it is possible to show that: where: AT E(x 0 ) = E(Y jx 0) + E(Y jx 0 ) P (x 0 ) + P (x 0 ) E(Y jx 0 ) + lim x!x + 0 E(Y jx = x 0 ) E(Y jx 0 ) lim x!x0 E(Y jx = x 0 ) P (x 0 ) + lim x!x + 0 P (x 0 ) P (x 0 ) lim x!x + 0 P (x 0 ) Note that AT E(x) is identi ed only at x 0.

72 7. ROY S MODEL Rational agents self-select in markets, occupations, education levels, etc, that maximize their payo. Roy s (1951) Thoughts on the Distribution of Earnings, is a seminal paper on this topic. He discusses the optimizing choices of workers selecting between shing and hunting. Workers have skills in each occupation/sector, and they select the sector that gives them the highest expected earnings. Roy s model is a model of comparative advantage. Since that seminal paper, there has been very substantial amount of methodological and empirical work in Econometrics on the identi cation and estimation of Roy s model.

73 ROY S MODEL 7.1. The Model 7.2. Indeti cation with (log)normal distributions of skills 7.3. Nonparametric identi cation 7.4. Generalized Roy s model

74 7.1. THE MODEL Two occupations [or industries, or countries, etc] indexed by d 2 f0; 1g. A worker is endowed with skills for each occupation (S 0 and S 1 ). Let 0 and 1 be the market prices of skills [the same for all workers in the market] in occupations 0 and 1, respectively, such that earnings of a worker in occupation d 2 f0; 1g are: W d = d S d A worker selects the occupation that maximizes her earnings: W 1 W 0, Worker selects occupation 1 W 1 < W 0, Worker selects occupation 0

75 7.1. THE MODEL [2] De ne the variables: Y d ln W d = ln d + ln S d (i.e., log-earnings in occupation d) D 1f worker selects occupation 1g For d = 0; 1, de ne the parameters d E(Y d ) = ln d + E(ln S d ), and the random variables U d Y d d. The model can be described in terms of the following equations: 8 >< >: Y = (1 D) Y 0 + D Y 1 Y d = d + U d for d = 0; 1 D = 1 fy 1 Y 0 g This is the TE model but with the assumation that individuals choose "treatment" to maximze earnings.

76 7.1. THE MODEL [3] Roy s main purpose was to understand the implications of self-selection on the distribution of earnings in di erent occupations. For d = 0; 1, de ne: d E(Y d jd = d) E(Y d ). If d > 0 we say that there is positive selection into occupation d; i.e., workers selecting occupation d have on more skills in this occupation than the average worker in the population. If d < 0 we say that there is negative selection into occupation d; i.e., workers selecting occupation d have on less skills in this occupation than the average worker in the population. What are the predictions of the model about 0 and 1?

77 7.1. THE MODEL [4] Note that: D = 1 fy 1 Y 0 g = 1 fu 0 U g = 1 ( ) V V where V U 0 U 1 and 1 0 V

78 7.1. THE MODEL [5] Under normality of U 0 and U 1 : E (Y 1 jd = 1) = 1 + E (U 1 j V 1 0 ) = 1 + E 1V 2 V V j V 1 0! = 1 + 1V 2 V = 1 1V V V E () () V V j V! 1 0 V V

79 7.1. THE MODEL [6] Similarly, E (Y 0 jd = 0) = 0 + E (U 0 j V > 1 0 ) = 0 + E 0V 2 V V j V > 1 0! = 0 + 0V 2 V V E V V j! V > V = 0 + 0V V () 1 ()

80 7.1. THE MODEL [7] Taking into account that 0V = and 1V = , and de ning 0 = V 1 = V () 1 () () () The signs of 0 and 1 depend on the signs of [ ] and h 2 1 i 01, respectively. Note that: 2 V = [ ] + h i >, so at least one of the two terms is positive, and it can be both.

81 7.1. THE MODEL [8] < > < 0 Imposible Positive selection in 1 Negative selection in > 0 Negative selection in 1 Positive selection in 1 Positive selection in 0 Positive selection in 0

82 7.1. THE MODEL [9] Which type of occupation has positive selection? The occupation where the distribution of skills is more heterogeneous, more disperse. To see this, note that [ ] = 0 1 " 0 1 [ ] = 0 1 " 1 0 such that the sign of 0 is determined by the sign of of 1 is determined by the sign of " 1 0 #. # # " 0 1 #, and the sign

83 If 1 0 > 1, then 1 > 0, and if If 0 1 > 1, then 0 > 0.

84 7.2. Indeti cation: Normal distributions Suppose that we have cross-sectional data fy i ; d i : i = 1; 2; :::; Ng. Can we identify the parameters of the Roy s model = ( 0 ; 1 ; 0 ; 1 ; 01 )? Heckman and Honore (ECMA, 1990) show that we normal distributions the parameters are uniquely identi ed from the following moments in the data: Pr(D = 1); E (Y jd = 0) ; E (Y jd = 1) ; V (Y jd = 0) ; and V (Y jd = 1) They also show that, without regressors, the model is not identi ed if we consider a nonparametric speci cation of the unobservables. Then, they present nonparametric identi cation results when the model includes regressors X.

85 7.3. Nonparametric Indeti cation: Exclusion restrictions Consider the model with repressors, such that d (X) E(Y d jx), and assume that U 0 and U 1 are independent of X: Suppose that X includes three groups of variables: X = (Z 0 ; Z 1 ; X c ) such that: 0 (X) = 0 (Z 0 ; X c ) 1 (X) = 1 (Z 1 ; X c ) Furtherefore Z 0 and Z 1 have continuous support and d (Z d ; X c ) is strictly monotonic in Z d, and lim Z d! 1 d(z d ; X c ) = 1

86 Nonparametric Indeti cation: Exclusion restrictions [2] We have that: E (Y j X; D = 0) = 0 (Z 0 ; X c ) + E (U 0 j V > 1 (Z 1 ; X c ) 0 (Z 0 ; X c )) Therefore, lim E (Y j X; D = 0) = Z 1! 1 0 (Z 0 ; X c ) +E U 0 j V > lim Z 1! 1 1(Z 1 ; X c ) 0 (Z 0 ; X c )! = 0 (Z 0 ; X c ) + E (U 1 j V > 1) = 0 (Z 0 ; X c ) and 0 (Z 0 ; X c ) is identi ed everywhere.

87 Nonparametric Indeti cation: Exclusion restrictions [3] Similarly, we have that: E (Y j X; D = 1) = 1 (Z 1 ; X c ) + E (U 1 j V 1 (Z 1 ; X c ) 0 (Z 0 ; X c )) Therefore, lim E (Y j X; D = 1) = Z 0! 1 1 (Z 1 ; X c ) +E U 1 j V 1 (Z 1 ; X c ) lim Z 0! 1 0(Z 0 ; X c )! = 1 (Z 1 ; X c ) + E (U 1 j V +1) = 1 (Z 1 ; X c ) and 0 (Z 0 ; X c ) is identi ed everywhere.

88 Nonparametric Indeti cation: Exclusion restrictions [4] For estimation, we can use nonparametric methods. De ne the choice probability P D (X) = Pr(D = 1jX). The model implies that P D (X) = F V ( 1 (Z 1 ; X c ) 0 (Z 0 ; X c )), and if F V (:) is strictlu increasing: 1 (Z 1 ; X c ) 0 (Z 0 ; X c ) = F 1 V [P D(X)] Note that E (U 1 j V 1 (Z 1 ; X c ) 0 (Z 0 ; X c )) is a function of 1 (Z 1 ; X c ) 0 (Z 0 ; X c ) only, and therefore we can represented as a function of P D (X). E (U 1 j V 1 (Z 1 ; X c ) 0 (Z 0 ; X c )) = s 1 (P D (Z 0 ; Z 1 ; X c ))

89 Nonparametric Indeti cation: Exclusion restrictions [5] Therefore, we can write E (Y j X; D = 1) = 1 (Z 1 ; X c ) + s 1 (P D (Z 0 ; Z 1 ; X c )) For the subsample of observations with d i = 1, consider the regression model: y i = 1 (z 1i ; x ci ) + s 1 (p i ) + e i = h(z 1i ; x ci ) s 1 (p i ) + e i We can use Robinson (1988) or Yatchew (2003) to estimate 1 in this model.

Empirical Methods in Applied Microeconomics

Empirical Methods in Applied Microeconomics Empirical Methods in Applied Microeconomics Jörn-Ste en Pischke LSE November 2007 1 Nonlinearity and Heterogeneity We have so far concentrated on the estimation of treatment e ects when the treatment e

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

Instrumental Variables. Ethan Kaplan

Instrumental Variables. Ethan Kaplan Instrumental Variables Ethan Kaplan 1 Instrumental Variables: Intro. Bias in OLS: Consider a linear model: Y = X + Suppose that then OLS yields: cov (X; ) = ^ OLS = X 0 X 1 X 0 Y = X 0 X 1 X 0 (X + ) =)

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Regression discontinuity design with covariates

Regression discontinuity design with covariates Regression discontinuity design with covariates Markus Frölich The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP27/07 Regression discontinuity design with covariates

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

i=1 y i 1fd i = dg= P N i=1 1fd i = dg.

i=1 y i 1fd i = dg= P N i=1 1fd i = dg. ECOOMETRICS II (ECO 240S) University of Toronto. Department of Economics. Winter 208 Instrctor: Victor Agirregabiria SOLUTIO TO FIAL EXAM Tesday, April 0, 208. From 9:00am-2:00pm (3 hors) ISTRUCTIOS: -

More information

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University Instructions: Answer all four (4) questions. Be sure to show your work or provide su cient justi cation for

More information

Prediction and causal inference, in a nutshell

Prediction and causal inference, in a nutshell Prediction and causal inference, in a nutshell 1 Prediction (Source: Amemiya, ch. 4) Best Linear Predictor: a motivation for linear univariate regression Consider two random variables X and Y. What is

More information

Economics 241B Estimation with Instruments

Economics 241B Estimation with Instruments Economics 241B Estimation with Instruments Measurement Error Measurement error is de ned as the error resulting from the measurement of a variable. At some level, every variable is measured with error.

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

Non-parametric Identi cation and Testable Implications of the Roy Model

Non-parametric Identi cation and Testable Implications of the Roy Model Non-parametric Identi cation and Testable Implications of the Roy Model Francisco J. Buera Northwestern University January 26 Abstract This paper studies non-parametric identi cation and the testable implications

More information

Estimation with Aggregate Shocks

Estimation with Aggregate Shocks Estimation with Aggregate Shocks Jinyong Hahn UCLA Guido Kuersteiner y University of Maryland October 5, 06 Maurizio Mazzocco z UCLA Abstract Aggregate shocks a ect most households and rms decisions. Using

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

Regression Discontinuity Designs.

Regression Discontinuity Designs. Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction

More information

Lecture Notes Part 7: Systems of Equations

Lecture Notes Part 7: Systems of Equations 17.874 Lecture Notes Part 7: Systems of Equations 7. Systems of Equations Many important social science problems are more structured than a single relationship or function. Markets, game theoretic models,

More information

Lecture 11 Roy model, MTE, PRTE

Lecture 11 Roy model, MTE, PRTE Lecture 11 Roy model, MTE, PRTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Roy Model Motivation The standard textbook example of simultaneity is a supply and demand system

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Heckman and Sedlacek, JPE 1985, 93(6), 1077-1125 James Heckman University of Chicago

More information

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects. A Course in Applied Econometrics Lecture 5 Outline. Introduction 2. Basics Instrumental Variables with Treatment Effect Heterogeneity: Local Average Treatment Effects 3. Local Average Treatment Effects

More information

1 Static (one period) model

1 Static (one period) model 1 Static (one period) model The problem: max U(C; L; X); s.t. C = Y + w(t L) and L T: The Lagrangian: L = U(C; L; X) (C + wl M) (L T ); where M = Y + wt The FOCs: U C (C; L; X) = and U L (C; L; X) w +

More information

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Lecture 1- The constrained optimization problem

Lecture 1- The constrained optimization problem Lecture 1- The constrained optimization problem The role of optimization in economic theory is important because we assume that individuals are rational. Why constrained optimization? the problem of scarcity.

More information

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation SS 2010 WS 2014/15 Alexander Spermann Evaluation With Non-Experimental Approaches Selection on Unobservables Natural Experiment (exogenous variation in a variable) DiD Example: Card/Krueger (1994) Minimum

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

ECON0702: Mathematical Methods in Economics

ECON0702: Mathematical Methods in Economics ECON0702: Mathematical Methods in Economics Yulei Luo SEF of HKU January 14, 2009 Luo, Y. (SEF of HKU) MME January 14, 2009 1 / 44 Comparative Statics and The Concept of Derivative Comparative Statics

More information

Macroeconomics IV Problem Set I

Macroeconomics IV Problem Set I 14.454 - Macroeconomics IV Problem Set I 04/02/2011 Due: Monday 4/11/2011 1 Question 1 - Kocherlakota (2000) Take an economy with a representative, in nitely-lived consumer. The consumer owns a technology

More information

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments

Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Nonparametric Identi cation of Regression Models Containing a Misclassi ed Dichotomous Regressor Without Instruments Xiaohong Chen Yale University Yingyao Hu y Johns Hopkins University Arthur Lewbel z

More information

Contents. University of York Department of Economics PhD Course 2006 VAR ANALYSIS IN MACROECONOMICS. Lecturer: Professor Mike Wickens.

Contents. University of York Department of Economics PhD Course 2006 VAR ANALYSIS IN MACROECONOMICS. Lecturer: Professor Mike Wickens. University of York Department of Economics PhD Course 00 VAR ANALYSIS IN MACROECONOMICS Lecturer: Professor Mike Wickens Lecture VAR Models Contents 1. Statistical v. econometric models. Statistical models

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

Advanced Economic Growth: Lecture 8, Technology Di usion, Trade and Interdependencies: Di usion of Technology

Advanced Economic Growth: Lecture 8, Technology Di usion, Trade and Interdependencies: Di usion of Technology Advanced Economic Growth: Lecture 8, Technology Di usion, Trade and Interdependencies: Di usion of Technology Daron Acemoglu MIT October 3, 2007 Daron Acemoglu (MIT) Advanced Growth Lecture 8 October 3,

More information

Exam ECON5106/9106 Fall 2018

Exam ECON5106/9106 Fall 2018 Exam ECO506/906 Fall 208. Suppose you observe (y i,x i ) for i,2,, and you assume f (y i x i ;α,β) γ i exp( γ i y i ) where γ i exp(α + βx i ). ote that in this case, the conditional mean of E(y i X x

More information

Notes on Generalized Method of Moments Estimation

Notes on Generalized Method of Moments Estimation Notes on Generalized Method of Moments Estimation c Bronwyn H. Hall March 1996 (revised February 1999) 1. Introduction These notes are a non-technical introduction to the method of estimation popularized

More information

Speci cation of Conditional Expectation Functions

Speci cation of Conditional Expectation Functions Speci cation of Conditional Expectation Functions Econometrics Douglas G. Steigerwald UC Santa Barbara D. Steigerwald (UCSB) Specifying Expectation Functions 1 / 24 Overview Reference: B. Hansen Econometrics

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

Trimming for Bounds on Treatment Effects with Missing Outcomes *

Trimming for Bounds on Treatment Effects with Missing Outcomes * CENTER FOR LABOR ECONOMICS UNIVERSITY OF CALIFORNIA, BERKELEY WORKING PAPER NO. 5 Trimming for Bounds on Treatment Effects with Missing Outcomes * David S. Lee UC Berkeley and NBER March 2002 Abstract

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

Lecture 3, November 30: The Basic New Keynesian Model (Galí, Chapter 3)

Lecture 3, November 30: The Basic New Keynesian Model (Galí, Chapter 3) MakØk3, Fall 2 (blok 2) Business cycles and monetary stabilization policies Henrik Jensen Department of Economics University of Copenhagen Lecture 3, November 3: The Basic New Keynesian Model (Galí, Chapter

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 15, 2013 Christophe Hurlin (University of Orléans)

More information

Addendum to: International Trade, Technology, and the Skill Premium

Addendum to: International Trade, Technology, and the Skill Premium Addendum to: International Trade, Technology, and the Skill remium Ariel Burstein UCLA and NBER Jonathan Vogel Columbia and NBER April 22 Abstract In this Addendum we set up a perfectly competitive version

More information

Microeconomics, Block I Part 1

Microeconomics, Block I Part 1 Microeconomics, Block I Part 1 Piero Gottardi EUI Sept. 26, 2016 Piero Gottardi (EUI) Microeconomics, Block I Part 1 Sept. 26, 2016 1 / 53 Choice Theory Set of alternatives: X, with generic elements x,

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures Andrea Ichino (European University Institute and CEPR) February 28, 2006 Abstract This course

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book

More information

Nonparametric Identification and Estimation of Nonadditive Hedonic Models

Nonparametric Identification and Estimation of Nonadditive Hedonic Models DISCUSSION PAPER SERIES IZA DP No. 4329 Nonparametric Identification and Estimation of Nonadditive Hedonic Models James J. Heckman Rosa L. Matzkin Lars Nesheim July 2009 Forschungsinstitut zur Zukunft

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

Control Functions in Nonseparable Simultaneous Equations Models 1

Control Functions in Nonseparable Simultaneous Equations Models 1 Control Functions in Nonseparable Simultaneous Equations Models 1 Richard Blundell 2 UCL & IFS and Rosa L. Matzkin 3 UCLA June 2013 Abstract The control function approach (Heckman and Robb (1985)) in a

More information

Labor Economics, Lecture 11: Partial Equilibrium Sequential Search

Labor Economics, Lecture 11: Partial Equilibrium Sequential Search Labor Economics, 14.661. Lecture 11: Partial Equilibrium Sequential Search Daron Acemoglu MIT December 6, 2011. Daron Acemoglu (MIT) Sequential Search December 6, 2011. 1 / 43 Introduction Introduction

More information

ECON2285: Mathematical Economics

ECON2285: Mathematical Economics ECON2285: Mathematical Economics Yulei Luo Economics, HKU September 17, 2018 Luo, Y. (Economics, HKU) ME September 17, 2018 1 / 46 Static Optimization and Extreme Values In this topic, we will study goal

More information

Empirical Methods in Applied Economics Lecture Notes

Empirical Methods in Applied Economics Lecture Notes Empirical Methods in Applied Economics Lecture Notes Jörn-Ste en Pischke LSE October 2005 1 Regression Discontinuity Design 1.1 Basics and the Sharp Design The basic idea of the regression discontinuity

More information

Pseudo panels and repeated cross-sections

Pseudo panels and repeated cross-sections Pseudo panels and repeated cross-sections Marno Verbeek November 12, 2007 Abstract In many countries there is a lack of genuine panel data where speci c individuals or rms are followed over time. However,

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil Four Parameters of Interest in the Evaluation of Social Programs James J. Heckman Justin L. Tobias Edward Vytlacil Nueld College, Oxford, August, 2005 1 1 Introduction This paper uses a latent variable

More information

Recitation Notes 5. Konrad Menzel. October 13, 2006

Recitation Notes 5. Konrad Menzel. October 13, 2006 ecitation otes 5 Konrad Menzel October 13, 2006 1 Instrumental Variables (continued) 11 Omitted Variables and the Wald Estimator Consider a Wald estimator for the Angrist (1991) approach to estimating

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Pedro Albarran y Raquel Carrasco z Jesus M. Carro x June 2014 Preliminary and Incomplete Abstract This paper presents and evaluates

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Advanced Economic Growth: Lecture 3, Review of Endogenous Growth: Schumpeterian Models

Advanced Economic Growth: Lecture 3, Review of Endogenous Growth: Schumpeterian Models Advanced Economic Growth: Lecture 3, Review of Endogenous Growth: Schumpeterian Models Daron Acemoglu MIT September 12, 2007 Daron Acemoglu (MIT) Advanced Growth Lecture 3 September 12, 2007 1 / 40 Introduction

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Single-Equation GMM: Endogeneity Bias

Single-Equation GMM: Endogeneity Bias Single-Equation GMM: Lecture for Economics 241B Douglas G. Steigerwald UC Santa Barbara January 2012 Initial Question Initial Question How valuable is investment in college education? economics - measure

More information

Identification of Regression Models with Misclassified and Endogenous Binary Regressor

Identification of Regression Models with Misclassified and Endogenous Binary Regressor Identification of Regression Models with Misclassified and Endogenous Binary Regressor A Celebration of Peter Phillips Fourty Years at Yale Conference Hiroyuki Kasahara 1 Katsumi Shimotsu 2 1 Department

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Exploring Marginal Treatment Effects

Exploring Marginal Treatment Effects Exploring Marginal Treatment Effects Flexible estimation using Stata Martin Eckhoff Andresen Statistics Norway Oslo, September 12th 2018 Martin Andresen (SSB) Exploring MTEs Oslo, 2018 1 / 25 Introduction

More information

Internationa1 l Trade

Internationa1 l Trade 14.581 Internationa1 l Trade Class notes on /19/013 1 Overview Assignment Models in the Trade Literature Small but rapidly growing literature using assignment models in an international context: Trade:

More information

Lecture 11/12. Roy Model, MTE, Structural Estimation

Lecture 11/12. Roy Model, MTE, Structural Estimation Lecture 11/12. Roy Model, MTE, Structural Estimation Economics 2123 George Washington University Instructor: Prof. Ben Williams Roy model The Roy model is a model of comparative advantage: Potential earnings

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

ow variables (sections A1. A3.); 2) state-level average earnings (section A4.) and rents (section

ow variables (sections A1. A3.); 2) state-level average earnings (section A4.) and rents (section A Data Appendix This data appendix contains detailed information about: ) the construction of the worker ow variables (sections A. A3.); 2) state-level average earnings (section A4.) and rents (section

More information

Empirical Methods in Applied Economics

Empirical Methods in Applied Economics Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2007 1 Instrumental Variables 1.1 Basics A good baseline for thinking about the estimation of causal e ects is often the randomized

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Generalized Roy Model and Cost-Benefit Analysis of Social Programs 1

Generalized Roy Model and Cost-Benefit Analysis of Social Programs 1 Generalized Roy Model and Cost-Benefit Analysis of Social Programs 1 James J. Heckman The University of Chicago University College Dublin Philipp Eisenhauer University of Mannheim Edward Vytlacil Columbia

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Estimating Marginal and Average Returns to Education

Estimating Marginal and Average Returns to Education Estimating Marginal and Average Returns to Education Pedro Carneiro, James Heckman and Edward Vytlacil Econ 345 This draft, February 11, 2007 1 / 167 Abstract This paper estimates marginal and average

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Recitation Notes 6. Konrad Menzel. October 22, 2006

Recitation Notes 6. Konrad Menzel. October 22, 2006 Recitation Notes 6 Konrad Menzel October, 006 Random Coefficient Models. Motivation In the empirical literature on education and earnings, the main object of interest is the human capital earnings function

More information

A time series plot: a variable Y t on the vertical axis is plotted against time on the horizontal axis

A time series plot: a variable Y t on the vertical axis is plotted against time on the horizontal axis TIME AS A REGRESSOR A time series plot: a variable Y t on the vertical axis is plotted against time on the horizontal axis Many economic variables increase or decrease with time A linear trend relationship

More information

Estimation of Treatment Effects under Essential Heterogeneity

Estimation of Treatment Effects under Essential Heterogeneity Estimation of Treatment Effects under Essential Heterogeneity James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Environmental Econometrics

Environmental Econometrics Environmental Econometrics Syngjoo Choi Fall 2008 Environmental Econometrics (GR03) Fall 2008 1 / 37 Syllabus I This is an introductory econometrics course which assumes no prior knowledge on econometrics;

More information

Econ Review Set 2 - Answers

Econ Review Set 2 - Answers Econ 4808 Review Set 2 - Answers EQUILIBRIUM ANALYSIS 1. De ne the concept of equilibrium within the con nes of an economic model. Provide an example of an economic equilibrium. Economic models contain

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest Edward Vytlacil, Yale University Renmin University, Department

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

When is it really justifiable to ignore explanatory variable endogeneity in a regression model?

When is it really justifiable to ignore explanatory variable endogeneity in a regression model? Discussion Paper: 2015/05 When is it really justifiable to ignore explanatory variable endogeneity in a regression model? Jan F. Kiviet www.ase.uva.nl/uva-econometrics Amsterdam School of Economics Roetersstraat

More information

Comparative Advantage and Schooling

Comparative Advantage and Schooling Comparative Advantage and Schooling Pedro Carneiro University College London, Institute for Fiscal Studies and IZA Sokbae Lee University College London and Institute for Fiscal Studies June 7, 2004 Abstract

More information