Instrumental Variables

Size: px
Start display at page:

Download "Instrumental Variables"

Transcription

1 Instrumental Variables Department of Economics University of Wisconsin-Madison January 22, 2018

2 Model Lets start with the constant treatment effect model For now take that to be Y i = αt i + X iβ + u i α measures the causal effect of T i on Y i. We don t want to just run OLS because we are worried that T i is not randomly assigned, that is that T i and u i are correlated. There are a number of different reasons that might be true-i think the main thing that we are worried about in the treatment effects literature is omitted variables. In this course we are going to think about a lot of different ways of dealing with this potential problem for estimating α.

3 Instrumental Variables Lets start with instrumental variables I want to think about three completely different approaches for estimating α The first is the GMM approach and the second two will come from a Simultaneous Equations framework

4 To justify OLS we would need E (T i u i ) = 0 E(X i u i ) = 0 The focus of IV is to try to relax the first assumption (There is much less concern about the second)

5 Lets suppose that we have an instrument Z i for which E (Z i u i ) = 0 and we continue to assume that E(X i u i ) = 0 We also will stick with the exactly identified case (1 dimensional Z i ) Define Then we know Z i = [ Zi X i [ E Zi ] [, Xi Ti = X i ] [ α, B = β ( )] Y i Xi B = 0 ]

6 Turning this into a GMM estimator we get 0 = 1 N = 1 N N i=1 Z i ( N Zi Y i i=1 Y i X i ( 1 N B IV ) N i=1 Z i X i ) B IV which we can solve as ( 1 B IV N And that is IV. N i=1 Z i X i ) 1 1 N N Zi Y i i=1

7 Consistency Notation I will use to mean asymptotically equivalent. Typically the phrase A n B n means that (A n B n ) p 0 So I want to be formal in this sense However I will not be completely formal as this could also mean almost sure convergence, convergence in distribution or some sort of uniform convergence. The differences between these are not relevant for anything we will do in class.

8 Consistency of IV ( 1 N B IV = N i=1 ( 1 N = N i=1 ( 1 =B + N B Z i X i Z i X i N i=1 ) 1 1 N ) 1 1 N Z i X i N Zi Y i i=1 N i=1 ) 1 1 N ( Z i X i B + Z i u i ) N Zi u i i=1

9 since 1 N N Zi u i 0 i=1 So (assuming iid sampling) this only took two assumptions. The moment conditions and the fact that you can invert ( ) E Zi Xi As we will discuss this assumption is typically a bigger deal than in OLS

10 Furthermore we get (Huber-White) standard errors from N ( B IV B) = ( 1 N Under standard conditions: N i=1 Z i X i ) 1 [ 1 N N i=1 Z i u i ] so N ( B IV B) 1 N N i=1 ( ( )) Zi u i N 0, E u 2 i Zi Zi ( ( N 0, E Zi Xi ) 1 E ( u 2 i Z i Z i ) ( E X i Z i ) 1 )

11 We approximate this by ( ) E Zi Xi 1 N ( ) E u 2 i Zi Zi N i=1 N i=1 Z i X i û 2 i Z i Z i

12 It actually turns out that IV is not unbiased even when it is consistent, so lets derive the asymptotic bias more generally Before that I want to take a detour and discuss partitioned regression which will turn out to be really useful for this and we will use it many times in this course

13 Partitioned Regression Think about the standard regression model (in large matrix notation) Y = X 1 β 1 + X 2 β 2 + u We will define M 2 I X 2 ( X 2 X 2 ) 1 X 2. Two facts about M 2

14 Fact 1: M 2 is idempotent ( ( ) M 2 M 2 = I X 2 X 1 2 X 2 X 2 ) ( ( ) ) I X 2 X 1 2 X 2 X 2 =I 2X 2 ( X 2 X 2 ) 1 X 2 + X 2 ( X 2 X 2 ) 1 X 2X 2 ( X 2 X 2 ) 1 X 2 =I X 2 ( X 2 X 2 ) 1 X 2 =M 2

15 Fact 2: M 2 Y is the Residuals from Regression For any potential dependent variable (say Y), M 2 Y is the residuals I would get if I regressed Y on X 2 To see that let the regression coefficients be be ĝ and generically let Ỹ be residuals from a regression of Y on X 2 so that Ỹ Y X 2 ĝ ( ) =Y X 2 X 1 2 X 2 X 2 Y [ ( ) ] = I X 2 X 1 2 X 2 X 2 Y =M 2 Y.

16 An important special case of this is that if I regress something on itself, the residuals are all zero That is M 2 X 2 = X 2 X 2 (X 2X 2 ) 1 X 2X 2 = 0

17 If I think of the GMM moment equations for least squares I get the two equations ) 0 = X 1 (Y X 1 β1 X 2 β2 ) 0 = X 2 (Y X 1 β1 X 2 β2 We can solve for β 2 in the second as β 2 = ( X 2X 2 ) 1 X 2 ( Y X 1 β1 ) Now plug this into the first ) 0 = X 1 (Y X 1 β1 X 2 β2 = X 1 (Y ( ) )) X 1 β1 X 2 X 1 2 X 2 X 2 (Y X 1 β1 = X 1M 2 Y X 1M 2 X 1 β1

18 Or β 1 = ( X 1M 2 X 1 ) 1 X 1 M 2 Y = ( X 1 X 1 ) 1 X Ỹ That is if I 1 Run a regression of X 1 on X 2 and form its residuals X 1 2 Run a regression of Y on X 2 and form its residuals Ỹ 3 Run a regression of Ỹ on X 1 Since I derived this from the sample analogue of the GMM moment equations (or the normal equations), this gives me exactly the same result as if I had run the full regression of Y on X 1 and X 2

19 It turns out the same idea works for IV. Put everything we had before into large Matrix notation and we can write GMM as: 0 = Z ( Y T α IV X β IV ) The second can be solved as Now plug this into the first 0 = X ( Y T α IV X β IV ) β IV = ( X X ) 1 X (Y T α IV ) ( 0 = Z Y T α IV X β ) IV ( = Z Y T α IV X ( X X ) ) 1 X (Y T α IV ) = Z M X Y Z M X T α IV

20 so α IV = ( Z M X T ) 1 Z M X Y = Z Ỹ Z T cov( Z i, Ỹ i ) cov( Z i, T i ) (This last expression assumes that there is an intercept in the model. If not it would be expected values rather than covariances, but covariances make things easier to interpret-at least to me)

21 To see consistency from this perspective note that so Ỹ =M X Y =αm X T + M X Xβ + M X u =α T + ũ α IV cov( Z i, Ỹ i ) cov( Z i, T i ) cov( Z i, α T i + ũ i ) cov( Z i, T i ) = α + cov( Z i, ũ i ) cov( Z i, T i )

22 This formula is helpful. In order for the model to be consistent you need cov( Z i, ũ i ) = 0 cov( Z i, T i ) 0 But more generally for the asymptotic bias to be small you want cov( Z i, ũ i ) to be small cov( Z i, T i ) to be large This means that in practice there is some tradeoff between them. If your instrument is not very powerful, a little bit of correlation in cov( Z i, ũ i ) could lead to a large asymptotic bias.

23 As a concrete example lets compare IV to OLS. OLS is really just a special case of IV with Z i = T i Then we get α IV α + cov( Z i, ũ i ) cov( Z i, T i ) α OLS α + cov( T i, ũ i ) cov( T i, T i ) If cov( Z i, ũ i ) = 0 and cov( T i, ũ i ) 0 then IV is consistent and OLS is not However, cov( Z i, ũ i ) < cov( T i, ũ i ) does not guarantee less bias because it also depends on cov( Z i, T i ) = 0 and cov( T i, T i ) 0

24 Assumptions Lets think about the assumptions Lets ignore the first stage thing and assume that isn t a problem we looked at two different things GMM My derivation E(Z i u i ) = 0, E(X i u i ) = 0 cov( Z i, ũ i ) = 0 Are these the same?

25 Well... We know that GMM gives consistent estimates and if cov( Z i, ũ i ) 0 then α is not consistent. So the first assumptions better imply the second However, it doesn t have to be the other way around. We need cov( Z i, ũ i ) = 0 to get consistent estimates of α, but we haven t required consistent estimates of β.

26 To see it doesn t go the other way, suppose that I can write with E(X i, ε i ) = 0, E(Z i, ε i ) = 0 We can write u i = X iδ + ε i Y i =αt i + X iβ + u i αt i + X i(β + δ) + ε i So IV gives a consistent estimate of α but not β

27 Simultaneous equations The second and third way to see IV comes from the simultaneous equations framework Y i = αt i + X iβ + u i T i = ρy i + X iγ + Z i δ + ν i Thee are called the structural equations Note the difference between X i and Z i in that we restrict what can affect what. We could also have stuff that affects Y i but not not T i but lets not worry about that (we are still allowing this as a possibility as some of the γ coefficients could be zero) The model with ρ = 0 simplifies things, but lets focus on what happens when it isn t

28 We assume that E(u i X i, Z i ) = 0 E(v i X i, Z i ) = 0 but notice that if ρ 0, then almost for sure T i is correlated with u i because u i influences T i through Y i

29 It is useful to calculate the reduced form for T i, namely T i = ρy i + X iγ + Z i δ + ν i = ρ [ αt i + X iβ ] + u i + X i γ + Z i δ + ν i = ραt i + X i [ρβ + γ] + Z iδ + (ρu i + ν i ) = X i ρβ + γ 1 ρα + δ Z i 1 ρα + ρu i + ν i 1 ρα = X iβ 2 + Z iδ 2 + νi where β2 ρβ + γ 1 ρα δ2 δ 1 ρα ν i ρu i + ν i 1 ρα Note that E(νi X i, Z i ) = 0, so one can obtain a consistent estimate of β2 and δ 2 by regressing T i on X i and Z i.

30 This is called the reduced form equation for T i Note that the parameters here are not the fundamental structural parameters themselves, but the are a known function of these parameters To me this is the classic definition of reduced form (you need to have a structural model)

31 We can obtain a consistent estimate of α as long as we have an exclusion restriction That is we need some Z i that affects T i but not Y i directly I want to show this in two different ways

32 Method 1 We can also solve for the reduced form for Y i Y i = αt i + X iβ + u i αγ + β = X i 1 αρ + Z αδ i 1 αρ + αν i + u i 1 αρ = X i β1 + Z i δ1 + u i with β1 αγ + β 1 αρ δ1 αδ 1 αρ u i αν i + u i 1 αρ Like the other reduced form, we can get a consistent estimate of β 1 and δ 1 by regressing Y i on X i and Z i.

33 Notice then that δ1 δ2 = α So we can get a consistent estimate of α simply by taking the ratio of the reduced form coefficients It also gives another interpretation of IV: δ 2 is the causal effect of Z i on T i δ 1 is the causal effect of Z i on Y i -it only operates through T i

34 Z i Y i 2 T i If we increase Z i by one unit this leads T i to increase by δ 2 units which causes Y i to increase by δ 2 α units Thus the causal effect of Z i on Y i is δ 2 α units

35 This illustrates another important way to think about an instrument: the key assumption is that T i is the only channel through which Z i influences Y i

36 Suppose there was another d i a Z i Y i 2 T i Then the causal effect of Z i on Y i would be δ2 α + da and IV would be δ 2 α + da δ 2

37 In the exactly identified case (i.e. one Z i ), this is numerically identical to IV. To see why, note that in a regression of Y i and T i on (X i, Z i ) yields δ 1 = Z iỹi Z i Z i so δ 2 = Z i T i Z i Z i δ 1 δ 2 = Z iỹi Z i T i = α IV This is just math-it does not require that the Structural equation" determining T i be correct

38 Method 2 Define T f i X iβ 2 + Z iδ 2 and suppose that T f i were known to the econometrician Now notice that Y i = αt i + X iβ + u i [ ] = α T f i + νi + X iβ + u i = αt f i + X iβ 2 + (αν i + u i ) One could get a consistent estimate of α by regressing Y i on X i and T f i.

39 Two Stage Least Squares In practice we don t know T f i but can get a consistent estimate of it from the fitted values of a reduced form regression call this T i (it is crucial that the reduced form gives us consistent estimates of β 2 and δ 2 ) That is: 1 Regress T i on X i and Z i, form the predicted value T i 2 Regress Y on X i and T i To run the second regression one needs to be able to vary T i separately from X i which can only be done if there is a Z i

40 It turns out that 2SLS is also is numerically identical to IV (with 1 instrument) Note that so T = Z ( Z Z ) 1 Z T B 2SLS = ( [ T X ] [ T X ]) 1 [ T X ] Y However, note that we can write X = Z ( Z Z ) 1 Z X That is projecting X on (X, Z) and using it to predict X will be a perfect fit.

41 That means that(using notation from earlier) that [ T X ] = Z ( Z Z ) 1 Z X Then (in the exactly identified case) we can write ( B ( 2SLS = X Z Z Z ) 1 ( Z Z Z Z ) ) 1 1 Z X ( X Z Z Z ) 1 Z Y ( ( = X Z Z Z ) 1 1 Z X ) ( X Z Z Z ) 1 Z Y = (Z X ) 1 ( Z Z ) ( X Z ) 1 ( X Z Z Z ) 1 Z Y (Z X ) 1 ( = = ( Z X ) 1 Z Y = β IV Z Z ) ( Z Z ) 1 Z Y

42 3 Interpretations Thus with 1 instrument we have 3 equivalent ways to derive IV: 1 GMM estimator or (Z T) 1 Z Y 2 Ratio of reduced form estimates-rescaling the reduced form 3 2SLS-direct effect of fitted model With more than one instrument only one of these procedures works-we ll worry about that later

43 Examples There are three main reasons people use IV 1 Simultaneity bias: ρ 0 2 Omitted Variable bias : There are unobservables that contribute to u i that are correlated with T i 3 Measurement Error: We do not observe T i perfectly While the first is the original reason for IV, in practice omitted variable bias is typically the biggest concern A classic (perhaps the classic) example is the returns to schooling.

44 Returns to Schooling This comes from the Card s chapter in the 1999 Handbook of Labor Economics Lets assume that log(w i ) = αs i + X iβ + ε i where W i is wages, S i is schooling, and X i is other stuff The biggest problem is unobserved ability

45 We are worried about ability bias we want to use instrumental variables A good instrument should have two qualities: It should be correlated with schooling It should be uncorrelated with unobserved ability (and other unobservables) Many different things have been tried. Lets go through some of them

46 Family Background If my parents earn quite a bit of money it should be easier for me to borrow for college Also they might put more value on education This should make me more likely to go This has no direct effect on my income-wisconsin did not ask how much education my Father had when they made my offer But is family background likely to be uncorrelated with unobserved ability?

47 Closeness of College If I have a college in my town it should be much easier to attend college I can live at home If I live on campus I can travel to college easily I can come home for meals and to get my clothes washed I can hang out with my friends from High school But is this uncorrelated with unobserved ability?

48 Quarter of Birth This is the most creative Consider the following two aspects of the U.S. education system (this actually varies from state to state and across time but ignore that for now), People begin Kindergarten in the calendar year in which they turn 5 You must stay in school until you are 16 Now consider kids who: Can t stand school and will leave as soon as possible Obey truancy law and school age starting law Are born on either December 31,1972 or January 1,1973

49 Those born on December 31 will turn 5 in the calendar year 1977 and will start school then (at age 4) will stop school on their 16th birthday which will be on Dec. 31, 1988 thus they will stop school during the winter break of 11th grade Those born on January 1 will turn 5 in the calendar year 1978 and will start school then (at age 5) will stop school on their 16th birthday which will be on Jan. 1, 1989 thus they will stop school during the winter break of 10th grade

50 The instrument is a dummy variable for whether you are born on Dec. 31 or Jan 1 This is pretty cool: For reasons above it will be correlated with education No reason at all to believe that it is correlated with unobserved ability The Fact that not everyone obeys perfectly is not problematic: An instrument just needs to be correlated with schooling, it does not have to be perfectly correlated In practice we can t just use the day as an instrument, use quarter of birth instead

51 Policy Changes Another possibility is to use institutional features that affect schooling Here often institutional features affect one group or one cohort rather than others

52

53

54

55 Consistently IV estimates are higher than OLS Why? Bad Instruments Ability Bias Measurement Error Publication Bias Discount Rate Bias

56 Heterogeneous Treatment Effects OK lets go back to the simple model but think about heterogeneous treatment effects following Imbens and Angrist (1994) We can write Y i = T i Y 1i + (1 T i ) Y 0i = T i (Y 1i Y 0i ) + Y 0i = i T i + Y 0i Assume our binary instrument Z i is correlated with T i but not with i or Y 0i (or equivalently Y 0i or Y 1i ) Does IV estimate the ATE?

57 There are 4 different types of people those for whom T i = 1 when: : Z i = 1, Z i = 0 : Z i = 1 only : Never : Z i = 0 only Imbens and Angrist s monotonicity rules out 4 as a possibility We also assume that Z i is independent of type Let µ, µ, and µ represent the sample proportions of the three groups and S i an indicator of the type

58 We know α 2SLS = Y 1 Y 0 E (Y i Z i = 1) E (Y i Z i = 0) T 1 T 0 E (T i Z i = 1) E (T i Z i = 0) Lets figure out the numerator and denominator E (Y i Z i = 1) E (Y i Z i = 0) =E ( i T i + Y 0i Z i = 1) E ( i T i + Y 0i Z i = 0) =E ( i T i Z i = 1) E ( i T i Z i = 0) = [µ E( i T i S i =, Z i = 1) + µ E( i T i S i =, Z i = 1) + µ E( i T i S i =, Z i = 1)] [µ E( i T i S i =, Z i = 0) + µ E( i T i S i =, Z i = 0) + µ E( i T i S i =, Z i = 0)] = [µ E( i S i = ) + µ E( i S i = ) + µ 0] [µ E( i S i = ) + µ 0 + µ 0] =µ E( i S i = )

59 E (T i Z i = 1) E (T i Z i = 0) = [µ E(T i S i =, Z i = 1) + µ E(T i S i =, Z i = 1) + µ E(T i S i =, Z i = 1)] [µ E(T i S i =, Z i = 0) + µ E(T i S i =, Z i = 0) + µ E(T i S i =, Z i = 0)] = [µ + µ ] =µ [µ )] Thus E (Y i Z i = 1) E (Y i Z i = 0) E (T i Z i = 1) E (T i Z i = 0) =E( i S i = ) They call this the local average treatment effect (LATE)

60 Measurement Error Another way people use instruments is for measurement error In the classic model lets get rid of X s so we want to measure the effect of T on Y. Y i = β 0 + αt i + u i and lets not worry about other issues so assume that cov(t i, u i ) = 0. The complication is that I don t get to observe T i, I only get to observe a noisy version of it: τ 1i = T i + ξ i where ξ i is i.i.d measurement error with variance σ 2 ξ

61 What happens if I run the regression on τ 1i instead of T i? α Cov (τ 1i, Y i ) Var(τ 1i ) = Cov (T i + ξ i, β 0 + αt i + u i ) Var(T i + ξ i ) = α Var (T i) Var(T i ) + σ 2 ξ

62 Why is OLS biased? Lets rewrite the model as Y i = β 0 + αt i + u i = β 0 + αt i + αξ i + u i αξ i = β 0 + ατ 1i + (u i αξ i ). You can see the problem with OLS: τ 1i = T i + ξ i is correlated with (u i αξ i )

63 Now suppose we have another measure of T i, τ 2i = T i + η i where η i is uncorrelated with everything else in the model. Using this as an instrument gives us a solution. τ 2i is correlated with τ 1i (through T i ),but uncorrelated with (u i αξ i ) so we can use τ 2i as an instrument for τ 1i.

64 Twins (Here we will think about both measurement error and fixed effect approaches) log(w if ) = θ f + αs if + u if The problem is that θ f is correlated with S if We can solve by differencing log(w f ) = α S f + u f if S f is uncorrelated with u f, then we can use this to get consistent estimates of α

65 The problem here is that a little measurement error can screw up things quite a bit because the variance of S f is small. A solution of this is to get two measures on schooling Ask me about my schooling Also ask my brother about my schooling do the same think for my brother s schooling This gives us two different measure of S if. Use one as an instrument for the other

66

67 The Colonial Origins of Comparative Development: An Empirical Investigation by Acemoglu, Johnson, and Robinson Time for a non-education example-a very important paper on an extremely important issue Their goal is to estimate the effects of institutions on Economic Development The idea is that countries with better institutions should develop more

68 Specifically they want to estimate their equation: log(y i ) = µ + αr i + X iγ + ε i where y i is GDP per capita R i is protection against expropriation" X i is other stuff Lets take a preliminary look

69 VOL. 91 NO. 5 ACEMOGLU ET AL.: THE COLONIAL ORIGINS OF DEVELOPMENT 1379 TABLE 2-OLS REGRESSIONS Whole Base Whole Whole Base Base Whole Base world sample world world sample sample world sample (1) (2) (3) (4) (5) (6) (7) (8) Dependent variable is log output per Dependent variable is log GDP per capita in 1995 worker in 1988 Average protection against expropriation (0.04) (0.06) (0.06) (0.05) (0.06) (0.06) (0.04) (0.06) risk, Latitude (0.49) (0.51) (0.70) (0.63) Asia dummy (0.19) (0.23) Africa dummy (0.15) (0.17) "Other" continent dummy (0.20) (0.32) R Number of observations Notes: Dependent variable: columns (1)-(6), log GDP per capita (PPP basis) in 1995, current prices (from the World Bank's World Development Indicators 1999); columns (7)-(8), log output per worker in 1988 from Hall and Jones (1999). Average protection against expropriation risk is measured on a scale from 0 to 10, where a higher score means more protection against expropriation, averaged over 1985 to 1995, from Political Risk Services. Standard errors are in parentheses. In regressions with continent dummies, the dummy for America is omitted. See Appendix Table Al for more detailed variable definitions

70 THE AMERICAN ECONOMIC REVIEW DECEMBER HKG S CAN r) ~~~~~~~~~~~~MLTBHS H GM PER DOMTW) N- 8 SLV BO[GU IDN. ~~~HTI SDN 'MM TGO RD 7 A l NEBRGD NGA 0~~~~~~~~ a) EhE TZA m~ 6m 0 4' Average Expropriation Risk FIGURE 2. OLS RELATIONSHIP BETWEEN EXPROPRIATION RISK AND INCOME

71 The problem is that there is no reason to believe institutions are set at random THE AMERICAN ECONOMIC REVIEW DECEMBER 2001 their income They suggest per current using mortalities institutions in of these settlers countries.2 as an instrument More with the following argument specifically, our theory can be schematically nstitutions on eco- summarized as a source of exogs. In this paper, we (potential) settler tional differences > settlements mortality y Europeans,' and possible source of early current eory rests on three institutions institutions es of colonization ferent sets of instiuropean powers set plified by the Belngo. These instituuch protection for hey provide checks ernment expropria- current performance. We use data on the mortality rates of soldiers, bishops, and sailors stationed in the colonies be- Lets see whattween they the findseventeenth and nineteenth centuries, largely based on the work of the historian Philip D. Curtin. These give a good indication of the

72 VOL 91 NO. 5 ACEMOGLU ET AL.: THE COLONIAL ORIGINS OF DEVELOPMENT 1377 TABLE 1-DESCRIPTIVE STATISTICS By quartiles of mortality Whole world Base sample (1) (2) (3) (4) Log GDP per capita (PPP) in (1.1) (1.1) Log output per worker in (with level of United States (1.1) (1.0) normalized to 1) Average protection against expropriation risk, (1.8) (1.5) Constraint on executive in (2.3) (2.3) Constraint on executive in (1.8) (2.1) Constraint on executive in first year of independence (2.4) (2.4) Democracy in (2.6) (3.0) European settlements in (0.4) (0.3) Log European settler mortality n.a (1.1) Number of observations Notes: Standard deviations are in parentheses. Mortality is potential settler mortality, measured in terms of deaths per annum per 1,000 "mean strength" (raw mortality numbers are adjusted to what they would be if a force of 1,000 living people were kept in place for a whole year, e.g., it is possible for this number to exceed 1,000 in episodes of extreme mortality as those

73 THE AMERICAN ECONOMIC REVIEW DECE 10 USA NZL CAN AUS SGP C) co) IND GMB v 8 ' MYS,fflLANGAB 0) K i?r) Z A ^~~~~~L JAM -F,6._ T ~~~~~~~~CMR GINGH NABGA MDG 4 SDN MLI 4 ~~~~~~~~~~~HTI ZAR Log of Settler Mortality

74 TABLE 3-DETERMINANTS OF INSTITUTIONS European settlements in (0.73) (0.93) (0.90) (1.20) Log European settler mortality (0.17) (0.18) (0.24) (0.25) (0.02) (0.02) Latitude (1.80) (1.70) (2.30) (2.40) (0.19) R Number of observations (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Panel A Dependent Variable Is Average Protection Against Expropriation Risk in Constraint on executive in (0.08) (0.09) Democracy in (0.06) (0.07) Constraint on executive in first year of independence (0.08) (0.08) European settlements in (0.61) (0.78) Log European settler mortality (0.13) (0.14) Latitude (1.40) (1.50) (1.40) (1.51) (1.34) R Number of observations Dependent Variable Is European Dependent Variable Is Constraint Dependent Variable Is Settlements in Panel B on Executive in 1900 Democracy in

75 O. 5 ACEMOGLU ET AL.: THE COLONIAL ORIGINS OF DEVELOPMENT 10 'Ivp LO) < PANGA tl FJ GUY AGO Xi PAKIND SDN GMB 0a co BGD NERMD NGA tl 6 ETH TA n- 6 SI Log of Settler Mortality FIGURE 1. REDUCED-FORM RELATIONSHIP BETWEEN INCOME AND SETTLER MORTALITY

76 TABLE 4-IV REGRESSIONS OF LOG GDP PER CAPITA Base Base Base sample, Base Base sample sample dependen Base sample Base sample sample sample with with variable i Base Base without without without without continent continent log outpu sample sample Neo-Europes Neo-Europes Africa Africa dummies dummies per worke (1) (2) (3) (4) (5) (6) (7) (8) (9) Panel A: Two-Stage Least Squares Average protection against expropriation risk (0.16) (0.22) (0.36) (0.35) (0.10) (0.12) (0.30) (0.46) (0.17) Latitude (1.34) (1.46) (0.84) (1.8) Asia dummy (0.40) (0.52) Africa dummy (0.36) (0.42) "Other" continent dummy (0.85) (1.0) Panel B: First Stage for Average Protection Against Expropriation Risk in Log European settler mortality (0.13) (0.14) (0.13) (0.14) (0.22) (0.24) (0.17) (0.18) (0.13) Latitude (1.34) (1.50) (1.43) (1.40) Asia dummy (0.49) (0.50) Africa dummy (0.41) (0.41) "Other" continent dummy (0.84) (0.84) R

77 Overidentification What happens when we have more than one instrument? Lets think about a general case in which Z i is multidimensional Let K Z be the dimension of Zi Let K X denote the dimension of Xi Now we have more equations then parameters so we can no longer solve for B using ( 0 = Z Y X B ) because this gives us K Z equations in K X unknowns.

78 A simple solution is follow GMM and weight the moments by some K Z K Z weighting matrix Ω and then minimize [ ] [ ] Z (Y X B) Ω Z (Y X B) which gives ( 2X Z ΩZ Y X B ) = 0 (notice that in the exactly identified case X Z Ω drops out) We can solve directly for our estimator ( B GMM = X Z ΩZ X ) 1 X Z ΩZ Y

79 Two staged least squares is a special case of this: B 2SLS = ( X Z ( Z Z ) 1 Z X ) 1 X Z ( Z Z ) 1 Z Y Notice that this is the same as B GMM when Ω = (Z Z ) 1

80 Consistency ( 1 B GMM = N X Z Ω 1 ) 1 N Z X 1 N X Z Ω 1 N Z (X B + U) ( 1 =B + N X Z Ω 1 ) 1 N Z X 1 N X Z Ω 1 N Z U ( ( ) ( )) 1 ( ) B + E Xi Zi ΩE Zi Xi E Xi Zi ΩE(Zi u i ) =B

81 Inference N ( B B) = ( 1 N X Z Ω 1 ) 1 N Z X 1 N X Z Ω 1 Z U N Using a standard central limit theorem with i.i.d. data Thus N 1 Z U = 1 N N i=1 ( N 0, E ( Z i u i u 2 i Z i Z i N ( B B) N ( 0, A VA ) )) with ( V =E A = ( E Xi Zi ( ) X i Z i ( ΩE ) ΩE u 2 i Zi Zi ( Z i X i ) ( ) ΩE Zi Xi )) 1

82 From GMM results we know that the efficient weighting matrix is ( Ω =E u 2 i Zi Zi ) 1 in which case the Covariance matrix simplifies to ( ( E X i Z i ) ( E u 2 i Z i Z i ) 1 E ( Z i X i ) ) 1 This also means that under homoskedasticity two staged least squares is efficient.

83 Overidentification Tests Lets think about testing in the following way. Suppose we have two instruments so that we have three sets of moment conditions ( 0 = Z 1 Y T α X β ) ( 0 = Z 2 Y T α X β ) ( 0 = X Y T α X β )

84 As before we can use partitioned regression to deal with the X s and then write the first two moment equations as ) 0 = Z 1 (Ỹ T α ) 0 = Z 2 (Ỹ T α The way I see the overidentification test is whether we can find an α that solves both equations.

85 That is let α 1 = Z 1Ỹ Z 2 T α 2 = Z 1Ỹ Z 2 T If α 1 α 2 then the test will not reject the model, otherwise it will For this reason I am not a big fan of overidentification tests: If you have two crappy instruments with roughly the same bias you will fail to reject Why not just estimate α 1 and α 2 and look at them? It seems to me that you learn much more from that than a simple F-statistic

Instrumental Variables

Instrumental Variables Instrumental Variables Department of Economics University of Wisconsin-Madison September 27, 2016 Treatment Effects Throughout the course we will focus on the Treatment Effect Model For now take that to

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

PSC 504: Instrumental Variables

PSC 504: Instrumental Variables PSC 504: Instrumental Variables Matthew Blackwell 3/28/2013 Instrumental Variables and Structural Equation Modeling Setup e basic idea behind instrumental variables is that we have a treatment with unmeasured

More information

Exam ECON5106/9106 Fall 2018

Exam ECON5106/9106 Fall 2018 Exam ECO506/906 Fall 208. Suppose you observe (y i,x i ) for i,2,, and you assume f (y i x i ;α,β) γ i exp( γ i y i ) where γ i exp(α + βx i ). ote that in this case, the conditional mean of E(y i X x

More information

Education Production Functions. April 7, 2009

Education Production Functions. April 7, 2009 Education Production Functions April 7, 2009 Outline I Production Functions for Education Hanushek Paper Card and Krueger Tennesee Star Experiment Maimonides Rule What do I mean by Production Function?

More information

Returns to Tenure. Christopher Taber. March 31, Department of Economics University of Wisconsin-Madison

Returns to Tenure. Christopher Taber. March 31, Department of Economics University of Wisconsin-Madison Returns to Tenure Christopher Taber Department of Economics University of Wisconsin-Madison March 31, 2008 Outline 1 Basic Framework 2 Abraham and Farber 3 Altonji and Shakotko 4 Topel Basic Framework

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

ECO375 Tutorial 9 2SLS Applications and Endogeneity Tests

ECO375 Tutorial 9 2SLS Applications and Endogeneity Tests ECO375 Tutorial 9 2SLS Applications and Endogeneity Tests Matt Tudball University of Toronto Mississauga November 23, 2017 Matt Tudball (University of Toronto) ECO375H5 November 23, 2017 1 / 33 Hausman

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

Recitation Notes 6. Konrad Menzel. October 22, 2006

Recitation Notes 6. Konrad Menzel. October 22, 2006 Recitation Notes 6 Konrad Menzel October, 006 Random Coefficient Models. Motivation In the empirical literature on education and earnings, the main object of interest is the human capital earnings function

More information

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45 Announcements Solutions to Problem Set 3 are posted Problem Set 4 is posted, It will be graded and is due a week from Friday You already know everything you need to work on Problem Set 4 Professor Miller

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Inference in Regression Model

Inference in Regression Model Inference in Regression Model Christopher Taber Department of Economics University of Wisconsin-Madison March 25, 2009 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3

More information

150C Causal Inference

150C Causal Inference 150C Causal Inference Instrumental Variables: Modern Perspective with Heterogeneous Treatment Effects Jonathan Mummolo May 22, 2017 Jonathan Mummolo 150C Causal Inference May 22, 2017 1 / 26 Two Views

More information

Recitation Notes 5. Konrad Menzel. October 13, 2006

Recitation Notes 5. Konrad Menzel. October 13, 2006 ecitation otes 5 Konrad Menzel October 13, 2006 1 Instrumental Variables (continued) 11 Omitted Variables and the Wald Estimator Consider a Wald estimator for the Angrist (1991) approach to estimating

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Harvard University mblackwell@gov.harvard.edu Where are we? Where are we going? Last week: we learned about how to calculate a simple

More information

Differences-in- Differences. November 10 Clair

Differences-in- Differences. November 10 Clair Differences-in- Differences November 10 Clair The Big Picture What is this class really about, anyway? The Big Picture What is this class really about, anyway? Causality The Big Picture What is this class

More information

The College Premium in the Eighties: Returns to College or Returns to Ability

The College Premium in the Eighties: Returns to College or Returns to Ability The College Premium in the Eighties: Returns to College or Returns to Ability Christopher Taber March 4, 2008 The problem of Ability Bias were A i is ability Y i = X i β + αs i + A i + u i If A i is correlated

More information

Instrumental Variables and the Problem of Endogeneity

Instrumental Variables and the Problem of Endogeneity Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Simple Regression Model. January 24, 2011

Simple Regression Model. January 24, 2011 Simple Regression Model January 24, 2011 Outline Descriptive Analysis Causal Estimation Forecasting Regression Model We are actually going to derive the linear regression model in 3 very different ways

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Topic 10: Panel Data Analysis

Topic 10: Panel Data Analysis Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel

More information

Gov 2000: 9. Regression with Two Independent Variables

Gov 2000: 9. Regression with Two Independent Variables Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1. Why Add Variables to a Regression? 2. Adding a Binary Covariate 3. Adding a Continuous Covariate 4. OLS Mechanics

More information

Assessing Studies Based on Multiple Regression

Assessing Studies Based on Multiple Regression Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

II. MATCHMAKER, MATCHMAKER

II. MATCHMAKER, MATCHMAKER II. MATCHMAKER, MATCHMAKER Josh Angrist MIT 14.387 Fall 2014 Agenda Matching. What could be simpler? We look for causal effects by comparing treatment and control within subgroups where everything... or

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Descriptive Statistics (And a little bit on rounding and significant digits)

Descriptive Statistics (And a little bit on rounding and significant digits) Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent

More information

Identification of Models of the Labor Market

Identification of Models of the Labor Market Identification of Models of the Labor Market Eric French and Christopher Taber, Federal Reserve Bank of Chicago and Wisconsin November 6, 2009 French,Taber (FRBC and UW) Identification November 6, 2009

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

An example to start off with

An example to start off with Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94 Freeing up the Classical Assumptions () Introductory Econometrics: Topic 5 1 / 94 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions needed for derivations

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

Eco517 Fall 2014 C. Sims FINAL EXAM

Eco517 Fall 2014 C. Sims FINAL EXAM Eco517 Fall 2014 C. Sims FINAL EXAM This is a three hour exam. You may refer to books, notes, or computer equipment during the exam. You may not communicate, either electronically or in any other way,

More information

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison

Treatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation SS 2010 WS 2014/15 Alexander Spermann Evaluation With Non-Experimental Approaches Selection on Unobservables Natural Experiment (exogenous variation in a variable) DiD Example: Card/Krueger (1994) Minimum

More information

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

6.080 / Great Ideas in Theoretical Computer Science Spring 2008 MIT OpenCourseWare http://ocw.mit.edu 6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Problem Set - Instrumental Variables

Problem Set - Instrumental Variables Problem Set - Instrumental Variables 1. Consider a simple model to estimate the effect of personal computer (PC) ownership on college grade point average for graduating seniors at a large public university:

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 9, 2016 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

Exam D0M61A Advanced econometrics

Exam D0M61A Advanced econometrics Exam D0M61A Advanced econometrics 19 January 2009, 9 12am Question 1 (5 pts.) Consider the wage function w i = β 0 + β 1 S i + β 2 E i + β 0 3h i + ε i, where w i is the log-wage of individual i, S i is

More information

An analogy from Calculus: limits

An analogy from Calculus: limits COMP 250 Fall 2018 35 - big O Nov. 30, 2018 We have seen several algorithms in the course, and we have loosely characterized their runtimes in terms of the size n of the input. We say that the algorithm

More information

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,

More information

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary Part VII Accounting for the Endogeneity of Schooling 327 / 785 Much of the CPS-Census literature on the returns to schooling ignores the choice of schooling and its consequences for estimating the rate

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Lecture 10 Optimal Growth Endogenous Growth. Noah Williams

Lecture 10 Optimal Growth Endogenous Growth. Noah Williams Lecture 10 Optimal Growth Endogenous Growth Noah Williams University of Wisconsin - Madison Economics 702 Spring 2018 Optimal Growth Path Recall we assume exogenous growth in population and productivity:

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects

More information

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy Department of Economics, Harvard University 1 / 40 Agenda instrumental variables part I Origins of instrumental

More information

Chapter 9: Assessing Studies Based on Multiple Regression. Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Chapter 9: Assessing Studies Based on Multiple Regression. Copyright 2011 Pearson Addison-Wesley. All rights reserved. Chapter 9: Assessing Studies Based on Multiple Regression 1-1 9-1 Outline 1. Internal and External Validity 2. Threats to Internal Validity a) Omitted variable bias b) Functional form misspecification

More information

Next, we discuss econometric methods that can be used to estimate panel data models.

Next, we discuss econometric methods that can be used to estimate panel data models. 1 Motivation Next, we discuss econometric methods that can be used to estimate panel data models. Panel data is a repeated observation of the same cross section Panel data is highly desirable when it is

More information

Short Questions (Do two out of three) 15 points each

Short Questions (Do two out of three) 15 points each Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

ECO375 Tutorial 8 Instrumental Variables

ECO375 Tutorial 8 Instrumental Variables ECO375 Tutorial 8 Instrumental Variables Matt Tudball University of Toronto Mississauga November 16, 2017 Matt Tudball (University of Toronto) ECO375H5 November 16, 2017 1 / 22 Review: Endogeneity Instrumental

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Econometrics Homework 4 Solutions

Econometrics Homework 4 Solutions Econometrics Homework 4 Solutions Question 1 (a) General sources of problem: measurement error in regressors, omitted variables that are correlated to the regressors, and simultaneous equation (reverse

More information

Properties of estimator Functional Form. Econometrics. Lecture 8. Nathaniel Higgins JHU. Nathaniel Higgins Lecture 8

Properties of estimator Functional Form. Econometrics. Lecture 8. Nathaniel Higgins JHU. Nathaniel Higgins Lecture 8 Econometrics Lecture 8 Nathaniel Higgins JHU Homework Next class: GDP, population, temperature, energy, and mortality data together by Nov. 9 (next class) If you have questions / need help, let Rob or

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

Other Models of Labor Dynamics

Other Models of Labor Dynamics Other Models of Labor Dynamics Christopher Taber Department of Economics University of Wisconsin-Madison March 2, 2014 Outline 1 Kambourov and Manovskii 2 Neal 3 Pavan Occupational Specificity of Human

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Introduction to Econometrics. Review of Probability & Statistics

Introduction to Econometrics. Review of Probability & Statistics 1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Endogeneity b) Instrumental

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

PSC 504: Differences-in-differeces estimators

PSC 504: Differences-in-differeces estimators PSC 504: Differences-in-differeces estimators Matthew Blackwell 3/22/2013 Basic differences-in-differences model Setup e basic idea behind a differences-in-differences model (shorthand: diff-in-diff, DID,

More information

Ordinary Least Squares Linear Regression

Ordinary Least Squares Linear Regression Ordinary Least Squares Linear Regression Ryan P. Adams COS 324 Elements of Machine Learning Princeton University Linear regression is one of the simplest and most fundamental modeling ideas in statistics

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Systematic Uncertainty Max Bean John Jay College of Criminal Justice, Physics Program

Systematic Uncertainty Max Bean John Jay College of Criminal Justice, Physics Program Systematic Uncertainty Max Bean John Jay College of Criminal Justice, Physics Program When we perform an experiment, there are several reasons why the data we collect will tend to differ from the actual

More information