L8. LINEAR PANEL DATA MODELS UNDER STRICT AND WEAK EXOGENEITY. 1. A Structural Linear Panel Model

Size: px
Start display at page:

Download "L8. LINEAR PANEL DATA MODELS UNDER STRICT AND WEAK EXOGENEITY. 1. A Structural Linear Panel Model"

Transcription

1 Victor Chernozhukov and Ivan Fernandez-Val Econometrics. Spring Massachusetts Institute o Technology: MIT OpenCourseWare, License: Creative Commons BY-NC-SA L8. LINEAR PANEL DATA MODELS UNDER STRICT AND WEAK EXOGENEITY VICTOR CHERNOZHUKOV AND IV AN FERN ANDEZ-VAL Abstract. We discuss basic examples o linear panel data models and their estimation via the ixed eects, dierencing, and correlated random eects approaches. 1. A Structural Linear Panel Model 1.1. The Setting. Here we consider the linear structural equations model (SEM) Y it = a itα + W i + D itβ + E it =: a i + Xitγ + E it, E it (X it, a i ), (1.1) where i = 1,..., n and t = 1,..., T. Here Y it is the outcome or an observational unit i at time t, D it is a vector o variables o interest or treatments, whose predictive eect α we would like to estimate, W it is a vector o covariates or controls including a constant, X it simply stacks together D it and W it, and E it is an error term normalized to have zero mean or each unit. We shall assume that the vectors Z i := {(Y it, Xit) } T t=1, that collect all the variables or the observational unit i, are i.i.d. across i. We note that this assumption does allow or arbitrary dependence o data or unit i across t, subject to other conditions speciied below. In our analysis the temporal dimension T will be small and the cross-sectional dimension n will be large. Accordingly, we shall derive ormal asymptotic results under the large n, ixed T asymptotics, were n and T is ixed. This type o scenario is oten called the short panel. The orthogonality condition stated will be strengthened below to various assumptions, which permit application o common estimation methods or perorming inerence on the target parameter α. The random variable a i is the unobserved individual eect. It can be correlated to X it, and so we can not omit it without introducing omitted variable bias that leads to 1

2 2 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL inconsistent estimates o the parameter o interest α. We can give context to this point by thinking o the case where a i is the unobserved individual s innate ability, Y it is wage, D it is education, and W it are other characteristics o a person i at time t. Clearly omission o a i rom the model would lead to an omitted variable bias and inconsistent estimation o the target parameter α or the usual reasons that we discussed in L2. Figure 1 illustrates the omitted variable bias problem in the linear panel model. An important point to make here is the ollowing. Suppose D it is randomly assigned conditional on a i and W it, then α estimates a causal parameter the average treatment eect. This is merely one o many suicient conditions or causal interpretability o α. An example o another condition is the assumption o parallel trends underlying the dierence-in-dierence approach, as described below. The individual eect a i is not observed and can not be identiied or consistently estimated rom short panels. This act is sometimes reerred to as the incidental parameter problem, a term that was coined by Neyman and Scott [4]. However, the act that we have panel data allows or the remarkable possibility to accommodate unobserved individual eects a i in the analysis explicitly. In act, below we design identiication and estimation strategies or the target parameter α that bypass the identiication and consistent estimation o a i. Beore we continue, it is worth pointing out that W it could contain a T -dimensional vector o indicators or time periods as a subvector, namely the vector Q t = (0, 0,..., 1,..., 0) with 1 in the t-th position. In this case the model is said to include time eects The Dierence-in-Dierence Method. A very important special case o the problem that we consider is the dierence-in-dierence approach to identiication. Here where Y it = a i + αd it + W it β + E it, (1.2) D it is the indicator o unit i receiving the treatment at time t; W it are various other controls, or example, time dummies in the simplest case;

3 L8 3 Y_it within pooled X_it Figure 1. The igure illustrated the inconsistency o the pooled least square estimator, which omits the individual eects. The igure plots the data rom the linear panel data model in which a i is related to X it. The pooled estimator, which treats a i as part o the error, estimates the projection o Y it on X it, but this is dierent then estimating target structural unction which characterizes the projection o Y it on a i and X it. As a result the pooled least squares is inconsistent or the target unction, as illustrated in the igure. The igure also demonstrates the within or ixed eect estimator, which does consistently estimate the target unction. and α has the interpretation o the treatment eect. Suppose there are two groups: the treated and control. The units in the treated group receive the treatment at time t 0, and the units in the control group do not receive the treatment at any time. Then, or the units

4 4 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL in the treated group E[Y is Treated] = E[a i + W is β + E is Treated], s < t 0, E[Y it Treated] = α + E[a i + W it β + E it Treated], t t 0. It is tempting to take the dierence: E[Y it Treated] E[Y is Treated] = α + E[(W it W is ) β + E it E is Treated], Change(s,t) in an eort to identiy the treatment eect α, but the trend term Change(s, t) does not necessarily vanish. For the units in the control group: I we take the dierence, E[Y is Control] = E[a i + W is β + E is Control], s < t 0, E[Y it Control] = E[a i + W it β + E it Control], t t 0. E[Y it Control] E[Y is Control] = E[(W it W is ) β + E it E is Control]. Under the assumption o the parallel trends: Change(s, t) = Change (s, t), Change ' (s,t) which says that treatment and control groups experienced parallel trends apart rom the treatment eect, we can take dierence-o-the-dierence to identiy the treatment eect: E[Y it Y is Treated] E[Y it Y is Control] = α. Thus under the assumption o parallel trends, we can identiy the treatment eect α. The structural linear panel data ormulation allows us to incorporate the dierence-indierence approach as a special case. Under the assumption o parallel trends and other conditions speciied below, we shall be able to identiy and estimate α. Note that the assumption o parallel trends is more plausible when the treatment D it is randomly assigned, but it also allows or non-random assignment even conditional on covariates. 1 For example, i Y it is income and D it is participation in a training program, it may be that D it = 1 is assigned to those who had low income Y st in the pre-treatment period s < t 0. In this case, we are still able to identiy α under the assumption o the parallel trends or high and low-income groups in the absence o treatment. The plausibility o the assumption really depends on the context. It could be tested by seeing i group-speciic trends enter the panel regression model as statistically insigniicant. 1 O course, this is not the irst time this happens in this course. Recall Wright s instrumental variables model.

5 L Basic Identiication and Estimation Strategies Under Strict Exogeneity There are our methods or identiication and estimation that we shall explore, which vary in terms o strength o assumptions needed or identiication o α: 1. within-estimation or ixed eect estimation, 2. irst-dierencing, 3. correlated random eects, 4. pooled estimation or random eects. The names listed above are conventional names given to the procedures, though some names are potentially misleading, as is usually the case Key approach I: within-groups or ixed eects approach. This approach eliminates the individual eects a i by individual demeaning. Deine the operator D acting on doubly indexed random variables V it as: 1 T DV it = V it V it. T t=1 Apply this operator to the linear SEM (1.1) to obtain: DY it = (DX it ) γ + DE it. This has eliminated the individual eects. It is now very tempting to identiy the parameter γ as a projection coeicient and estimate it by least squares, but our assumptions are not suicient or this purpose. In order to proceed in this way we need to have the orthogonality condition: T 1 T E[DX it DE it ] = 0. (2.1) T t=1 This condition states that the demeaned error terms are uncorrelated with the demeaned covariates, ater averaging over t. With this assumption γ is equal to the projection coeicient o DY it on DX it, where we aggregate across t = 1,..., T. The condition (2.1) is implied by the so-called strict exogeneity condition: T s=1, E it (X, a i ), X i := (X is ) (2.2) i which states that the structural errors E it are orthogonal to all lags and leads o X it conditional on a i. This is potentially a restrictive condition, and does not hold with lagged dependent variables appearing as X it. Note that it is hard to think o situations where (2.2) does not hold but (2.1) does or the causal parameter γ, so eectively we are imposing (2.2) when interpreting results as having causal meaning.

6 6 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL The least squares estimation using the demeaned equation is called within-groups or ixed eects estimator. It can be seen as an exactly identiied GMM estimator with the score unction T g(z i, γ) = 1 (DY it DX it γ)dx it. (2.3) T t=1 Because o the exact identiication a simpliied variance ormula applies. It turns out that this approach is numerically equivalent to the the so-called least squares dummy variable (LSDV) estimator that applies OLS to the model: Y it = Q iπ + Ditα + W it β + u it, where Q i is a n-dimensional vector o indicators or observational units with a 1 in the i-th position and 0 s otherwise, namely Q i = (0, 0,..., 1,..., 0). The elements o this vector are called ixed eects. Note that here the GMM ormulation takes care o the clustering problem temporal dependence o data on observational unit i across time by aggregating the data on unit i into one score. We can also use the panel bootstrap which would be to simply bootstrap the independent observational units i or the scores in (2.3). The strict exogeneity condition (2.2) contains over identiying restrictions, and we can set up a GMM estimator with the score unction: g (Z i, γ) = {(DY it DX it γ)x i } T t=1, X i = (X it ) T t=1. (2.4) Using this ormulation we can obtain an eicient estimator or the causal parameter γ. Moreover, we can use the J-test to check the statistical validity o (2.2). Typically the approach outlined in this paragraph is ignored in empirical work, but this is not a good practice. Note that here too we are taking care o the clustering problem by aggregating the data on unit i into one score. We can construct an alternative GMM estimator based on a subset o linear combinations o the moment conditions implied by strict exogeneity using the score unction: ğ(z i, γ) = {(DY it DX it γ)dx it } T t=1. (2.5) This estimator might have better inite sample properties than the estimator based on (2.4) as it is based on a smaller number o inormative moment conditions, T dim X it instead o T 2 dim X it. The J-test in this case can be interpreted as a test o time homogeneity o the parameter γ in the FE approach Key approach II: irst dierencing. This approach eliminates the individual eects a i by taking dierences across time. Speciically, deine the dierencing operator Δ acting

7 L8 7 on doubly indexed random variables V it by creating the dierence ΔV it = V it V it 1. Apply this operator to both sides o (1.1) to obtain: ΔY it = ΔX it γ + ΔE it, (2.6) since Δa i = 0. Thus dierencing eliminates the individual eect. It is tempting to apply least squares to this equation to estimate γ, but the assumptions made so ar do not guarantee that 1 T EΔE it ΔX it = 0, (2.7) T 1 t=2 which is necessary or γ to be a projection coeicient. So the researchers impose this condition implicitly when perorming the irst dierent estimation. This condition states that the innovation in the error terms is uncorrelated with the innovation in the covariates, ater averaging over t. With this assumption γ is equal to the projection coeicient o ΔY it on ΔX it, where we aggregate across t = 2,..., T. The assumption (2.7) is implied the strict exogeneity assumption (2.2) and it is hard to think o situations where strict exogeneity does not hold but (2.7) does or the causal parameter γ. The assumption can serve as a basis o completely standard least squares estimation method with the i.i.d. data {Z i } i=1 n. We can ormulate that as an exactly identiied GMM problem with the score unction: g(z i, γ) = 1 T T 1 t=2 (Δy it ΔX γ)δx it, (2.8) it so that GMM theory applies here, since {Z i } n i=1 are i.i.d. Note that the GMM ormulation automatically takes care o the clustering problem namely the act that data or unit i could be dependent across t by simply aggregating the scores within the cluster i. Note that we also can apply bootstrap or inerence by simply bootstrapping observation units, or, equivalently bootstrapping the scores in (2.8). Note however that the strict exogeneity (2.2) contains a lot o over-identiying inormation and could serve as a basis or a GMM method with the score unction g (Z i, γ) = {(Δy it ΔX it γ)x i } T t=2. (2.9) This estimator will be more eicient than the previous one, and we can also use the J- statistic to test the validity o (2.7) (as well as validity o (2.2)). Ordinarily J-testing is not done in empirical work (which is bad practice), and one wonders i the results o many well-known studies will pass such a test. Again, the GMM theory applies here. Note that this ormulation also automatically takes care o the clustering problem.

8 8 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL As in the FE approach, we can construct an alternative GMM estimator based on a subset o linear combinations o the moment conditions implied by strict exogeneity using the score unction: T ğ(z i, γ) = {(ΔY it ΔX it γ)δx it } t=2. (2.10) This estimator might have better inite sample properties than the estimator based on (2.9) as it is based on a smaller number o inormative moment conditions, (T 1) dim X it instead o (T 1)T dim X it. The J-test in this case can be interpreted as a test o time homogeneity o all the model parameters in the FD approach Other Approaches: Correlated Random Eects. In this approach, instead o trying to eliminate the individual eect a i, we model its dependence on the covariates: Then 1 T a i = X i λ + v i, v i X i, X i = X it. T t=1 Y = X i λ + X it itγ + u it, u it = v i + E it. Under the strict exogeneity assumption (2.2), the parameters γ and λ can be identiied as projection coeicients, and we can estimate the model by least squares. This corresponds to the GMM approach with the score unction: T g(z i, λ, γ) = 1 (Y it X i λ X it γ)x it, X it = (X i, X it ). T t=1 It turns out that this approach is numerically equivalent to the within-groups estimator in the linear model. Still it is conceptually useul to have this approach discussed here. As in the previous cases, there is a lot o over-identiying inormation and we can use the score unction: g (Z i, λ, γ) = {(Y it X i λ X it γ)x i } T t=1 to set up the eicient GMM estimator and use the J-test to examine the validity o the model Other Approaches: Pooled or Random Eects Approach. This approach is by ar the most restrictive. You still need to know about it, since people use this terminology quite oten and you need to understand what they mean by it. In this approach we simply think that a i is uncorrelated with X it and treat it as a part o the error term: Y = X it itγ + v it, v it X it, v it = a i + E it. This means we can identiy α as a projection coeicient and estimate it by least squares o Y it on X it. The error terms v it are correlated across t, which is a orm o clustering. We

9 L8 9 can easily take care o this dependence by using the score unction: T g(z i, γ) = 1 (Y it X it γ)x it. (2.11) T t= Which approach to use? The pooled estimator is the worst in terms o strength o assumptions. The minimal assumption (2.1) o the ixed eects approach and is neither stronger nor weaker than the minimal assumption (2.7) o the irst dierencing approach. It is always a good idea to test the underlying assumptions. None o the assumptions made are suitable to the case where the strict exogeneity does not hold. Strict exogeneity requires the leads and lags o X it to be uncorrelated with E it. This assumption is violated or example when X it contains lagged dependent variables. The irst dierence approach makes the slightly milder assumption o uncorrelated dierences, but it is still unsuitable, or example, when lags o Y it appear as X it. Example 1 (Lagged Dependent Variable). The last point requires elaboration. I the model is Y it = a i + β Y i(t 1) +E it, where E it has zero mean conditional on a i and is independent across t. Then, strict exogeneity clearly ails because E it is not orthogonal to X i(t+1) = Y it conditional on a i : X it 2 E[Y it E it a i ] = E[E it a i ] = 0. Also the uncorrelated dierences assumption ails because ΔE it is not orthogonal to ΔX it : EΔX it ΔE it = E(Y i(t 1) Y i(t 2) )(E it E i(t 1) ) = EE 2 = 0. i(t 1) 3. Getting More Sophisticated: Identiication and Estimation under Weak Exogeneity Once you understand how things work or the key approaches and how they easily it in the GMM ramework, you can easily modiy them to best suit your empirical situations. Here is one example o how to do this, and it comes as a pleasant bonus or mastering the preceding section as well as the GMM material. In this example, we will ocus on the dierencing approach, but we could also consider de-meaning where one uses uture values o the dependent variable to de-mean it and uses the past values o the regressors to demean them.

10 10 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL 3.1. A Model with Pre-Determined Regressors. Consider the linear SEM (1.1) with predetermined regressors, namely E it (X i t, a i ), X t := (X it, X i(t 1),..., X i1 ). i This is a weak exogeneity assumption. Note that this ormulation can accommodate lagged dependent variables as part o X it i E it is not serially correlated. We can adapt the irst dierences approach to this situation. Note that ΔE it X t 1. i This means we can identiy γ rom the moment equations: t 1 E(ΔY it ΔX it γ)x i = 0, t = 2,..., T. (3.1) The estimation and inerence can be done using GMM with the score unction g(z i, γ) = {(ΔY it ΔX it γ)x t 1 i } T t=2. (3.2) This estimation method subsumes as a special case the Arellano-Bond [3] s estimation approach or models with a lagged dependent variable. As in the strictly exogenous case, we can construct an alternative GMM estimator based on a subset o linear combinations o the moment conditions implied by weak exogeneity using the score unction: ğ(z i, γ) = {(ΔY it ΔX it γ)x i(t 1) } T t=2. (3.3) This estimator might have better inite sample properties than the estimator based on (3.2) as it is based on a smaller number o inormative moment conditions. The J-test in this case can be interpreted as a test o time homogeneity o all the model parameters. We can urther reduce the number conditions by averaging over t resulting on the score unction 1 T g (Z i, γ) = (ΔY it ΔX it γ)x i(t 1), T 1 t=2 which just-identities the parameter γ. This method subsumes as a special case the Anderson- Hsiao [2] estimator or models with a lagged dependent variable A Model with Pre-Determined Instruments. As a urther generalization we consider the linear structural equations model (SEM) Y it = a i + D it α + W it β + E it =: a i + X it γ + E it, where i = 1,..., n and t = 1,..., T, and where we have pre-determined instruments: t E it (M i, a i ), Mi t := (M it, M i(t 1),...). (3.4) where M it are technical instruments that contain W it, but don t contain D it. For example, we can think about modeling aggregate demand equations across dierent markets i and

11 L8 11 across dierent time periods, where D it is the price. The instruments M it could consist o various supply shiters as well as controls W it. Then (3.4) implies ΔE it Mi t 1. (3.5) This means we can identiy γ rom the moment equations: t 1 E(ΔY it ΔX it γ)m i = 0. (3.6) The estimation and inerence can be done using GMM with the score unction g(z i, γ) = {(ΔY it ΔX it γ)m t 1 i } T t=2. (3.7) As beore the GMM ormulation takes care o clustering by simply deining the scores appropriately. 4. The Eects o Expending on Academic Perormance We illustrate the methods o Section 2 with an empirical application on the eect o school resources per student on test pass rates ollowing Papke (2005) [5]. We use data rom annual Michigan School Reports (MSR) on 550 elementary schools in Michigan or the period rom 1992 through This period covers 1994, when Michigan carried out a school inance reorm to equalize spending across K-12 schools. The purpose o this reorm is to oer equal educational opportunities and to improve student perormance. The outcome variable Y it is the percent o students passing a ourth-grade math test rom the Michigan Educational Assessment Program in school i at year t. The explanatory variables X it include the logarithm o per student spending, logarithm o per student spending in the previous year, the raction o the students receiving ree lunch, and the logarithm o school enrollment. Table 1 gives descriptive statistics or the variables used in the analysis. Table 1. Descriptive Statistics Mean SD Math Pass Rate Expenditure 5,496 1,151 Lunch Enrollment 3,044 8,153 We estimate the model (1.1) using several approaches: (1) Pooled: pooled approach that assumes that X it is orthogonal to the school unobserved eect a i. (2) FD: irst dierencing approach based on the moment conditions (2.7).

12 12 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL (3) GMM-FD1: irst dierencing approach based on two-step GMM with the score unction (2.10). The irst-step uses the FD estimator to construct the optimal weighting matrix. (4) GMM-FD2: irst dierencing approach based on two-step GMM with the score unction (2.9), which exploits all the restrictions implied by strict exogeneity. The irst-step uses the FD estimator to construct the optimal weighting matrix. (5) FE: ixed eects approach based on the moment conditions (2.1). (6) GMM-FE1: ixed eects approach based on two-step GMM with score unction (2.5). The irst-step uses the FE estimator to construct the optimal weighting matrix. (7) GMM-FE2: ixed eects approach based on two-step GMM with score unction (2.4), which exploits all the restrictions implied by strict exogeneity. The irst-step uses the FE estimator to construct the optimal weighting matrix. For each approach we compute estimates, analytical standard errors clustered at the school level, and bootstrap standard errors based on resampling schools with replacement. We also report the results o the J-test or the overidentiying restrictions o the GMM-FD1, GMM-FD2, GMM-FE1 and GMM-FE2 approaches. Table 2 presents the results. All the approaches yield positive and signiicant eects o increasing expenditure per student the previous year on current math score passing rates. The delay on the eect is due to the timing o the tests and spending variables: the test are administered early in the second semester, whereas the spending is the allocation or the entire school year. According to the analytical standard errors, using all the strict exogeneity restrictions substantially increases the precision o the estimates. However, we should be cautious with this result as analytical standard errors might provide a poor approximation to the variability o two-step GMM estimators in the presence o many (potentially weak) moment conditions. 2 Bootstrap provides standard errors that are more stable across the dierent approaches. Time homogeneity o all the model parameters cannot be rejected at the 5% level, although only marginally or the GMM-FD1. The strict exogeneity overidentiying restrictions are rejected at the 5% level with both GMM-FD2 and GMM-FE2. 5. Causal Eect o Democracy on Economic Growth We illustrate the methods o Section 3 with an application to the causal eect o democracy on economic growth based on Acemoglu, Naidu, Restrepo and Robinson (2014) [1]. We use a balanced panel o 147 countries over the period rom 1987 through 2009 extracted rom the data set used in [1]. The outcome variable Y it is the logarithm o GDP per capita in 2000 USD as measured by the World Bank or country i at year t. The treatment variable o interest D it is a democracy indicator constructed in [1], which combines inormation 2 [6] derived a correction or analytical standard errors o two-step GMM estimators.

13 log(rexpp) 0.53 (2.51) [2.49] L1.log(rexpp) 9.05 (2.79) [2.81] log(enrol) 0.59 (0.41) [0.40] L8 13 Table 2. Eect o Expenditure per Student on Math Scores Pooled FD GMM-FD1 GMM-FD2 FE GMM-FE1 GMM-FE (4.93) (2.99) (1.30) (2.79) (2.09) (1.31) [4.65] [3.43] [3.39] [2.74] [2.51] [2.61] (5.12) [5.10] 2.14 (1.64) [1.59] 7.94 (2.77) [3.69] 1.84 (1.02) [1.34] 9.87 (1.12) [4.26] 1.42 (0.42) [1.32] 7.00 (4.24) [4.20] 0.25 (0.95) [0.95] 9.44 (2.47) [3.42] 0.31 (0.75) [0.96] 7.63 (1.03) [3.87] 0.05 (0.41) [0.93] lunch (0.03) [0.03] (0.17) [0.15] (0.12) [0.16] (0.04) [0.12] (0.13) [0.12] (0.10) [0.11] (0.04) [0.11] J-test p-val d.o Note 1: All the speciications include time eects. Note 2: Clustered standard errors at the school level in parentheses. Note 3: Bootstrap standard errors in brackets based on 500 replication. rom several sources including Freedom House and Polity IV. This indicator captures a bundle o institutions that characterize electoral democracies such as ree and competitive elections, checks on executive power, an inclusive political process that permits various groups o society to be represented politically, and expansion o civil rights. Table 3 reports some descriptive statistics o the variables used in the analysis. The unconditional eect o democracy on GDP is 134% in this period. Table 3. Descriptive Statistics Mean SD Dem = 1 Dem = 0 Democracy Log(GDP) Number Obs. 3,381 3,381 2,099 1,282

14 14 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL We control or unobserved country eects and rich dynamics o GDP using the linear panel model pt Y it = a i + αd it + β j Y i(t j) + E it, i = 1,..., n, t = p + 1,..., T, where we assume that j=1 E it (X i t, a i ), X i t = (D it,..., D i1, Y i(t 1),..., Y i1 ). (5.1) This assumption implies that democracy and past GDP are orthogonal to contemporaneous and uture GDP shocks, and that the error E it is serially uncorrelated once we include suiciently many lags o GDP. 3 Following the preerred speciication in [1], we include our lags (p = 4). Table 4. Eect o Democracy on Economic Growth Democracy ( 100) L1.log(gdp) 1.33 (0.05) [0.05] L2.log(gdp) (0.06) [0.07] L3.log(gdp) (0.05) [0.05] L4.log(gdp) (0.03) [0.03] Predictive Causal Pooled FD FE GMM-FD1 GMM-FD (0.26) (1.20) (0.65) (2.77) (1.70) [0.26] [1.20] [0.66] [2.75] [1.61] 0.33 (0.05) [0.06] 0.17 (0.04) [0.04] 0.06 (0.02) [0.02] (0.04) [0.04] 1.15 (0.05) [0.05] (0.06) [0.06] (0.04) [0.04] (0.02) [0.03] 1.30 (0.14) [0.15] (0.28) [0.22] (0.28) [0.23] 0.18 (0.14) [0.13] 1.00 (0.07) [0.07] (0.06) [0.07] (0.04) [0.04] (0.03) [0.03] J-test p-value d.o Note 1: All the speciications include time eects. Note 2: Clustered standard errors at the country level in parentheses. Note 3: Bootstrap standard errors in brackets based on 500 replications. 3 All the variables are net o time eects.

15 L8 15 Table 4 presents the results. The estimates based on the weak exogeneity condition (5.1) are reported in the columns labelled as GMM-FD1 and GMM-FD2. 4 GMM-FD1 applies two-step GMM with the score unction (3.3) with X it = (D it, Y i(t 1),..., Y i(t 4) ), γ = (α, β 1,..., β 4 ), (5.2) which uses only a subset o the moment conditions in (5.1). GMM-FD2 applies two-step GMM with the score unction (3.2), which uses all the moment conditions in (5.1), with the same X it and γ as in (5.2). In both cases we test the overidentiying restrictions using the J-test. The rest o the columns report estimates o predictive eects based on the pooled, irst dierencing and ixed eects approaches o Section 2 with the score unctions (2.11), (2.8) and (2.3), respectively. None o these approaches is consistent or the causal parameters identiied by (5.1) under large n ixed T asymptotics. 5 For each method, we report analytical standard errors clustered at the country level and bootstrap standard errors based on resampling countries without replacement. The GMM-FD2 approach inds that a transition to democracy increases economic growth by almost 4% in the irst year, and about 20% in the long run. 6 The J-test does not reject the weak exogeneity overidentiying restrictions in (5.1). The GMM-FD1 approach yields short run eects similar to FD-GMM2, but more than double run eects o 46.7% and clearly reject the overidentiying restrictions in (3.3). This does raise the concern about the statistical validity o the model and warrants urther investigation. The discrepancy in the conclusion o the J-test might be due to lack o power in the GMM-FD2 due to simultaneous testing o many restrictions, most o them (possibly) weak moment conditions. As expected, the FE predictive estimates are closer to their causal counterparts than the pooled and FD estimates because T is large, T = 19 ater using the irst 4 periods as initial conditions. Appendix A. Problems (1) Implement standard panel data estimators (ixed eects, irst dierences, or Arellano- Bond approach) or the irst or second empirical example. These are implemented in standard sotware such as Stata or R. Very briely explain the assumptions you need to make or these estimators to be consistent or causal eects, and why these assumptions may or may not hold in these examples. Very briely explain how you are taking care o clustering. (2) (Optional Bonus Problem.) Implement the GMM-FE1 and GMM-FE2 estimators or the irst empirical example and GMM-FD1 or the second empirical example. Report results o J-tests. Explain very briely what you are doing. Extra plus or 4 We obtain the estimates with the command pgmm o the package plm in R. 5 The ixed eects estimator is consistent under asymptotic sequences where n, T because it has asymptotic bias o order O(T 1 ). 6 The long-run eect is calculated as α/(1 to democracy. p j=1 βj ), and corresponds to the eect o a permanent transition

16 16 VICTOR CHERNOZHUKOV AND IVAN FERNANDEZ-VAL inding problems with the posted R-code or the empirical results reported in the lecture note. Reerences [1] Daron Acemoglu, Suresh Naidu, Pascual Restrepo, and James A. Robinson. Democracy Does Cause Growth. NBER Working Papers 20004, National Bureau o Economic Research, Inc, March [2] T. W. Anderson and Cheng Hsiao. Formulation and estimation o dynamic models using panel data. J. Econometrics, 18(1):47 82, [3] Manuel Arellano and Stephen Bond. Some tests o speciication or panel data: Monte carlo evidence and an application to employment equations. The Review o Economic Studies, 58(2): , [4] J. Neyman and E. L. Scott. Consistent estimates based on partially consistent observations. Econometrica, 16(1):1 32, [5] Leslie E. Papke. The eects o spending on test pass rates: evidence rom Michigan. Journal o Public Economics, 89(5-6): , June [6] Frank Windmeijer. A inite sample correction or the variance o linear eicient twostep gmm estimators. Journal o Econometrics, 126(1):25 51, 2005.

17 MIT OpenCourseWare Econometrics Spring 2017 For inormation about citing these materials or our Terms o Use, visit:

arxiv: v1 [econ.em] 12 Jan 2019

arxiv: v1 [econ.em] 12 Jan 2019 Mastering Panel Metrics: Causal Impact of Democracy on Growth By Shuowen Chen and Victor Chernozhukov and Iván Fernández-Val arxiv:1901.03821v1 [econ.em] 12 Jan 2019 The relationship between democracy

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information

Assessing Studies Based on Multiple Regression

Assessing Studies Based on Multiple Regression Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables

More information

Dynamic Panel Data Models

Dynamic Panel Data Models Models Amjad Naveed, Nora Prean, Alexander Rabas 15th June 2011 Motivation Many economic issues are dynamic by nature. These dynamic relationships are characterized by the presence of a lagged dependent

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Common Errors: How to (and Not to) Control for Unobserved Heterogeneity *

Common Errors: How to (and Not to) Control for Unobserved Heterogeneity * Common Errors: How to (and ot to) Control or Unobserved Heterogeneity * Todd A. Gormley and David A. Matsa June 6 0 Abstract Controlling or unobserved heterogeneity (or common errors ) such as industry-speciic

More information

EC327: Advanced Econometrics, Spring 2007

EC327: Advanced Econometrics, Spring 2007 EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Chapter 14: Advanced panel data methods Fixed effects estimators We discussed the first difference (FD) model

More information

1 Estimation of Persistent Dynamic Panel Data. Motivation

1 Estimation of Persistent Dynamic Panel Data. Motivation 1 Estimation of Persistent Dynamic Panel Data. Motivation Consider the following Dynamic Panel Data (DPD) model y it = y it 1 ρ + x it β + µ i + v it (1.1) with i = {1, 2,..., N} denoting the individual

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Beyond the Target Customer: Social Effects of CRM Campaigns

Beyond the Target Customer: Social Effects of CRM Campaigns Beyond the Target Customer: Social Effects of CRM Campaigns Eva Ascarza, Peter Ebbes, Oded Netzer, Matthew Danielson Link to article: http://journals.ama.org/doi/abs/10.1509/jmr.15.0442 WEB APPENDICES

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels

More information

Dynamic Panel Data Models

Dynamic Panel Data Models June 23, 2010 Contents Motivation 1 Motivation 2 Basic set-up Problem Solution 3 4 5 Literature Motivation Many economic issues are dynamic by nature and use the panel data structure to understand adjustment.

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

10. Joint Moments and Joint Characteristic Functions

10. Joint Moments and Joint Characteristic Functions 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent the inormation contained in the joint p.d. o two r.vs.

More information

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton. 1/17 Research Methods Carlos Noton Term 2-2012 Outline 2/17 1 Econometrics in a nutshell: Variation and Identification 2 Main Assumptions 3/17 Dependent variable or outcome Y is the result of two forces:

More information

System GMM estimation of Empirical Growth Models

System GMM estimation of Empirical Growth Models System GMM estimation of Empirical Growth Models ELISABETH DORNETSHUMER June 29, 2007 1 Introduction This study based on the paper "GMM Estimation of Empirical Growth Models" by Stephan Bond, Anke Hoeffler

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

The achievable limits of operational modal analysis. * Siu-Kui Au 1)

The achievable limits of operational modal analysis. * Siu-Kui Au 1) The achievable limits o operational modal analysis * Siu-Kui Au 1) 1) Center or Engineering Dynamics and Institute or Risk and Uncertainty, University o Liverpool, Liverpool L69 3GH, United Kingdom 1)

More information

Dynamic Panels. Chapter Introduction Autoregressive Model

Dynamic Panels. Chapter Introduction Autoregressive Model Chapter 11 Dynamic Panels This chapter covers the econometrics methods to estimate dynamic panel data models, and presents examples in Stata to illustrate the use of these procedures. The topics in this

More information

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

Objectives. By the time the student is finished with this section of the workbook, he/she should be able FUNCTIONS Quadratic Functions......8 Absolute Value Functions.....48 Translations o Functions..57 Radical Functions...61 Eponential Functions...7 Logarithmic Functions......8 Cubic Functions......91 Piece-Wise

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Linear dynamic panel data models

Linear dynamic panel data models Linear dynamic panel data models Laura Magazzini University of Verona L. Magazzini (UniVR) Dynamic PD 1 / 67 Linear dynamic panel data models Dynamic panel data models Notation & Assumptions One of the

More information

Least-Squares Spectral Analysis Theory Summary

Least-Squares Spectral Analysis Theory Summary Least-Squares Spectral Analysis Theory Summary Reerence: Mtamakaya, J. D. (2012). Assessment o Atmospheric Pressure Loading on the International GNSS REPRO1 Solutions Periodic Signatures. Ph.D. dissertation,

More information

The concept of limit

The concept of limit Roberto s Notes on Dierential Calculus Chapter 1: Limits and continuity Section 1 The concept o limit What you need to know already: All basic concepts about unctions. What you can learn here: What limits

More information

Topic 10: Panel Data Analysis

Topic 10: Panel Data Analysis Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Improving GMM efficiency in dynamic models for panel data with mean stationarity

Improving GMM efficiency in dynamic models for panel data with mean stationarity Working Paper Series Department of Economics University of Verona Improving GMM efficiency in dynamic models for panel data with mean stationarity Giorgio Calzolari, Laura Magazzini WP Number: 12 July

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects

More information

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006 Ability Bias, Errors in Variables and Sibling Methods James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006 1 1 Ability Bias Consider the model: log = 0 + 1 + where =income, = schooling,

More information

Advanced Econometrics

Advanced Econometrics Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate

More information

dqd: A command for treatment effect estimation under alternative assumptions

dqd: A command for treatment effect estimation under alternative assumptions UC3M Working Papers Economics 14-07 April 2014 ISSN 2340-5031 Departamento de Economía Universidad Carlos III de Madrid Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 916249875 dqd: A command for treatment

More information

Dynamic panel data methods

Dynamic panel data methods Dynamic panel data methods for cross-section panels Franz Eigner University Vienna Prepared for UK Econometric Methods of Panel Data with Prof. Robert Kunst 27th May 2009 Structure 1 Preliminary considerations

More information

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook) Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook) 1 2 Panel Data Panel data is obtained by observing the same person, firm, county, etc over several periods. Unlike the pooled cross sections,

More information

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points Roberto s Notes on Dierential Calculus Chapter 8: Graphical analysis Section 1 Extreme points What you need to know already: How to solve basic algebraic and trigonometric equations. All basic techniques

More information

Probabilistic Model of Error in Fixed-Point Arithmetic Gaussian Pyramid

Probabilistic Model of Error in Fixed-Point Arithmetic Gaussian Pyramid Probabilistic Model o Error in Fixed-Point Arithmetic Gaussian Pyramid Antoine Méler John A. Ruiz-Hernandez James L. Crowley INRIA Grenoble - Rhône-Alpes 655 avenue de l Europe 38 334 Saint Ismier Cedex

More information

NONPARAMETRIC PREDICTIVE INFERENCE FOR REPRODUCIBILITY OF TWO BASIC TESTS BASED ON ORDER STATISTICS

NONPARAMETRIC PREDICTIVE INFERENCE FOR REPRODUCIBILITY OF TWO BASIC TESTS BASED ON ORDER STATISTICS REVSTAT Statistical Journal Volume 16, Number 2, April 2018, 167 185 NONPARAMETRIC PREDICTIVE INFERENCE FOR REPRODUCIBILITY OF TWO BASIC TESTS BASED ON ORDER STATISTICS Authors: Frank P.A. Coolen Department

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Gaussian Process Regression Models for Predicting Stock Trends

Gaussian Process Regression Models for Predicting Stock Trends Gaussian Process Regression Models or Predicting Stock Trends M. Todd Farrell Andrew Correa December 5, 7 Introduction Historical stock price data is a massive amount o time-series data with little-to-no

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

ON THE PRACTICE OF LAGGING VARIABLES TO AVOID SIMULTANEITY

ON THE PRACTICE OF LAGGING VARIABLES TO AVOID SIMULTANEITY ON THE PRACTICE OF LAGGING VARIABLES TO AVOID SIMULTANEITY by W. Robert Reed Department of Economics and Finance University of Canterbury Christchurch, New Zealand 28 August 2013 Contact Information: W.

More information

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid Applied Economics Panel Data Department of Economics Universidad Carlos III de Madrid See also Wooldridge (chapter 13), and Stock and Watson (chapter 10) 1 / 38 Panel Data vs Repeated Cross-sections In

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Short T Panels - Review

Short T Panels - Review Short T Panels - Review We have looked at methods for estimating parameters on time-varying explanatory variables consistently in panels with many cross-section observation units but a small number of

More information

PS 271B: Quantitative Methods II Lecture Notes

PS 271B: Quantitative Methods II Lecture Notes PS 271B: Quantitative Methods II Lecture Notes (Part 6: Panel/Longitudinal Data; Multilevel/Mixed Effects models) Langche Zeng zeng@ucsd.edu Panel/Longitudinal Data; Multilevel Modeling; Mixed effects

More information

Instrumental Variables and the Problem of Endogeneity

Instrumental Variables and the Problem of Endogeneity Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =

More information

The Forward Premium is Still a Puzzle Appendix

The Forward Premium is Still a Puzzle Appendix The Forward Premium is Still a Puzzle Appendix Craig Burnside June 211 Duke University and NBER. 1 Results Alluded to in the Main Text 1.1 First-Pass Estimates o Betas Table 1 provides rst-pass estimates

More information

Introduction to Econometrics. Assessing Studies Based on Multiple Regression

Introduction to Econometrics. Assessing Studies Based on Multiple Regression Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Assessing Studies Based on Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1

More information

Mostly Dangerous Econometrics: How to do Model Selection with Inference in Mind

Mostly Dangerous Econometrics: How to do Model Selection with Inference in Mind Outline Introduction Analysis in Low Dimensional Settings Analysis in High-Dimensional Settings Bonus Track: Genaralizations Econometrics: How to do Model Selection with Inference in Mind June 25, 2015,

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models An obvious reason for the endogeneity of explanatory variables in a regression model is simultaneity: that is, one

More information

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects MPRA Munich Personal RePEc Archive Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects Mohamed R. Abonazel Department of Applied Statistics and Econometrics, Institute of Statistical

More information

Fixed Effects Models for Panel Data. December 1, 2014

Fixed Effects Models for Panel Data. December 1, 2014 Fixed Effects Models for Panel Data December 1, 2014 Notation Use the same setup as before, with the linear model Y it = X it β + c i + ɛ it (1) where X it is a 1 K + 1 vector of independent variables.

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

STAT 801: Mathematical Statistics. Hypothesis Testing

STAT 801: Mathematical Statistics. Hypothesis Testing STAT 801: Mathematical Statistics Hypothesis Testing Hypothesis testing: a statistical problem where you must choose, on the basis o data X, between two alternatives. We ormalize this as the problem o

More information

CHAPTER (i) No. For each coefficient, the usual standard errors and the heteroskedasticity-robust ones are practically very similar.

CHAPTER (i) No. For each coefficient, the usual standard errors and the heteroskedasticity-robust ones are practically very similar. SOLUTIONS TO PROBLEMS CHAPTER 8 8.1 Parts (ii) and (iii). The homoskedasticity assumption played no role in Chapter 5 in showing that OLS is consistent. But we know that heteroskedasticity causes statistical

More information

Applied Econometrics Lecture 1

Applied Econometrics Lecture 1 Lecture 1 1 1 Università di Urbino Università di Urbino PhD Programme in Global Studies Spring 2018 Outline of this module Beyond OLS (very brief sketch) Regression and causality: sources of endogeneity

More information

Applied Health Economics (for B.Sc.)

Applied Health Economics (for B.Sc.) Applied Health Economics (for B.Sc.) Helmut Farbmacher Department of Economics University of Mannheim Autumn Semester 2017 Outlook 1 Linear models (OLS, Omitted variables, 2SLS) 2 Limited and qualitative

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Area disparities in Britain: understanding the contribution of people versus place through variance decompositions 1

Area disparities in Britain: understanding the contribution of people versus place through variance decompositions 1 Area disparities in Britain: understanding the contribution o people versus place through variance decompositions 1 Stephen Gibbons (LSE and SERC) Henry G. Overman (LSE and SERC) Panu Pelkonen (University

More information

Specification Tests in Unbalanced Panels with Endogeneity.

Specification Tests in Unbalanced Panels with Endogeneity. Specification Tests in Unbalanced Panels with Endogeneity. Riju Joshi Jeffrey M. Wooldridge June 22, 2017 Abstract This paper develops specification tests for unbalanced panels with endogenous explanatory

More information

Mgmt 469. Causality and Identification

Mgmt 469. Causality and Identification Mgmt 469 Causality and Identification As you have learned by now, a key issue in empirical research is identifying the direction of causality in the relationship between two variables. This problem often

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

( x) f = where P and Q are polynomials.

( x) f = where P and Q are polynomials. 9.8 Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm ( ) ( ) ( ) P where P and Q are polynomials. Q An eample o a simple rational

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

ECO375 Tutorial 8 Instrumental Variables

ECO375 Tutorial 8 Instrumental Variables ECO375 Tutorial 8 Instrumental Variables Matt Tudball University of Toronto Mississauga November 16, 2017 Matt Tudball (University of Toronto) ECO375H5 November 16, 2017 1 / 22 Review: Endogeneity Instrumental

More information

CEPA Working Paper No

CEPA Working Paper No CEPA Working Paper No. 15-06 Identification based on Difference-in-Differences Approaches with Multiple Treatments AUTHORS Hans Fricke Stanford University ABSTRACT This paper discusses identification based

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values Supplementary material or Continuous-action planning or discounted ininite-horizon nonlinear optimal control with Lipschitz values List o main notations x, X, u, U state, state space, action, action space,

More information

xtdpdml for Estimating Dynamic Panel Models

xtdpdml for Estimating Dynamic Panel Models xtdpdml for Estimating Dynamic Panel Models Enrique Moral-Benito Paul Allison Richard Williams Banco de España University of Pennsylvania University of Notre Dame Reunión Española de Usuarios de Stata

More information

Fluctuationlessness Theorem and its Application to Boundary Value Problems of ODEs

Fluctuationlessness Theorem and its Application to Boundary Value Problems of ODEs Fluctuationlessness Theorem and its Application to Boundary Value Problems o ODEs NEJLA ALTAY İstanbul Technical University Inormatics Institute Maslak, 34469, İstanbul TÜRKİYE TURKEY) nejla@be.itu.edu.tr

More information

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models Sebastian Kripfganz University of Exeter Business School, Department of Economics, Exeter, UK UK Stata Users Group

More information

xtseqreg: Sequential (two-stage) estimation of linear panel data models

xtseqreg: Sequential (two-stage) estimation of linear panel data models xtseqreg: Sequential (two-stage) estimation of linear panel data models and some pitfalls in the estimation of dynamic panel models Sebastian Kripfganz University of Exeter Business School, Department

More information

Lecture 8 Panel Data

Lecture 8 Panel Data Lecture 8 Panel Data Economics 8379 George Washington University Instructor: Prof. Ben Williams Introduction This lecture will discuss some common panel data methods and problems. Random effects vs. fixed

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 16, 2013 Outline Introduction Simple

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Introduction to Simulation - Lecture 2. Equation Formulation Methods. Jacob White. Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy

Introduction to Simulation - Lecture 2. Equation Formulation Methods. Jacob White. Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Introduction to Simulation - Lecture Equation Formulation Methods Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Formulating Equations rom Schematics Struts and Joints

More information

Instrumental variables estimation using heteroskedasticity-based instruments

Instrumental variables estimation using heteroskedasticity-based instruments Instrumental variables estimation using heteroskedasticity-based instruments Christopher F Baum, Arthur Lewbel, Mark E Schaffer, Oleksandr Talavera Boston College/DIW Berlin, Boston College, Heriot Watt

More information

On IV estimation of the dynamic binary panel data model with fixed effects

On IV estimation of the dynamic binary panel data model with fixed effects On IV estimation of the dynamic binary panel data model with fixed effects Andrew Adrian Yu Pua March 30, 2015 Abstract A big part of applied research still uses IV to estimate a dynamic linear probability

More information

Published in the American Economic Review Volume 102, Issue 1, February 2012, pages doi: /aer

Published in the American Economic Review Volume 102, Issue 1, February 2012, pages doi: /aer Published in the American Economic Review Volume 102, Issue 1, February 2012, pages 594-601. doi:10.1257/aer.102.1.594 CONTRACTS VS. SALARIES IN MATCHING FEDERICO ECHENIQUE Abstract. Firms and workers

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Applied Statistics and Econometrics. Giuseppe Ragusa Lecture 15: Instrumental Variables

Applied Statistics and Econometrics. Giuseppe Ragusa Lecture 15: Instrumental Variables Applied Statistics and Econometrics Giuseppe Ragusa Lecture 15: Instrumental Variables Outline Introduction Endogeneity and Exogeneity Valid Instruments TSLS Testing Validity 2 Instrumental Variables Regression

More information

14.74 Lecture 10: The returns to human capital: education

14.74 Lecture 10: The returns to human capital: education 14.74 Lecture 10: The returns to human capital: education Esther Duflo March 7, 2011 Education is a form of human capital. You invest in it, and you get returns, in the form of higher earnings, etc...

More information

Jeffrey M. Wooldridge Michigan State University

Jeffrey M. Wooldridge Michigan State University Fractional Response Models with Endogenous Explanatory Variables and Heterogeneity Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Fractional Probit with Heteroskedasticity 3. Fractional

More information