Impact Evaluation Workshop 2014: Asian Development Bank Sept 1 3, 2014 Manila, Philippines
|
|
- Lilian Randall
- 5 years ago
- Views:
Transcription
1 Impact Evaluation Workshop 2014: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 15 Regression Estimators, Differences in Differences, and Panel Data Methods
2 I. Introduction: Most evaluations in developing countries were conducted on new programs that did not exist before the randomized trial was conducted. In contrast, impact evaluations that are not based on randomized trials almost always are conducted on programs that have existed before the evaluation was planned. Theree are two not random: ways in which participation is 1. The communities in which the programs exist are not randomly chosen. 2. The participants in the program are not randomly assigned. 2
3 This session has three objectives: 1. Explain how very simple ordinary least squares (OLS) estimates of program effects could lead to biased estimates of program impacts. 2. Present four commonly used regression methods to estimate program impacts, including the assumptions needed for those estimations methods to produce unbiased and consistent estimates. The cross section estimator The before after estimator The difference in difference estimator The within estimator 3. Present a case study exploiting panel data methods that allow for correlation between time and the treatment. Let s start with some very simple examples (well, let's skip them here) 3
4 Example: A Before After Estimator The before after estimator obtains a program s impact by comparing outcomes measured after the program started with outcomes measured before it started. Consider a program thatt provides loans to poor farmers, so that they can buy fertilizer to increase their maize production. In the year before the program started, we observed that farmers who later enrolled in the program harvested an average of 1,000 kg of maize per hectare (ha). One year after the program started, maize yields increased to kg/ha. The before after estimator finds a program impact of 200 ( = ) kg/ha. Question: Is 200 kg/ha a plausible estimatee of the program s impact? Consider two cases: 4
5 A: Rainfall was normal during the year before the program started, but a drought occurred in the year the program was launched. B: A drought occurred in the year before the program started, but rainfall returned to normal during the year the program was launched. 5
6 Note: The before after estimator assumes Counterfactual C. Counterfactuals A and B pick up impacts of factors other than the program (e.g. weather changes). 6
7 Example: A Cross section (Enrolled and Nonenrolled) Estimator The cross section estimator obtains a program s impact by comparing outcomes of participants and non participants after the program started. Consider the microfinance program again. weree collected one year after the program Now suppose thatt the only data we have started (i.e. no before data). One year after the program began, the farmers who enrolled in the program harvested an average of 1,100 kg of maize per ha, while those who did not enrolll harvested an average of 1,000 kg/ha. The cross section estimator calculates a program impact of a 100 ( =1,100 1,000) kg/ /ha increase in maize yields. Question: Is 100 kg/ha a plausible estimatee of the program impact? 7
8 Consider the following case scenarios: A: More productive farmers were more likely to obtain the loan because they were more likely to be able to pay back the loan. B: Farmers in the program reside in areas where the quality of land is lower (e.g. they needed more fertilizer to compensate for low land quality). Example: A Difference in Differences Estimator Consider the microfinance example again. A drought occurred the year before the program started, but rainfall was normal the year program was launched. Assume that all farmers were affected by the drought, and they were affected in a similar way. Assume also that not only do we have data collected one year after the program was launched, but we also have data on maize yields before the program was launched, for both enrolled and nonenrolled farmers. Before the program, the farmers who later enrolled in the program harvested 1000 kg of maize per ha, and they harvested 1150 kg/ha one year after the program started. Farmers who did not enroll harvested 900 kg/ha before the program began, and 1000 kg/ha one 8
9 year after the program started. A DID estimator combines the before after and crosssectional (enrolled nonenrolled) estimators: 1 2 = [(enrolled, after) (enrolled, before)] [(nonenrolled, after) (nonenrolled, before)] This yields an estimated impact of ( ) ( ) = 50 kg/ha. Intuition: 1 and 2 remove influence of time invariant factors, e.g., land quality; 1 Δ 2 removes the influence of the common time trend due to, say, the drought. The following figure illustrates these three simple estimators: Cross section: Estimated effect = C D (ignores fixed factors, e.g. land quality, between groups); Before After: Estimated effect = C A (ignores time trend); DID: Estimated effect = (C A) (D B) = C E (accounts for both) 9
10 II. Parameters of Interest and Sources of Bias Recall the two most common parameters of interest for impact evaluation: 1. ATE: the average effect of the program for all persons in the population: ATE E[Y 1 Y 0 ] = E[Δ] 2. ATT: the average effect of the program for program participants: ATT E[Y 1 Y 0 P = 1] = E[Δ P = 1] Recall also that sometimes it is possible to go further by estimating ATE and ATT for a person with characteristics X (a vector of observable variables): ATE(X) E[Y 1 Y 0 X] = E[Δ X] ATT(X) E[Y 1 Y 0 P = 1, X] = E[Δ P = 1, X] If the individuals who take the program tend to be the ones that receive the greatest benefit from it, then we would expect ATT(X) > ATE(X). 10
11 In general, the difference between the mean of observed Y for program participants (P = 1 group) and the mean of observed Y for program non participants (P = 0 group) will not give a consistent (unbiased) estimate of either ATE(X) or ATT(X). To see how bias comes about, assume that for any person in the population the values of Y 1 (the value of Y if that person participates in the program) and Y 0 (the value of Y if that person does not participate in the program) can be expressed as simple linear functions of the X variables for that person, plus an error term: Y 1 = Xβ 1 + U 1 Y 0 = Xβ 0 + U 0 where we assume that E[U 1 X] = E[U 0 X] = 0. The observed value of Y can be written as Y = PY 1 + (1 P)Y 0, where P equals 1 if the person participates in the program and equals 0 if he/she does not participate. 11
12 Note that this setup is quite general, and it allows the program impact to work through X and U. Note that: Y = PY 1 + (1 P)Y 0 Substituting our values for Y 0 and Y 1 above, we have: Regrouping terms, we have: Y= P(Xβ 1 +U 1 ) + (1 P)(Xβ 0 + U 0 ) Y= Xβ 0 + P(Xβ 1 Xβ 0 ) + {U 0 + P(U 1 U 0 )} What does this expression for Y have to do with ATE(X) and ATT(X)? In fact, it is easy to manipulate this expression to show the relationships. To begin, recall that ATE(X) = E[Y 1 Y 0 X]. Substituting the above expressions for Y 1 and Y 0 and rearranging terms, we have: 12
13 ATE(X) = E[Y 1 Y 0 X] = E[(Xβ 1 + U 1 ) (Xβ 0 + U 0 ) X] = E[(Xβ 1 Xβ 0 ) X] + E[U 1 X] E[U 0 X] Recalling that E[U 1 X] = E[U 0 X] =0 (assumption made above), we have = (Xβ 1 Xβ 0 ) This implies that the above expression for Y can be written as: Y = Xβ 0 + P ATE(X) + {U 0 + P(U 1 U 0 )} You can also show (by adding and subtracting PE[U 1 U 0 X, P = 1]) that: Y = Xβ 0 + P ATT(X) + {U 0 + P (U 1 U 0 E[U 1 U 0 X, P = 1])}. 13
14 These two expressions show us how bias can arise when trying to estimate ATE(X) and ATT(X). To estimate ATE(X), the expression Y = Xβ 0 + P ATE(X) + {U 0 + P(U 1 U 0 )} suggests that we regress Y on X and P, and the coefficient on P will be ATE(X). However, this will yield unbiased and consistent estimates of ATE(X) only if the error term {U 0 + P(U 1 U 0 )} is uncorrelated with X and P! In other words, we need to assume that: E[U 0 + P(U 1 U 0 ) X, P] = 0 Similarly, to estimate ATT(X) the above expression Y = Xβ 0 + P ATT(X) + {U 0 + P (U 1 U 0 E[U 1 U 0 X, P = 1])} suggests that the same regression yields an estimate of ATT(X) if the following holds: E[U 0 + P (U 1 U 0 E[U 1 U 0 X, P = 1]) X, P] = 0 14
15 Note that E[P (U 1 U 0 E[U 1 U 0 X, P = 1]) X, P] = 0. This can be seen by considering the two possible values of P. If P = 0 then the expression equals 0. If P = 1 the expression becomes E[(U 1 U 0 E[U 1 U 0 X, P = 1]) X, P = 1], which equals E[U 1 U 0 X, P = 1] E[U 1 U 0 X, P = 1] = 0. So the only concern in estimating ATT(X) is whether E[U 0 X, P] = 0. Therefore, to estimate ATT(X), we need assumptions that imply E[U 0 X, P] = 0. Similarly, estimating ATE(X) requires assumptions that imply E[U 0 + P(U 1 U 0 ) X, P] = 0. Consider the following assumptions: (A.1) Conditional on X, the program effect is the same for everyone (U 1 = U 0 ) (A.2) Conditional on X, the program effect varies across individuals (U 1 U 0 ), but U 1 U 0 does not predict program participation (A.3) Conditional on X, the program effect varies across individuals and U 1 U 0 does predict who participates in the program. 15
16 Note that Assumptions (A.1) and (A.2) imply that ATE(X) = ATT(X): ATT(X) E[Y 1 Y 0 X, P = 1] = E[Xβ 1 Xβ 0 + U 1 U 0 X, P = 1] = Xβ 1 Xβ 0 + E[U 1 U 0 X, P = 1] Under Assumption (A.1), the last term equals 0. It also equals 0 under Assumption (A.2) because that assumption implies that E[U 1 U 0 X, P = 1] = E[U 1 U 0 X], which also equals zero. Thus ATT(X) = Xβ 1 Xβ 0 under either Assumption (A.1) or Assumption (A.2), and ATE(X) = Xβ 1 Xβ 0 as well because E[U 1 U 0 X] = 0. Under assumptions (A.1) and (A.2), ATE = ATT and potential bias arises only if E[U 0 X, P] 0. Under assumption (A.3), bias in estimating ATE can arise if either E[U 0 X, P] 0 or E[U 1 U 0 X, P] 0 (see p.13). 16
17 III. Cross Section Estimator The cross section estimator uses data on a group of nonparticipants to impute counterfactual outcomes for program participants. The data for both groups are collected during the same time period, after the program has started. We now modify the notation to allow for a time subscript: Y 1it = value of Y for person i at time t if he/she participates in the program at time t Y 0it = value of Y for person i at time t if he/she does not participate in the program at time t. The data requirements of this estimator are minimal: it requires data only on participants (P it = 1) and non participants (P it = 0) for some time period t after the participants started their involvement in the program. 17
18 The cross section estimator can be defined as the OLS estimate of: Y it = X it β 0 + P it ATT(X it ) + ε it where ε it = U 0it + P it (U 1it U 0it E[U 0it U 1it X it, P it = 1]). That is, Y it is regressed on X it and P it interacted with X it, and the coefficients on P it interacted with X it provide estimates of the average treatment effect on the treated (ATT) for people with characteristics X it. In practice, it is often assumed that treatment effects are the same across different X it, so that Y it is regressed on X it and the indicator P it, and the single coefficient on P is interpreted as the treatment effect. Recall that under assumptions (A.1) or (A.2), ATE(X) and ATT(X) are the same parameter. Consistency of the cross section regression estimator requires that the error term ε it not be correlated with either X it or P it, i.e. that E[ε it P it, X it ] = 0. This restriction is violated and thus the cross section regression estimator is biased and inconsistent if people select into the program based on expectations about their own gain from the program (violation of A.3). 18
19 To see this, consider that unobservable characteristics, like motivation, intellectual ability, or other advantages are likely to be present and correlated with both participation in or access to the treatment and with the outcome variable, introducing bias into the estimates of the treatment effects. Even though this strong assumption is likely to be violated, the cross section estimator is commonly used because of its minimal data requirements. Thus the other three regression estimators (before after, difference in differences, and within) are preferred, although each of them requires some kind of additional requirement of the data. 19
20 IV. The Before After Estimator Suppose that we have panel data, that is data collected from the same people for 2 or more time periods and that we observe only program participants. For both of the potential outcomes (Y 1 and Y 0 ), assume the same linear model used above: Y 1it = X it β 1 + U 1it Y 0it = X it β 0 + U 0it The X it variables may either be fixed (e.g. gender) or time varying (e.g. age), but they are assumed to be unaffected by an individual s participation in the program. The error terms U 1it and U 0it are assumed to satisfy E[U 1it X it ] = E[U 0it X it ] = 0. Suppose the intervention took place in period t *. For t < t *, none of the individuals had yet participated in the program, so we observe Y 0it and P it = 0. For t > t *, we observe Y 1it and P it = 1. 20
21 Thus, the observed outcome at time t can be written as: Y it = X it β 0 + P it Δ(X it ) +U 0it where P it denotes having participated in the program and Δ(X it ) = X it β 1 X it β 0 + U 1it U 0it is the treatment impact for individual i (note that it is not an average treatment effect because it is for a single person). The evaluation problem can be viewed as a missing data problem, because each person is observed in only one of two potential states (treated or untreated) at any point in time and the missing state needs to be imputed. The before after estimator addresses the missing data problem by using pre program data to impute the missing counterfactual outcome. Let t and t denote two time periods, one before and one after the program intervention. Suppose that we want to estimate the impact of the program on a person who participates between those two time periods. 21
22 In the notation of the panel data model, we can define the ATT(X it ) parameter as: ATT(X it ) = E[Δ(X it ) P it = 1, P it = 0, X it ] = [X it' β 1 + U 1it' X it' β 0 + U 0it' P it = 1, P it = 0, X it ] (all evaluated at t') where the conditioning on P it = 1 and P it = 0 indicates that the person was not in the program at time t but did participate in the program by time t. The before after estimator for ATT(X it ) can be written as follows. Y it Y it = X it β 1 X it β 0 + U 1it U 0it (for participants only; 1st Eqn. at t' 2nd at t) We can derive how this may be estimated using OLS, as follows: = X it β 1 X it β 0 + E[Δ(X it ) P it = 1, P it = 0, X it ] E[Δ(X it ) P it = 1, P it = 0, X it ]+ U 1it U 0it = X it β 1 X it β 0 + ATT(X it ) E[X it β 1 X it' β 0 + U 1it U 0it P it = 1, P it = 0, X it ] + U 1it U 0it = X it β 1 X it β 0 X it β 1 + X it' β 0 + ATT(X it ) E[U 1it U 0it P it = 1, P it = 0, X it ] + U 1it U 0it = (X it X it )β 0 + ATT(X it ) E[U 1it U 0it P it = 1, P it = 0, X it ] + U 1it U 0it + U 0it U 0it 22
23 The last expression implies that one can use OLS to estimate the following: Y it Y it = (X it X it )β 0 + ATT(X it ) + ε it where ε it = (U 1it U 0it E[U 1it U 0it P it = 1, P it = 0, X it ]) + U 0it U 0it Thus, the treatment impact can be obtained from a regression of the difference Y it Y it regressed on (X it X it ) and also on X it in levels (i.e. part of the ATT(X it' ) = [X it' β 1 + U 1it' X it' β 0 + U 0it' P it = 1, P it = 0, X it ]). The coefficients on X it, along with the constant term, provide estimates of ATT(X it ), controlling for any time varying X it variables. If the regressors X are not time varying, then the regression simplifies to regressing Y it Y it on X it. Note, however, that this estimation strategy does not allow for estimation of timespecific intercepts that are unrelated to program participation. The β 0 have to be assumed to be non time varying, or else they cannot be separately identified from the treatment effect. 23
24 Consistent estimation of the ATT(X it ) term requires E[ε it P it = 1, P it = 0, X it ] = 0. In fact, the term in parentheses in the expression for ε it has conditional mean of 0 by construction: E[U 1it U 0it E[U 1it U 0it P it = 1, P it = 0, X it ] P it = 1, P it = 0, X it ] so the key assumption needed for the before after estimator to be an unbiased and consistent estimator is the following: E[U 0it U 0it P it = 1, P it = 0, X it ] = 0. A special case where this assumption is satisfied is when U 0it can be decomposed into a fixed effect error structure: U 0it = f i + v it where f i is fixed over time and v it satisfies E[v it v it P it = 1, P it = 0, X it ] = 0. 24
25 Intuitively, this assumption allows selection into the program to be based on unobservable characteristics that are time invariant (called f i here), which could be correlated with P it, but are then differenced out of the expression U 0it U 0it. Thus a before after estimation strategy allows for person specific permanent unobservables that affect the program participation decision. The regression as described above has one pre and one post program observation for each person and the model is estimated only for people who eventually participate in the program. If there are more than two periods of data available, the model can also be estimated as a standard fixed effects regression (taking deviations from means), making use of all the data available. 25
26 V. Difference in Differences (DID) Estimators The difference in differences (DID) estimator measures the impact of the program intervention by the difference in the before after change in outcomes between participants and nonparticipants. To see how it works, recall that t is a time period before the program started and t is some me period a er it started. Define a (time invariant) indicator variable, denotedd by I i, thatt equals 1 for participants (those for whom P it = 0 and P it = 1) and 0 for non participants (for whom P it = P it = 0). The DID estimator is the OLS estimate of ATT(X it ) in the following regression equation: Y it Y it = X it β 0 X it β 0 + I i ATT(X it ) + where ε it = P it (U 1it U 0i t E[U 1it U 0it P i t = 1, P it = 0, X it ]) + ε it U 0it U 0it t 26
27 Note that this regression equation is identical to that for the before after estimator, except that now it is estimated using both participant and nonparticipant observations. The DID estimator addresses an important shortcoming of the before after estimator in that it allows for time specific intercepts that are common across groups (which can be included in X it β 0 ). These time effects are identified separately from the treatment effects because of the inclusion of the nonparticipant observations (recall that with the before after estimator, the constant term was attributed to the treatment effect, which is not the case here). The DID estimator is unbiased and consistent if E[ε it P it, X it ] = 0, which would be satisfied under a fixed effect error structure. With more than two time periods, the DID estimator can be implemented using a panel data fixed effects regression. 27
28 The data required to implement the DID estimator can be either panel data or repeated cross section data on both participants and nonparticipants. If it is implemented using repeated cross section data, stronger assumptions are needed on the error term. There are also ways of specifying the DID estimator as a levels equation rather than a differenced equation. For example, it can be estimated using the regression: Y it = X it β 0 + t + f i + P it ATT(X it ) + ε it for t = t,, t where ε it = U 0it + P it (U 1it U 0it E[U 1it U 0it P it = 1, X it ]). In this equation, t indicates a time specific intercept and f i is an individual level fixed effect (an indicator variable for each individual). Alternatively, the model could be estimated in deviation from mean form, in which case the fixed effect term would not need to be included since it will be differenced out. 28
29 If repeated cross section data are available rather than longitudinal data, then it is not possible to estimate fixed effects. In that case, we need to impose a stronger assumption on the error term, namely that E[ε it P it, X it ] = 0, which requires that E[U 0it P it, X it ] = 0. This means that people cannot select into the program based on their U 0it values. This is in contrast to panel data, in which these time invariant unobservables are differenced out. The main advantage of longitudinal (before after or difference in difference) estimators over cross sectional methods is that they allow for unobservable determinants of program participation decisions that are correlated with outcomes. However, the fixed effects error structure that is imposed to justify application of these estimators requires that unobservables which could be correlated with the error term be time invariant; this does not allow for variables that both vary over time and are correlated with the observed variables. For example, we might expect there to be correlated unobserved earnings shocks that make people more likely to participate in a social program (such as a public works program) and that would not be captured by a fixed effects error structure. 29
30 VI. Extension: Within Estimators (one way fixed effects) Within estimators identify program impacts from changes in outcomes within some unit, such as within a family, a school or a village. The before after and difference in differences estimators can also be viewed as within estimators, where the variation exploited is the change over time within a given individual. This section describes other kinds of within estimators. Let Y 0ijt and Y 1ij jt denote the outcomes for individual i, who is a member of unit j, and is observed at time t. For simplicity, at first assume that U 1it = U 0it. Assume a linear model for these two outcomes: Y ijt = X ijt β 0 + P ijt ATT(XX ijt ) + ε ijt Assume that the error term ε ijt (= U 0it = U 1it ) can be decomposed as: ε ijt = θ j + v ijt 30
31 where θ j represents the unobservables that are assumed to be fixed for individuals within the same unit, and the v ijt s are independent & identically distributed (i.i.d). Taking differences between two individuals, denoted by i and i, from the same unit j observed in the same time period t gives: Y ijt Y i jt = (X ijt X i jt )β 0 + (P ijt P i jt ) ATT(X ijt ) + (v ijt v i jt ). To estimate ATT(X ijt ), regress Y ijt Y i jt, X ijt X i jt and interaction terms between P ijt P i jt and X ijt. Consistency and unbiasedness of the OLS estimator of ATT(X ijt ) requires that: E[v ijt v i jt X ijt, X i jt, P ijt, P i jt ] = 0 This assumption implies that, within a particular unit, the individual who gets the treatment is selected without any influence of the error term v ijt. 31
32 Comments on the Within Estimator 1. Because it relies on comparing the outcomes of treated and untreated persons, the approach implicitly assumes that there are no spillover effects from treating one individual onto other individuals within the same unit. 2. In the more general version of the model, where U 1it U 0it, one must also assume that the individual in the unit that receives the treatment is selected without any influence of that individual s idiosyncratic gain from the program. That is, the program may be targeted at specific units (e.g. families or villages), but within those units, the selection of participants into the program should be unrelated to their idiosyncratic gains from the program (unrelated to U 1it U 0it ). 3. As with the before after and difference in differences estimation approaches, the within estimator just described allows treatment to be selective across units. That is, it allows E[ε ijt P ijt, X ijt ] 0, because treatment selec on can be based on the unobserved heterogeneity term θ j (heterogeneity shared among individuals within a unit). 4. When the variation being exploited for identification of the treatment effect is variation within a family, village, or school at a single point in time, then the within estimator can be implemented with a single cross section of data. 32
33 Discuss: What are some sources of heterogeneity that might be shared by all individuals in a community? What are some thatt might vary within a community? What are some advantages of using the within estimator? 33
34 VI. Extension: Two Way Fixed Effects and More Y it = X it β 0 + t + f i + P it ATT(X it ) + ε it for t = t,, t What if P is correlated with time variant unobservables? What if the program enrollment expands over time? What if the treatment effect varies over time? The following case study illustrates some possible extensions when long panel datasets are available. 34
35 Case Study: Does Microfinance Reduce Rural Poverty? (Berhane and Gardderoek 2011 AJAE) Background: The Dedebit Credit and Saving Institution (DECSI) in northern Ethiopia provides financial services for production purposes. It officially launched credit and saving programs in 1997 and expanded quickly into almost all villages in Tigray. By 2000, it was providing loans to 210,000 borrowers with 1.4 million credit transactions amounting to 447 million Ethiopian birrs (ETB) total outstanding loans and ETB74 million total savings. As of 2002, its network of 9 branches and 96 subbranches with headquarters in the capital city of the regional state covered more than 91% of the villages in the region and extended loans to about half a million borrowers. To study the impact of microfinance on poverty reduction, a four round survey with three year intervals ( ) was administered on 400 randomly selected rural borrowers and nonborrowers. The dataset covers household and village level information ranging from household characteristics, consumption, assets credit, and savings, to village infrastructure, markets, and credit contracts. 35
36 This analysis is based on a balanced panel of 351 households, of which 211 borrowed and 140 did not borrow in the 1997 survey. 36
37 37
38 Empirical Method: Consider first the following model for impact evaluation: (1) C it = X it β + prog it γ + M i α + u it, t =1, 2,...,T; i =1, 2,...,N where the outcome variable C it, per capita consumption for household i at time t, is determined by a vector of observable household, village, and MFI level characteristics X it, a program participation variable, prog it, and a vector M i of time invariant unobservable variables. The program participation variable is usually defined as a dummy variable. However, given the nature of the data, the authors define prog it as the number of years the household has been in a borrowing relationship in order to account for the degree or intensity of participation. Panel data models that allow program participation decisions to be correlated with unobservables affecting outcome variables reduce this problem. Three such models, i.e. the standard fixed effects model, the random trend model, and a flexible random trend model were used in the study. 38
39 The standard fixed effects estimator (1) provides a consistent estimate of the borrowing impact, γ, under the assumption that all unobservables that influence the outcome of interest are time invariant, which can be removed by a within or first difference transformation. However, if such individual specific unobservables change over time, the estimate for γ is still biased. There are two potential reasons for such effects. First, unobserved negative economic shocks affecting households input endowments may pressurize households into input bridging borrowings or repeat borrowings to settle earlier debts. Second, credit may have lasting effects on unobservables on which selection is based. For example, unobserved household characteristics such as entrepreneurial abilities, which may condition credit demand, may change over time depending on previous exposure to microfinance credit. The individual specific linear trend model, allows both household specific timeinvariant unobservables and individual trends of time varying unobservables to correlate linearly with program participation. This model remedies bias from timeinvariant factors and linear trends in time varying factors, but not from any remaining nonlinear factors. 39
40 (2) C it = X it β + prog it γ + M i α + g i t + u it where g i is an individual trend parameter, which, in addition to the level effects M i, captures individual specific growth rates over time. A consistent estimate for γ, the treatment effect of an additional year of borrowing, can be obtained by eliminating the linear trend in time varying unobservables as well as time invariant unobservables that can potentially bias γ. Equation (2) is first first differenced to eliminate M i, which gives a standard fixed effects model: (3) C it * =X it * β + prog it * γ + g i * + u it * ; t = 1, 2,...,T where C it * = C it C it 1, X it * =X it X it 1, u it * = u it u it 1 and g i * = g it g i(t 1). Equation (3) is then consistently estimated using a standard fixed effects approach. One then seconddifference equation (3) to eliminate g i * and estimate by pooled OLS. Note that γ can be estimated consistently from this specification only if T > 3. 40
41 An advantage of long panel data sets is that they enable one to estimate the impact from long term rather than one shot program participation. In addition to shifting the levels in each borrowing year, repeated participation may affect the rate of change of the outcome variables relative to nonparticipation. This can be accounted for by including dumprog it t in equation (3): (4) C it = X it β + γ 1 prog it + γ 2 dumprog it t + M i α + g i t + u it where dumprog it is a dummy equal to 1 if individual i participated in credit at time t. This specification provides impact estimates robust to random periodical changes by allowing the individual specific trend to vary on participation over time. Estimation follows the same procedures as for equation (2). 41
42 A more flexible specification allows program indicators to reflect the frequency of participation in each year. This is done by replacing progit and dumprogit t in equation (4) with a series of program indicators for each loan cycle for which the participant has been in the program: (5) C it = X it β + γ 1 prog1 it +,..., + γ k progk it + g i t + M i α + u it where progj it =1 if household i has been in the program for exactly j years in year t and zero otherwise; k is the maximum number of (observed) years a household can be in the program. Program indicators attach more weight to differences between households degree of participation regardless of year of participation. More weight is also given to the timing of participation within each indicator. Estimation follows the same procedures as for equations (2) and (4). 42
43 Standard FE (Eqn. 1) Individual Trend (Eqn. 2) Indiv. Trend + Trend Based on Participation (Eqn. 4) Flexible Random Trend Model (Eqn. 5) No. of years borrowed *** ** ** One year borrowing ** Two years borrowing ** Three years borrowing Four years borrowing ** Random trend*borrowing ** Year 2006 dummy *** *** *** *** Age of HH head Age Cultivated land size (in Tsimad = 0.25 hectare) Land size Intercept Within R N
44 VII. Extension: Difference in Differences Matching Matching estimators assume that outcomes are independent of program participation after conditioning on observables. However, for a variety of reasons, there may be systematic differences between participant and nonparticipant outcomes, even after conditioning on observables. Such differences may arise, for example, because of program selectivity on unmeasured characteristics (such as motivation) or because of systematic differences in the level of outcomes across different communities in which the participants and nonparticipants reside. A difference in differences (DID) matching strategy, as defined in Heckman, Ichimura and Todd (1997, 1998), allows program participation to be based on unobservables as long as the unobservables do not vary over time. This approach is analogous to the standard differences in differences regression estimator, but it reweights the participant and nonparticipant observations according to the weighting functions implied by matching estimators. 44
45 To see how this works, we need to start with the following independence assumption: (ΔY 0, ΔY 1 ) P Z where, ΔY 0 = Y 0t Y 0t, ΔY 1 = Y 1t Y 1t, t and t are time periods before and after the program enrollment date, respectively, and indicates statistical independence. This is a key assumption of the DID matching approach. Intuitively, it means that P does not help predict changes in the value of ΔY 0 (i.e. Y 0t Y 0t ) conditional on Pr(Z). Thus, individuals cannot select into the program based on anticipated changes in Y 0 (i.e. Y 0t Y 0t ). This estimator also requires the support condition: 0 < Prob[P = 1 Z] < 1 If interest centers on the ATT(X) parameter, then the matching independence assumption needs to be made only for ΔY 0. 45
46 As with cross sectional matching, nonparametric weighting can be used to construct matches. The local linear DID estimator is given by: ATT KDM = 1 n 1 i I {(Y 1t i Y 1ti ) W ij (Y 0t j Y 0tj )} 1 S P j I 0 S P where the weights correspond to the local linear weights defined in Session T9. If repeated cross section data are available, instead of longitudinal data, the estimator can be implemented as: ATT KDM = n 1 1t i I 1t S P {Y 1t i j I 0t S P W ij Y 0t j } 1 n1 t ' i I {Y 1ti W ij Y 0tj } 1 t ' S P j I 0 t ' S P where I 1t, I 1t, I 0t, I 0t denote the treatment and comparison group datasets in each time period. 46
Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data
Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible
More informationCausal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies
Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed
More informationCausality and Experiments
Causality and Experiments Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania April 13, 2009 Michael R. Roberts Causality and Experiments 1/15 Motivation Introduction
More informationControlling for Time Invariant Heterogeneity
Controlling for Time Invariant Heterogeneity Yona Rubinstein July 2016 Yona Rubinstein (LSE) Controlling for Time Invariant Heterogeneity 07/16 1 / 19 Observables and Unobservables Confounding Factors
More informationAn example to start off with
Impact Evaluation Technical Track Session IV Instrumental Variables Christel Vermeersch Human Development Human Network Development Network Middle East and North Africa Region World Bank Institute Spanish
More informationIntroduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015
Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal
More informationApplied Microeconometrics (L5): Panel Data-Basics
Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics
More informationChapter 60 Evaluating Social Programs with Endogenous Program Placement and Selection of the Treated
See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/222400893 Chapter 60 Evaluating Social Programs with Endogenous Program Placement and Selection
More informationEmpirical approaches in public economics
Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects
More informationPrinciples Underlying Evaluation Estimators
The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two
More informationEvaluating Social Programs with Endogenous Program Placement and Selection of the Treated 1
Evaluating Social Programs with Endogenous Program Placement and Selection of the Treated 1 Petra E. Todd University of Pennsylvania March 19, 2006 1 This chapter is under preparation for the Handbook
More informationNew Developments in Econometrics Lecture 11: Difference-in-Differences Estimation
New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?
More informationMissing dependent variables in panel data models
Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units
More informationApplied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid
Applied Economics Panel Data Department of Economics Universidad Carlos III de Madrid See also Wooldridge (chapter 13), and Stock and Watson (chapter 10) 1 / 38 Panel Data vs Repeated Cross-sections In
More informationDifference-in-Differences Methods
Difference-in-Differences Methods Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 1 Introduction: A Motivating Example 2 Identification 3 Estimation and Inference 4 Diagnostics
More informationImpact Evaluation Technical Workshop:
Impact Evaluation Technical Workshop: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 19(b) Quantile Treatment Effects I. Quantile Treatment Effects Most of the evaluation literature
More informationBeyond the Target Customer: Social Effects of CRM Campaigns
Beyond the Target Customer: Social Effects of CRM Campaigns Eva Ascarza, Peter Ebbes, Oded Netzer, Matthew Danielson Link to article: http://journals.ama.org/doi/abs/10.1509/jmr.15.0442 WEB APPENDICES
More informationDevelopment. ECON 8830 Anant Nyshadham
Development ECON 8830 Anant Nyshadham Projections & Regressions Linear Projections If we have many potentially related (jointly distributed) variables Outcome of interest Y Explanatory variable of interest
More information1 Impact Evaluation: Randomized Controlled Trial (RCT)
Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:
More informationQuantitative Economics for the Evaluation of the European Policy
Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,
More informationEconometrics I. by Kefyalew Endale (AAU)
Econometrics I By Kefyalew Endale, Assistant Professor, Department of Economics, Addis Ababa University Email: ekefyalew@gmail.com October 2016 Main reference-wooldrigde (2004). Introductory Econometrics,
More informationEconometric Causality
Econometric (2008) International Statistical Review, 76(1):1-27 James J. Heckman Spencer/INET Conference University of Chicago Econometric The econometric approach to causality develops explicit models
More informationLeast Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates
Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating
More informationLecture 9. Matthew Osborne
Lecture 9 Matthew Osborne 22 September 2006 Potential Outcome Model Try to replicate experimental data. Social Experiment: controlled experiment. Caveat: usually very expensive. Natural Experiment: observe
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample
More informationDifference-in-Differences Estimation
Difference-in-Differences Estimation Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. The Basic Methodology 2. How Should We
More informationInstrumental Variables
Instrumental Variables Yona Rubinstein July 2016 Yona Rubinstein (LSE) Instrumental Variables 07/16 1 / 31 The Limitation of Panel Data So far we learned how to account for selection on time invariant
More informationPanel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63
1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:
More informationPSC 504: Differences-in-differeces estimators
PSC 504: Differences-in-differeces estimators Matthew Blackwell 3/22/2013 Basic differences-in-differences model Setup e basic idea behind a differences-in-differences model (shorthand: diff-in-diff, DID,
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationAbility Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006
Ability Bias, Errors in Variables and Sibling Methods James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006 1 1 Ability Bias Consider the model: log = 0 + 1 + where =income, = schooling,
More informationChapter 6 Stochastic Regressors
Chapter 6 Stochastic Regressors 6. Stochastic regressors in non-longitudinal settings 6.2 Stochastic regressors in longitudinal settings 6.3 Longitudinal data models with heterogeneity terms and sequentially
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationINTRODUCTION TO BASIC LINEAR REGRESSION MODEL
INTRODUCTION TO BASIC LINEAR REGRESSION MODEL 13 September 2011 Yogyakarta, Indonesia Cosimo Beverelli (World Trade Organization) 1 LINEAR REGRESSION MODEL In general, regression models estimate the effect
More informationLecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)
Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook) 1 2 Panel Data Panel data is obtained by observing the same person, firm, county, etc over several periods. Unlike the pooled cross sections,
More informationAnalysis of Panel Data: Introduction and Causal Inference with Panel Data
Analysis of Panel Data: Introduction and Causal Inference with Panel Data Session 1: 15 June 2015 Steven Finkel, PhD Daniel Wallace Professor of Political Science University of Pittsburgh USA Course presents
More informationDealing With Endogeneity
Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics
More informationReview of Econometrics
Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,
More informationIDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL
IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper
More informationApplied Econometrics (MSc.) Lecture 3 Instrumental Variables
Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.
More informationPanel data methods for policy analysis
IAPRI Quantitative Analysis Capacity Building Series Panel data methods for policy analysis Part I: Linear panel data models Outline 1. Independently pooled cross sectional data vs. panel/longitudinal
More informationEconometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 6 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 21 Recommended Reading For the today Advanced Panel Data Methods. Chapter 14 (pp.
More informationEconometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017
Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within
More informationy it = α i + β 0 ix it + ε it (0.1) The panel data estimators for the linear model are all standard, either the application of OLS or GLS.
0.1. Panel Data. Suppose we have a panel of data for groups (e.g. people, countries or regions) i =1, 2,..., N over time periods t =1, 2,..., T on a dependent variable y it and a kx1 vector of independent
More informationPart VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary
Part VII Accounting for the Endogeneity of Schooling 327 / 785 Much of the CPS-Census literature on the returns to schooling ignores the choice of schooling and its consequences for estimating the rate
More informationCh 7: Dummy (binary, indicator) variables
Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male
More informationPanel Data Models. James L. Powell Department of Economics University of California, Berkeley
Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel
More informationPanel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43
Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression
More informationLinear Models in Econometrics
Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.
More informationClick to edit Master title style
Impact Evaluation Technical Track Session IV Click to edit Master title style Instrumental Variables Christel Vermeersch Amman, Jordan March 8-12, 2009 Click to edit Master subtitle style Human Development
More informationEcon 582 Fixed Effects Estimation of Panel Data
Econ 582 Fixed Effects Estimation of Panel Data Eric Zivot May 28, 2012 Panel Data Framework = x 0 β + = 1 (individuals); =1 (time periods) y 1 = X β ( ) ( 1) + ε Main question: Is x uncorrelated with?
More informationStatistical Models for Causal Analysis
Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring
More informationTreatment Effects. Christopher Taber. September 6, Department of Economics University of Wisconsin-Madison
Treatment Effects Christopher Taber Department of Economics University of Wisconsin-Madison September 6, 2017 Notation First a word on notation I like to use i subscripts on random variables to be clear
More informationGeneral motivation behind the augmented Solow model
General motivation behind the augmented Solow model Empirical analysis suggests that the elasticity of output Y with respect to capital implied by the Solow model (α 0.3) is too low to reconcile the model
More informationFixed Effects Models for Panel Data. December 1, 2014
Fixed Effects Models for Panel Data December 1, 2014 Notation Use the same setup as before, with the linear model Y it = X it β + c i + ɛ it (1) where X it is a 1 K + 1 vector of independent variables.
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationChapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE
Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over
More informationDifferences in Differences (DD) Empirical Methods. Prof. Michael R. Roberts. Copyright Michael R. Roberts
Differences in Differences (DD) Empirical Methods Prof. Michael R. Roberts 1 Topic Overview Introduction» Intuition and examples» Experiments» Single Difference Estimators DD» What is it» Identifying Assumptions
More informationEMERGING MARKETS - Lecture 2: Methodology refresher
EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different
More informationEconometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous
Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and
More informationPotential Outcomes Model (POM)
Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics
More informationPanel data panel data set not
Panel data A panel data set contains repeated observations on the same units collected over a number of periods: it combines cross-section and time series data. Examples The Penn World Table provides national
More informationPolicy-Relevant Treatment Effects
Policy-Relevant Treatment Effects By JAMES J. HECKMAN AND EDWARD VYTLACIL* Accounting for individual-level heterogeneity in the response to treatment is a major development in the econometric literature
More informationNon-linear panel data modeling
Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1
More informationWeek 2: Pooling Cross Section across Time (Wooldridge Chapter 13)
Week 2: Pooling Cross Section across Time (Wooldridge Chapter 13) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China March 3, 2014 1 / 30 Pooling Cross Sections across Time Pooled
More informationNext, we discuss econometric methods that can be used to estimate panel data models.
1 Motivation Next, we discuss econometric methods that can be used to estimate panel data models. Panel data is a repeated observation of the same cross section Panel data is highly desirable when it is
More information6. Assessing studies based on multiple regression
6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationEconometric Methods for Ex Post Social Program Evaluation
Econometric Methods for Ex Post Social Program Evaluation Petra E. Todd 1 1 University of Pennsylvania January, 2013 Chapter 1: The evaluation problem Questions of interest in program evaluations Do program
More informationImpact Evaluation of Rural Road Projects. Dominique van de Walle World Bank
Impact Evaluation of Rural Road Projects Dominique van de Walle World Bank Introduction General consensus that roads are good for development & living standards A sizeable share of development aid and
More informationDetermining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1
Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1 Income and wealth distributions have a prominent position in
More informationBasic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler
Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as
More informationShort T Panels - Review
Short T Panels - Review We have looked at methods for estimating parameters on time-varying explanatory variables consistently in panels with many cross-section observation units but a small number of
More informationEcon 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem
Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects The Problem Analysts are frequently interested in measuring the impact of a treatment on individual behavior; e.g., the impact of job
More informationØkonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning
Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,
More information1. The OLS Estimator. 1.1 Population model and notation
1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology
More informationPhD/MA Econometrics Examination January 2012 PART A
PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator
More informationNotes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market
Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market Heckman and Sedlacek, JPE 1985, 93(6), 1077-1125 James Heckman University of Chicago
More informationIntroduction to Econometrics. Regression with Panel Data
Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Regression with Panel Data Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 Regression with Panel
More informationSession IV Instrumental Variables
Impact Evaluation Session IV Instrumental Variables Christel M. J. Vermeersch January 008 Human Development Human Network Development Network Middle East and North Africa Middle East Region and North Africa
More informationEfficiency of repeated-cross-section estimators in fixed-effects models
Efficiency of repeated-cross-section estimators in fixed-effects models Montezuma Dumangane and Nicoletta Rosati CEMAPRE and ISEG-UTL January 2009 Abstract PRELIMINARY AND INCOMPLETE Exploiting across
More informationGov 2002: 9. Differences in Differences
Gov 2002: 9. Differences in Differences Matthew Blackwell October 30, 2015 1 / 40 1. Basic differences-in-differences model 2. Conditional DID 3. Standard error issues 4. Other DID approaches 2 / 40 Where
More informationPanel Data. STAT-S-301 Exercise session 5. November 10th, vary across entities but not over time. could cause omitted variable bias if omitted
Panel Data STAT-S-301 Exercise session 5 November 10th, 2016 Panel data consist of observations on the same n entities at two or mor time periods (T). If two variables Y, and X are observed, the data is
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint
More informationWrite your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).
STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods in Economics 2 Course code: EC2402 Examiner: Peter Skogman Thoursie Number of credits: 7,5 credits (hp) Date of exam: Saturday,
More informationAn Introduction to Causal Analysis on Observational Data using Propensity Scores
An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint
More informationEconometrics I Lecture 3: The Simple Linear Regression Model
Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating
More informationBasic Linear Model. Chapters 4 and 4: Part II. Basic Linear Model
Basic Linear Model Chapters 4 and 4: Part II Statistical Properties of Least Square Estimates Y i = α+βx i + ε I Want to chooses estimates for α and β that best fit the data Objective minimize the sum
More informationAlternative Approaches to Evaluation in Empirical Microeconomics
Alternative Approaches to Evaluation in Empirical Microeconomics Richard Blundell and Monica Costa Dias Institute for Fiscal Studies August 2007 Abstract This paper reviews a range of the most popular
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationTopic 10: Panel Data Analysis
Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel
More informationPANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1
PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,
More informationEstimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing
Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università
More informationECO 310: Empirical Industrial Organization Lecture 2 - Estimation of Demand and Supply
ECO 310: Empirical Industrial Organization Lecture 2 - Estimation of Demand and Supply Dimitri Dimitropoulos Fall 2014 UToronto 1 / 55 References RW Section 3. Wooldridge, J. (2008). Introductory Econometrics:
More information