Lecture 4: Linear panel models

Similar documents
Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Applied Quantitative Methods II

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Lecture Notes on Measurement Error

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Environmental Econometrics

Applied Microeconometrics (L5): Panel Data-Basics

Applied Econometrics. Lecture 3: Introduction to Linear Panel Data Models

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Topic 10: Panel Data Analysis

Economics 582 Random Effects Estimation

Controlling for Time Invariant Heterogeneity

EC327: Advanced Econometrics, Spring 2007

Introduction to Panel Data Analysis

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

PSC 504: Differences-in-differeces estimators

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics Homework 4 Solutions

Chapter 2. Dynamic panel data models

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria (March 2017) TOPIC 1: LINEAR PANEL DATA MODELS

Economics 241B Estimation with Instruments

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Lecture 6: Dynamic panel models 1

LECTURE 13: TIME SERIES I

Short T Panels - Review


The linear regression model: functional form and structural breaks

Dealing With Endogeneity

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Econ 582 Fixed Effects Estimation of Panel Data

1 Correlation between an independent variable and the error

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels

EMERGING MARKETS - Lecture 2: Methodology refresher

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

1 Regression with Time Series Variables

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

A Course on Advanced Econometrics

Labor Economics, Lecture 11: Partial Equilibrium Sequential Search

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

Capital humain, développement et migrations: approche macroéconomique (Empirical Analysis - Static Part)

Econometrics of Panel Data

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models

Lecture notes to Stock and Watson chapter 12

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Notes on Time Series Modeling

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

WISE International Masters

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Econometrics of Panel Data

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

We begin by thinking about population relationships.

What Accounts for the Growing Fluctuations in FamilyOECD Income March in the US? / 32

Economics 620, Lecture 13: Time Series I

Job Training Partnership Act (JTPA)

Introductory Econometrics

Linear Panel Data Models

FNCE 926 Empirical Methods in CF

Microeconometrics. Bernd Süssmuth. IEW Institute for Empirical Research in Economics. University of Leipzig. April 4, 2011

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

The regression model with one stochastic regressor (part II)

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Difference-in-Differences Methods

The returns to schooling, ability bias, and regression

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Microeconometrics: Clustering. Ethan Kaplan

Econometric Analysis of Cross Section and Panel Data

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Advanced Econometrics

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case.

Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models

Basic Regressions and Panel Data in Stata

Econometrics II. Lecture 4: Instrumental Variables Part I

Speci cation of Conditional Expectation Functions

Instrumental Variables. Ethan Kaplan

Casuality and Programme Evaluation

Introductory Econometrics

ECONOMETRICS FIELD EXAM Michigan State University August 21, 2009

An overview of applied econometrics

Lecture Module 8. Agenda. 1 Endogeneity. 2 Instrumental Variables. 3 Two-stage least squares. 4 Panel Data: First Differencing

Introduction: structural econometrics. Jean-Marc Robin

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Instrumental Variables

Panel data can be defined as data that are collected as a cross section but then they are observed periodically.

Fixed Effects Models for Panel Data. December 1, 2014

LECTURE 11. Introduction to Econometrics. Autocorrelation

Finnancial Development and Growth

Testing Weak Convergence Based on HAR Covariance Matrix Estimators

Single-Equation GMM: Endogeneity Bias

Transcription:

Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47

Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries) over several time periods (e.g., years, weeks). Two dimensions (cross-section and time-series), two indices: i for individuals, and t for time periods. Examples: Household panels (German Socio-Economic Panel, Panel Survey of Income Dynamics, Enquête Emploi,...); Panel of rms or workers based on administrative data; Panel of countries in the growth regressions literature Luc Behaghel (PSE) Lecture 4 February 2009 2 / 47

1 Balanced vs. unbalanced panel. If unbalanced, issue = sample selection. Here, balanced. Luc Behaghel (PSE) Lecture 4 February 2009 3 / 47

1 Balanced vs. unbalanced panel. If unbalanced, issue = sample selection. Here, balanced. 2 Short T and large N. ) asymptotic properties similar to cross-section case. Luc Behaghel (PSE) Lecture 4 February 2009 3 / 47

1 Balanced vs. unbalanced panel. If unbalanced, issue = sample selection. Here, balanced. 2 Short T and large N. ) asymptotic properties similar to cross-section case. 3 Panel 6= repeated cross-sections. Luc Behaghel (PSE) Lecture 4 February 2009 3 / 47

Attraction of panel data 1 More observations, more precision 2 New (more credible?) ways to identify causal e ects. e.g. di erence-in-di erences 3 Clear comparative advantage to understand the dynamics of individual behavior and economic e ects Approach followed in this lecture Main themes only 1 understand the two dimensions of panel data (longitudinal and cross-section) Luc Behaghel (PSE) Lecture 4 February 2009 4 / 47

Attraction of panel data 1 More observations, more precision 2 New (more credible?) ways to identify causal e ects. e.g. di erence-in-di erences 3 Clear comparative advantage to understand the dynamics of individual behavior and economic e ects Approach followed in this lecture Main themes only 1 understand the two dimensions of panel data (longitudinal and cross-section) 2 assess how panel data can help us to identify causal e ects. Luc Behaghel (PSE) Lecture 4 February 2009 4 / 47

Outline 1 Identi cation 2 Inference 3 Di erence-in-di erence methods 4 Extensions Luc Behaghel (PSE) Lecture 4 February 2009 5 / 47

Identi cation Example: impact of wages on the number of hours of labor supplied Panel data on 532 males for each of the 10 years from 1979 to 1988. Two variables: the log of hours worked (h), and the log of wages (w). The data set has 5320 pairs (w it, h it ) i=1,...,532;t=1,...,10. Luc Behaghel (PSE) Lecture 4 February 2009 6 / 47

Total dimension Higher wages are associated with longer hours Figure: All dimensions Luc Behaghel (PSE) Lecture 4 February 2009 7 / 47

Between dimension People who have higher wages on average during 10 years of their career tend to work longer hours Figure: Between dimension Luc Behaghel (PSE) Lecture 4 February 2009 8 / 47

Within dimension When a given worker has a higher wage than her usual (average) wage, she works longer hours Figure: Within dimension Luc Behaghel (PSE) Lecture 4 February 2009 9 / 47

Causal interpretation? ) is there a risk of omitted variable bias? 1 Between dimension > permanent unobserved heterogeneity: heterogeneity bias Luc Behaghel (PSE) Lecture 4 February 2009 10 / 47

Causal interpretation? ) is there a risk of omitted variable bias? 1 Between dimension > permanent unobserved heterogeneity: heterogeneity bias 2 Within dimension > shock simultaneously driving wages and hours: simultaneity bias Luc Behaghel (PSE) Lecture 4 February 2009 10 / 47

Decomposition of total variation: Notation: s the total standard deviation, s w the within standard deviation, and s b the between standard deviation, such that s 2 = s 2 w = s 2 b = 1 NT 1 NT 1 N 1 N N N T i=1 t=1 N T i=1 t=1 N i=1 (z i z) 2, (z it z) 2, (z it z i ) 2, and then Proof: show that s 2 = s 2 w + s 2 b. N T i=1 t=1 (z it z) 2 = N T i=1 t=1 (z it z i ) 2 + N T i=1 t=1 (z i z) 2. Luc Behaghel (PSE) Lecture 4 February 2009 11 / 47

Iden cation (continued) Formalisation: correlated and uncorrelated individual e ects Linear model: Speci c panel ingredients: y it = x it β + u it. 1 Time-varying and time-invariant variables:y it = β 0 + β 1 x it + β 2 z i + u it. Luc Behaghel (PSE) Lecture 4 February 2009 12 / 47

Iden cation (continued) Formalisation: correlated and uncorrelated individual e ects Linear model: Speci c panel ingredients: y it = x it β + u it. 1 Time-varying and time-invariant variables:y it = β 0 + β 1 x it + β 2 z i + u it. 2 Error term with two components: permanent component c i (= individual e ect) and transitory component ε it : u it = c i + ε it. Luc Behaghel (PSE) Lecture 4 February 2009 12 / 47

Iden cation (continued) Formalisation: correlated and uncorrelated individual e ects Linear model: Speci c panel ingredients: y it = x it β + u it. 1 Time-varying and time-invariant variables:y it = β 0 + β 1 x it + β 2 z i + u it. 2 Error term with two components: permanent component c i (= individual e ect) and transitory component ε it : u it = c i + ε it. 3 Period e ects (trends, macro shocks, etc.) ) time dummies. Therefore, y it = β 0 + β 1 x it + β 2 z i + λ t + c i + ε it (1) = β 0 + β 1 x it + β 2 z i + T τ=2 λ τ 1 τt + c i + ε it Identi cation question: can we identify β 0, β 1, β 2 and the λ τ s? Luc Behaghel (PSE) Lecture 4 February 2009 12 / 47

Remark Why we treat the λ τ s di erently from the c i s? T small whereas N large. ) hope to estimate the (small number of) λ τ s consistently; not true for the c i s. Luc Behaghel (PSE) Lecture 4 February 2009 13 / 47

Assumptions on the individual e ect c i is not directly observable ) include it in the error term, just like the error term in a cross-sectional regression? Condition for consistency Cov(u it, x it ) = Cov(u it, z i ) = 0 or equivalently Cov(εit, x it ) = Cov(ε it, z i ) = 0 Cov(c i, x it ) = Cov(c i, z i ) = 0 Cov(c i, x it ) = Cov(c i, z i ) = 0 is the uncorrelated individual e ects assumption. Interpretation: what are the unobserved permanent determinants of y? Luc Behaghel (PSE) Lecture 4 February 2009 14 / 47

Identi cation in the presence of correlated individual e ects Good news of panel data: even without instruments, we can still hope to get consistent estimates of the coe cients of the time-varying variables (i.e. estimate β 1, but not β 2 ). Approach: transform the variables to get rid of c i. First approach: rst-di erence model often denoted y it y it 1 = β 1 (x it x it 1 ) + ε it ε it 1 y it = β 1 x it + ε it. (2) Second approach: within model y it y i = β 1 (x it x i ) + ε it ε i. (3) c i disappears. Therefore, if we consider ε it as our error term, and if we assume Cov( x it, ε it ) = 0, OLS provides an unbiased estimate of β 2. β 2 and z i disappear as well. This means that our di erencing strategy does not yield an estimate of β 2. Luc Behaghel (PSE) Lecture 4 February 2009 15 / 47

Not surprisingly, we have returned to the within and between dimensions previously described: 1 When estimating the between model, we need to ask whether c i is correlated with x i and/or z i. If yes, OLS lead to an omitted variable bias called the heterogeneity bias. 2 When estimating the within model, we need to ask whether ε it ε i is correlated with x it x i. If yes, there is an omitted variable bias called the simultaneity bias. Luc Behaghel (PSE) Lecture 4 February 2009 16 / 47

Remark 1 Assymmetry. Condition for consistency of within: Cov(x it x i, ε it ε i ) = 0. Condition for consistency of between: Cov(x i, c i + ε i ) = 0 ) absence of correlation for the permanent and for the transitory unobservable components of the error term: Cov(ci, x i ) = 0 Cov(ε i, x i ) = Cov 1 N ε it, 1 N x it = 0 If (ε it, x it ) t=1,...,t are i.i.d., with Cov(ε it, x it 0) = 0 for t 6= t 0, the second condition means Cov(ε it, x it ) = 0 (for all t). This means that ε it and x it must not be simultaneously driven by an unobserved shock; that is, the between model can also su er from a simultaneity bias. Luc Behaghel (PSE) Lecture 4 February 2009 17 / 47

Remark 2 The rst-di erence and the within models have similar conditions for consistency. They are in general very close. In the case where T = 2, they are even identical. Check it by noting that, for any variable w, w it w i = 1 2 w it. Luc Behaghel (PSE) Lecture 4 February 2009 18 / 47

Remark 3 Cases where the structure of the model itself allows you to reject the assumption that Cov( x it, ε it ) = 0 (or that Cov(x it x i, ε it ε i ) = 0). For instance, if x is the lagged dependent variable: x it = y it 1. Then we have y it = β 1 y it and the question is whether Cov( y it 1 + ε it 1, ε it ) = 0. Now, Cov( y it 1, ε it ) = Cov(β 1 y it 2 + ε it 1, ε it ) = β 1 Cov( y it 2, ε it ) + Cov( ε it 1, ε it ). Now, Cov( ε it 1, ε it ) is unlikely to be 0. Even in the case where there is no serial correlation (i.e. Cov(ε it, ε it 0) = 0 for t 6= t 0, Cov( ε it 1, ε it ) = Var(ε it 1 ) 6= 0. Therefore, when the individual e ects are correlated and there is a lagged dependent variable as a regressor, we are in trouble: rst-di erencing the data will not be su cient to get a consistent estimator. Other methods will be required; they will actually use instrumental variables: we postpone this to section 5. Luc Behaghel (PSE) Lecture 4 February 2009 19 / 47

Estimation Questions to consider: 1 Can we assume that the individual e ects are uncorrelated? This splits the available estimators in two groups. 2 In a given group of estimators, the next question is e ciency: are there gains to be made from taking into account the speci c error structure of panel data? 3 For a given estimator, how should we derive consistent standard errors (again, taking into account the fact that the error structure is more complex than in the cross-section case)? Notations: y it = x it β + c i + ε it. Luc Behaghel (PSE) Lecture 4 February 2009 20 / 47

Estimation Estimators when individual e ects are uncorrelated Pooled OLS The key assumption here is that for each period t (and for each individual i), regressors and errors are uncorrelated: Cov(c i, x it ) = Cov(ε it, x it ) = Cov(u it, ε it ) = 0. (4) If the model has an intercept, we also have E (u it ) = 0. Pooling all the NT (j = 1,..., NT ) observations: yj = x j β + u j E (xj 0u j ) = E (u j ) = 0 ) Same as the standard cross section model ) estimate by OLS: pooled OLS estimator = bβ POLS = OLS estimator obtaining by pooling all the observations and regressing y on x. Luc Behaghel (PSE) Lecture 4 February 2009 21 / 47

Rem. 1: Comparison to cross-section estimators On period 1 only: i = 1,..., N and yi1 = x i1 β + u i1 E (x 0 i1 u i1) = E (u i1 ) = 0 ) consistent estimator by OLS, bβ 1OLS. Similarly for other T 1 periods. Gain of pooling: precision. Luc Behaghel (PSE) Lecture 4 February 2009 22 / 47

Rem. 2: Need for panel-robust standard errors Cross-sections: the (asymptotic) standard error of an estimate is inversely proportional to the square root of the sample size. Panel data: things are made a bit more complex by the fact that the error term has two parts. ) intuitively, adding N new individuals to a cross-section with N observations adds more information than adding a second period with the same individuals. Part of the information that was in rst period is repeated by the information in the second period. We need to account for that. This is what panel-robust standard errors do. Luc Behaghel (PSE) Lecture 4 February 2009 23 / 47

Parenthesis: default and robust standard errors Cross-section case: default standard errors computed under the assumption that all observations are independent, i.e. error terms i.i.d. Var(ui jx i ) = σ 2 for all i Cov(u i, u i 0) = 0 for all i 6= i 0 Sometimes, the assumption does not hold. Example: linear probability model (cf. lecture 1). ) robust standard errors: s.e. that are valid even when the error terms are not i.i.d. (often larger than default s.e.) Luc Behaghel (PSE) Lecture 4 February 2009 24 / 47

Error structure in panel data The i.i.d. assumption would be 8 < : Var(u it jx it ) = σ 2 for all i and t Cov(u it, u it 0) = 0 for all i and t 6= t 0 Cov(u i 0 t, u it ) = 0 for all t and i 6= i 0 Cov(u it, u it 0) = 0 for all i and t 6= t 0 is unlikely. At the minimum, we have Cov(u it, u it 0) = Cov(c i + ε it, c i + ε it 0) = Var(c i ) 6= 0. ) always a need for robust s.e. with POLS ) the robustness is wrt correlations across the di erent observations of the same individual, i.e. wrt correlations within the clusters constituted by the di erent individuals. In Stata, the command reads where id is the individual identifyer. regress y x, robust cluster(id) Luc Behaghel (PSE) Lecture 4 February 2009 25 / 47

Hours and wages example Luc Behaghel (PSE) Lecture 4 February 2009 26 / 47

Between estimator Condition for consistency Cov(c i + ε i, x i ) = 0. Su cient condition Cov(c i, x it ) = Cov(ε it 0, x it ) = 0 for all i, t and t 0. (5) ) de ne the between estimator, bβ B, as the OLS estimator from regressing y i on x i. One observation per individual: correlation of error terms is less of an issue. Robust standard errors may correct other problems (heteroskedasticity). Luc Behaghel (PSE) Lecture 4 February 2009 27 / 47

Random e ect (RE) estimator Parenthesis: Generalized least squares If the error terms are not i.i.d., GLS are an e cient alternative to OLS Idea: weigh the observations according to the information they provide. The additional information (and the noise) provided by an observation depends on the variance-covariance matrix of the error term, which is estimated in a rst step. Luc Behaghel (PSE) Lecture 4 February 2009 28 / 47

RE e ect model = one application of GLS to panel data Speci c model for the error term: 8 < : Var(u it jx it ) = Var(c i ) + Var(ε it ) = σ 2 c + σ 2 ε for all i and t Cov(u it, u it 0) = σ 2 c for all i and t 6= t 0 Cov(u i 0 t, u it ) = 0 for all t, t 0 and i 6= i 0 Given this structure of disturbances, it can be shown that the e cient GLS estimator can be calculated from the OLS regression of y it λy i on x it λx i, with σ ε λ = 1 p. σ 2 ε + T σ 2 c Luc Behaghel (PSE) Lecture 4 February 2009 29 / 47

Comparison to POLS and WITHIN If σ 2 c goes to 0, λ goes to 0. We come back to the POLS estimator. Intuition: when σ 2 c is close to 0, all the observation have the same amount of noise, and they are not correlated between each other. It is therefore optimal to weigh them equally. If T goes to in nity, then λ = 1, and we have the within estimator. No loss of e ciency in discarding the between information. Intuition: in a very long panel, the between information becomes negligible. In practice of course, λ is strictly between 0 and 1. The RE estimator can be seen as an intermediary estimator between the POLS and the within estimators. It can also be shown that it is a weighted average of the between and the within estimators. Luc Behaghel (PSE) Lecture 4 February 2009 30 / 47

Remark 1 Implementing the RE estimator requires to estimate λ in a rst step. This is done by rst estimating the residuals from a POLS regression, and then looking at the empirical correlations between residuals. Stata s xtreg command with the re option does the job. Remark 2 The RE speci cation does better than POLS if the model of the variance-covariance matrix of errors is correct. If not, it might do better or worse, although it will still be consistent. Moreover, the standard errors then need to corrected: as in POLS, this is done by using panel-robust standard errors. Luc Behaghel (PSE) Lecture 4 February 2009 31 / 47

Estimation Estimators when individual e ects are correlated We cannot include c i in the disturbance anymore: omitted variable bias. Within of Fixed E ect (FE) estimator The within estimator (bβ W ) is de ned as the OLS estimator obtained by regressing y it y i on x it x i. The key assumption for consistency is for which a su cient condition is Cov(ε it ε i, x it x i ) = 0 Cov(ε it, x it 0) = 0 for each i, t and t 0. Within = Fixed e ects β W is equal to the xed e ect estimator obtained from the OLS regression of y it on x it and N individual dummies (one for each individual). In other word, controlling for individual unobserved characteristics by estimating c i as the coe cient on a dummy for individual i amounts to doing a within estimation. Luc Behaghel (PSE) Lecture 4 February 2009 32 / 47

Remark 1 c i s not consistently estimated. Still interesting for descriptive purposes. Remark 2 Panel-robust standard errors are also needed: the within transformation implies that there is serial correlation. Luc Behaghel (PSE) Lecture 4 February 2009 33 / 47

First-di erence (FD) estimator The rst-di erence estimator (bβ FD ) is de ned as the OLS estimator obtained by regressing y it on x it. The key assumption for consistency is for which a su cient condition is Cov( ε it, x it ) = 0 Cov(ε it, x it 1 ) = Cov(ε it, x it+1 ) = Cov(ε it, x it ) = 0 for each i and t. In the same way as for the FE estimator, panel-robust standard errors are needed. Luc Behaghel (PSE) Lecture 4 February 2009 34 / 47

Estimation Choosing an estimator Hours and wages example Luc Behaghel (PSE) Lecture 4 February 2009 35 / 47

Testing the uncorrelated e ects hypothesis If H 0 : the individual e ects are uncorrelated, then bβ FE and bβ RE converge to the same limit, β. ) Hausman test: compare two estimators that should be consistent for the same parameter (under H 0 ). Hausman statistics: depends on the distance between the two estimates as well as on the variance of this di erence: 0 1 H = b β FE bβ b RE V b β FE bβ b RE β FE bβ RE Reject if H above a critical value. Example: H = 1.65 < 3.84: we cannot statistically reject H 0. However, this might be due to a lack of power: bβ FE and bβ RE are not precisely estimated. If we had a larger sample, maybe we would have rejected H 0... Luc Behaghel (PSE) Lecture 4 February 2009 36 / 47

Conclusion: 1 Key decision: uncorrelated individual e ects or not. Statistical test not necessarily convincing to choose. ) more conservative models (FE and FD) often preferred. However, the FE and FD come at a cost: Luc Behaghel (PSE) Lecture 4 February 2009 37 / 47

Conclusion: 1 Key decision: uncorrelated individual e ects or not. Statistical test not necessarily convincing to choose. ) more conservative models (FE and FD) often preferred. However, the FE and FD come at a cost: 1 as they discard the between information, they are less precise; Luc Behaghel (PSE) Lecture 4 February 2009 37 / 47

Conclusion: 1 Key decision: uncorrelated individual e ects or not. Statistical test not necessarily convincing to choose. ) more conservative models (FE and FD) often preferred. However, the FE and FD come at a cost: 1 as they discard the between information, they are less precise; 2 they do not enable us to estimate the e ects of time-invariant variables. Luc Behaghel (PSE) Lecture 4 February 2009 37 / 47

Conclusion: 1 Key decision: uncorrelated individual e ects or not. Statistical test not necessarily convincing to choose. ) more conservative models (FE and FD) often preferred. However, the FE and FD come at a cost: 1 as they discard the between information, they are less precise; 2 they do not enable us to estimate the e ects of time-invariant variables. 2 All estimators are obtained by OLS regression on some transformation of the initial data. Panel-robust standard errors needed. Luc Behaghel (PSE) Lecture 4 February 2009 37 / 47

Di erences in di erences Di -in-di s very often used with panel data, in particular in the evaluation of public policies. Combination of before / after and treatment / control comparisons: does the evolution of the outcome in the treated group di er from the evolution in the control group? Identifying assumption: di erences in these evolutions are due to the policy (and to some random noise). Numerous examples using natural experiments : Impact of immigration on wages and employment of local workers: Mariel Boatlift (Card, 1990) Luc Behaghel (PSE) Lecture 4 February 2009 38 / 47

Di erences in di erences Di -in-di s very often used with panel data, in particular in the evaluation of public policies. Combination of before / after and treatment / control comparisons: does the evolution of the outcome in the treated group di er from the evolution in the control group? Identifying assumption: di erences in these evolutions are due to the policy (and to some random noise). Numerous examples using natural experiments : Impact of immigration on wages and employment of local workers: Mariel Boatlift (Card, 1990) Impact of the minimum wage on employment: di erent changes across US states in minimum wage (Card and Krueger, 1995) Luc Behaghel (PSE) Lecture 4 February 2009 38 / 47

Di erences in di erences Di -in-di s very often used with panel data, in particular in the evaluation of public policies. Combination of before / after and treatment / control comparisons: does the evolution of the outcome in the treated group di er from the evolution in the control group? Identifying assumption: di erences in these evolutions are due to the policy (and to some random noise). Numerous examples using natural experiments : Impact of immigration on wages and employment of local workers: Mariel Boatlift (Card, 1990) Impact of the minimum wage on employment: di erent changes across US states in minimum wage (Card and Krueger, 1995) Labor supply elasticity: change in the Allocation parentale d éducation (Piketty, 1998) Luc Behaghel (PSE) Lecture 4 February 2009 38 / 47

Di erences in di erences Di -in-di s very often used with panel data, in particular in the evaluation of public policies. Combination of before / after and treatment / control comparisons: does the evolution of the outcome in the treated group di er from the evolution in the control group? Identifying assumption: di erences in these evolutions are due to the policy (and to some random noise). Numerous examples using natural experiments : Impact of immigration on wages and employment of local workers: Mariel Boatlift (Card, 1990) Impact of the minimum wage on employment: di erent changes across US states in minimum wage (Card and Krueger, 1995) Labor supply elasticity: change in the Allocation parentale d éducation (Piketty, 1998)... Luc Behaghel (PSE) Lecture 4 February 2009 38 / 47

Di erences in di erences Example 1994: APE extended to parents with 2 kids Situation unchanged for mothers with 1 or 3 kids ) ideal natural experiment Luc Behaghel (PSE) Lecture 4 February 2009 39 / 47

Hope: parallel trends before change in the treatment group after the policy change Luc Behaghel (PSE) Lecture 4 February 2009 40 / 47

Luc Behaghel (PSE) Lecture 4 February 2009 41 / 47

Controls (mothers with 1 kid) Treatment (mothers with two kids) Before (1994) After (1997) Evolution (1st difference) 62% 64,5% +2,5% 58,6% 47,4% 11,2% Relative evolution (2 nd difference) Treatment Controls = 13,7% Figure: Employment rates Luc Behaghel (PSE) Lecture 4 February 2009 42 / 47

Formalization as a panel data estimator Controls (mothers with 1 kid) Treatment (mothers two kids) with Before (1994) a a + b2kids After (1997) a + a + c post Evolution (1st difference) c post + b 2 kids c post + d APE c post + d APE Relative evolution (2 nd difference) Treatment Controls = d APE ) can be viewed as a linear probability model over two periods: a + b2kids [kids emp it = it = 2] + c post 1[year it > 1994] +d APE 1[kids it = 2, year it > 1994] + u it Luc Behaghel (PSE) Lecture 4 February 2009 43 / 47

Extension to a panel model over more than two periods with several control groups including covariates accounting for binary outcome a + b2kids [kids Pr(emp it j...) = Φ it = 2] + b 3+kids [kids it > 2] + λ t +d APE 1[kids it = 2, year it > 1994] + x it β ) Slightly more complex than the di erence in di erences. But same idea. Luc Behaghel (PSE) Lecture 4 February 2009 44 / 47

Di erences in di erences Prototypical di -in-di model A subset of States that passed a law: s 2 S Treat t = 1,..., T periods of observation; t 0 s = date of passing the law (if s 2 S Treat ) ) di -in-di models: 1 with outcomes and covariates measured at the State level: y st = α s + λ t + β1[t t 0 s, s 2 S Treat ] + x st γ + u st Luc Behaghel (PSE) Lecture 4 February 2009 45 / 47

Di erences in di erences Prototypical di -in-di model A subset of States that passed a law: s 2 S Treat t = 1,..., T periods of observation; t 0 s = date of passing the law (if s 2 S Treat ) ) di -in-di models: 1 with outcomes and covariates measured at the State level: y st = α s + λ t + β1[t t 0 s, s 2 S Treat ] + x st γ + u st 2 with outcomes and covariates measured at the individual level i (repeated cross-sections of individuals, but panel of States): y ist = α s + λ t + β1[t t 0 s, s 2 S Treat ] + x ist γ + u ist Luc Behaghel (PSE) Lecture 4 February 2009 45 / 47

Important remark: standard errors y ist = α s + λ t + β1[t t 0 s, s 2 S Treat ] + x ist γ + u ist Two reasons for dependence between observations: 1 Common shocks for individuals in the same State in the same year (clustered sample); 2 Serial correlations between shocks in a given State over time. u ist = η st + ε ist with η st serially correlated ) within a given State, there are correlations within periods and between periods ) need to have standard errors robust to these correlations:, robust cluster(id_state) in Stata s parlance ) in practice, many empirical papers have missed this point and underestimated their standard errors (Bertrand, Du o and Mullainathan, QJE 2004) Luc Behaghel (PSE) Lecture 4 February 2009 46 / 47

Di erences in di erences Testing the key identifying assumptions y st = α s + λ t + β1[t t 0 s, s 2 S Treat ] + x st γ + u st Identifying assumption: Cov(u st, 1[t t 0 s, s 2 S Treat ]) after controlling for covariates () State shocks are not correlated with the change in policy Intuitive phrasing: Any systematic di erence in the evolution of the two groups can be attributed to the policy This cannot be directly tested. But indirect tests: 1 No systematic di erence in the evolutions of the two groups before the policy occurred ) APE: parallel trends before 1994 Luc Behaghel (PSE) Lecture 4 February 2009 47 / 47

Di erences in di erences Testing the key identifying assumptions y st = α s + λ t + β1[t t 0 s, s 2 S Treat ] + x st γ + u st Identifying assumption: Cov(u st, 1[t t 0 s, s 2 S Treat ]) after controlling for covariates () State shocks are not correlated with the change in policy Intuitive phrasing: Any systematic di erence in the evolution of the two groups can be attributed to the policy This cannot be directly tested. But indirect tests: 1 No systematic di erence in the evolutions of the two groups before the policy occurred ) APE: parallel trends before 1994 2 No systematic di erence in the evolutions of groups that are not a ected by the policy after the policy occurred ) APE: no di erentiated evolution after 1994 for mothers with 2 vs. 3 kids when the kids are all older than 3 Luc Behaghel (PSE) Lecture 4 February 2009 47 / 47