Applied Quantitative Methods II

Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 2 / 38

Introduction Topic of today: Panel data Intuition: observing SAME units over time / in the same environment Units individuals, HHs, factories, firms, municipalities, states / countries: i...n, Repeated over: usually time periods t (year, quarter, weeks, days) units within clusters (siblings within family, firms within an industry, workers within a firm) Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 3 / 38

Introduction Repeated Cross-Section: Survey at several points in time ( rounds ) using different sample each round Panel: Survey at several points in time ( waves ) using same individuals Rotational Panel: Survey at several points in time, where part of the sample is based on same individuals and part is new ones Example Panel Study of Income Dynamics (PSID, USA) German Socioeconomic panel (GSEP, Germany) Linked employer-employee data from TREXIMA (CR) Time series vs. Panel - Length and sample size: Time Series: N small (mostly=1), T large (T ) Panel Surveys: N large, T small (N ) Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 4 / 38

Individual effects in panel data Positive relationship? Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 5 / 38

Individual effects in panel data Positive relationship? No! Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 5 / 38

Individual effects in panel data Positive relationship? No! Different intercepts for different people and negative relationship Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 5 / 38

Individual dynamics allows for study of individual dynamics Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 6 / 38

Introduction Main advantage: observe SAME individual => control for unobserved characteristics that do not change over time Estimation issues: We cannot assume that the observations are independently distributed across time undermine validity of standard OLS approach E.g. person s wage in 1990 and 1991 are very likely to be correlated Leads to more complicated methods Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 7 / 38

Structure of panel data Panel data have at least two dimensions: or y it = x itβ i + ε it with i = 1,..., N and t = 1,..., T X 1... 0 0 X 2 0 y = NT 1..... 0 0... X N NT kn β 1 β 2. β N kn 1 + ε T is not necessarily a time dimension: e.g. families & family members; schools, classrooms, and students We cannot estimate β it with OLS Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 8 / 38

Basic setup The basic framework for panel data analysis is the following linear regression model: y i,t = x i,tβ + z i α + v tγ + ε i,t x i,t is a vector of individual characteristics that change with time (e.g. employment status, income etc) z i is fixed or individual effects; it consists of time constant characteristics: 1 observed (sex, ethnicity etc) 2 unobserved (family specific characteristics, individual heterogeneity in skills or preferences etc) v t is time-varying common factor for all units (trend) Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 9 / 38

Types of models We will consider 4 basic cases: 1 Pooled regression 2 Random effects 3 Random coefficients/parameters 4 Fixed effects Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 10 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 11 / 38

Pooled OLS regression (POLS) We can just run an OLS regression on the whole dataset (pretending it s just N T cross-sectional observations) where: α = E[z i α], u i = z i α α y i,t = α + x i,tβ + u i + ε i,t }{{} =ω i,t In this case, we can put observed time-invariant characteristics and observed time trend into x i,t, so that u i only includes unobserved time-invariant characteristics Is this consistent and efficient? If so, when? Assumptions: no correlation between x i,t and ω i,t = E(x i,t u i ) = 0 & E(x i,t ε i,t ) = 0 no heteroskedasticity or serial correlation Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 12 / 38

Example Goal: estimate the effect of crime on house prices HousePrice i,t = β 0 + β 1 crime i,t + X i,tδ + v tγ + η i,t v t - time dummies (time trend) X i,t is observed characteristics of cities - both time variant and time-invariant (geography, demography, avg. education, age,) OLS on pooled sample: error term is η i,t = z i α + ε i,t z i - city-specific unobserved characteristics, time constant ɛ i,t - idiosyncratic error uncorrelated with crime rate error η i,t likely to be correlated with crime = endogeneity = OLS biased and inconsistent Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 13 / 38

Pooled regression bias If we ignore existing fixed effects (correlation of individual heterogeneity with other included variables): Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 14 / 38

Between estimator Another option: use unit averages over time y i = α + x iβ + ε i This uses only the variation between cross-sectional units (no within variation) Not efficient Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 15 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 16 / 38

First difference estimator Another way how to get rid of endogenous fixed effects (unobserved time-invariant characteristics) Run a regression on differences (changes over time): y i,t y i,t 1 = ( x i,t x i,t 1) β + εi,t ε i,t 1 = y i,t = x i,tβ + ε i,t Consistent estimator of β: unobserved time-invariant characteristics u i are differenced out! However: less efficient than other methods Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 17 / 38

Example Goal: estimate the effect of crime on house prices, first diff: HP i,t HP i,t 1 = β 0 +β 1 (cri i,t cri i,t )+β 2 (v t v t 1 )+z i z i +ε i,t ε i,t 1 HP it = β 0 + β 1 cri i,t + β 2 v t + ε i,t unobserved heterogeneity z i disappears! if cov( cri i,t, ɛ i,t ) = 0 and no heteroskedasticity and no serial correlation, consistent estimator Cons: 1 we have to have some variance in crime both across time and across cities 2 maybe, large variation in levels of crime, but low variation in first difference = larger error = lower efficiency 3 cannot estimate impact of time-invariant factors (e.g. geography) problem only if we are interested in that Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 18 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 19 / 38

Fixed effects model (FE) y i,t = α + x i,tβ + u i + ε i,t }{{} =ω i,t In this case we have Corr[x i,t, ω i,t ] 0 1 We can use individual fixed effects as dummies (individual specific intercepts): y i,t = x i,tβ + z i α + ε i,t sometimes too many dummies 2 It can be proven that it is the same thing as if we use deviations from unit means: y i,t y i = ( x i,t x i) β + εi,t ε i (We can also use time fixed effects, but careful about degrees of freedom) Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 20 / 38

Example - de-meaning Goal: estimate the effect of crime on house prices Estimate using fixed effects: we subtract means Mean of HP: HP i,t = 1 T HPi,t HP i,t HP i = β 0 +β 1 (cri i,t cri i )+β 2 (une it une i )+z i z i +trend+ε i,t ε i mean of z i is z i (no time variation) = we get rid of unobserved heterogeneity If cov(x it, ɛ it ) = 0, we have a consistent estimator Con: removes anything time-constant well, it was the goal? but now we cannot evaluate effect of any time-constant variable Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 21 / 38

Example - set of dummies Goal: estimate the effect of crime on house prices HP i,t = β 0 + β 1 cri i,t + β 2 une it + z i + trend + ε i,t add a dummy for each grouping (cities here, possibly also years) HP i,t = β 0 +β 1 cri i,t +β 2 une it +trend +µ 1 city 1 +...+µ N 1 city N 1 +ε i,t if cov(x i,t, ɛ i.t ) = 0, no serial correlation, homoscedasticity, then equivalent to FE Why to do this? we can capture the time constant influences z i Why not? large number of dummies makes estimation tedious Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 22 / 38

What about R-squared? HP i,t HP i = β 0 + β 1 (crime i,t crime i ) + β 2 (une it une i ) + ε i,t ε i if we have R 2 of 0.65, what does that mean? how well can our model explain the variation in houseprice across time? across cities? across time more important We have two: within- and between- R 2 when using only dummies, the dummies may inflate R 2! (use adj. R 2 ) Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 23 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 24 / 38

Motivation HP i,t = β 0 + β 1 crime i,t + β 1 unemp i,t + z i + ε i,t }{{} =ω i,t We assumed that there may be corr of unobserved heterogeneity and independent vars cov(z i, x i,t ) 0 then endogeneity in OLS => use FE or FD But what if the correlation is 0? Then we may have an easier way of estimation: when we think we control for all factors that are important in determination of y or, if effect of unobserved heterogeneity is very small Can we then use pooled OLS? OLS - serial correlation of errors We need to estimate structure of correlation -> Random effects Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 25 / 38

Random effects HP i,t = β 0 + β 1 crime i,t + β 1 unemp i,t + z i + ε i,t }{{} =ω i,t So, we assume that Cov(x i,t, z i ) = 0 Random efects (RE) corrects for presence of serial correlation When we subtract lambda * means (quasi-de-meaning) HP i,t λhp i = β 0 (1 λ)+β 1 (cri i,t λcri i )+β 1 (une i,t λune i )+η it λη i if lambda = 0, then RE = pooled OLS if lambda = 1, then RE = FE Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 26 / 38

Random effects HP i,t λhp i = β 0 (1 λ)+β 1 (cri i,t λcri i )+β 1 (une i,t λune i )+η it λη i what is lambda? λ = 1 (σ 2 u/(σ 2 u + T σ 2 z )) when does it go to 0? when T σ 2 z = 0, that is variance of unobserved heterogeneity is 0, we use OLS When is it 1? when T σ 2 z =, that is very large, then we have FE RE procedure (Stata does that automatically): 1 estimate lambda 2 transform system and estimate using OLS it is a feasible generalizes least-square technique Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 27 / 38

Example HP i,t λhp i = β 0 (1 λ)+β 1 (cri i,t λcri i )+β 1 (une i,t λune i )+η it λη i Assumptions of RE: Cov(x i,t, z i ) = 0 random sample in cross-section strict exogenous errors E(u i x i ) = 0 & E(ε i,t x i ) = 0, If assumptions hold, then our RE estimator converges to true population value Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 28 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 29 / 38

Using RE & FE Using RE rather than FE: Pros: Cons: smaller standard errors than FE (more efficient) time-constant variables estimation! almost never Cov(x i,t, z i ) = 0 does hold we have to have many control variables, which is often hard to get if Cov(x i,t, z i ) = 0 does not hold, then RE is incosistent also, we do not estimate unobserved heterogeneity How to find out which one to use? a test b/w FE and RE: Hausman test Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 30 / 38

Comparison of FE and RE Hausman test to compare: FE RE H 0 consistent consistent, efficient H A consistent inconsistent H 0 : Cov(x i,t, z i ) = 0 <=> can we use RE? do not reject => use RE reject => use FE basically, it compares estimated βs from FE and RE if they are the same, use RE (it is more efficient) if they are different, use FE Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 31 / 38

Example: Impact of enterprise zones on employment Source: data file EZUNEM (Wooldridge). 22 cities in Indiana, from 1980 to 1988. Six enterprise zones (ez) created in 1984, and 4 more in 1985. uclms is number of unemployment claims file during the year Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 32 / 38

Example: Impact of enterprise zones on employment Table Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 33 / 38

Example: Impact of enterprise zones on employment Fixed effect using dummies Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 34 / 38

Example: Impact of enterprise zones on employment Fixed effect using demeaning Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 35 / 38

Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects 5 Random effects 6 FE vs RE 7 Conclusion Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 36 / 38

Conclusion 1 Repeated observations isolation of time constant unobserved differences 2 We can study dynamics of economic processes 3 Some economic phenomena / outcomes of treatment are inherently longitudinal (e.g. unemployment levels) To summarize, with panel data: 1 We can obtain more precise estimates of the effect of our interest 2 If we suspect that endogeneity comes from some unobserved characteristic that does not change with time, we have an additional way how to solve it Panel data can be used with models you already covered FE probit/logit, Dif-in-dif, 2SLS etc Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 37 / 38

Types of panels Panels can be balanced or unbalanced: 1 Balanced: We observe each unit in every time period 2 Unbalanced: different units appear in different years disappearance of units is called attrition Always determine the reason for attrition: Is there any underlying process? (e.g. we do not observe wage, because that person is no longer employed not random) Solutions: Should I limit to balanced panel? NO, lower efficiency Imputation e.g. if only some variables are missing In case of attrition bias use sample selection bias correction Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 38 / 38