Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Similar documents
Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Linear Panel Data Models

α version (only brief introduction so far)

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Applied Quantitative Methods II

Applied Microeconometrics (L5): Panel Data-Basics

Week 2: Pooling Cross Section across Time (Wooldridge Chapter 13)

10 Panel Data. Andrius Buteikis,

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Final Exam. Economics 835: Econometrics. Fall 2010

Econometrics of Panel Data

Econ 582 Fixed Effects Estimation of Panel Data

EC327: Advanced Econometrics, Spring 2007

Advanced Econometrics

Introduction to Panel Data Analysis

Dealing With Endogeneity

Econometrics of Panel Data

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Panel data methods for policy analysis

Gov 2000: 13. Panel Data and Clustering

Fixed Effects Models for Panel Data. December 1, 2014

Topic 10: Panel Data Analysis

Panel Data Model (January 9, 2018)

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Econometrics I Lecture 3: The Simple Linear Regression Model

Specification testing in panel data models estimated by fixed effects with instrumental variables

Ordinary Least Squares Regression

Econometrics of Panel Data

Dynamic Panel Data Models

y it = α i + β 0 ix it + ε it (0.1) The panel data estimators for the linear model are all standard, either the application of OLS or GLS.

Multiple Regression Analysis

Introductory Econometrics

Econometrics Multiple Regression Analysis: Heteroskedasticity

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Empirical Application of Panel Data Regression

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

WISE International Masters

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

EMERGING MARKETS - Lecture 2: Methodology refresher

Non-linear panel data modeling

Multiple Equation GMM with Common Coefficients: Panel Data

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

What s New in Econometrics? Lecture 14 Quantile Methods

Econometric Analysis of Cross Section and Panel Data

Econometrics of Panel Data

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Panel Data: Fixed and Random Effects

1 Motivation for Instrumental Variable (IV) Regression

Review of Econometrics

Lecture 8: Instrumental Variables Estimation

ECONOMETFUCS FIELD EXAM Michigan State University May 11, 2007

Introductory Econometrics

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

New Developments in Econometrics Lecture 16: Quantile Estimation

Econometrics Homework 4 Solutions

Lecture 4: Linear panel models

Econometrics Summary Algebraic and Statistical Preliminaries

Lecture 6: Dynamic panel models 1

Panel Data: Linear Models

An overview of applied econometrics

Intermediate Econometrics

ECON The Simple Regression Model

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Lecture 10: Panel Data

Heteroskedasticity. Part VII. Heteroskedasticity

LECTURE 11. Introduction to Econometrics. Autocorrelation

Missing dependent variables in panel data models

Short T Panels - Review

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Econometrics - 30C00200

Econometrics of Panel Data

Fortin Econ Econometric Review 1. 1 Panel Data Methods Fixed Effects Dummy Variables Regression... 7

Chapter 2: simple regression model

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

1. Overview of the Basic Model

1 Introduction to Generalized Least Squares

Cluster-Robust Inference

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Sensitivity of GLS estimators in random effects models

Multivariate Regression Analysis

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Estimating Panel Data Models in the Presence of Endogeneity and Selection

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Multiple Regression Analysis: Heteroskedasticity

Linear dynamic panel data models

Difference-in-Differences Estimation

Econometrics of Panel Data

A Practitioner s Guide to Cluster-Robust Inference

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Transcription:

Econometrics Week 6 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 21

Recommended Reading For the today Advanced Panel Data Methods. Chapter 14 (pp. 441 460). For the next week MIDTERM In the 2 weeks Instrumental Variables Estimation and Two Stage Least Squares Chapter 15 (pp. 461 491). 2 / 21

Today s Talk Last week we have learned about the independently pooled cross sections and panel data. We have learned a common method of estimation first differencing. Today, we will continue and introduce two methods for estimation of the Panel data. Fixed Effects Estimator. Random Effects Estimator. 3 / 21

Fixed Effects Estimation First differencing is just one of the ways how to eliminate the fixed effect a i. An alternative, which works better under certain assumptions, is called fixed effects transformation. Consider a model: y it = β 1 x it + a i + u it, t = 1,..., T. For each i, we will average this equation over time: y i = β 1 + x i + a i + u i, where y i = T 1 T t=1 y it and similarly for x i and u i Subtracting the averages from original equation, we will get time-demeaned data: ÿ it = β 1 ẍ it + ü it, where ÿ it = y it y i and similarly for ẍ it and ü it 4 / 21

Fixed Effects Estimation cont. The fixed effects transformation is also called within transformation. Unobserved effects a i disappeared we can use pooled OLS. Pooled OLS estimator on the time-demeaned variables is called fixed effects estimator, or within estimator. name within comes from the fact that we use time variation within each cross-sectional observation. We also know a between estimator, which is obtained as the OLS estimator of y i = β 0 + β 1 x i + a i + u i. We use time-averages and then run a cross-sectional regression. Between estimator is biased when a i is correlated with x i 5 / 21

Fixed Effects Estimation cont. A general time-demeaned equation for each i will be: ÿ it = β 1 ẍ it1 + β 2 ẍ it2 +... + β k ẍ itk + ü it, for t = 1, 2,..., T, estimated by pooled OLS. Notice that intercept is eliminated by the FE transformation. The FE estimator can also be obtained by adding individual dummies for each cross-section (to estimate unobserved effect for each i individually). This dummy variable regression results in too many explanatory variables dummy variables are not practical in panel data. Let s have a look at properties of the fixed effects estimator, ˆβ F E. 6 / 21

Assumptions for Fixed Effects Estimator Assumption FE1 For each i, the model is y it = β 1 x it1 +... + β k x itk + a i + u it, t = 1,..., T, where parameters β j are to be estimated and a i is the unobserved, or fixed effect. Assumption FE2 We have a random sample in the cross section dimension. Assumption FE3 For each t, the expected value of the idiosyncratic error given the explanatory variables in all time periods and the unobserved effect is zero: E(u it X i, a i ) = 0., 7 / 21

Assumptions for Fixed Effects Estimator cont. Assumption FE4 Each explanatory variable changes over time (for at least some i), and there are no perfect linear relationships among the explanatory variables. Under the Assumptions FE1 FE4 (which are identical to the first-differencing estimator), ˆβ F E is unbiased. The key assumption is strict exogeneity (FE3) Under the Assumptions FE1 FE4, plim( ˆβ F E ) = β as N ( ˆβ F E is consistent). Assumption FE5 V ar(u it X i, a i ) = V ar(u it ) = σ 2 u, for all t = 1,..., T. 8 / 21

Assumptions for Fixed Effects Estimator cont. Assumption FE6 For all t s, the idiosyncratic errors are uncorrelated (conditional on all explanatory variables and a i ): Cov(u it, u is X i, a i ) = 0. Under the Ass. FE1 FE6, the fixed effects estimator is BLUE. Assumption FE7: Normality Conditional on X i and a i, the u it are independent and identically distributed normal random variables. Last assumptions assures us that FE estimator is normally distributed, its t and F statistics have exact t and F distributions. Without FE7, we can rely on asymptotic approximations (although they require large N and small T without further assumptions). 9 / 21

Fixed Effects (FE) vs. First Differencing (FD) FD involves differencing the data, FE involves time-demeaning. Which one to use? FD and FE estimates and statistics are identical when T = 2. For T 2, the methods are different. If u it are uncorrelated, FE is more efficient than FD. If u it follow i.e. Random Walk, then u it are uncorrelated and FD is better test whether u it are serially correlated first. But in most of the data serial correlation is not that strong as in Random Walk. Thus it is suggested to obtain both estimates. If the results are not sensitive, then it is fine. But if they vary, we have to find out why! 10 / 21

Fixed Effects with Unbalanced Panels Panel data have often missing years for some cross-sections. These data are called unbalanced panel. We simply use the time-demeaning of T i observations in time for each cross-section i and FE is equivalent to an FE on balanced panel. But we should know the reason, why we have unbalanced panel. Provided the reason we have missing data for some i is not correlated with u it, unbalanced panel cause no problem. But if we have for example missing data of some firms because of merger of 2 firms, or because they have gone out of the business, we have nonrandom sample. The reason of firm leaving the sample is correlated with unobserved factors in time and thus affects u it bias. The problem is more complicated Master course. 11 / 21

Random Effects Models In FE or FD estimation, we would like to eliminate a i because we think it is correlated with x itj, Now, suppose that a i is uncorrelated with each explanatory variable at all periods, x itj. FE and FD are inefficient as we eliminate a i in this case. Solution is to use Random Effect Model Random Effects Model (RE) y it = β 0 + β 1 x it1 +... + β k x itk + a i + u it, where Cov(x itj, a i ) = 0 for all t = 1, 2,..., T and j = 1, 2,..., k. How do we estimate RE model? 12 / 21

Random Effects Models cont. It is important to see that if we believe that a i is uncorrelated with explanatory variables, OLS of simple cross-sections is constistent. Thus we may not need panel data at all. But, in this case, we throw away useful information in the other time periods. If we want to use this information, we can use pooled OLS estimation. But pooled OLS ignores the key feature of the model - serial correlation in the error term. 13 / 21

Random Effects Models cont. Let s consider following regression equation: y it = β 0 + β 1 x it1 +... + β k x itk + ν it, where ν it = a i + u it is a composite error term. The ν it are serially correlated across time as a i is present in each time period. Random Effects Assumption Corr(ν it, ν is ) = σ 2 a/(σ 2 a + σ 2 u), for all t s, where σ 2 a = V ar(a i ) and σ 2 u = V ar(u it ) Because of this positive serial correlation, pooled OLS estimator will be incorrect. 14 / 21

Random Effects Models cont. Solution to this problem is GLS transformation that eliminates serial correlation in the errors. Transformation subtracts a fraction of that time average, where the fraction depends on σ 2 u, σ 2 a and the number of time periods T : y it λy i = β 0 (1 λ) + β 1 (x it1 λx i1 ) +... +β k (x itk λx ik ) + (ν it λν i ) where λ = 1 [σ 2 u/(σ 2 u + T σ 2 a)] 1/2 and y i is time average. This equation contains quasi-demeaned data. Errors are now uncorrelated, and the GLS estimator is simply the pooled OLS of this transformation. 15 / 21

Note Random Effects Models cont. Advantage of RE is that it allows for explanatory variables which are constant over time. Thus if we have a variable which do not change over time in our data and we want to use it, it is no problem. In practice, λ is never known, as it is composed of theoretical variances. Thus we need to estimate it, usually by Pooled OLS: ˆλ = 1 [ˆσ 2 u/(ˆσ 2 u + T ˆσ 2 a)] 1/2, where ˆσ 2 u and ˆσ 2 a are consistent estimators of σ 2 u and σ 2 a under Pooled OLS. Thus random effects model is estimated by feasible GLS FGLS, where λ is replaced by ˆλ. For λ = 0, we have Pooled OLS (a i is unimportant as it has small variance relative to u it ) For λ = 1, we have FE (σ 2 a is large relatively to σ 2 u) 16 / 21

Assumptions for Random Effects In practice, λ is never 0 or 1. But when it is close to 0 for example, RE will be close to Pooled OLS. This is the case, when unobserved effect is relatively unimportant. Assumptions FE1, FE2 and FE6 are unchanged for RE model. RE3 In addition to FE3, the expected value of a i given all explanatory variables is zero: E(a i X i ) = 0. rule out correlation between unobserved effect and explanatory variables. RE4 There are no perfect linear relationships among the explanatory variables. allow explanatory variables to be constant in time for all i. 17 / 21

Assumptions for Random Effects cont. RE5 In addition to FE5, the variance of a i given all explanatory variables is constant: V ar(a i X i ) = σ 2 a. Under the Assumptions FE1, FE2, RE3 and RE4, the random effects estimator ˆβ RE is consistent as N gets large for fixed T. RE estimator is not unbiased unless we know λ. RE estimator is also approximately normally distributed with large N and usual standard errors, t statistics and F statistics are valid. 18 / 21

Fixed Effects vs. Random Effects We decide whether to use RE or FE based on a i. If unobserved effect is something we want to estimate, use FE. (i.e. Panel Data for EU-27). If unobserved effect is supposed to be random, use RE (i.e. Panel Data for schools in large country). But, to treat a i as random, we have to make sure that it is not correlated with explanatory variable. If unobserved effect a i is correlated with explanatory variables, FE is consistent, while RE is inconsistent. Otherwise, RE is more efficient that FE. 19 / 21

Fixed Effects vs. Random Effects cont. Thus we can test whether to use FE or RE statistically: Haussman test H 0 : cov(a i, x it ) = 0 Under the null, both FE and RE are consistent, but RE is asymptotically more efficient. Under the alternative, FE is still consistent (RE is not). We can test and correct for serial correlation and heteroskedasticity in the errors. We can estimate standard errors robust to both. 20 / 21

Thank you Thank you very much for your attention! 21 / 21