Topic 10: Panel Data Analysis
|
|
- Sharlene Powers
- 5 years ago
- Views:
Transcription
1 Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel data set have the following format. id time y x The rich information contained in a panel data set will allow us to answer some questions that can not be well answered using a cross section or time series data set alone. For example, in the study of FDI s impact on domestic firms productivity using industry-level cross section data, one faces the problem of separating out the potential effect of self-selection. Suppose one estimate a cross section model that regresses the domestic firms average productivity in an industry on the level of FDI presence along with a set of industry-specific controls using industry-level data. P ROD i = β 1 + β 2 F DI i + β 3 CONC i + β 4 KLRAT IO i + β 5 SOE i + ε i, (1) P ROD is industry average productivity, F DI is the level of FDI presence (measured by capital or employment), CON C is a measure market concentration, KLRAT IO is capital-labor ratio, SOE is a measure of state-owned enterprises weight in an industry. A positive coefficient estimate for FDI s presence in this case can bear two interpretations. One is that foreign firms presence raises domestic firms productivity, which suggests positive spillover effect of FDI. However, this may also reflect the fact that multinational firms are more likely to enter industries in which domestic firms have higher productivity in the first place. Without information on the time dimension, it is hard to make a distinction between these two effects. Panel data, however, will allow one to make inferences in cases like this. An important advantage of panel data is that they allow us to control for some unobserved cross-unit heterogeneity that does not vary over time. For 1
2 2 Independently Pooled Cross Sections 2 example, demographics civic culture are thought to change very slowly, especially compared to short term fluctuations in the economy or changes in the political climate. Consider the problem of estimating the effect of imprisonment on crime. The simple correlation between the per capita prison population property crimes per capita is positive. Controlling for demographic shortrun economic factors the relationship between imprisonment property crime remains positive. Does this mean imprisonment has no effect on curbing crime, or it s just because we have omitted some potential unmeasured factors, such as culture? Panel data will be helpful to answer questions like this. In this chapter, we are going to discuss some widely used models of panel data. Before our discussion of those models, we shall first examine a different, while related, type of data, namely, independently pooled cross sections. 2 Independently Pooled Cross Sections Independently pooled cross section data are composed of observations that are sampled romly from a large population at different points in time. It differs from panel data in that it contains different samples of units in each period instead of following the same units across time. As a result, sometimes independently pooled cross section data sets are also called pseudo-panels. Models using independently pooled cross section data set can be estimated by OLS. Due to increased sample size, one can obtain more precise estimates of parameters increased power in hypothesis testing. More importantly, adding time dimension to the data set can sometimes allow for correct inferences when otherwise impossible. Example 1: Immigration is an important part of Canada s public policy. The Canadian government is concerned about how immigrants are assimilated into Canadian society. One important measure of assimilation is to compare immigrants employment income with that of native born Canadians. A possible way of measuring the assimilation process is to regress people s job earnings on an immigrant dummy variable, a variable indicating the number of years since an immigrant led in Canada a set of individual specific characteristics like level of education, age, work experience etc. EARN i = β 1 + β 2 IMM i + β 3 Y EARIMM i + β 4 Y EARIMM 2 i +β 5 AGE i + β 6 SCHOOL i + ε i, (2) EARN is an individual s job earnings, IMM is a dummy variable indicating whether an individual is an immigrant, Y EARIMM is the number of years since immigration, AGE measures an individual s age SCHOOL is years of schooling. Suppose this model is estimated using cross section data, for example, a sample from the census of a certain year. To make correct inference, an important assumption is that some unobserved qualities of immigrants who have led in Canada in different years are constant. However, this assumption may well be violated due to the dramatic shift of immigrants source countries during the past several decades. Therefore, the observed positive correlation between immigrants earnings the length of time that they have spent in Canada may not reflect real assimilation, but is instead, at least partly, due to
3 3 Panel Data Models 3 the changes of the unobserved qualities among different immigrant cohorts. If, however, the estimation is based on pooled data from multiple censuses, then this cohort effect can be controlled. Several issues need to be considered when using pooled cross section data. First, usually time period dummy variables are included in the model to allow the intercept to differ. If the effect of an explanatory variable changes over time, then we should interact the time period dummies with that explanatory variable. Note that if we interact all explanatory variables with the time period dummies, then it is equivalent to running separate regressions for each time period. This calls for testing for structural changes in the model across time, for example, using Chow test as we have discussed earlier. 3 Panel Data Models The basic framework of panel data analysis is summarize by the following model. y it = x itβ + z iα + ε it, i = 1,..., n; t = 1,..., T. (3) There are K regressors in x it, not including a constant term. The term z i α measures the individual or heterogeneous effect, z i contains a constant term a set of individual or group specific variables that are constant over time t. Note that if z i is observed for all individuals, then this model is simply an regular linear regression model that can be estimated by OLS. Depending on the assumptions on z iα, we can have different models. Pooled Regression If z i contains only a constant term (i.e., no unobserved individual or group specific heterogeneity), then OLS will give consistent efficient estimates of the common intercept, α the slope vector β. Fixed Effects If z i is unobserved, at least partially, but correlated with x it, then OLS estimator of β will be biased inconsistent due to the omitted variable problem. In this case, the model becomes y it = x itβ + α i + ε it, (4) α i embodies all the unobservable individual or group specific effects. Rom Effects If the unobserved individual heterogeneity can be assumed to be uncorrelated with x it, then the model can be rewritten as y it = x itβ + E (z iα) + {z iα E (z iα)} + ε it = x itβ + α + u i + ε it, (5) u i is a group specific rom term. 4 Fixed Effects 4.1 Estimation Note that in equation (4), α i can be treated as individual specific intercept terms they can be estimated by including a set of dummy variables in the model. Thus, the fixed effect model can be written in a compact form as y = Dα + Xβ + ε, (6)
4 4 Fixed Effects 4 D = i 0 0 i i nt n α = α 1 α 2. α n n 1 This model is also called the least squares dummy variable (LSDV) model. Estimating (6) using OLS involves inverting a (n + K) (n + K) matrix. If n is large, which is usually the case, then this is likely to exceed the storage capacity of computers. There is an alternative way to proceed which only requires inverting a K K matrix. This is achieved by using the results of partitioned regression to first estimate β alone from model (6). The normal equations are (i) (ii) [ D D D X X D X X We can first solve for a from (i) in (7). 1 Substitute (9) into (ii) in (7), we have Solving for b yields b = ] [ a b ] = [ D y X y. ]. (7) a = (D D) 1 D y (D D) 1 D Xb (8) = (D D) 1 D (y Xb). (9) X D (D D) 1 D y X D (D D) 1 D Xb + X Xb = X y. [ ( X I D (D D) 1 D ) ] 1 [ ( X X I D (D D) 1 D ) ] y = [X M D X] 1 [X M D y], (10) M D = I D (DD) 1 D. Since M D is an idempotent matrix, we can rewrite (10) as b = [X M DM D X] 1 [X M DM D y] = (X X ) 1 X y, (11) X = M D X y = M D y. Note that the columns of the matrix D are orthogonal, so M M 0 0 M D =., 0 M 0 1 The solution (8) implies an important result: if D X = 0, then bα = (D D) 1 D y, which is just the OLS estimator of regressing y on D.
5 4 Fixed Effects 5 M 0 = I T 1 T ii. Recall that the matrix M 0 creates deviations from the mean when postmultiplied by any T 1 vector z i (see the lecture notes on the derivation of R 2 ). That is, M 0 z i = z i zi. Therefore, the least squares regression of M D y on M D X is equivalent to a regression of (y it y i. ) on (x it x i. ), y i. x i. are the scalar K 1 vector of means of y it x it over the T observations for group i. That is, y it y i. = (x it x i. ) + (ε it ε i. ). (12) Hence, the fixed effects estimator b can also be written as [ n ] 1 [ T n ] T b = (x it x i. ) (x it x i. ) (x it x i. ) (y it y i. ). With the estimate for β, the estimate for α can be obtained from the other normal equation in the partitioned regression. a = (D D) 1 D (y Xb), (13) or a i = y i. x i.b. (14) Remark 1: In fixed effects models, explanatory variables that are constant over time cannot be included because in this case x it x i. = 0, i. Also, when a full set of time period dummy variables (or a linear time trend) is included, explanatory variables whose change across time is constant, e.g. age, cannot be included. Although time-invariant variables cannot be included by themselves in the fixed effects model, one can interact them with variables that change over time, for example, with time period dummy variables. Doing so will yield estimates of how the partial effect of that variable changes over time. Remark 2: A panel data set with missing values for some time periods is called an unbalanced panel. Generally, we can proceed as usual using the available data. Note that in this case the observations with only one period of data will not play a role in the estimation will be dropped because in these cases y it y i. = 0 x it x i. = Properties of the Fixed Effects Estimator If we assume rom sampling on the cross section dimension strict exogeneity on the time series dimension (conditional on the unobserved effects), E (ε it X i, α i ) = 0, (15)
6 4 Fixed Effects 6 then the fixed effects estimator of α β are unbiased. To see the unbiasedness, { } E (b) = E [X M D X] 1 [X M D y] { } = E [X M D X] 1 [X M D (Dα + Xβ + ε)] { } = E 0 + β + [X M D X] 1 X M D ε = β. The estimator of the covariance matrix for b is Est.Var (b) = s 2 (X M D X) 1 (16) [ n ] 1 T = s 2 (x it x i. ) (x it x i. ), (17) n T s 2 = e2 it nt n K. (18) The itth residual, e it, is defined as e it = y it x itb a i = y it x itb y i. + x i.b = (y it y i. ) (x it x i. ) b. (19) For the fixed effects estimator to be BLUE, we need to further assume homoskedasticity no autocorrelation. That is, for each t, for all t s, Var (ε it X i, α i ) = Var (ε it ) = σ 2 ε, (20) Cov (ε it, ε is X i, α i ) = 0. (21) The fixed effect estimator of β is consistent when either n or T or both tend to infinity. However, the estimator α is consistent only if T. STATA Tips To estimate fixed effects models in STATA, we need to first declare the data set as panel data by using the tsset comm. tsset id_var date_var, option The usual time series operators (lag, difference etc.) then can be applied to the panel data. For example, to create the first difference of the variable profit across years for each firm, you can type tsset firm year, yearly gen dprofit = d.profit A fixed effects model then can be estimated by using the xtreg comm with the fe option. xtreg dep_var var_list, fe Note that STATA can only process data arranged in long form rather than wide form. To transform a wide-form data set to long form, use the reshape comm. Check the help file for more details.
7 5 Rom Effects 7 5 Rom Effects 5.1 Assumptions If we assume α i is rom uncorrelated with X, then we can make more efficient use of the data by using the rom effects model. Consider a reformulation of the model y it = x itβ + (α + u i ) + ε it. (22) In this case we have K regressors a constant term α, which is the mean of the unobserved heterogeneity, E (z i α). The component u i is the rom heterogeneity specific to the ith observation is constant over time. It is further assumed that E (ε it X) = E (u i X) = 0, E ( ε 2 it X ) = σ 2 ε, E ( u 2 i X ) = σ 2 u, E (ε it u j X) = 0 for all i, t, j, E (ε it ε js X) = 0 if t s or i j, E (u i u j X) = 0 if i j. Let η it = u i + ε it, which is the composite error term. It follows that E (η it X) = 0, E ( η 2 it X ) = σ 2 u + σ 2 ε, E (η it η is X) = σ 2 u if t s, E (η it η js X) = 0 for all t s if i j. Denote η i = [η i1, η i2,..., η it ]. Let E (η i η i X) = Σ. Then σu 2 + σε 2 σu 2 σu 2 σu 2 σu 2 + σε 2 σu 2 Σ =... (23) σu 2 σu 2 σu 2 + σε 2 = σεi 2 T + σui 2 T i T, (24) i is a T 1 vector of 1s. Since observations i j are independent, the disturbance covariance matrix for the full nt observations is Σ Σ 0 Ω =... = I n Σ. (25) 0 0 Σ 5.2 GLS Estimator Given the error structure of the rom effects model, OLS applied to model (22) will yield a consistent estimator of β, but it will not be efficient. To obtain
8 5 Rom Effects 8 efficient estimator, we shall use GLS method. The GLS estimator of the slope parameters is β = ( X Ω 1 X ) 1 X Ω 1 y ( n ) 1 ( n ) = X iω 1 X i X iω 1 X i. (26) i=1 The GLS method is equivalent to OLS on the transformed model i=1 y it θy i. = (1 θ) α + (x it θx i. ) β+ (η it θη i. ), (27) θ = 1 σ ε. (28) σ 2 ε + T σu 2 These transformations are known as the quasi-demeaned data because they are formed by subtracting only a fraction of the averages. Note the similarity of this procedure to the computation in the fixed effects model, which uses θ = 1. Unlike the fixed effects models, it is possible to include explanatory variables that are constant over time in the rom effects model. 5.3 Feasible GLS Since the variance components, σ 2 u σ 2 ε, are usually unknown, we can use the two-step method to obtain the FGLS estimator. In the first step, we estimate the variance components using some consistent estimators in the second step, we substitute those values into the GLS estimator. Specifically, consider Taking the difference yields y it = x itβ + α + ε it + u i (29) y i. = x itβ + α + ε i. + u i. (30) y it y i. = (x it x i. ) β + (ε it ε i. ). (31) The OLS estimator of β from model (31) is just the LSDV estimator, which is unbiased consistent. We can estimate σε 2 by T s 2 t=1 e = (e it e i. ) 2, (32) T K 1 e it is given in (19). average obtain s 2 e = 1 n There are n such estimators, so we can take the i=1 n T s 2 e = (e it e i. ) 2. nt nk n However, since α β are not estimated n times, the above expression makes excess correction for the degrees of freedom. It can be shown that an unbiased estimator for σε 2 is n T s 2 LSDV = (e it e i. ) 2. (33) nt n K
9 6 Testing for Rom Effects 9 Note that estimating (29) by pooled OLS will give consistent estimators of α β. Hence, a consistent estimator of E ( η 2 it) is That is, s 2 P ooled = Therefore, we can estimate σ 2 u by e e nt K 1. (34) plims 2 P ooled = σ 2 u + σ 2 ε. σ 2 u = s 2 P ooled s 2 LSDV. (35) Plugging σ 2 ε σ 2 u into (28), we can obtain an estimator of θ. When the sample size is large (in the sense that either n or T or both), the FGLS estimator is asymptotically as efficient as the true GLS estimator. Even for moderate sample size, the FGLS is still more efficient than the fixed effects estimator. 6 Testing for Rom Effects If the regressors are correlated with the rom effects α i, then the GLS estimator of β is inconsistent. If that is the case, then we shall use the fixed effects model, which always yields consistent estimators. Otherwise, rom effects models are more efficient. It is possible to test for such orthogonality by using Hausman s specification test. H 0 : Cov (x itj, α i ) = 0 vs. H 1 : Cov (x itj, α i ) 0. The test statistic is W = ( b β) Ψ 1 ( b β) a χ 2 K, (36) b is the fixed effects estimator, ) β is the rom effects estimator, Ψ = Est.Var (b) Est.Var ( β. STATA Tips To estimate the rom effects model in STATA, use the xtreg comm with the re option. xtreg dep_var varlist, re The Hausman s test can be carried out using the following comms. quiet xtreg dep_var var_list, fe estimates store fixed quiet xtreg dep_var var_list, re hausman fixed
10 7 Comparison of OLS, Fixed Effects, Rom Effects 10 7 Comparison of OLS, Fixed Effects, Rom Effects We can formulate the fixed effects panel data regression model in three ways. First, the original model is In terms of deviations from the group means, in terms of the group means y it = x itβ + α + ε it. (37) y it y i. = (x it x i. ) β + (ε it ε i. ), (38) y i. = x i.β + α + ε i.. (39) In (37), the matrices of the sums of squares cross products around the overall means are S total = S total = T ( xit x ) ( x it x ), (40) T ( xit x ) ( y it y ). In (38), the matrices are around the group means are given by S within = S within = T (x it x i. ) (x it x i. ), (41) T (x it x i. ) (y it y i. ). (42) These are the averages of the variations within groups. Finally, for (39), the moment matrices are the between-groups sums of squares cross products. S between = T ( x i. x ) ( x i. x ) i=1 (43) It can be shown that S between = S total T ( x i. x ) ( y i. y ). (44) i=1 = S within + S between, (45) S total = S within + S between. (46)
11 7 Comparison of OLS, Fixed Effects, Rom Effects 11 As a results, the OLS estimator with pooled data (without exploring the panel feature of the data set) is b total = ( S total ) 1 S total = ( S within + S between ) 1 ( S within + S between ). (47) The within-group estimator, which is also the LSDV or the fixed effects estimator, is given by b within = ( S within ) 1 S within. (48) And finally the between-groups estimator is From (48) (49), we have b between = ( S between ) 1 S between. (49) S within S between Substituting (50) (51) into (47), we have = S within b within, (50) = S between b between. (51) b total = F within b within + F between b between, (52) F within = ( S within + S between ) 1 S within = I F between. This result implies that the slope in the pooled data (total) will be a weighted average of the average slope within groups the slope of the means between groups. The rom effects model can also be compared within this framework. Let F within = ( S within σ 2 ε + λs between ) 1 S within, λ = σε 2 + T σu 2 = (1 θ) 2. If λ = 1 (i.e., σ 2 u = 0), then OLS is efficient. However, to the extent that λ is less than 1, OLS will be inefficient because it gives too much weight to the between-group variation.
Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data
Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible
More informationPanel Data Model (January 9, 2018)
Ch 11 Panel Data Model (January 9, 2018) 1 Introduction Data sets that combine time series and cross sections are common in econometrics For example, the published statistics of the OECD contain numerous
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationApplied Microeconometrics (L5): Panel Data-Basics
Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics
More informationTopic 7: Heteroskedasticity
Topic 7: Heteroskedasticity Advanced Econometrics (I Dong Chen School of Economics, Peking University Introduction If the disturbance variance is not constant across observations, the regression is heteroskedastic
More informationPanel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43
Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within
More informationChapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE
Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over
More informationEC327: Advanced Econometrics, Spring 2007
EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Chapter 14: Advanced panel data methods Fixed effects estimators We discussed the first difference (FD) model
More informationINTRODUCTION TO BASIC LINEAR REGRESSION MODEL
INTRODUCTION TO BASIC LINEAR REGRESSION MODEL 13 September 2011 Yogyakarta, Indonesia Cosimo Beverelli (World Trade Organization) 1 LINEAR REGRESSION MODEL In general, regression models estimate the effect
More informationDealing With Endogeneity
Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics
More informationShort T Panels - Review
Short T Panels - Review We have looked at methods for estimating parameters on time-varying explanatory variables consistently in panels with many cross-section observation units but a small number of
More informationAdvanced Econometrics
Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects
More informationPANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1
PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,
More informationPanel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63
1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao
More informationFixed Effects Models for Panel Data. December 1, 2014
Fixed Effects Models for Panel Data December 1, 2014 Notation Use the same setup as before, with the linear model Y it = X it β + c i + ɛ it (1) where X it is a 1 K + 1 vector of independent variables.
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 1 Jakub Mućk Econometrics of Panel Data Meeting # 1 1 / 31 Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 26 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Hausman-Taylor
More informationEconometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 6 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 21 Recommended Reading For the today Advanced Panel Data Methods. Chapter 14 (pp.
More informationPanel Data Models. James L. Powell Department of Economics University of California, Berkeley
Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel
More informationCapital humain, développement et migrations: approche macroéconomique (Empirical Analysis - Static Part)
Séminaire d Analyse Economique III (LECON2486) Capital humain, développement et migrations: approche macroéconomique (Empirical Analysis - Static Part) Frédéric Docquier & Sara Salomone IRES UClouvain
More information10 Panel Data. Andrius Buteikis,
10 Panel Data Andrius Buteikis, andrius.buteikis@mif.vu.lt http://web.vu.lt/mif/a.buteikis/ Introduction Panel data combines cross-sectional and time series data: the same individuals (persons, firms,
More informationEcon 582 Fixed Effects Estimation of Panel Data
Econ 582 Fixed Effects Estimation of Panel Data Eric Zivot May 28, 2012 Panel Data Framework = x 0 β + = 1 (individuals); =1 (time periods) y 1 = X β ( ) ( 1) + ε Main question: Is x uncorrelated with?
More informationLECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationApplied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid
Applied Economics Panel Data Department of Economics Universidad Carlos III de Madrid See also Wooldridge (chapter 13), and Stock and Watson (chapter 10) 1 / 38 Panel Data vs Repeated Cross-sections In
More informationNinth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"
Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric
More informationLecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)
Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook) 1 2 Panel Data Panel data is obtained by observing the same person, firm, county, etc over several periods. Unlike the pooled cross sections,
More informationLecture 4: Linear panel models
Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)
More informationNotes on Panel Data and Fixed Effects models
Notes on Panel Data and Fixed Effects models Michele Pellizzari IGIER-Bocconi, IZA and frdb These notes are based on a combination of the treatment of panel data in three books: (i) Arellano M 2003 Panel
More informationEconometrics of Panel Data
Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical
More informationEconometrics - 30C00200
Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business
More informationWooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares
Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit
More informationInstrumental Variables, Simultaneous and Systems of Equations
Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated
More information1 Estimation of Persistent Dynamic Panel Data. Motivation
1 Estimation of Persistent Dynamic Panel Data. Motivation Consider the following Dynamic Panel Data (DPD) model y it = y it 1 ρ + x it β + µ i + v it (1.1) with i = {1, 2,..., N} denoting the individual
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationy it = α i + β 0 ix it + ε it (0.1) The panel data estimators for the linear model are all standard, either the application of OLS or GLS.
0.1. Panel Data. Suppose we have a panel of data for groups (e.g. people, countries or regions) i =1, 2,..., N over time periods t =1, 2,..., T on a dependent variable y it and a kx1 vector of independent
More informationDynamic Panel Data Workshop. Yongcheol Shin, University of York University of Melbourne
Dynamic Panel Data Workshop Yongcheol Shin, University of York University of Melbourne 10-12 June 2014 2 Contents 1 Introduction 11 11 Models For Pooled Time Series 12 111 Classical regression model 13
More informationOrdinary Least Squares Regression
Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section
More informationLinear Panel Data Models
Linear Panel Data Models Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania October 5, 2009 Michael R. Roberts Linear Panel Data Models 1/56 Example First Difference
More informationEconometrics. 7) Endogeneity
30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors
More informationECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes
ECON 4551 Econometrics II Memorial University of Newfoundland Panel Data Models Adapted from Vera Tabakova s notes 15.1 Grunfeld s Investment Data 15.2 Sets of Regression Equations 15.3 Seemingly Unrelated
More informationA Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008
A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels
More informationNon-linear panel data modeling
Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1
More informationChapter 15 Panel Data Models. Pooling Time-Series and Cross-Section Data
Chapter 5 Panel Data Models Pooling Time-Series and Cross-Section Data Sets of Regression Equations The topic can be introduced wh an example. A data set has 0 years of time series data (from 935 to 954)
More informationBasic Regressions and Panel Data in Stata
Developing Trade Consultants Policy Research Capacity Building Basic Regressions and Panel Data in Stata Ben Shepherd Principal, Developing Trade Consultants 1 Basic regressions } Stata s regress command
More informationEconometrics. 8) Instrumental variables
30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates
More informationInternal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.
Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample
More information1. The OLS Estimator. 1.1 Population model and notation
1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology
More informationEfficiency of repeated-cross-section estimators in fixed-effects models
Efficiency of repeated-cross-section estimators in fixed-effects models Montezuma Dumangane and Nicoletta Rosati CEMAPRE and ISEG-UTL January 2009 Abstract PRELIMINARY AND INCOMPLETE Exploiting across
More informationIntermediate Econometrics
Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the
More informationLecture 6: Dynamic panel models 1
Lecture 6: Dynamic panel models 1 Ragnar Nymoen Department of Economics, UiO 16 February 2010 Main issues and references Pre-determinedness and endogeneity of lagged regressors in FE model, and RE model
More informationHOW IS GENERALIZED LEAST SQUARES RELATED TO WITHIN AND BETWEEN ESTIMATORS IN UNBALANCED PANEL DATA?
HOW IS GENERALIZED LEAST SQUARES RELATED TO WITHIN AND BETWEEN ESTIMATORS IN UNBALANCED PANEL DATA? ERIK BIØRN Department of Economics University of Oslo P.O. Box 1095 Blindern 0317 Oslo Norway E-mail:
More informationEconometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage
More information1 The Multiple Regression Model: Freeing Up the Classical Assumptions
1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator
More informationPanel data methods for policy analysis
IAPRI Quantitative Analysis Capacity Building Series Panel data methods for policy analysis Part I: Linear panel data models Outline 1. Independently pooled cross sectional data vs. panel/longitudinal
More informationControlling for Time Invariant Heterogeneity
Controlling for Time Invariant Heterogeneity Yona Rubinstein July 2016 Yona Rubinstein (LSE) Controlling for Time Invariant Heterogeneity 07/16 1 / 19 Observables and Unobservables Confounding Factors
More informationDynamic Panel Data Models
June 23, 2010 Contents Motivation 1 Motivation 2 Basic set-up Problem Solution 3 4 5 Literature Motivation Many economic issues are dynamic by nature and use the panel data structure to understand adjustment.
More informationMULTILEVEL MODELS WHERE THE RANDOM EFFECTS ARE CORRELATED WITH THE FIXED PREDICTORS
MULTILEVEL MODELS WHERE THE RANDOM EFFECTS ARE CORRELATED WITH THE FIXED PREDICTORS Nigel Rice Centre for Health Economics University of York Heslington York Y01 5DD England and Institute of Education
More informationApplied Econometrics (MSc.) Lecture 3 Instrumental Variables
Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.
More informationLecture 4: Heteroskedasticity
Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationEcon 836 Final Exam. 2 w N 2 u N 2. 2 v N
1) [4 points] Let Econ 836 Final Exam Y Xβ+ ε, X w+ u, w N w~ N(, σi ), u N u~ N(, σi ), ε N ε~ Nu ( γσ, I ), where X is a just one column. Let denote the OLS estimator, and define residuals e as e Y X.
More informationMultiple Equation GMM with Common Coefficients: Panel Data
Multiple Equation GMM with Common Coefficients: Panel Data Eric Zivot Winter 2013 Multi-equation GMM with common coefficients Example (panel wage equation) 69 = + 69 + + 69 + 1 80 = + 80 + + 80 + 2 Note:
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationJeffrey M. Wooldridge Michigan State University
Fractional Response Models with Endogenous Explanatory Variables and Heterogeneity Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Fractional Probit with Heteroskedasticity 3. Fractional
More informationApplied Econometrics. Lecture 3: Introduction to Linear Panel Data Models
Applied Econometrics Lecture 3: Introduction to Linear Panel Data Models Måns Söderbom 4 September 2009 Department of Economics, Universy of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom,
More informationWeek 2: Pooling Cross Section across Time (Wooldridge Chapter 13)
Week 2: Pooling Cross Section across Time (Wooldridge Chapter 13) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China March 3, 2014 1 / 30 Pooling Cross Sections across Time Pooled
More informationADVANCED ECONOMETRICS I. Course Description. Contents - Theory 18/10/2017. Theory (1/3)
ADVANCED ECONOMETRICS I Theory (1/3) Instructor: Joaquim J. S. Ramalho E.mail: jjsro@iscte-iul.pt Personal Website: http://home.iscte-iul.pt/~jjsro Office: D5.10 Course Website: http://home.iscte-iul.pt/~jjsro/advancedeconometricsi.htm
More informationEconometric Analysis of Cross Section and Panel Data
Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND
More informationTitle. Description. Quick start. Menu. stata.com. xtcointtest Panel-data cointegration tests
Title stata.com xtcointtest Panel-data cointegration tests Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas References Also see Description xtcointtest
More informationTest of hypotheses with panel data
Stochastic modeling in economics and finance November 4, 2015 Contents 1 Test for poolability of the data 2 Test for individual and time effects 3 Hausman s specification test 4 Case study Contents Test
More informationEconometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague
Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in
More informationECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors
ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption
More informationIntroduction to Econometrics. Heteroskedasticity
Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory
More informationSensitivity of GLS estimators in random effects models
of GLS estimators in random effects models Andrey L. Vasnev (University of Sydney) Tokyo, August 4, 2009 1 / 19 Plan Plan Simulation studies and estimators 2 / 19 Simulation studies Plan Simulation studies
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationPanel Data: Fixed and Random Effects
Short Guides to Microeconometrics Fall 2016 Kurt Schmidheiny Unversität Basel Panel Data: Fixed and Random Effects 1 Introduction In panel data, individuals (persons, firms, cities, ) are observed at several
More informationInstrumental Variables and the Problem of Endogeneity
Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =
More informationLinear Regression with Time Series Data
Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)
More informationNew Developments in Econometrics Lecture 11: Difference-in-Differences Estimation
New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. The Basic Methodology 2. How Should We View Uncertainty in DD Settings?
More informationEmpirical Application of Panel Data Regression
Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while
More informationIntroduction to Linear Regression Analysis
Introduction to Linear Regression Analysis Samuel Nocito Lecture 1 March 2nd, 2018 Econometrics: What is it? Interaction of economic theory, observed data and statistical methods. The science of testing
More informationTime-Series Cross-Section Analysis
Time-Series Cross-Section Analysis Models for Long Panels Jamie Monogan University of Georgia February 17, 2016 Jamie Monogan (UGA) Time-Series Cross-Section Analysis February 17, 2016 1 / 20 Objectives
More informationECON Introductory Econometrics. Lecture 13: Internal and external validity
ECON4150 - Introductory Econometrics Lecture 13: Internal and external validity Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 9 Lecture outline 2 Definitions of internal and external
More informationA Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008
A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and
More informationGraduate Econometrics Lecture 4: Heteroskedasticity
Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model
More informationChapter 2. Dynamic panel data models
Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)
More informationEconomics 582 Random Effects Estimation
Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0
More informationIn the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)
RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made
More informationIntroduction to Econometrics
Introduction to Econometrics STAT-S-301 Panel Data (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Regression with Panel Data A panel dataset contains observations on multiple entities
More informationNon-Spherical Errors
Non-Spherical Errors Krishna Pendakur February 15, 2016 1 Efficient OLS 1. Consider the model Y = Xβ + ε E [X ε = 0 K E [εε = Ω = σ 2 I N. 2. Consider the estimated OLS parameter vector ˆβ OLS = (X X)
More informationPanel data can be defined as data that are collected as a cross section but then they are observed periodically.
Panel Data Model Panel data can be defined as data that are collected as a cross section but then they are observed periodically. For example, the economic growths of each province in Indonesia from 1971-2009;
More informationOutline. Overview of Issues. Spatial Regression. Luc Anselin
Spatial Regression Luc Anselin University of Illinois, Urbana-Champaign http://www.spacestat.com Outline Overview of Issues Spatial Regression Specifications Space-Time Models Spatial Latent Variable Models
More informationPS 271B: Quantitative Methods II Lecture Notes
PS 271B: Quantitative Methods II Lecture Notes (Part 6: Panel/Longitudinal Data; Multilevel/Mixed Effects models) Langche Zeng zeng@ucsd.edu Panel/Longitudinal Data; Multilevel Modeling; Mixed effects
More informationsplm: econometric analysis of spatial panel data
splm: econometric analysis of spatial panel data Giovanni Millo 1 Gianfranco Piras 2 1 Research Dept., Generali S.p.A. and DiSES, Univ. of Trieste 2 REAL, UIUC user! Conference Rennes, July 8th 2009 Introduction
More informationPlease discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!
Econometrics - Exam May 11, 2011 1 Exam Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page! Problem 1: (15 points) A researcher has data for the year 2000 from
More informationLecture 8 Panel Data
Lecture 8 Panel Data Economics 8379 George Washington University Instructor: Prof. Ben Williams Introduction This lecture will discuss some common panel data methods and problems. Random effects vs. fixed
More information