Panel Data Econometrics

Size: px
Start display at page:

Download "Panel Data Econometrics"

Transcription

1 Panel Data Econometrics RDB September 2012 Contents 1 Advantages of Panel Longitudinal) Data 5 I Preliminaries 5 2 Some Common Estimators 5 21 Ordinary Least Squares OLS) Small Sample Properties Large Sample Properties 7 22 Instrumental Variables IV ) 8 23 Generalized Least Squares GLS) 9 24 Generalized) Method of Moments GMM ) Maximum Likelihood ML) Estimator Regularity conditions Properties Maximum Entropy ME) Entropy Maximum entropy principle Use 17 1

2 3 Some Common Tests Likelihood ratio test Lagrange multiplier LM ) test Wald test Hausman Test Sargan-Hansen J -Test 20 4 Models, parameters of interest and the incidental parameters problem Review of some nonlinear models Discrete choice Censoring Selection Count data Policy parameters The incidental parameter problem Terminology 29 II Linear Models 29 5 Static Uncorrelated individual effects OLS GLS FGLS Between Groups Estimator Correlated individual effects Within estimator or Least Squares Dummy Variables LSDV ) estimation First differenced OLS Hausman-Taylor IV Chamberlain device Hausman test 36 2

3 6 Dynamic No exogenous regressors Previous Estimators IV GMM Exogenous regressors Moment conditions for the equation in differences Moment conditions for the equation in levels Serially correlated errors MA errors AR errors Testing for serial correlation 57 7 Random Coefficient Models General Correlated Random Coefficient Models Random Coefficient Dynamic Models* 70 III Non-Linear Models 70 8 Static models Random effects Butler-Moffit Simulation based estimation* Correlated RE Uncorrelated RE Probit Correlated RE Probit Fixed effects Conditional MLE Maximum score* Censored regression 79 3

4 824 Fixed T identifiability Bias reduction Orthogonal Parameters* 82 9 Dynamic models The initial conditions problem Dynamic discrete choice and duration Random Effects Tobit Fixed effects 85 IV Additional Issues Cross-sectional dependence* Variance estimation One-way clustering More-way clustering QML Attrition - Sample selection Simple panel data model of non-response Identification Two-step estimation - RE ML estimation of a RE model Estimation of a FE model 97 4

5 1 Advantages of Panel Longitudinal) Data Panel data: 1 overcome the shortcomings of both cross-sectional and time-series data a) Cross-sectional data do not provide dynamic information b) Time-series variables tend to move col-linearly It is difficult to separate microfrom macrodynamic effects Estimation of distributed lag models relies on strong assumptions without firm empirical justification 2 allow for more complicated behavioral hypotheses 3 allow to control the effects of missing or unobserved variables 4 provide micro foundations for aggregate data analysis 5 can simplify inference if observations are IID across cross-sectional units Part I Preliminaries 2 Some Common Estimators 21 Ordinary Least Squares OLS) Consider the general linear model Y = Xβ + V, 1) 5

6 where Y = y 1 y 2 ; X = x 1 x 2 ; V = v 1 v 2 y N x N v N The OLS estimators of the unknown parameters β and σ 2 = Var [v] are defined as ˆβ = arg min β Y Xβ) Y Xβ), which can in the case of a linear model) be rewritten as ˆβ = X X) 1 X Y, and s 2 = N K) 1 ˆV ˆV 211 Small Sample Properties Consider now the following assumption about 1) Assumption LS1 FR) The N K matrix X is of rank K Assumption LS2 MI) The disturbances V are mean-independent of X E [v i x i ] = 0 Assumption LS3 SD) The disturbances are homoskedastic and not autocorrelated E [V V X] = σ 2 I N Assumption LS4 NS) The matrix X is non-stochastic 6

7 Under Assumptions LS1)-LS3), the OLS estimators have the following exact finite sample properties 1 Unbiasedness [ ] E ˆβ X [ ] = E ˆβ = β E [ s 2 X ] = E [ s 2] = σ 2 2 Gauss-Markov Theorem ˆβ is the best linear unbiased estimator of β, with [ ] Var ˆβ X [ = σ 2 E X X) 1] The following assumption allows inference in small samples Assumption LS5 SD) The disturbances are normally distributed V X N 0; σ 2 I N ) Under assumptions LS1)-LS5) it holds that ˆβ X N 0; σ 2 X X) 1) 212 Large Sample Properties To obtain asymptotic properties, the matrix of explanatory variables needs to be well behaved and the condition plimn X X = Q xx, 2) N with Q xx a finite, positive definite matrix, is usually imposed This assumption, however, is stronger than necessary and can be replaced by the Grenander conditions in case X includes polynomial time series or trending variables We also replace assumption LS4) about nonstochastic X with the following one Assumption LS6 IS) The sequence {x i, v i )} N i=1 is IID 7

8 Under the regularity condition 2), and under assumptions LS2)-LS3) and LS6), it holds that plim ˆβ = β N ) D N ˆβ β N ) 0; σ 2 Q 1 xx The most important conclusion is that, given good behavior of the regressors, asymptotic normality of the regressors does not depend on the normality of the disturbances 22 Instrumental Variables IV ) Zero correlation between x i and v i was the key assumption to guarantee the desirable properties of the OLS estimator Here this assumption is relaxed, but the existence of an L 1 vector z i of variables that are correlated with x i, but not with v i, is assumed In particular, the following assumptions are made Assumption IV1 The sequence {x i, z i, v i )} N i=1 is IID Assumption IV2 The disturbances V are mean-independent of Z E [v i z i ] = 0 Under the additional regularity conditions plimn Z Z = Q zz, N plimn Z X = Q zx, N with Q zz and Q zx finite, positive definite matrices, the instrumental variable estimator ˆβ IV = X Z Z Z) 1 Z X) 1 X Z Z Z) 1 Z Y, 8

9 has the following asymptotic properties plim ˆβ IV = β N D N ˆβIV β) N ) 0; σ 2 Q 1 xz Q zz Q 1 zx 23 Generalized Least Squares GLS) Consider again the general linear model Y = Xβ + V, but assume now that the disturbances violate the homoskedasticity and uncorrelatedness assumptions, ie E [V V X] = σ 2 Ω, where Ω is a positive definite matrix If now both plimn 1 X X and plimn 1 X ΩX are positive definite matrices, then N N plim ˆβ OLS = β N If Ω would have been known, the best linear unbiased estimator BLUE) of α is given by ˆβ GLS = {X Ω 1 X} 1 {X Ω 1 Y } We can now first obtain ˆβ OLS, and then, using the OLS residuals, we can estimate the unknown parameters of Ω We can then decompose Ω into its eigenvectors and eigenvalues Ω = EΛE, where the columns of E are the characteristic vectors of Ω, and Λ is a diagonal matrix containing the characteristic roots λ i of Ω Let now Λ 1 /2 be the diagonal matrix containing λ i and define P = Λ 1/2 E Pre-multiplication of the model by P P Y = P Xβ + P V, 9

10 results now in a transformed model for which the disturbances are again homoskedastic and not autocorrelated E [P V V P X] = σ 2 P ΩP = σ 2 Λ 1/2 E EΛE EΛ 1/2 = σ 2 I, so the classical regression model applies to the transformed model and thus ˆβ GLS = {X P P X} 1 {X P P Y } = { X Ω 1 X } 1 { X Ω 1 Y } D N β; σ 2 X Ω 1 X ) ) 1 The feasible GLS estimator substitutes ˆΩ for the unknown Ω Remark that we only need a consistent estimator of Ω in order that the efficiency properties of ˆβ GLS carry over to ˆβ F GLS 24 Generalized) Method of Moments GMM ) Using GMM, a number of orthogonality restrictions are formulated The parameter estimates satisfy these restrictions as close as possible Consider the model y i = f x i, β) + u i z i = g x i ) E [z i u i ] = 0, where β is a K 1 vector of parameters and z i is a L 1 vector of instruments The corresponding population moment restrictions are given by N c N b) = N 1 z i {y i f x i, b)} = 0 i=1 10

11 In more general terms, the moment equations have the general form E [ψ w, β)] = 0, and their sample counterparts are given by c N b) = N 1 N i=1 ψ w i, b), with w i = y i, x i, z i) with double entries removed The GMM estimator of β is defined by ˆβ GMM = arg min J N b) b = arg min b c N b) W N c N b), where W N is some weight matrix In fact, a set of moment conditions defines a whole family of estimators ˆβ GMM W N ), depending on the actual choice of the weight matrix W N When K = L, the system is just identified and ˆβ GMM is the unique solution to c N b) = 0 Properties Under certain conditions Hansen, 1982), we have: 1 Strong consistency ˆβ GMM as β, for N ; 2 Asymptotic normality N ˆβGMM β) D N 0; Σ N ˆβGMM ), where Σ N ˆβ GMM = N D W D) 1 DW SW D D W D) 1 ) c N b) D β) = plim N b b=β W = plimw N N N S = N 1 E [ z i u 2 i z i] i=1 11

12 In the special case that f x i, β) is linear in β, there is a closed form expression for ˆβ GMM ˆβ GMM = X ZW N Z X) 1 X ZW N Z Y, and D = Z X Choice of W N 1 The asymptotically optimal GMM estimator minimizes Σ ˆβGMM in function of W N It can be shown that this optimum is reached for any choice of W N for which W = S 1, resulting in an asymptotic covariance for ˆβ OGMM of Σ N ˆβ OGMM = N D S 1 D ) 1 This optimal estimator is typically estimated in two steps: in a first step, a consistent estimate of β is obtained by OLS, IV, GMM, ), which allows estimation of S 1 by Ŝ 1 = N 1 N i=1 z i u 2 i z i) 1 In the second step the asymptotically optimal GMM estimator is computed While these asymptotic results only require the first step estimator to be consistent, finite sample properties improve when the first step weight matrix is closer to the optimal one 2 OLS is identical to exactly identified GMM with W N = I N and z i = x i 3 IV is a GMM estimator with W N = Z Z) 1 a weight matrix that is optimal in case that u i IID 0; σ 2 ) 25 Maximum Likelihood ML) 251 Estimator Suppose the RVs x 1,, x N have a joint density function f x 1,, x N θ), which can also be considered a function of the parameter vector θ In this capacity it is called the likelihood L θ) = f x 1,, x N θ) 12

13 The maximum likelihood estimator MLE) is that value of θ that makes the observed data most probable ˆθ = arg max L θ) θ If the x i are assumed to be IID, their joint density is the product of the marginal densities and the likelihood simplifies to L θ) = N f x i θ), i=1 maximization of which is usually simplified by taking the natural logarithm Since ln ) is a monotonous function, the maximum likelihood estimator also maximizes the log-likelihood l θ) = N ln f x i θ) i=1 The information matrix of θ is defined as [ ] l θ) I θ) = E θ θ Under correct specification of the model it is equal to E [ ) ) ] lθ) lθ) θ θ 252 Regularity conditions This informal treatment of conditions is from Greene 2000, p127) 1 The first three derivatives of l x i, θ) wrt θ are finite for all values of θ and for almost all x i, which ensures the finite variance 1 of lθ) / θ 2 The conditions for the existence of E [ lθ) / θ ] and E [I θ)] are met 3 θ, 3 lx i,θ)/ θ k θ l θ m is less than some function g x i ) with finite expectation By means of the above conditions, following properties are obtained: 1 In addition to the existence of a second degree Taylor approximation of l x i, θ) 13

14 1 l x i, θ), g i = lx i,θ)/ θ and H i = 2 lx i,θ), for i = 1,, N are all randomly sampled of θ θ random variables 2 E [g i ] = 0 3 Var [g i ] = E [H i ] These should facilitate the intuitions behind following properties 253 Properties Under the regularity conditions stated below 2, the MLE has following asymptotic properties 1 Consistency plim ˆθ ML = θ N 2 Normality ˆθ ML D N 0; I θ) 1), with [ ] [ ] 2 ln L θ) 2 l θ) I θ) = E = E θ θ θ θ 3 Efficiency: ˆθ ML reaches the Cramér-Rao lower bound for consistent estimators ) 4 Invariance: the MLE of γ = g θ) is given by g ˆθML 26 Maximum Entropy ME) 261 Entropy Setting Consider an experiment with the possible outcomes y 1,, y N, from which we want to retrieve the unknown and unobservable probabilities p 1,, p K, K k=1 p k = 1, p k 0, 2 See sub-subsection

15 which are assumed to represent the GDP We assume, in addition, that y = Xp, where X is an N K) non-invertible matrix with K > N Golan, Judge, and Miller, 1996) Prior to execution, there is uncertainty about the outcome Entropy is a measure of this uncertainty It should satisfy following requirements Kapur, 1989, p2): 1 It should be a continuous function of p 1,, p K, invariant to permutation of its arguments E K p 1,, p K ) 2 Addition of an impossible outcome E K+1 p 1,, p K, 0) = E K p 1,, p K ) 3 Certain outcome E K 1, 0 K 1 ) = 0 4 Maximum uncertainty max E K p 1,, p K ) = K 1 1 p 1,,p K K 5 Behavior for increasing K max E L ) > max E K ), L > K 6 For independent probability distributions E K+L p q) = E K p 1,, p K ) + E L q 1,, q L ) 15

16 Shannon s 1948) measure of uncertainty E K = where we set p ln p = 0, if p = 0 K p k ln p k, k=1 262 Maximum entropy principle Aims to give as broad a distribution as possible, subject to satisfaction of the constraints, which can be formalized as max p 1,,p K E K p 1,, p K ) st y = Xp, 1 Kp = 1, p 0 This will result in the vector p that can generate the greatest number of outcomes consistent with the data The analytical solution of this problem can be obtained by maximizing the Lagrangian function L = p ln p + λ y Xp) + µ 1 1 Kp), with optimality conditions L p = ln p 1 X λ µ = 0 K L λ = y Xp L µ = 1 1 Kp = 0 We can solve for ˆp as a function of ˆλ as ˆp = ) exp X ˆλ K k=1 exp X ˆλ ) Because ˆp is a function of ˆλ, the maximum entropy distribution does not have a closed form solution We must use numerical optimization techniques to compute ˆp 16

17 For continuous probability distributions we maximize ln f x) f x) dx, subject to f x) dx = 1 g r x) f x) dx = ḡ r, where the functions g r ) are for example x, x µ x ) 2, and the ḡ r are their observed sample counterparts 263 Use More unknowns than observations Non-invertibility of X, due to linearly dependent columns, Inconsistent data Traditional estimation procedures can lead to Arbitrarily fixed parameters Undefined solutions Unstable estimates 3 Some Common Tests The most commonly used test procedures are reviewed in the next subsections 31 Likelihood ratio test Consider θ be a vector of parameters to be estimated by ML) and let H 0 be a restriction of some kind on θ, restricting the parameter vector by c degrees of freedom 3 Let ˆθ U and ˆθ R be 3 For example θ 1 = 0 is a restriction of degree 1 θ 1 = θ 2 = 0, on the other hand, is a restriction of degree 2, while θ 1 = θ 2 is a restriction of degree only 1 17

18 the ML estimates of the unrestricted, respectively the restricted models, and let ˆL U and ˆL R be the corresponding estimated likelihood functions, evaluated at their corresponding estimates The Likelihood Ratio LR) is defined by λ = ˆL R ˆL U Under the regularity conditions given in sub-subsection 252 and under H 0 : ˆθ R = ˆθ U 2 ln λ = 2 {ln ˆL R ln ˆL } U D χ 2 c Note that we need to estimate both the restricted and the unrestricted model in order to be able to perform this test 32 Lagrange multiplier LM ) test The LM or efficient) score test is purely based on the restricted model, ie under H 0 It is based on the maximization of l θ), subject to the set of c constraints that c θ) = q The Lagrangian function for this problem is l θ) = l θ) + λ [c θ) q], with λ a c 1 vector of Lagrangian multipliers The constrained solution ˆθ R is the root of l θ) l θ) θ = θ + λ c θ) θ = 0 l θ) λ = c θ) q If the restrictions are valid ˆθ R ˆθ U, or λ 0 c and we can thus base a test on the vector l θ) θ 0 θ=ˆθr 18

19 To that order define the LM statistic as LM = l θ) θ θ=ˆθr ) ) I ˆθR 1 l θ) θ θ=ˆθr) Under H 0 : ˆθ R = ˆθ U LM D χ 2 c Note that we only need to estimate the restricted model in order to be able to perform this test 33 Wald test As before, let ˆθ U be the ML estimate of the unrestricted model We can now test the set of restrictions c θ) = q by means of the Wald statistic W = ) ) c ˆθU q c θ) θ θ=ˆθu ) ) I ˆθR 1 c θ) θ θ=ˆθu ) ) 1 ) ) c ˆθU q Under H 0 : ˆθ R = ˆθ U W D χ 2 c Note that we only need to estimate the unrestricted model in order to be able to perform this test 34 Hausman Test Hausman 1978)suggests 4 to test the difference between two estimators, ˆβE being consistent and efficient under H 0, but inconsistent under H A and ˆβ C being consistent under both H 0 and H A Because an inefficient estimator like ˆβ C can always be written as the sum of an efficient estimator like ˆβ E and a random vector that is uncorrelated with ˆβ E, the covariance matrix of 4 Although this principle was used previously by Durbin 1954) and Wu 1973), Hausman 1978) was the first to formalize it in such a general way MacKinnon, 1992) 19

20 ˆβ C ˆβ E is simply the difference of their covariance matrices The proposed test is thus H = ˆβC ˆβ ) { [ ] [ ]} 1 E Var ˆβC Var ˆβE ˆβC ˆβ ) E Under H 0, H D χ 2 c, with c the degrees of freedom of the restriction 35 Sargan-Hansen J -Test GMM estimators are motivated by setting sample analogues of population orthogonality restrictions as close to zero as possible If the model is just identified, we achieve this exactly, and nothing is left to test If the model is over-identified, we can test whether the over-identifying restrictions are set close enough to zero to be consistent with their validity, when evaluated at the optimal GMM parameter estimates If they are small enough, we do not reject the validity of the moment conditions used Otherwise we reject Very loosely, this is like testing for correlation between the model residuals and a subset of) the instruments used In the model above y i = f x i, β) + u i z i = g x i ) E [z i u i ] = 0, with β a K 1 vector of parameters and z i a L 1 vector of instruments, the GMM estimator of β was defined as ˆβ GMM = arg min b J N b) The main result of interest here, is that, under H 0 : E [z i u i ] = 0), the minimized optimal GMM 20

21 criterion NJ N ˆβOGMM ) is asymptotically distributed as a central chi-square distribution 5 with L K degrees of freedom NJ N ˆβOGMM ) = Nc N ˆβOGMM ) Ŝ 1 c N ˆβOGMM ) D χ 2 L K, with ˆβ OGMM an optimal estimator and Ŝ is a consistent estimate of S The intuition of the argument is as follows Under the conditions for ˆβ OGMM to be asymptotically normally distributed, it holds that N Ĉ c N ˆβOGMM ) D N 0; I L ), where Ŝ 1 = ĈĈ Defining Ĝ = Ĉ ˆD we have that N ˆβOGMM β) = Ĝ Ĝ) 1 Ĝ NĈ c N β) + o p 1), ) and, using a first-order expansion for c N ˆβOGMM around β, h NĈ c N ˆβOGMM ) = ) NĈ c N β) + Ĉ ˆD N ˆβOGMM β + o p 1) { Ĝ ) } 1 N = I L + Ĝ Ĝ Ĝ Ĉ c N β) + o p 1) Since the limit of { Ĝ } 1 I L + Ĝ) Ĝ Ĝ is idempotent and has rank L K, h h D χ 2 L K When the value of the Sargan/Hansen test statistic is too large, relative to critical values of the appropriate χ 2 L K distribution, reject the validity of the set of moment conditions used Consider now a partition of the R = R A + R B moment conditions as 5 Under the null that all instruments are valid E [ψ A w, β)] = 0 E [ψ B w, β)] = 0, 21

22 in R A and R B moment conditions, where β is a K 1 vector of parameters, with R A > K We wish to test the restrictions E [ψ B w, β)] = 0, taking E [ψ A w, β)] = 0 as given Optimal GMM, using the restricted set moment conditions results in ˆβ A, while ˆβ is the estimator when all ) moment conditions are used Similarly, NJ NA ˆβA is the minimized optimal GMM criterion ) using only the restricted moment set, while NJ N ˆβ is the minimized optimal GMM criterion using all moment conditions From the previous result we know that ) NJ NA ˆβA ) NJ N ˆβ D D χ 2 R A K χ 2 R K It can now also be shown Blundell and Bond, 1998) that ) ) J d = NJ N ˆβ NJ NA ˆβA D χ 2 R B Furthermore J d is asymptotically independent of NJ NA ˆβA ) 4 Models, parameters of interest and the incidental parameters problem 41 Review of some nonlinear models 411 Discrete choice Consider a binary outcome dependent variable y it = 0/1, which is thought to depend on some co-variates x it, such that Pr [y i = 1 x i ] = F x iβ) The log-likelihood for such a model is given by l β) = N {y i ln F x iβ) + 1 y i ) ln [1 F x iβ)]}, i=1 22

23 with gradient with respect to β given by l β) β = N [ y i ) F x i β) F x i β) β i=1 1 y i) 1 F x i β) F x i β) β )] x i To model above relationship sensibly, the function F ) should be monotonously increasing from 0 to 1 Common choices are F x iβ) = Φ x iβ) in the probit model and F x iβ) = expx β) i β) in the logit model 1+expx i 412 Censoring A general formulation of the censored regression or tobit model in terms of an index function, y i = α x i + ε i y i = y i y i > 0) Assuming a normal distribution for ε, the log-likelihood of the above model is given by [ l α) = 1 ) ] ln 2π) + ln σ 2 yi α 2 x i σ y i >0 y i =0 ln 1 Φ )) α x i, σ where the first term is identical to the log-likelihood of a linear model, and the second term to the log-likelihood of a probit model It is a mixture of discrete and continuous distributions 413 Selection Consider following first-stage model, where the selection is determined by the variable z i = z i > 0) z i = α w i + u i, and the equation of interest is given by y i = β x i + ε i 23

24 Assuming u ε N 0; σ2 u ρσ u σ ε ρσ u σ ε σε 2, we have for the observed outcomes E [y i z i ] = β x i + ρσ ε λ, where λ = φq) /Φq), with q = α w i/σu Considering for a moment the nonlinear specification y i = f 1 x i ) + ε i, E [ε i x i ] = 0 z i = 1 z i = f 2 x i ) + u i > 0), which implies that E [y i x i ] = f 1 x i ) E [y i x i, z i = 1] = f 1 x i ) + E [ε i x i, f 2 x i ) + u i > 0] The crucial point for identification of f 1 x i ) is that E [ε i x i, f 2 x i ) + u i > 0] = E [ε i f 2 x i ), f 2 x i ) + u i > 0], ie the conditional expectation of ε i depends on x i through f 2 x i ) only 414 Count data A count data model captures the number of occurrences of an event per period A popular model is the Poisson regression model, in which the probability of each outcome is given by Pr [y i = k] = exp λ i) λ k i k!, k N 24

25 and the expected number of events per period E [y i ] = λ i, which coincides with the variance of the number of events per period The λ i can be written in terms of co-variates, which commonly takes the form λ i = exp γ x i ) The log-likelihood function is given by l γ; y i ) = N { exp γ x i ) + y i γ x i ) ln y i!)}, i=1 with gradient l γ) γ = N y i λ i ) x i i=1 and Hessian 2 l γ) γ γ = N λ i x i x i The Poisson distribution can be generalized by introducing an unobserved effect into the conditional mean i=1 E [y i ] = exp γ x i + ε i ) = λ i u i = µ i Conditional on x i and u i, y i is Poisson distributed Pr [y i = k x i, ε i ] = exp µ i) µ k i k! 25

26 The distribution conditional only on x i, is obtained by integrating out 6 u i Pr [y i = k x i ] = 0 exp λ i u i ) λ i u i ) k g u i ) du i k! Assuming now that u i follows a Γ-distribution, ie g u i ) = βα Γ α) exp βu i) u α 1 i, and normalizing E [u i ] = α /λ = 1, we obtain Pr [y i x i ] as Pr [y i x i ] = = = = = λ y i i αα Γ y i + 1) Γ α) 0 λ y i i αα 1 Γ y i + 1) Γ α) λ i + α) α+y i λ y i i αα 1 Γ y i + 1) Γ α) λ i + α) α+y i λ y i i αα Γ y i + α) Γ y i + 1) Γ α) λ i + α) α+y i Γ y i + α) λi Γ y i + 1) Γ α) λ i + α exp [λ i + α] u i ) u y i+α 1 i 0 0 ) yi α λ i + α du i exp [λ i + α] u i ) [λ i + α] y i+α 1 u y i+α 1 i [λ i + α] du i exp t) t y i+α) 1 dt ) α This is one possible formulation of the negative binomial distribution, which has E [y i x i ] = λ i and Var [y i x i ] = λ i /α α + λ i ) 42 Policy parameters The quantity we are usually interested in, is the effect of x on y In the linear model y = x β + η + ε, this quantity is equal to β, identically for all individuals In binary choice models β convey information on the relative impact of x on Pr [y = 1 x, η] The magnitude of the effect of a change in one x 1 on Pr [y = 1 x 1, x 2, η] now depends on the whole vector x 1, x 2, η) For a continuous x 1 this impact is given by 6 Pr [y i x i ] = E ui [Pr [y i x i, u i ]] x 1 Pr [y = 1 x, η] = x 1 F x β + η) = β 1 f x β + η), 26

27 and for a discrete variable Pr [y = 1 x 1 = b, x 2, η] Pr [y = 1 x 1 = a, x 2, η] = F bβ 1 + x 2 β 2 + η) F aβ 1 + x 2 β 2 + η) The parameter we are interested in is the population average of the above quantities, ie {F bβ 1 + x 2 β 2 + η) F aβ 1 + x 2 β 2 + η)} dg η, x 2 x 1 = a) 3) Chamberlain 1984) proposed the mean effect of a randomly drawn individual {F bβ 1 + x 2 β 2 + η) F aβ 1 + x 2 β 2 + η)} dg η, x 2 ) 4) Expressions 3) and 4) measure different quantities Expression 3) is appropriate if we want to measure, say, the effect on female labor participation, of having a third x 1 = 3 child, for those women that already have two children x 1 = 2 In this case the unobserved individual effects η will be differently distributed, compared to the total population Expression 4) would be the average effect of going from two to three children In case x 1 is a treatment, 3) is the average treatment on the untreated, and 4) is the average treatment effect The overall effect of having one extra child is given by {F x 1 + 1) β 1 + x 2 β 2 + η) F x 1 β 1 + x 2 β 2 + η)} dg η, x 1, x 2 ) In case of a dynamic model y it = 1 y i,t 1 α + x itβ + η i + v it 0), the long run 7 effect of a change in x on Pr [y = 1 x, η] is given by 7 We have that { x F x itβ + η i ) 1 F α + x it β + η i) + F x it β + η i) } Pr [y it = 1 x, η] = Pr [y i,t 1 α + x itβ + η i + v it 0 x, η] = Pr [α + x itβ + η i + v it 0 x, η] Pr [y i,t 1 = 1 x, η] + Pr [x itβ + η i + v it 0 x, η] 1 Pr [y i,t 1 = 1 x, η]) 27

28 43 The incidental parameter problem Consider a panel model with N agents observed for T periods The model contains agent-specific individual effects η i, together with the K 1 vector θ parameters of interest Application of ML to estimate all parameters usually yields inconsistent N ) estimates for the common parameters θ Since the complete parameter vector η 1 η 2,, η N, θ) has N + K elements and thus infinitely grows for N, standard consistency theorems fail Example 1 Consider the model with individual-specific means y it N η i, σ 2 ), for which the i th term of the log-likelihood can be written as l i = T 2 ln σ2 1 2σ 2 T y it η i ) 2 t=1 Maximizing the likelihood results in ˆη i = ȳ i ˆσ 2 = 1 NT y it ȳ i ) 2, i,t from which it is clear that plim ˆσ 2 = σ 2 σ2 N T Example 2 Chamberlain 1992) studied identification of the following fixed effects binary choice model with T = 2 y it = 1 αx it + η i + v it 0), where the v it are IID over time, independent of x it and η i, with known CDF F for which the usual restrictions apply Furthermore, the vectors of explanatory variables In the steady state, it holds that Pr [y it = 1 x, η] = Pr [y i,t 1 = 1 x, η], and thus that Pr [y it = 1 x, η] = = Pr [x it β + η i + v it 0 x, η] 1 Pr [α + x it β + η i + v it 0 x, η] + Pr [x it β + η i + v it 0 x, η] F x it β + η i) 1 F α + x it β + η i) + F x it β + η i) 28

29 and of parameters are partitioned as x it = d t, z it) and α = β, γ ), where d t is a time dummy such that d 1 = 0 and d 2 = 1, and z it is a continuous RV with bounded support Chamberlain showed that under these conditions there exists a value for β such that identification fails for all α in a neighborhood of β, 0) 44 Terminology In the present context, fixed effects refers to a model for the effect of x it on y it, given x it observed) and η i unobserved), whereby the distribution of η i x it is left unspecified, while random effects denotes a model in which some knowledge about the form of η i x it is assumed Arellano 2003) Part II Linear Models 5 Static y it = x itβ + z iγ + η i + v it, with x it and z i a K x 1, respectively K z 1, vector of explanatory variables, for i = 1,, N and t = 1,, T Stacked above equation looks like Y = Xβ + Zγ + H + V, where Y = Y 1 Y 2 ;Y j = y j1 y j2 Y N y jt 29

30 51 Uncorrelated individual effects E [x it η i ] = 0 5) 511 OLS Assumption A x it is predetermined or contemporaneously uncorrelated with the idiosyncratic error term: E [x it v it ] = 0 6) Expressions 5) and 6) imply E [x it u it ] = 0, where u it = η i + v it Thus ˆβ OLS as β, for N T N, T In general, under both 5) and 6), pooled estimators are consistent Remark that this rules out serial correlation in the v it 512 GLS UU = Ω = Ω i Ω i Ω i, 30

31 with Ω i = σ 2 vi N + σ 2 ηj N, where J N is a N N matrix filled with 1s Under 5) and 6) it holds that ˆβ GLS as β, for N T, N, T where ˆβ GLS = X Ω 1 X ) 1 X Ω 1 Y This estimator can be obtained by θ-differencing the data, ie y d it = y it 1 θ) ȳ i, where and ȳ i = T 1 T s=1 y is θ = σ 2 v, σv 2 + T ση 2 and subsequently by estimating this transformed model by OLS 513 FGLS Feasible GLS uses consistent estimates of σ 2 v and σ 2 η to estimate θ consistently It holds that ˆβ F GLS as ˆβ GLS, for N T N, T 31

32 514 Between Groups Estimator Averaging all cross-sections over time results in ȳ i = x iβ + z iγ + η i + v i, for i = 1,, N OLS on this constructed) cross-section is called the between groups estimator Under both 5) and 6), it is consistent, but inefficient 52 Correlated individual effects E [x it η i ] 0 7) Under 7), OLS suffers from the omitted variable bias, ie [ ] E ˆβOLS [ ] = β + E X X) 1 X H 521 Within estimator or Least Squares Dummy Variables LSDV ) estimation Assumption B x it is strictly exogenous: E [x it v is ] = 0, s, t 8) Assumption B is crucial for fixed T asymptotics It does not allow for lagged dependent variables The within transformation ỹ it = y it ȳ i, 32

33 has as effect that time invariant variables and thus also the individual effects disappear, z i = z i z i = z i z i = 0 Endogeneity that is only caused by 7) is removed by within transforming the data, and, thus, OLS on within-transformed data is consistent for N This is true since E [ x it ṽ it ] = 0, is implied by 8) 522 First differenced OLS First differencing: y it = y it y i,t 1 OLS on first differenced data is consistent for N if E [ x it v it ] = 0, which is implied by 8), but not vice versa 523 Hausman-Taylor IV Hausman and Taylor 1981) consider the model y it = x itβ + z iγ + µ i + ε it i = 1,, N; t = 1,, T ), where x it and z i are k 1 and g 1 vectors of coefficients associated with time-varying and time-invariant observable variables respectively The disturbance ε it is assumed uncorrelated with the vector x it, z i, µ i ) and has zero mean and constant variance σ 2 ε conditional on x it and z i 33

34 The latent individual effect µ i is assumed to be a time-invariant random variable, distributed independently across individuals, with variance σα 2 Standard strategy of transforming the data, either by taking deviations from individual means or first differences, has two drawbacks: 1 time-invariant observable variables z i are eliminated 2 under certain circumstances, the within-groups estimator is not fully efficient Prior information lets us distinguish those elements of x it and z i which are asymptotically uncorrelated with µ i from those which are not For fixed T, let plimn 1 N plimn 1 N N i=1 N i=1 X 1A = 0 k1 plimn 1 N X 1A = c x plimn 1 N N i=1 N i=1 Z 1A = 0 g1 Z 1A = c z, with the k 2 1, respectively g 2 1, vectors c x and c z assumed unequal to zero Define P V = I N T 1 ι T ι T, Q V = I NT P ιt, then pre-multiplication with P V and Q V transforms a variable in group means, respectively deviations therefrom Hausman and Taylor 1981) now consider the set of instruments A = Q V, X 1, Z 1 ) It can be shown that P A = P B, where B = P V X 1, Q V X 1, Z 1 ) A necessary condition for the identification of β, γ ) is that k 1 g 2 A necessary and sufficient condition for the identification of β, γ ) is that det [ X, Z) P A X, Z) ] 0 In addition, they propose to estimate P A Ω 1/2 Y = P A Ω 1/2 Xβ + P A Ω 1/2 Zγ + P A Ω 1/2 µ i + ε it ) i = 1,, N; t = 1,, T ), 34

35 where Ω = Var [µ i + ε it ] = σ 2 εi NT + T σ 2 µp V, by OLS 524 Chamberlain device Model y it = x itβ + z iγ + η i + v it, under FE condition 7) E [x it η i ] 0 Parsimonious version Assume that E [η i x i1,, x it ] = x iθ, and estimate y it = x itβ + z iγ + x iθ + ζ i + v it, where, by construction E [x it ζ i ] = 0 under the assumptions made) On top of the K x + K z explanatory variables, K x degrees of freedom are needed for this type of correction Remark that the LSDV estimator needs N 1 additional degrees of freedom 35

36 Full version Without any further assumption, we have that E [η i x i1,, x it ] = T x isθ s, s=1 which leads to the model T y it = x itβ + z iγ + x isθ s + ψ i + v it, s=1 where, by construction E [x it ψ i ] = 0 This type of correction uses T K x degrees of freedom, which for T small compared to N might still be an improvement over LSDV 53 Hausman test Testing for correlation between the individual effects and the variables is especially useful for fixed T It rests on the observation that ˆβ W is consistent irrespective of this correlation, and ˆβ B, ˆβGLS and ˆβ OLS are inconsistent in the presence of such correlation Define the distance between ˆβ W and ˆβ GLS as q = ˆβ W ˆβ GLS Under H 0 : E [x it η i ] = 0, it holds that q D N 0; Σ q ), with Σ q = Σ ˆβW Σ ˆβGLS 36

37 and qσ 1 q q D χ 2 k 6 Dynamic 61 No exogenous regressors Consider y it = αy i,t 1 + x itβ + z i γ + η i + v it α < 1, 9) for i = 1,, N and t = 2,, T, and assume for now β, γ = 0, ie y it = αy i,t 1 + η i + v it α < 1, which implies y it = 1 α) η i + Consequences of the lagged dependence: α s v i,t s 10) s=0 E [y i,t 1 η i ] = σ 2 η > 0 correlation with the individual effects) E [y i,t 1 v i,t 1 ] = σ 2 v > 0 no strict endogeneity) 611 Previous Estimators As before, the OLS estimator for α suffers from omitted variable bias, which does not vanish with increasing sample size The inconsistency is given by plim ˆα OLS = α + E [ y 2 1 i,t 1] E [yi,t 1 η i ] N α, since E [y i,t 1 η i ] 0 37

38 In this dynamic setting, however, the within estimator also suffers from bias, termed the Nickell 1981) bias, since, T plim ˆα α) = N t=1 T t=1 A t B t with A t = E [ỹ i,t 1 ṽ it ] and B t = E [ ỹ 2 i,t 1] Of these quantities At can be simplified to A t = T 1) 1 E T 1) 1 E + T 1) 2 E s=0 [ T 1 [ y i,t 1 [ T 1 ] T v is s=2 ] y is v it s=1 [ T 1 y is s=1 r=2 ] T v ir which, combined with 10), gives [ ] T A t = T 1) 1 E α s v i,t 1 s v is T 1) 1 E + T 1) 2 E After some manipulation we finally get σ 2 v A t = T 1) 1 α) s=1 r=0 [ T 1 s=2 ] α r v i,s r v it ] T α r v i,s r v iq s=1 r=0 q=2 { ) } 1 α T 1 1 α t 1 α T t 1 + T 1) 1 α) Similarly it holds that B t = ) σv α 1 α 2 T 1) 1 α A t 2 38

39 Combining the expressions for both A t and B t, results in { plim ˆα α) = 1 + α ) } 1 α T 1 1 α T 1 α T t 1 + N T 2 T 1) 1 α) { [ 2α ) ]} 1 1 α T α T 1 α T t 1 +, T 2) 1 α) T 1) 1 α) from which it can be clearly seen that plim ˆα W = α O T 1) N,T The within-estimator is inconsistent for N, but consistent for T and for N, T Remark 1 Do not test H 0 : α = 0 using ˆα W, unless T is large, since lim plim α 0 N,T [ˆα W α] 0 The first-differenced OLS estimator is inconsistent under any type of asymptotic, since E [ y i,t 1 v it ] = E [y i,t 1 v i,t 1 ] Remark 2 Use these estimators as benchmarks For any consistent estimator it holds that ˆα W, ˆα DOLS ˆα ˆα OLS Remark 3 Letting A = t ỹi,t 1ṽ it and B = t ỹ2 i,t 1, the Nickell bias can be written as E [A] /E [B] It is equal to the approximation to the order T 1 of the standard Hurwicz 1950) bias E [A/B] E [ ] A B = E [A] { } Cov [AB] Var [B] 1 + E [B] E [A] E [B] E [B]) 2 + o T 1) 39

40 612 IV Consider the first differenced equation y it = α y i,t 1 + v it, and assume that 6) applies to y i,t 1 The lagged determined variable is thus contemporaneously uncorrelated with the idiosyncratic error term: E [y i,t 1 v it ] = 0 This assumption follows when the v it are serially uncorrelated and are uncorrelated with y i0 As seen above OLS is inconsistent under any type of asymptotic, but we have that E [y i,t 2 v it ] = 0 E [ y i,t 2 v it ] = 0, and thus both y i,t 2 and y i,t 2 are valid instruments for y i,t 1 Anderson and Hsiao 1981) suggested the first type of instrument, resulting in ˆα AH = Y 1Y 2 Y 2Y 2 ) 1 Y 2 Y 1 ) Y 1Y 2 Y 2Y 2 ) 1 Y 2 Y 11) This estimator is consistent as N for fixed T 613 GMM Consider again the AR 1) panel data model 9) without exogenous regressors y it = αy i,t 1 + η i + v it α < 1, and make the following assumptions A1 Error components) E [η i ] = E [v it ] = E [η i v it ] = 0 40

41 A2 Serially uncorrelated shocks) E [v is v it ] = 0 for s t A3 Predetermined initial conditions) E [y i1 v it ] = 0 for t = 2,, T Assumptions A1)-A3) specify a finite number linear moment conditions These can be exploited to construct a GMM estimator First-differenced equations y i3 y i2 = α y i2 y i1 ) + v i3 v i2 ) y i4 y i3 = α y i3 y i2 ) + v i4 v i3 ) y it y i,t 1 = α y i,t 1 y i,t 2 ) + v it v i,t 1 ) Valid instruments y i1 y i1, y i2 y i1, y i2,, y i,t 2 Table 1: Valid instruments for the dynamic model The moment conditions E [y i1 v it ] = 0, for t = 3,, T follow from A3), while E [y i,t s v it ] = 0, for t = 3,, T and s 2 follow from A2) E [v is v it ] = 0, from A1) E [η i v it ] = 0 and from E [y i,s 1 v it ] = 0 These M = T 1) T 2) /2 moments can also be written as E [Z i V i ] = 0, and have as sample analogue where Z i is the T 2) M matrix N c N α) = N 1 Z i V i α), i=1 Z i = y i y i1 y i y i1 y i2 y i,t 2 41

42 and V i is the T 2) 1 matrix V i = v i3 v i4 v it The GMM estimator minimizes the weighted quadratic distance between these moment conditions and zero ˆα GMM = arg min α = arg min α J N α) N 1 N i=1 V i α) Z i ) W N N 1 N i=1 Z i V i α) = Y 1ZW N Z Y 1 ) 1 Y 1ZW N Z Y, 12) ) where Y and Y 1 are the stacked N T 2) 1 vectors of observations on y it respectively y i,t 1, Z = Z 1,, Z N ) is the stacked N T 2) M matrix of observations on the instruments and W N is a M M weight matrix The simple expression on the last line arises from the linearity of the moment conditions For arbitrary W N, the asymptotic covariance matrix of N ˆα GMM can be estimated by Σ N ˆα GMM = N Y 1ZW N Z Y 1 ) 1 Y 1ZW N ˆΣV W N Z Y 1 Y 1ZW N Z Y 1 ) 1, with ˆΣ V = N 1 N i=1 Z i ˆV i ˆV i Z i and ˆV i = Y ˆα GMM Y 1 The optimal GMM estimator chooses W N = variance can thus be estimated by 1 ˆΣ V Its asymptotic Σ N ˆα GMM = N ) 1 Y 1Z 1 ˆΣ V Z Y 1 Remark 1 For T = 3, we have only 1 moment condition E [y i1 v i3 ] = 0 Consequently, our 42

43 only parameter is exactly identified Since weighting is irrelevant in this case, we obtain the optimal GMM estimator It can easily be shown that ˆα GMM = ˆα AH, for T = 3 Remark 2 For T > 3, ˆα GMM is more efficient than ˆα AH Firstly, ˆα GMM exploits more moment conditions, and secondly because W N = Y 2Y 2 ) 1 is not optimal for first-differenced equations Optimal One Step In the special case that v it IID 0; σ 2 v), the optimal GMM estimator can be obtained in one step This optimal choice of weight matrix does not coincide with 2SLS, since first-differencing introduces serial correlation in the error terms v it It can easily be shown that E [ V i V i ] = σ 2 vh, with H = Consequently, choosing W N = N 1 N i=1 Z ihz i ) 1 is asymptotically equivalent to the optimal two-step estimator results in a one step GMM estimator that Extra Moment conditions Without making extra assumptions beyond A1)-A3), a extra set of T 3 moment assumptions, quadratic in α, can be exploited Ahn and Schmidt, 1995) E [u it v i,t 1 ] = 0 for t = 4,, T, 13) where u it = η i +v it Moment conditions that are non-linear in the parameters require numerical optimization and are less common in practice Under the additional homoskedasticity assumption E [ v 2 it] = σ 2 i, 43

44 there are a further T 3 linear moment conditions Ahn and Schmidt, 1995) E [y i,t 2 v i,t 1 y i,t 1 v it ] = 0 for t = 4,, T 14) Further Issues Over-fitting Too many instruments relative to the sample size N) is a source of finite sample bias For small N and large T over-fitting is a distinct possibility, resulting in a downward) finite sample bias towards the within estimator Always check the sensitivity of empirical results with respect to the number of lags present in the set of instruments For T, the within estimator is consistent, so first-differenced GMM is also consistent but not very useful) see Alvarez and Arellano, 2003, for a formal proof) In the presence of endogenous explanatory variables, however, the within estimator is no longer consistent as T Weak Instruments and Unit Roots IV and GMM have poor small sample properties when instruments are only weakly correlated with the endogenous regressors For AR 1) models with α 1, the correlation between y i,t 1 and the lagged levels y i,t s for s 2 becomes weaker Formally, α remains identified as α 1 and the first-differenced GMM estimator remains consistent as N, provided ση 2 0 But Monte Carlo evidence suggests first-differenced GMM estimators to be very imprecise for α 08 For persistent time series, Blundell and Bond 1998) developed an extended GMM estimator see below) Consider now the alternative specification y it = η i + ε it ε it = αε i,t 1 + v it, which can be reformulated as y it = αy i,t α) η i + v it 44

45 Under this specification, the process for y it approaches a pure random walk as α 1 instead of a random walk with drift above) Lagged levels are completely uninformative instruments for y i,t 1 as α = 1 and α is not identified using the moment conditions E [y i,t s v it ] = 0, for t = 3,, T and s 2 Remark that for this model OLS in levels is consistent when α = 1 System GMM It is possible that y it is uncorrelated with η i, while y it is obviously correlated with η i This requires a further restriction on the process generating the initial conditions y i1 E [ y i2 η i ] = 0 15) The validity of 15) has two implications 1 The AR 1) specification implies t 3 y it = α t 2 y i2 + α s v i,t s for t = 3,, T, s=0 which, together with 15) implies the T 2 non-redundant linear moment conditions E [ y is η i ] = 0 for t = 2,, T for the equation in levels, which can, for example, be rewritten as E [ y is u it ] = 0 for t = 2, 3,, T 1) 2 Validity of these additional linear moment restrictions renders the quadratic moment restrictions 13) redundant For example it holds that u it u i,t 1 = u it y i,t 1 α y i,t 2 ) = u it y i,t 1 αu it y i,t 2, 45

46 and the assumptions E [u it y i,t 1 ] = 0 and E [u it y i,t 2 ] = 0 imply E [u it u i,t 1 ] = 0 Conveniently, the complete set of moment conditions 8 implied by the standard assumptions T 1) T 2) /2) and the initial conditions restriction T 2) can be written as E [y i,t s v it ] = 0 for t = 3,, T ; s 2 E [ y is u it ] = 0 for s = 2, 3,, T 1), which can be restated as E [ ] Zi S Ui S = 0, and has as sample analogue where Z S i is the T 1) M S matrix 9 c N α) = N 1 N i=1 Z S i U S i α), Z S i = y i y i1 y i y i1 y i2 y i,t = y i2 y i,t 1 Z i y i2 y i,t 1, 8 Moreover, all these restrictions are linear 9 M S = T + 1) T 2) /2 46

47 and U S i is the T 1) 1 vector U S i = v i3 v i4 v it η i + v it Finally, define also the T 1) 1 vector Y S i Y S i = y i3 y i4 y it, y it and the vector equation 10 Y S i = αy S i, 1 + U S i, represents T 2 scalar equations in first differences, with one equation in levels added The GMM estimator minimizes the weighted quadratic distance between the sample moment conditions and zero ˆα SGMM = arg min α = arg min α J N α) N 1 N i=1 Ui S α) Zi S ) W S N N 1 N i=1 Z S i U S i α) = ) Y 1Z S S WNZ S S Y 1 S 1 Y S 1Z S WNZ S S Y S 16) ) where Y S = ), Y1 S,, YN S Y S 1 = ), Y1, 1, S, YN, 1 S Z S = ) Z1 S,, ZN S and WN S is a M S M S weight matrix 10 As before Yi, 1 S = L ) Yi S, with L ) the lag operator 47

48 In some cases these additional moment conditions for the equation in levels can provide huge efficiency improvements and a reduction in finite sample bias A persistent y it series is such an example In this case, the quadratic moments conditions give a substantial improvement, but the additional linear restrictions stemming from the initial conditions restriction, provide an even bigger gain in efficiency Blundell and Bond, 1998) In order to guarantee that y i2 is uncorrelated with η i, a restriction on the behavior of y i1 is needed This takes the form of a stationarity restriction on the y it series Indeed, the representation t 3 y it = α t 2 y i2 + α s v i,t s for t = 3,, T, s=0 indicates that y it becomes uncorrelated with η i, for t If the same process has been generating the y it series long enough 11 before the observation period, the observations on y it are uncorrelated with η i More formally, define e i1 = y i1 η i 1 α, then we have that y i2 = α 1) y i1 + η i + v it = α 1) e i1 + v i2 Since the error components assumption A1) states that E [v i2 η i ] = 0, a sufficient condition for E [ y i2 η i ] to be equal to zero, is given by the restriction E [e i1 η i ] = 0 η i 1 α is the steady state level of the y it series, ie the level the series will converge to for T Consequently e i1 represents the deviation from the steady state at the start of the sample period The additional initial conditions assumption 15) is thus equivalent to the requirement that the initial deviations from the steady state are uncorrelated with the steady state level 11 Long enough for the influence of the true start-up conditions to have become negligible 48

49 This imposes a restriction 12 on the mean of the y it series, but not on its variance When the process we observe has started long ago, this assumption is quite reasonable On the other hand, if the observation period coincides with the true start-up of a process for instance the start of the Euro) this assumption may be unreasonable The system GMM estimator is similar to the case where levels or first-differences of some x it variable can be used to obtain instruments for the equation in levels see subsection 622 below) Serially correlated errors 62 Exogenous regressors Consider again model 9) y it = αy i,t 1 + x itβ + z i γ + η i + v it α < 1, = s itδ + u it, for i = 1,, N and t = 2,, T, where s it = y i,t 1, x it) and δ = α, β ) We assume that x it is a K 1 vector of variables and we maintain assumptions A1)-A3) Consequently, the moment conditions E [y i,t s v it ] = 0, for t = 3,, T and s 2, remain valid Different assumptions about x it will imply different extra sets of moment conditions 13 The usual tradeoff between efficiency and robustness applies here More restrictive assumptions imply the validity of additional moment restrictions, which will increase efficiency when they are valid, but imply inconsistency when violated 621 Moment conditions for the equation in differences Example 1 We assume that x it is predetermined with respect to the serially uncorrelated shocks v it : E [x is v it ] = 0, for s t, 12 Sometimes called mean stationarity 13 x it can either be correlated or uncorrelated with η i ; x it may be endogenous, predetermined or strictly exogenous with respect to v it 49

50 but correlated with the individual effects E [x it η i ] 0 These assumptions imply that present shocks v it possibly affect future values of x is where s > t) First-differencing eliminates η i y it = α y i,t 1 + β x it + v it, and the assumption of predeterminedness implies the moment conditions E [x i,t s v it ] = 0, for t = 3,, T ; s 1, ie lagged values of x it are uncorrelated with v it = v it v i,t 1 The complete set of linear moment conditions can again be written as E [ Z + i V i ] = 0, but with the matrix Z i now equal to Z + i = y i1 x i1 x i y i1 y i2 x i1 x i2 x i , y i1 y i2 y i,t 2 x i1 x i,t 1 and GMM proceeds as before Denoting the stacked N T 2) K + 1) matrix of observations on s it = y i,t 1, x it), the GMM estimator of δ is given by ˆδ GMM = X Z + W N Z + X ) 1 X Z + W N Z + Y Example 2 If we assume that x it is endogenous with respect to the serially uncorrelated 50

51 shocks v it : E [x is v it ] = 0, for s < t, then only the subset E [x i,t s v it ] = 0, for t = 3,, T ; s 2 remains valid In this case the treatment of x is and y is in the instrument matrix is symmetric Example 3 If x it is strictly exogenous with respect to the shocks v it : E [x is v it ] = 0, s, t, the the larger set of moment conditions E [x i,t s v it ] = 0, for t = 3,, T ; s = 1, 2,, T, would be valid, which would imply the instrument matrix Z i = y i1 x i1 x it y i1 y i2 x i1 x T y i1 y i2 y i,t 2 x i1 x i,t Implementation of the different alternatives consists simply of adding or deleting columns from the instrument matrix Z i 622 Moment conditions for the equation in levels Assume that x it is strictly exogenous with respect to the shocks v it and uncorrelated with the individual effects We have T observations on x it which are all uncorrelated with η i, which 51

52 provides us with T valid moment conditions for the equation in levels One possibility 14 is to write these as E [x is u it ] = 0, for s = 1, 2,, T, with u it = η i + v it The full set of linear) moment conditions can now be written as E [Z i U i ] = 0, where Z i = y i1 x i1 x it y i1 y i2 x i1 x it y i1 y i2 y i,t 2 x i1 x i,t x i1 x it or Z i = Z i x i1 x it, and U i = v i3 v it η i + v it We thus augment the system of first-differenced equations by adding the levels equation for the final period, and augment the instrument matrix by adding the T valid instruments for this equation As before the GMM estimators are based on the sample analogues of these moment conditions ˆδ GMM = X Z W N Z X ) 1 X Z W N Z Y 14 This possibility is elegant but useless for unbalanced panels 52

53 When x it is only predetermined with respect to v it, the instrument matrix changes to Z i = Z+ i x i1 x it On the other hand, when x it is endogenous with respect to v it, but still uncorrelated with η i ), there are only T 1 extra moment conditions E [x is u it ] = 0, for s = 1, 2,, T 1, and the instrument matrix loses one column Remark There is no obvious choice for the one step weight matrix when one or more levels equations are added to the system The assumption that v it IID 0; σv) 2 does not imply a form for E [ U + i U ] + i that is proportional to a known matrix As a consequence, the efficiency improvement between one step and optimal GMM is expected to be more substantial, compared to the situation in which there are only moment restrictions in first differences Another possibility is that, although x it is correlated with η i, some known function of x it is uncorrelated with η i and can thus be used to form valid instruments for the equations in levels For instance, when the covariance between x it and η i is constant over time, the first differences of x it are uncorrelated with η i Arellano and Bover suggest to use suitably dated first differences x is as instruments for the levels equation, where the suitability depends on the correlation between x is and η i This correlation in turn depends on whether x it is assumed to be endogenous, predetermined or strictly exogenous with respect to v it Presence of variables that are uncorrelated with the individual effects thus allows the use of the levels equation Use of the levels equation in turn opens up the possibility to identify coefficients on time-invariant explanatory variables z i In contrast, if all valid moment conditions require transformation of the model in order to eliminate the individual effects η i, time-invariant variables z i are eliminated as well, and their coefficients γ are not identified 53

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

Advanced Econometrics

Advanced Econometrics Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate

More information

1 Estimation of Persistent Dynamic Panel Data. Motivation

1 Estimation of Persistent Dynamic Panel Data. Motivation 1 Estimation of Persistent Dynamic Panel Data. Motivation Consider the following Dynamic Panel Data (DPD) model y it = y it 1 ρ + x it β + µ i + v it (1.1) with i = {1, 2,..., N} denoting the individual

More information

Multiple Equation GMM with Common Coefficients: Panel Data

Multiple Equation GMM with Common Coefficients: Panel Data Multiple Equation GMM with Common Coefficients: Panel Data Eric Zivot Winter 2013 Multi-equation GMM with common coefficients Example (panel wage equation) 69 = + 69 + + 69 + 1 80 = + 80 + + 80 + 2 Note:

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel

More information

Lecture 8 Panel Data

Lecture 8 Panel Data Lecture 8 Panel Data Economics 8379 George Washington University Instructor: Prof. Ben Williams Introduction This lecture will discuss some common panel data methods and problems. Random effects vs. fixed

More information

Lecture 6: Dynamic panel models 1

Lecture 6: Dynamic panel models 1 Lecture 6: Dynamic panel models 1 Ragnar Nymoen Department of Economics, UiO 16 February 2010 Main issues and references Pre-determinedness and endogeneity of lagged regressors in FE model, and RE model

More information

Panel Data Model (January 9, 2018)

Panel Data Model (January 9, 2018) Ch 11 Panel Data Model (January 9, 2018) 1 Introduction Data sets that combine time series and cross sections are common in econometrics For example, the published statistics of the OECD contain numerous

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical

More information

Topic 10: Panel Data Analysis

Topic 10: Panel Data Analysis Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Economics 582 Random Effects Estimation

Economics 582 Random Effects Estimation Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0

More information

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Econ 582 Fixed Effects Estimation of Panel Data

Econ 582 Fixed Effects Estimation of Panel Data Econ 582 Fixed Effects Estimation of Panel Data Eric Zivot May 28, 2012 Panel Data Framework = x 0 β + = 1 (individuals); =1 (time periods) y 1 = X β ( ) ( 1) + ε Main question: Is x uncorrelated with?

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

Instrumental Variables, Simultaneous and Systems of Equations

Instrumental Variables, Simultaneous and Systems of Equations Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated

More information

Instrumental Variables

Instrumental Variables Università di Pavia 2010 Instrumental Variables Eduardo Rossi Exogeneity Exogeneity Assumption: the explanatory variables which form the columns of X are exogenous. It implies that any randomness in the

More information

Dynamic Panels. Chapter Introduction Autoregressive Model

Dynamic Panels. Chapter Introduction Autoregressive Model Chapter 11 Dynamic Panels This chapter covers the econometrics methods to estimate dynamic panel data models, and presents examples in Stata to illustrate the use of these procedures. The topics in this

More information

Dynamic panel data methods

Dynamic panel data methods Dynamic panel data methods for cross-section panels Franz Eigner University Vienna Prepared for UK Econometric Methods of Panel Data with Prof. Robert Kunst 27th May 2009 Structure 1 Preliminary considerations

More information

Estimation of Dynamic Panel Data Models with Sample Selection

Estimation of Dynamic Panel Data Models with Sample Selection === Estimation of Dynamic Panel Data Models with Sample Selection Anastasia Semykina* Department of Economics Florida State University Tallahassee, FL 32306-2180 asemykina@fsu.edu Jeffrey M. Wooldridge

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 1 Jakub Mućk Econometrics of Panel Data Meeting # 1 1 / 31 Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled

More information

Specification testing in panel data models estimated by fixed effects with instrumental variables

Specification testing in panel data models estimated by fixed effects with instrumental variables Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions

More information

EC327: Advanced Econometrics, Spring 2007

EC327: Advanced Econometrics, Spring 2007 EC327: Advanced Econometrics, Spring 2007 Wooldridge, Introductory Econometrics (3rd ed, 2006) Chapter 14: Advanced panel data methods Fixed effects estimators We discussed the first difference (FD) model

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Linear dynamic panel data models

Linear dynamic panel data models Linear dynamic panel data models Laura Magazzini University of Verona L. Magazzini (UniVR) Dynamic PD 1 / 67 Linear dynamic panel data models Dynamic panel data models Notation & Assumptions One of the

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes ECON 4551 Econometrics II Memorial University of Newfoundland Panel Data Models Adapted from Vera Tabakova s notes 15.1 Grunfeld s Investment Data 15.2 Sets of Regression Equations 15.3 Seemingly Unrelated

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

What s New in Econometrics. Lecture 15

What s New in Econometrics. Lecture 15 What s New in Econometrics Lecture 15 Generalized Method of Moments and Empirical Likelihood Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Generalized Method of Moments Estimation

More information

Linear Panel Data Models

Linear Panel Data Models Linear Panel Data Models Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania October 5, 2009 Michael R. Roberts Linear Panel Data Models 1/56 Example First Difference

More information

Short Questions (Do two out of three) 15 points each

Short Questions (Do two out of three) 15 points each Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour Instrumental Variables and GMM: Estimation and Testing Christopher F Baum, Boston College Mark E. Schaffer, Heriot Watt University Steven Stillman, New Zealand Department of Labour March 2003 Stata Journal,

More information

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects MPRA Munich Personal RePEc Archive Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects Mohamed R. Abonazel Department of Applied Statistics and Econometrics, Institute of Statistical

More information

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Seung C. Ahn Arizona State University, Tempe, AZ 85187, USA Peter Schmidt * Michigan State University,

More information

Regression with time series

Regression with time series Regression with time series Class Notes Manuel Arellano February 22, 2018 1 Classical regression model with time series Model and assumptions The basic assumption is E y t x 1,, x T = E y t x t = x tβ

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system: Panel Data Exercises Manuel Arellano Exercise 1 Using panel data, a researcher considers the estimation of the following system: y 1t = α 1 + βx 1t + v 1t. (t =1,..., T ) y Nt = α N + βx Nt + v Nt where

More information

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

Introduction to Eco n o m et rics

Introduction to Eco n o m et rics 2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Introduction to Eco n o m et rics Third Edition G.S. Maddala Formerly

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

Lecture 7: Dynamic panel models 2

Lecture 7: Dynamic panel models 2 Lecture 7: Dynamic panel models 2 Ragnar Nymoen Department of Economics, UiO 25 February 2010 Main issues and references The Arellano and Bond method for GMM estimation of dynamic panel data models A stepwise

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

10 Panel Data. Andrius Buteikis,

10 Panel Data. Andrius Buteikis, 10 Panel Data Andrius Buteikis, andrius.buteikis@mif.vu.lt http://web.vu.lt/mif/a.buteikis/ Introduction Panel data combines cross-sectional and time series data: the same individuals (persons, firms,

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)

More information

Statistics and econometrics

Statistics and econometrics 1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample

More information

Short T Panels - Review

Short T Panels - Review Short T Panels - Review We have looked at methods for estimating parameters on time-varying explanatory variables consistently in panels with many cross-section observation units but a small number of

More information

Econometrics Master in Business and Quantitative Methods

Econometrics Master in Business and Quantitative Methods Econometrics Master in Business and Quantitative Methods Helena Veiga Universidad Carlos III de Madrid Models with discrete dependent variables and applications of panel data methods in all fields of economics

More information

Testing Random Effects in Two-Way Spatial Panel Data Models

Testing Random Effects in Two-Way Spatial Panel Data Models Testing Random Effects in Two-Way Spatial Panel Data Models Nicolas Debarsy May 27, 2010 Abstract This paper proposes an alternative testing procedure to the Hausman test statistic to help the applied

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

Lecture 10: Panel Data

Lecture 10: Panel Data Lecture 10: Instructor: Department of Economics Stanford University 2011 Random Effect Estimator: β R y it = x itβ + u it u it = α i + ɛ it i = 1,..., N, t = 1,..., T E (α i x i ) = E (ɛ it x i ) = 0.

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Panel Threshold Regression Models with Endogenous Threshold Variables

Panel Threshold Regression Models with Endogenous Threshold Variables Panel Threshold Regression Models with Endogenous Threshold Variables Chien-Ho Wang National Taipei University Eric S. Lin National Tsing Hua University This Version: June 29, 2010 Abstract This paper

More information

Notes on Panel Data and Fixed Effects models

Notes on Panel Data and Fixed Effects models Notes on Panel Data and Fixed Effects models Michele Pellizzari IGIER-Bocconi, IZA and frdb These notes are based on a combination of the treatment of panel data in three books: (i) Arellano M 2003 Panel

More information

Improving GMM efficiency in dynamic models for panel data with mean stationarity

Improving GMM efficiency in dynamic models for panel data with mean stationarity Working Paper Series Department of Economics University of Verona Improving GMM efficiency in dynamic models for panel data with mean stationarity Giorgio Calzolari, Laura Magazzini WP Number: 12 July

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

System GMM estimation of Empirical Growth Models

System GMM estimation of Empirical Growth Models System GMM estimation of Empirical Growth Models ELISABETH DORNETSHUMER June 29, 2007 1 Introduction This study based on the paper "GMM Estimation of Empirical Growth Models" by Stephan Bond, Anke Hoeffler

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

Increasing the Power of Specification Tests. November 18, 2018

Increasing the Power of Specification Tests. November 18, 2018 Increasing the Power of Specification Tests T W J A. H U A MIT November 18, 2018 A. This paper shows how to increase the power of Hausman s (1978) specification test as well as the difference test in a

More information

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets Econometrics II - EXAM Outline Solutions All questions hae 5pts Answer each question in separate sheets. Consider the two linear simultaneous equations G with two exogeneous ariables K, y γ + y γ + x δ

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

Generalized Method of Moment

Generalized Method of Moment Generalized Method of Moment CHUNG-MING KUAN Department of Finance & CRETA National Taiwan University June 16, 2010 C.-M. Kuan (Finance & CRETA, NTU Generalized Method of Moment June 16, 2010 1 / 32 Lecture

More information

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

A Guide to Modern Econometric:

A Guide to Modern Econometric: A Guide to Modern Econometric: 4th edition Marno Verbeek Rotterdam School of Management, Erasmus University, Rotterdam B 379887 )WILEY A John Wiley & Sons, Ltd., Publication Contents Preface xiii 1 Introduction

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information

Efficiency of repeated-cross-section estimators in fixed-effects models

Efficiency of repeated-cross-section estimators in fixed-effects models Efficiency of repeated-cross-section estimators in fixed-effects models Montezuma Dumangane and Nicoletta Rosati CEMAPRE and ISEG-UTL January 2009 Abstract PRELIMINARY AND INCOMPLETE Exploiting across

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 6 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 21 Recommended Reading For the today Advanced Panel Data Methods. Chapter 14 (pp.

More information

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Sensitivity of GLS estimators in random effects models

Sensitivity of GLS estimators in random effects models of GLS estimators in random effects models Andrey L. Vasnev (University of Sydney) Tokyo, August 4, 2009 1 / 19 Plan Plan Simulation studies and estimators 2 / 19 Simulation studies Plan Simulation studies

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information