Discrete Choices with Panel Data

Size: px
Start display at page:

Download "Discrete Choices with Panel Data"

Transcription

1 Discrete Choices with Panel Data Manuel Arellano CEMFI, Madrid This version: July 2003 Abstract This paper reviews the existing approaches to deal with panel data binary choice models with individual effects. Their relative strengths and weaknesses are discussed. Much theoretical and empirical research is needed in this area, and the paper points to several aspects that deserve further investigation. In particular, I illustrate the usefulness of asymptotic arguments in providing both approximately unbiased moment conditions, and approximations to sampling distributions for panels of different sample sizes. JEL classification:c23. Keywords: Binary choice, panel data, fixed effects, modified likelihood, asymptotic corrections. Address for correspondence: CEMFI, Casado del Alisal 5, 2804 Madrid, Spain. Tel , Fax , arellano@cemfi.es This paper was written for presentation as the Investigaciones Económicas Lecture at the XXV Simposio de Análisis Económico, Universitat Autònoma de Barcelona, Bellaterra, 9-2 December I wish to thank Pedro Albarrán, Samuel Bentolila, Olympia Bover, Raquel Carrasco, Jesús Carro, Enrique Sentana, and two anonymous referees for helpful comments and discussions. I am also grateful to Pedro Albarrán and Jesús Carro for very able research assistance.

2 Introduction The use of fixed effects is a simple and well understood way of dealing with endogenous explanatory variables in linear panel data models. In such a context, least squares or instrumental variable methods for errors in differences provide consistent estimates that control for unobserved heterogeneity in short panels of large cross-sections (small T,largeN). However, the situation is fundamentally different in models with nonlinear errors; for example, when one intends to use fixed effects to deal with an endogenous explanatory variable in a probit model. In those cases, estimates of the parameters of interest, jointly estimated with the effects, are typically inconsistent if T is fixed (incidental parameter problem). Moreover, fixed effects estimates in a spirit similar to differencing in the linear case are not available for many models of practical importance. There are also random effectmethodsthatachievefixed T consistency subject to a particular specification of the form of the dependence between the explanatory variables and the effects, but they rely on strong and untestable auxiliary assumptions, and even these methods are often out of reach. Without auxiliary assumptions, the common parameters of certain nonlinear fixed effects models are simply unidentifiable in a fixed T setting, so that fixed-t consistent estimation is not possible at all. In other cases, although identifiable, fixed-t consistent estimation at the standard root-n rate is impossible. Analternativereactiontothefactthatmicropanelsareshortistoask for estimators with small biases as opposed to no bias at all; specifically, estimators with biases of order /T 2 instead of the standard magnitude of /T. This alternative approach has the potential of overcoming some of the fixed-t identification difficulties and the advantage of generality. The purpose of this lecture is twofold. First I review the incidental parameter problem (Sections 2 and 3), fixed-t solutions (Section 4), and identification problems (Section 5), all in the context of the static binary choice

3 model with explanatory variables that are correlated with an individual effect. Second, I discuss the modified concentrated likelihood of Cox and Reid (987), its role in achieving consistency up to a certain order of magnitude in T (Section 6), and a double asymptotic formulation which provides an effective discrimination between estimators with and without bias reduction (Section 7). I focus on the static binary case for simplicity and because many results are only available for this case. Thus, dynamic models, multinomial choice, andmodelswithrandomeffects that are uncorrelated with the explanatory variables are all left out (see Arellano and Honoré (200) for a fuller survey of the fixed-t panel data discrete choice literature). My intention is to exhibit the strengths and weaknesses of fixed-t approaches, and to illustrate the usefulness of double asymptotic arguments in providing both approximately unbiased moment conditions, and approximations to sampling distributions even for fairly short panels, which is the main theme of the paper. 2 Models and Parameters of Interest I begin by considering the following static binary choice model y it ={x 0 itβ 0 + η i + v it 0} (t =,..., T ; i =,..., N) () where the errors v it are independently distributed with cdf F conditional on η i and x i =(x 0 i,..., x 0 it ) 0,sothat Pr(y it = x i, η i )=F (x 0 itβ 0 + η i ). (2) The Linear Model as a Benchmark In a linear model of the form E(y it x i, η i )=x 0 itβ 0 + η i, (3) β 0 is identifiable from the regression in first differences or deviations from means in a cross-sectional population for fixed T, regardless of the form of 2

4 the distribution of η i x i.thatis,wehave plim N TN NX TX (x it x i ) (y it y i ) (x it x i ) 0 β 0 =0, (4) t= which is uniquely satisfied by the true value β 0 provided plim N TN NX TX (x it x i )(x it x i ) 0 (5) t= is non-singular. So, the value b β that solves TN NX TX t= i (x it x i ) h(y it y i ) (x it x i ) 0 β b =0 (6) (the within-group estimator) is a consistent estimator of β 0 for large N, no matter how small is T as long as T 2 (see, for example, Arellano, 2003, or Hsiao, 2003). This is of economic interest if one hopes that by conditioning on η i, β 0 measures a more relevant (causal or structural) effect of x on y. The consistency result matters because one wants to make sure that gets the right answer when calculating β b from a large cross-sectional panel with a small time series dimension, which is a typical situation in microeconometrics. The motivation and aim in a binary choice fixed effects model is to get similar results as in the linear case when the form of the model is given by (). In our context, the term fixed effects has nothing to do with the nature of sampling. It just refers to a model for the effect of x on y given x and η, in which we observe y and x but not η, and the distribution of η x is left unrestricted. Following the usage in the econometric literature, the term random effects will be reserved for models in which some knowledge about the form of the distribution of η x is assumed. Parameters of Interest The micropanel data literature has emphasized the large-n-short-t identification of β 0 with an unspecified distribution 3

5 of η i x i. However, a natural parameter of interest is the mean effect on the probability of y it =of changing x it from z a to z b, say. A consistent estimator of this is: NX Z [F (z a β N 0 + x 0 2itβ 02 + η) F (z b β 0 + x 0 2itβ 02 + η)] dg(η x 2it ) (7) where G(. x 2it ) is the cdf of η i conditional on x 2it,andx it denotes the first component of x it. Thus, measuring this effectwouldrequireustospecifyg, whichisnotinthenatureofthefixed effects approach. The direct information we can get from the β coefficients only concerns the relative impacts of explanatory variables on the probabilities. If x it and x 2it are continuous variables we have: β 02 = Pr ob(y it = x i, η i ) / Pr ob(y it = x i, η i ). (8) β 0 x 2it x it 3 The Problem The log-likelihood function from () assuming that the y it are independent conditional on x i and η i is given by NX `i (β, η i ) (9) where `i (β, η i )= TX {y it log F it +( y it )log( F it )} (0) t= and F it = F (x 0 itβ + η i ). Moreover, the scores are d ηi (β, η i ) `i (β, η i ) TX f it = η i F it ( F it ) (y it F it ) () d βi (β, η i ) `i (β, η i ) β = t= TX t= f it F it ( F it ) x it (y it F it ) (2) An alternative is to obtain the difference in probabilities for specific valuesofη and x 2t (e.g. their means), but this may only be relevant for a small part of the population (see Chamberlain, 984). 4

6 where f it denotes the pdf corresponding to F it. For the logit model F is the logistic cdf Λ(r) =e r / ( + e r ) and we have f it F it ( F it ) = so that in this case the scores are simply d ηi (β, η i )= P T t= (y it F it ) and d βi (β, η i )= P T t= x it (y it F it ). Let the MLE of η i for given β be bη i (β) =argmax`i (β, η i ) (3) η so that bη i (β) solves d ηi (β, bη i (β)) = 0. (4) Therefore, the MLE of β is given by the maximizer of the concentrated (or profile) log-likelihood bβ =argmax β which solves the first order conditions b TN (β) = TN = TN NX `i (β, bη i (β)) (5) NX ½ d βi (β, bη i (β)) + d ηi (β, bη i (β)) bη ¾ i (β) β NX d βi (β, bη i (β)) (6) The problem is that b TN (β) evaluated at β = β 0 does not converge to zero in probability when N for T fixed (although it does converge to zero when T ). This situation is known as the incidental parameters problem since Neyman and Scott (948). A discussion of this problem for discretechoicemodelsisinheckman(98). 5

7 An Example As a classic illustration let us consider a model in which T = 2, f is symmetric, β is scalar, and x it is a time dummy such that x i =0and x i2 =(Andersen, 973; Heckman, 98). For observations with (y i,y i2 )=(0, 0) we have bη i (β) and `i (β, bη i (β)) = log F ( bη i (β)) + log F ( bη i (β) β) 0. For observations with (y i,y i2 )=(, ) we have bη i (β) and `i (β, bη i (β)) = log F (bη i (β)) + log F (bη i (β)+β) 0. Finally, for (0, ) or (, 0) observations we have, respectively, `i (β, η) =logf ( η)+logf (η + β) or `i (β, η) =logf (η)+logf ( η β), which in both cases are maximized at bη i (β) = β 2. (7) The implication is that the contributions of observations (0, 0) and (, ) to the concentrated log-likelihood are equal to zero, a (0, ) observation contributes a term of the form 2logF (β/2), anda(, 0) observation contributes with 2log[ F (β/2)]. So the concentrated log-likelihood is given by NX 2 {d 0i log [ F (β/2)] + d 0i log F (β/2)} (8) where d 0i =(y i =,y i2 =0)and d 0i =(y i =0,y i2 =). Moreover, the MLE of p = F (β/2) is P N bp = d 0i P N (y i + y i2 =), (9) so that bβ =2F (bp). (20) Note that bp isthesamplecounterpartofp 0 =Pr(y i =0,y i2 = y i + y i2 =). Thus the MLE β b satisfies plim bβ =2F (p 0 ). (2) N 6

8 Moreover, we have Z p 0 = Pr (y i =0,y i2 = y i + y i2 =, η) dg (η y i + y i2 =). For the logit model Pr (y i =0,y i2 = y i + y i2 =, η) does not depend on η and it turns out that p 0 = Λ (β 0 ) where β 0 isthetruevalue. Therefore, in such a case plim N β b = 2Λ [Λ (β 0 )] = 2β 0,sothatMLwouldbe estimating a relative log odds ratio that is twice as large as its true value. More generally, using Bayes formula Z Pr (yi =0,y i2 = η) Pr (y i + y i2 = η) p 0 = dg (η) Pr (y i + y i2 = η) Pr (y i + y i2 =) or so that plim N p 0 = E η [Pr (y i =0,y i2 = η)], (22) E η [Pr (y i + y i2 = η)] ½ bβ =2F E η [F ( η) F (β + η)] E η [F ( η) F (β + η)] + E η [F (η) F ( β η)] ¾. (23) Thus, in general the form of the asymptotic bias of β b depends on the distribution of the individual effects. For probit, under η N 0, ση 2 an explicit expression is available. Letting β = β/ +ση 2 /2 and ρ = σ 2 η/ +ση 2 we have Eη [Φ (η)] = Φ (0) = 0.5, E η [Φ (β + η)] = Φ (β ),ande η [Φ (η) Φ (β + η)] = Φ 2 (0, β ; ρ), sothat plim bβ =2Φ Φ (β ) Φ 2 (0, β ; ρ) N Φ (β )+0.5 2Φ 2 (0, β (24) ; ρ) where Φ 2 (.,.; ρ)is the cdf of the standardized bivariate normal density with correlation coefficient ρ. As noted by Heckman (98), we get p 0 F (β) as σ η 0, in which case we get a similar bias result as for the logistic in a limiting situation. Numerical calculations of (24) are reported below in Section 6. 7

9 4 Fixed T Solutions 4. Conditional MLE Asufficient statistic for η i, S i say, is a function of the data such that the distribution of the data given S i does not depend on η i. The idea is to use the likelihood conditioned on S i to make inference about β 0 (Andersen, 970). This works as long as β 0 is identified from the conditional likelihood of the data, which obviously requires that the conditional likelihood depends on β 0.Unfortunately,thisisnotthecaseexceptforthelogitmodel. In the logit model P T t= y it is a sufficient statistic for η i. Indeed, we have Ã! PT TX exp t= y itx 0 itβ 0 Pr y i,..., y it y it,x i = P PT (25) t= (d,...,d T ) B i exp t= d tx 0 it β 0 where B i is the set of all 0 sequences such that P T t= d t = P T t= y it. This result was first obtained by Rasch (960, 96) (for surveys see Chamberlain, 984, or Arellano and Honoré, 200). For example, with T =2we have Pr (y i,y i2 y i + y i2,x i )= Λ ( x 0 i2β 0 ) Λ ( x 0 i2β 0 ) if (y i,y i2 )=(0, 0) or (, ) if (y i,y i2 )=(, 0) if (y i,y i2 )=(0, ). (26) Therefore, the log-likelihood conditioned on y i + y i2 is given by 2 L c (β) = NX {d 0i log [ Λ ( x 0 i2β)] + d 0i log Λ ( x 0 i2β)} (27) and the score takes the form L c (β) β = NX x i2 {d 0i Λ ( x 0 i2β)(y i + y i2 =)}. (28) 2 The contributions of (0, 0) or (, ) observations is zero. 8

10 4.2 Maximum Score Estimation The previous technique crucially relied on the logit assumption. Manski (987) considered a more general model of the form () in which the cdf of v it x i, η i was non-parametric and could depend on x i and η i in a timeinvariant way. Namely, for all t and s Pr( v it r x i, η i )=Pr( v is r x i, η i )=F (r x i, η i ), (29) so that F (r x i, η i ) does not change with t but is otherwise unrestricted. This assumption imposes stationarity and strict exogeneity, but allows for serial dependence in the errors v it. It also allows for a certain kind of conditional heteroskedasticity, though not a very plausible one, since Var(v it x i, η i ) may depend on x i but v it is not allowed to be more sensitive to x it than to other x s. Similarly if the expectations E (v it x i, η i ) exist, they may depend on x i but not their first-differences E ( v it x i, η i )=0. The time-invariance of F implies that for T =2: 3 med (y i2 y i x i,y i + y i2 =)=sgn( x 0 i2β 0 ). (30) To see this note that, given y i + y i2 =,thedifference y i2 y i can only equal or. So the median will be one or the other depending on whether Pr (y i2 =,y i =0 x i ) Q Pr (y i2 =0,y i = x i ).Thus 4 med (y i2 y i x i,y i + y i2 =) = sgn[pr(y i2 =,y i =0 x i ) Pr (y i2 =0,y i = x i )] = sgn [Pr (y i2 = x i ) Pr (y i = x i )]. 3 The sign function is defined as sgn (u) =(u>0) (u<0), i.e. sgn (u) = if u<0, sgn (u) =0if u =0and sgn (u) =if u>0. 4 The second equality follows from Pr (y i2 = x i ) = Pr(y i2 =,y i =0 x i )+Pr(y i2 =,y i = x i ) Pr (y i = x i ) = Pr(y i2 =0,y i = x i )+Pr(y i2 =,y i = x i ). 9

11 Moreover, from the model s specification, i.e. Pr (y i = x i, η i ) = F (x 0 iβ 0 + η i x i, η i ) Pr (y i2 = x i, η i ) = F (x 0 i2β 0 + η i x i, η i ), and the monotonicity of F,wehavethatforanyη i (the constancy of F over time becomes crucial at this point): Pr (y i2 = x i, η i ) Q Pr (y i = x i, η i ) x 0 i2β 0 Q x 0 iβ 0. Therefore, the implication also holds unconditionally relative to η i : Pr (y i2 = x i ) Q Pr (y i = x i ) x 0 i2β 0 Q x 0 iβ 0. or sgn [Pr (y i2 = x i ) Pr (y i = x i )] = sgn ( x 0 i2β 0 ). Manski showed that the true value of β 0 uniquely maximizes (up to scale) the expected agreement between the sign of x 0 i2β and that of y i2 conditioned on y i + y i2 =. This identification result required an unbounded support for at least one of the explanatory variables with a non-zero coefficient. That is, letting x 0 it =(z it,wit) 0 and β 0 0 =(γ 0, α 0 0), theminimal requirement for identification is that z it has unbounded support and γ 0 6=0. Identification fails at γ 0 =0,sothatγ 0 =0isnot a testable hypothesis. Manski s identification result implies that we can learn about the relative effects of the variables w it under the maintained assumption that γ 0 6=0. Manski then proposed to estimate β 0 by selecting the value that matches the sign of x 0 i2β with that of y i2 forasmanyobservationsaspossiblein the subsample with y i + y i2 =. The suggested estimator is bβ =argmax β NX sgn ( x 0 i2β)(y i2 y i ) (3) 0

12 subject to the normalization k β k=. 5 This is the maximum score estimator applied to the observations with y i + y i2 =(notice that the estimation criterion is unaffected by removing observations having y i = y i2 ). It is consistent under the assumption that there is at least one unbounded continuous regressor, but it is not root-n consistent, and not asymptotically normal. An alternative form of the score objective function is NX S N (β) = {d 0i ( x 0 i2β < 0) + d 0i ( x 0 i2β 0)}. (32) The score S N (β) gives the number of correct predictions we would make if we predicted (y i,y i2 ) to be (0, ) whenever x 0 i2β 0. In contrast, P N sgn ( x0 i2β) y i2 gives the number of successes minus the number of failures. Yet another form of the estimator suggested by the median regression interpretation is as the minimizer of the number of failures, which is given by NX (y i 6= y i2 ) y i2 sgn ( x 0 2 i2β). (33) Smoothed Maximum Score It is possible to consider a smoothed version of the maximum score estimator along the lines of Horowitz (992), which does have an asymptotic normal distribution, although the rate of convergence remains slower than root-n (Charlier, Melenberg and van Soest, 995, and Kyriazidou, 997). 6 The idea is to replace S N (β) with a smooth function SN(β) whose limit a.s. as N isthesameass N (β). Thisisof the form NX SN(β) = {d 0i [ K ( x 0 i2β/γ N )] + d 0i K ( x 0 i2β/γ N )} (34) 5 In the logit case the scale normalization is imposed through the variance of the logistic distribution. More generally, if F is a known distribution a priori, the scale normalization isdeterminedbytheformoff. Comparisons can be made by considering ratios of coefficients. 6 Chamberlain (986) showed that there is no root-n consistent estimator of β under the assumptions of Manski for his maximum score method.

13 where K(.) is analogous to a cdf and γ N is a sequence of positive numbers such that lim N γ N = Random Effects In general Z Pr (y i,..., y it x i )= Pr (y i,...,y it x i, η i ) dg (η i x i ) (35) where G (η i x i ) is the cdf of η i x i. The substantive model specifies Pr (y i,..., y it x i, η i ),butonlypr (y i,..., y it x i ) has an empirical counterpart. For example, we may have specified Pr (y i,...,y it x i, η i )= TY Pr (y it x i, η i )= t= TY t= F y it it ( F it ) ( y it). In a fixed effects model we seek to make inferences about parameters in Pr (y i,..., y it x i, η i ) without restricting the form of G. In a random effects model G is typically parametric or semiparametric, and the parameters of interest may or may not be identified with G unrestricted. Thus a fixed effects model can be regarded as a random effects model that leaves the distribution of the effects unrestricted. Thechoicebetweenfixed and random effects models often involves a trade-off between robustness in the specification of Pr (y i,..., y it x i, η i ) and robustness in G, in the sense that achieving fixed-t identification with unrestricted G usually requires a more restrictive specification of Pr (y i,..., y it x i, η i ). Chamberlain (980, 984) considered a random effects model in which the effects are of the form η i = µ (x i )+ε i (36) and ε i is independent of x i. He also made the normality assumptions v it x i, η i N (0, ω tt ) (37) ε i x i N 0, σ 2 η, (38) 2

14 which imply that Pr (y it = x i )=Φ σ t (x 0 itβ 0 + µ (x i )). (39) where σ 2 t = σ 2 η + ω tt and Φ (.) is the standard normal cdf. In this model the v it may be serially dependent and heteroskedastic over time. Chamberlain assumed a linear specification µ (x i )=λ 0 + x 0 iλ, andnewey (994) generalized the model to a non-parametric µ (x i ). In the linear case, β 0, λ 0, λ, andtheσ 2 t can be estimated subject to the normalization σ 2 = by combining the period-by-period probit likelihood functions (see Bover and Arellano, 997, for a discussion of alternative estimators). In the semiparametric case, Newey used the fact that σ t Φ [Pr (y it = x i )] σ t Φ Pr y i(t ) = x i = x 0 it β 0 (40) together with non-parametric estimates of the probabilities Pr (y it = x i ) to obtain an estimator of β 0 and the relative scales. A further generalization of the model is to drop the normality assumptions and allow the distribution of the errors ε i + v it x i to be unknown. This case has been considered by Chen (998). Another semi-parametric approach has been followed by Lee (999). Under certain assumptions on the joint distribution of x i and η i, Lee proposed a maximum rank correlation-type estimator which is N-consistent and asymptotically normal. 5 Identification Problems with Fixed T It would be useful to know which models for Pr (y i,..., y it x i, η i ) are identified without placing restrictions in the form of G (η i x i ) (i.e. fixed-effects identification with fixed T ) and which are not. Amodelisgivenbya2 T vector p (x i, η i, β 0 ) with elements that specify the probabilities Pr ((y i,..., y it )=d j x i, η i )(j =,..., 2 T ) (4) 3

15 where d j is a 0 sequence of order T. Let the true cdf of η i x i be G 0 (η x). Identification will fail at β 0 if for all x inthesupportofx i there is a cdf G (η x) and β 6= β 0 in the parameter space, such that Z Z p (x, η, β 0 ) dg 0 (η x) = p (x, η, β ) dg (η x). (42) If this is so, (β 0,G 0 ) and (β,g ) give the same conditional distribution for (y i,..., y it ) given x i. Therefore, they are observationally equivalent relative to such distribution. Chamberlain (992) studied the identification of a fixed effects binary choice model with T =2. He considered the model y it =(x 0 itβ 0 + η i + v it 0) (t =, 2) together with the assumption that the v it are independent of x i, η i and are i.i.d. over time with a known cdf F. The distribution F is strictly increasing on the whole line, with a bounded, continuous derivative. Moreover, we have the partitions x 0 it =(d t,zit) 0 and β 0 0 =(α 0, γ 0 0), whered t is a time dummy such that d =0and d 2 =,andz i is a continuous random vector with bounded support. With these assumptions Chamberlain showed that if F is not logistic, then there is a value of α such that identification fails for all β 0 in a neighborhood of (α, 0). This seems puzzling since Manski (987) proved identification under less restrictive assumptions. He required, however, the presence of an explanatory variable with unbounded support. Indeed, the difference between the identification result of Manski and the underidentification result of Chamberlain is due to the bounded support for the explanatory variables. The line between identification and underidentification in this context is very subtle. Under Manski s assumptions identification will fail at β 0 0 = (α 0, 0) even if z it has unbounded support, but there will be identification as long as a component of γ 0 is different from zero. Chamberlain shows that if z it is bounded β 0 is underidentified not only when β 0 0 =(α 0, 0), butalsofor 4

16 all β 0 in a neighborhood of (α 0, 0) for a certain value of α 0. So it seems to be a case of local underidentification at zero versus local underidentification in a neighborhood around zero. The lesson from these findings is the fragility of fixed-t identification results and the special role of the logistic assumption. Chamberlain (992) also showed that when the support of z it is unbounded (so that identification holds to the exclusion of γ 0 =0from the parameter space) the information bound for β 0 is zero unless F is logistic. Thus, root-n consistent estimation is possible only for the logit model. Chamberlain sproofcanbesketchedasfollows.inhiscasep (x, η, β 0 ) is ( F )( F 2 ) p (x, η, β 0 )= ( F ) F 2 F ( F 2 ) F F 2 where F = F (zγ η) and F 2 = F (α 0 + z2γ η). Let β =(α, 0) and define the 4 4 matrix H (x, η,..., η 4, β )=[p (x, η, β ),..., p (x, η 4, β )]. which does not vary with x when evaluated at β. The proof proceeds by showing that unless H (x, η,..., η 4, β ) is singular for every α and η,..., η 4, there will be lack of identification for all β 0 in a neighborhood of some β.nextitisshownthath (x, η,..., η 4, β ) can only be singular if F is logistic. Letuschooseapmf π =(π,..., π 4) 0, π j > 0, P 4 j= π j =.Ifforsome other pmf π 0 (x) we have H (x, η,..., η 4, β 0 ) π 0 (x) =H (x, η,..., η 4, β ) π, then the models characterized by (β 0, π 0 (x)) and (β, π ) give the same unconditional choice probabilities, hence creating an identification problem. To rulethisoutwehavetoruleoutthath is invertible. To see this, suppose 5

17 that H (x, η,..., η 4, β ) is nonsingular for some α and η,..., η 4. Since x is bounded, for β 0 6= β in a neighborhood of β, H (x, η,..., η 4, β 0 ) will also be nonsingular for all admissible values of x. Wecannowdefine π 0 (x) =H (x, η,..., η 4, β 0 ) H (x, η,..., η 4, β ) π, such that π 0j (x) > 0 for all admissible x. Moreover, since ι 0 H = ι 0 where ι is a 4 vector of ones, we also have ι 0 H = ι 0 and ι 0 π 0 (x) =.Therefore, 4X p x, η j, β 0 π0j (x) = j= 4X p x, η j, β π j j= which implies that β 0 cannot be distinguished from β. The singularity of H (x, η,..., η 4, β ) requires that ψ [ F (η)] [ F (α + η)] + ψ 2 [ F (η)] F (α + η) +ψ 3 F (η)[ F (α + η)] + ψ 4 F (η) F (α + η) =0 for all η and some scalars ψ,..., ψ 4 that are not all zero. Taking limits as η tends to ± gives ψ = ψ 4 =0.Thusweareleftwith ψ 2 Q (α + η)+ψ 3 Q (η) =0 where Q F/( F ). Forη =0we obtain ψ 3 /ψ 2 = Q (α) /Q(0). Therefore the singularity of H requires that for all α and η we have q (α + η) =q (α)+q(η) q (0). Thiscanonlyhappenifthelogoddratiosq log Q are linear or equivalently if F is logistic. 6 Adjusting the Concentrated Likelihood Cox and Reid (987) considered the general problem of doing inference for a parameter of interest in the absence of knowledge about nuisance parameters. They proposed a first-order adjustment to the concentrated likelihood 6

18 to take account of the estimation of the nuisance parameters (the modified profile likelihood). Their formulation required information orthogonality between the two types of parameters. That is, that the expected information matrix be block diagonal between the parameters of interest and the nuisance parameters; something that may be achieved by transformation of the latter (Cox and Reid explained how to construct orthogonal parameters). A discussion of orthogonality in the context of panel data models and a Bayesian perspective have been given by Lancaster (2000, 2002). The nature of the adjustment in a fixed effects model and some examples are also discussed in Cox and Reid (992). 6. Orthogonalization Let `i (β, η i ) be the log-likelihood for unit i (conditional on x i and η i ). A strong form of orthogonality arises when for some parameterization of η i we have `i (β, η i )=`i (β)+`2i (η i ), (43) for in this case the MLE of η i for given β does not depend on β, bη i (β) =bη i. The implication is that the MLE of β is unaffected by lack of knowledge of η i. In this case 2`i (β, η i ) / β η i = 0 for all i. Unfortunately, such factorization does not hold for binary choice models. In contrast, information orthogonality just requires the cross derivatives to be zero on average. Suppose that a reparameterization is made from (β, η i ) to (β, λ i ) chosen so that β and λ i are information orthogonal. Thus η i = η (β, λ i ) is chosen such that the reparameterized log likelihood ` i (β, λ i )=`i (β, η (β, λ i )) (44) satisfies (at true values): µ 2` i (β E 0, λ i ) β λ i x i, η i =0. (45) 7

19 Since we have ` i β = `i β + η i `i (46) β η i and 7 µ 2` i E x i, η β λ i = η µ i 2`i E x i, η i λ i β η i + η µ i η i i λ i β E 2`i x η 2 i, η i, i (47) following Cox and Reid (987) and Lancaster (2002), the function η(β, λ i ) must satisfy the partial differential equations µ µ η i β = E 2`i 2`i x i, η β η i /E x i, η i. (48) i Orthogonal Effects in Binary Choice Let us now consider the form of information orthogonal fixed effects for model ()-(2). These have been obtained by Lancaster (998, 2000). For binary choice we have µ 2`i (β E 0, η i ) x i, η β η i i µ 2`i (β E 0, η i ) x i, η i η 2 i = = η 2 i TX h (x 0 itβ 0 + η i ) x it (49) t= TX h (x 0 itβ 0 + η i ) (50) where f (r) 2 h (r) = F (r)[ F (r)]. (5) Since in general (49) is different from zero, β and η i are not information orthogonal. In view of (48), an orthogonal transformation of the effects will satisfy η i β = TX P T t= h h it x it (52) it t= where h it = h (x 0 itβ + η i ). 7 Note that there is a term that vanishes: ( 2 η i / β λ i )E ( `i/ η i x i, η i )=0. t= 8

20 and Moreover, letting φ (r) =h 0 (r) and φ it = φ (x 0 itβ + η i ),since " 2 η i = η i X µ T P β λ i λ T i t= h φ it x it + η # i t= it β it turns out that 2 η i / η i = β λ i λ i β log η i λ i = Hence, Lancaster s orthogonal reparameterization is η i λ i, (53) P T t= h. (54) it λ i = TX Z x 0 it β+η i t= h (r) dr. (55) When F (r) is the logistic distribution h(r) coincides with the logistic density, so that an orthogonal effect for the logit model is λ i = TX Λ (x 0 itβ + η i ). (56) t= 6.2 Modified Profile Likelihood The modified profile log likelihood function of Cox and Reid (987) can be written as L M (β) = X i `Mi (β) and `Mi (β) =` i β, λ b i (β) h 2 log d λλi β, λ b i i (β), (57) where λ b i (β) is the MLE of λ i for given β, andd λλi (β, λ i)= 2` i / λ 2 i. Intuitively, the role of the second term is to penalize values of β for which the information about the effects is relatively large. 9

21 An individual s modified score is of the form d λλβi β, b λ i (β) + d λλλi β, b h λ i (β) b i λ i (β) / β d Mi (β) =d Ci (β) β, λ b (58) i (β) 2d λλi where d Ci (β) is the standard score from the concentrated likelihood function, d λλβi (β, λ i)= 3` i / λ 2 i β and d λλλi (β, λ i)= 3` i / λ 3 i. Thefunction(57)wasderivedbyCoxandReidasanapproximationto the conditional likelihood given λ b i (β). Their approach was motivated by the fact that in an exponential family model, it is optimal to condition on sufficient statistics for the nuisance parameters, and these can be regarded as the MLE of nuisance parameters choseninaformtobeorthogonaltothe parameters of interest. For more general problems the idea was to derive a concentrated likelihood for β conditioned on the MLE λ b i (β), having ensured via orthogonality that λ b i (β) changes slowly with β. Another motivation for using (57) is that the corresponding expected score has a bias of a smaller order of magnitude than the standard ML score (cf. Liang, 987, McCullagh and Tibshirani, 990, and Ferguson, Reid, and Cox, 99). Seen in this way, the objective of the adjustment is to center the concentrated score function to achieve consistency up to a certain order of magnitude in T. Specifically, while the difference between the score with known λ i and the concentrated score is in general of order O p (), the corresponding difference with the modified concentrated score is of order O p T /2 (see Appendix). This leads to a bias of order O (T ) in the expected modified score, as opposed to O () in the concentrated score without modification. The Adjustment in Terms of the Original Parameterization Cox and Reid s motivation for modifying the concentrated likelihood relied on the orthogonality between common and nuisance parameters. Nevertheless, the mpl function (57) can be expressed in terms of the original parameterization. 20

22 Firstly, note that because of the invariance of MLE bη i (β) =η(β, λ b i (β)) and ` i β, λ b i (β) = `i (β, bη i (β)). (59) Next, the term d λλi β, λ b i (β) can be calculated as the product of the Fisher information in the (β, η i ) parameterization and the square of the Jacobian of the transformation from (β, η i ) to (β, λ i ) (Cox and Reid, 987, p. 0). That is, since the second derivatives of ` i and `i are related by the expression µ 2` i 2 µ λ 2 = 2`i ηi + `i 2 η i i η 2 i λ i η i λ 2, i and `i/ η i vanishes at bη i (β), lettingd ηηi (β, η i )= 2`i/ η2 i we have d λλi β, λ b i (β) = d ηηi (β, bη i (β)) µ ηi λ i λi = b λ i (β) 2. (60) Thus, the mpl can be written as `Mi (β) =`i (β, bη i (β)) µ 2 log [ d λi ηηi (β, bη i (β))] + log ηi =bη η i (β). (6) i Finally, in view of (48) and (53), the derivative with respect to β of the Jacobian term (the required term for the modified score) can be expressed as β log λ i η = q i (β, η i η i ), (62) i where q i (β, η i )= κ βηi (β, η i ) /κ ηηi (β, η i ) and κ βηi (β 0, η i ) = E T d βηi (β 0, η i ) x i, η i (63) κ ηηi (β 0, η i ) = E T d ηηi (β 0, η i ) x i, η i. (64) Modified Profile Likelihood for Binary Choice Replacing (54) in (6) we have à TX! `Mi (β) =`i (β, bη i (β)) 2 log [ d ηηi (β, bη i (β))] + log b hit (β) (65) 2 t=

23 where b h it (β) =h (x 0 itβ + bη i (β)), and `i (β, η i )= TX {y it log F it +( y it )log( F it )} t= d ηηi (β, η i )= where ρ it = ρ (x 0 itβ + η i ) and TX [h it ρ it (y it F it )] (66) t= ρ (r) = f 0 (r) h (r)[ 2F (r)]. (67) F (r)[ F (r)] For logit, the MLE b λ i (β) for given β solves bλ i (β) = TX Λ (x 0 itβ + bη i (β)) = t= TX y it (68) so that it does not vary with β. Therefore, the likelihood conditioned on bλ i (β) coincides with the conditional logit likelihood given a sufficient statistic for the fixed effect discussed in Section 4.. For the logistic distribution ρ (r) =0. Themodified profile likelihood (mpl) for logit is therefore à TX! `Mi (β) =`i (β, bη i (β)) + 2 log f Λ (x 0 itβ + bη i (β)) (69) t= t= where f Λ (r) =Λ (r)[ Λ (r)] is the logistic density and `Mi (β) is defined for observations such that P T t= y it is not zero or T Numerical Comparisons for Logit and Probit Comparisons for the Two-Period Logit Model The mpl for logit (69) differs from Andersen s conditional likelihood, and the estimator b β MML PT 8 If bη i (β) ±, thenlog t= f Λ (x 0 it β + bη i (β)) tends to for any β. Soobservations for individuals that never change state are uninformative about β. 22

24 that maximizes the mpl is inconsistent for fixed T. Pursuing the example in Section 3, we compare the large-n biases of ML and MML for T =2and x i2 =. Thus we are assessing the value of the large-t adjustment in (69) when T =2. When T =2, for individuals who change state bη i (β) = β/2 so that the second term in (69) becomes 2 log [f Λ ( β/2) + f Λ (β/2)]. (70) Collecting terms and ignoring constants, the modified profile log-likelihood takes the form N NX `Mi (β) = NX {2d 0i log [ Λ(β/2)] + 2d 0i log Λ(β/2) N +(d 0i + d 0i ) (log Λ(β/2) + log [ Λ(β/2)])} 2 NX {(5d 0i + d 0i )log[ Λ(β/2)] + (5d 0i + d 0i )logλ(β/2)} N (5 4bp)log[ Λ(β/2)] + (4bp +)logλ(β/2) (7) where d 0i =(y i =,y i2 =0), d 0i =(y i =0,y i2 =)and bp is as defined in (9). This is maximized at µ µ 4bp + 4bp + bβ MML =2Λ =2log. (72) 6 5 4bp Therefore, µ µ 4p0 + 4Λ (β0 )+ plim bβ MML =2log =2log. (73) N 5 4p 0 5 4Λ (β 0 ) Figure shows the probability limits of MML for positive values of β 0, together with those of ML (the 2β 0 line) and conditional ML (the 45 line) for comparisons. 9 In this example the adjustment produces a surprisingly 9 See McCullagh and Tibshirani (990, pp ) for a similar exercise using different adjusted likelihood functions. 23

25 Probability Limit 8 7 MLE Modified MLE Conditional MLE True Value Figure : Probability limits for a logit model with T =2 good improvement given that we are relying on a large T argument with T =2. For example, for p 0 =0.65, wehaveβ 0 =0.62, β ML =.24 and β MML = 0.8. Since the MML biases are of order O (/T 2 ), the result suggests that, although the biases are not negligible for T =2,theymaybe so for values of T as small as 5 or 6. Comparisons for the Two-Period Probit Model If f (r) is the standard normal pdf we have h ( r) =h (r) and ρ ( r) = ρ (r). Thus, in the two-period case, b h i = h [bη i (β)] = h ( β/2) = h (β/2) and b h i2 = h [β + bη i (β)] = h (β/2). Also F b i = F ( β/2) and F b i2 = F (β/2). Finally, bρ i = ρ ( β/2) = ρ (β/2) and bρ i2 = ρ (β/2). Therefore, for observations with y i + y i2 =we have: `Mi (β) = `i (β, bη i (β)) h 2 log bhi bρ i y i bf i + b h i2 bρ i2 y i2 bf i2 i 24

26 +log bhi (β)+ b h i2 (β) (74) and `Mi (β) 2[d 0i log F ( β/2) + d 0i log F (β/2)] 2 d 0i log [2h (β/2) + 2ρ (β/2) F (β/2)] (75) 2 d 0i log [2h (β/2) + 2ρ ( β/2) F ( β/2)] + log h (β/2). Next, collecting terms, ignoring constants, averaging over observations with y i + y i2 =, and using the notation q (r) =+ ρ (r) F (r) h (r) = Φ (r) rφ (r) Φ (r) φ (r), we have N XN 2 2 `Mi (β) = ( bp) 2logF( β/2) + log h (β/2) log q (β/2) 2 2 +bp 2logF(β/2) + log h (β/2) log q ( β/2).(76) Thus, the probability limit of β b MML for probit maximizes the limiting modified log likelihood as follows 2 2 plim bβ MML = argmax ½p 0 2logF(β/2) + log h (β/2) log q ( β/2) N β 2 2 +( p 0 ) 2logF( β/2) + log h (β/2) log q (β/2) ¾. Figure 2 shows the probability limits of probit ML and MML for normally distributed individual effects with variances 0.,, and0, aswellasfor Cauchy distributed effects. The range of values of β has been chosen for comparability with Figure, in the sense that both figures cover similar intervals of p 0 values. The impact of changing the distribution of the effects 25 (77)

27 Probability Limit MLE, N(0,.) MMLE, N(0,.) MLE, N(0,) MMLE, N(0,) MLE, N(0,0) MMLE, N(0,0) MLE, cauchy MMLE, cauchy True Value True Value Figure 2: Probability limits for a probit model with T =2 is noticeably small for both ML and MML. The adjustment for probit also produces a good improvement given that T is only two, although less so than in the logit case. For example, for β 0 =0.60, the relevant ranges of values are [.2,.30] for β ML, [0.90, 0.96] for β MML,and[0.73, 0.74] for p 0. 7 N and T Asymptotics The panel data literature has probably overemphasized the quest for fixed-t large-n consistent estimation of non-linear models with fixed effects. We have already seen the difficulties that arise in trying to obtain a root-n consistent estimator for a simple static fixed effects probit model. Not surprisingly, the difficulties become even more serious for dynamic binary choice models. In a sense, insisting on fixed T consistency has similarities with (and may be as restrictive as) requiring exactly unbiased estimation in non-linear 26

28 models. Panels with T =2are more common in theoretical discussions than in econometric practice. For a micro panel with 7 or 8 time series observations, whether estimation biases are of order O (/T ) or O (/T 2 ) may make all the difference. So it seems useful to consider a wider class of estimation methods than those providing fixed-t consistency, and assess their merits with regard to alternative N and T asymptotic plans. There are multiple possible asymptotic formulations, and it is a matter of judgement to decide which one provides the best approximation for the sample sizes involved in a given application. Here we consider the asymptotic properties of the estimators that maximize the concentrated likelihood (ML) and the modified concentrated likelihood (MML) when T/N tends to a constant (related results for autoregressive models are in Alvarez and Arellano, 2003, and Hahn and Kuersteiner, 2002). 0 Consistency The ML estimator of β can be shown to be consistent as T regardless of N using the arguments and the consistency theorem in Amemiya (985, pp ). The consistency of MML follows from noting that the concentrated likelihood and the mpl converge to the same objective function uniformly in probability as T. Letting bρ it (β) =ρ (x 0 itβ + bη i (β)) and bf it (β) =F (x 0 itβ + bη i (β)), from (65) we have Ã! p lim T T `Mi (β) = p lim T T `i (β, bη i (β)) + p lim T T log TX b hit (β) (78) T t= Ã p lim T T log TX h i! /2 bhit (β) bρ T it (β) y it bf it (β), 0 Since this lecture was first written I have become aware of recent work on double asymptotic formulations for nonlinear fixed effect models by Woutersen (200) and Li, Lindsay, and Waterman (2002). Moreover, a modified ML estimator for dynamic binary choice models has been developed in Carro (2003) and its properties investigated in simulations and empirical calculations. 27 t=

29 where the convergence is uniform in β in a neighborhood of β 0,andthelast two terms vanish. Asymptotic Normality When T/N c, 0 <c<, bothmland MML are asymptotically normal but, unlike MML, the ML estimator has a bias in the asymptotic distribution. An informal calculation of the terms arising in the asymptotic distributions is given in the Appendix. The results areasfollows: H 0 V H /2 µb βml β 0 + T H b d N N (0,I) (79) H 0 V H /2 bβmml β 0 d N (0,I). (80) where κ βλλi = E T d βλλi (β 0, λ i0 ) x i, λ i, κ λλi = E [T d λλi (β 0, λ i0 ) x i, λ i ], and V = b N = N H = NX µ κ βλλi 2κ λλi, (8) NX d βi (β 0, λ i0 ) d βi (β 0, λ i0 ) 0, (82) NX H = β 0 d βi β 0, λ b i (β 0 ), (83) NX β 0 d Mi (β 0 ). (84) Thus, the asymptotic distribution of the ML estimator will contain a bias term unless κ βλλi =0. 8 Concluding Remarks In this paper we have considered ML and modified ML estimators, but the estimation problem can be put more generally in terms of moment conditions 28

30 in a GMM framework. Fixed-T consistent estimators rely on exactly unbiased moment conditions. When T/N tends to a constant, a GMM estimator from moment conditions with a O (/T ) bias will typically exhibit a bias in the asymptotic distribution, but not if the estimator is based on moment conditions with a O (/T 2 ) bias. Thus, in the context of binary choice and other non-linear microeconometric models, a search for optimal orthogonality conditions that are unbiased to order O (/T 2 ) or greater seems a useful research agenda. But do these biases really matter? Heckman (98) reported a Monte Carlo experiment for ML estimation of a probit model with strictly exogenous variables and fixed effects, T =8and N =00. Usingarandomeffects estimator as a benchmark, he concluded that the MLE of the common parameters (jointly estimated with the effects) performed well. According to this, it would seem that even for fairly small panels there is not much to be gained from the use of fixed-t unbiased or approximately unbiased orthogonality conditions. For models with only strictly exogenous explanatory variablesthismaywellbethecase. Butthesearemodelsthatarefoundto be too restrictive in many applications. When modelling panel data, state dependence, predetermined regressors, and serial correlation often matter. Heckman (98) found that when a lagged dependent variable was included the ML probit estimator performed badly. This is not surprising since similar problems occur with linear autoregressive models. The difference is that while standard tools are available in the literature that ensure fixed T consistency for linear dynamic models, very little is known for dynamic binary choice. This is therefore a promising area of application of asymptotic arguments to both the construction of estimating equations and useful approximations to sampling distributions. See Keane (994), Hyslop (999), Honoré and Kyriazidou (2000), Magnac (2000), Honoré and Lewbel (2002), Arellano and Carrasco (2003), and Arellano and Honoré (200) for a survey and more references. 29

31 References [] Alvarez, J. and M. Arellano (2003): The Time Series and Cross-Section Asymptotics of Dynamic Panel Data Estimators, Econometrica, 7, July. [2] Amemiya, T.(985): Advanced Econometrics, Basil Blackwell. [3] Andersen, E. B. (970): Asymptotic Properties of Conditional Maximum Likelihood Estimators, Journal of the Royal Statistical Society, Series B, 32, [4] Andersen,E.B.(973):Conditional Inference and Models for Measuring, Mentalhygiejnisk Forsknings Institut, Copenhagen. [5] Arellano, M. (2003): Panel Data Econometrics, Oxford University Press. [6] Arellano, M. and R. Carrasco (2003): Binary Choice Panel Data Models with Predetermined Variables, Journal of Econometrics, 5, [7] Arellano, M. and B. Honoré (200): Panel Data Models: Some Recent Developments, in J. Heckman and E. Leamer (eds.), Handbook of Econometrics, vol.5,northholland,amsterdam. [8] Bover, O. and M. Arellano (997): Estimating Dynamic Limited Dependent Variable Models from Panel Data, Investigaciones Económicas, 2, [9] Carro, J. M. (2003): Estimating Dynamic Panel Data Discrete Choice Models with Fixed Effects, CEMFI Working Paper No [0] Chamberlain, G. (980): Analysis of Covariance with Qualitative Data, Review of Economic Studies, 47,

32 [] Chamberlain, G. (984): Panel Data, in Z. Griliches and M.D. Intriligator (eds.), Handbook of Econometrics, vol. 2, Elsevier Science, Amsterdam. [2] Chamberlain, G. (986): Asymptotic Efficiency in Semiparametric Models with Censoring, Journal of Econometrics, 32, [3] Chamberlain (992): Binary Response Models for Panel Data: Identification and Information, unpublished manuscript, Department of Economics, Harvard University. [4] Charlier, E., B. Melenberg, and A. van Soest (995): A Smoothed Maximum Score Estimator for the Binary Choice Panel Data Model and an Application to Labour Force Participation, Statistica Neerlandica, 49, [5] Chen, S. (998): Root-N Consistent Estimation of a Panel Data Sample Selection Model, unpublished manuscript, The Hong Kong University of Science and Technology. [6] Cox, D. R. and N. Reid (987): Parameter Orthogonality and Approximate Conditional Inference (with discussion), Journal of the Royal Statistical Society, SeriesB,49,-39. [7] Cox, D. R. and N. Reid (992): A Note on the Difference between Profile and Modified Profile Likelihood, Biometrika, 79, [8] Ferguson, H., N. Reid, and D. R. Cox (99): Estimating Equations from Modified Profile Likelihood, in Godambe, V. P. (ed.), Estimating Functions, Oxford University Press. [9] Hahn, J. and G. Kuersteiner (2002): Asymptotically Unbiased Inference for a Dynamic Panel Model with Fixed Effects When Both n and T are Large, Econometrica, 70,

33 [20] Heckman, J. J. (98): The Incidental Parameters Problem and the Problem of Initial Conditions in Estimating a Discrete Time Discrete Data Stochastic Process in C. F. Manski and D. McFadden (eds.), Structural Analysis of Discrete Data with Econometric Applications, MIT Press. [2] Honoré, B. and E. Kyriazidou (2000): Panel Data Discrete Choice Models with Lagged Dependent Variables, Econometrica, 68, [22] Honoré, B. and A. Lewbel (2002): Semiparametric Binary Choice Panel Data Models without Strictly Exogeneous Regressors, Econometrica, 70, [23] Horowitz, J. L. (992): A Smoothed Maximum Score Estimator for the Binary Response Model, Econometrica, 60, [24] Hsiao, C. (2003): Analysis of Panel Data, SecondEdition, Cambridge University Press. [25] Hyslop, D. R. (999): State Dependence, Serial Correlation and Heterogeneity in Intertemporal Labor Force Participation of Married Women, Econometrica, 67, [26] Keane, M. (994): A Computationally Practical Simulation Estimator for Panel Data, Econometrica, 62, [27] Kyriazidou, E. (997): Estimation of a Panel Data Sample Selection Model, Econometrica, 65, [28] Lancaster, T. (998): Panel Binary Choice with Fixed Effects, unpublished manuscript, Department of Economics, Brown University. [29] Lancaster, T. (2000): The Incidental Parameter Problem Since 948, Journal of Econometrics, 95,

34 [30] Lancaster, T. (2002): Orthogonal Parameters and Panel Data, Review of Economic Studies, 69, [3] Lee, M. J. (999): A Root N Consistent Semiparametric Estimator for Related-Effect Binary Response Panel Data, Econometrica, 67, [32] Li, H., B. G. Lindsay, and R. P. Waterman (2002): Efficiency of Projected Score Methods in Rectangular Array Asymptotics, unpublished manuscript, Department of Statistics, Pennsylvania State University. [33] Liang, K.Y. (987): Estimating Functions and Approximate Conditional Likelihood, Biometrika, 74, [34] Magnac, T. (2000): State Dependence and Unobserved Heterogeneity in Youth Employment Histories, Economic Journal 0, [35] McCullagh, P. and R. Tibshirani (990): A Simple Method for the Adjustment of Profile Likelihoods, Journal of the Royal Statistical Society, Series B, 52, [36] Manski, C. (987): Semiparametric Analysis of Random Effects Linear Models from Binary Panel Data, Econometrica, 55, [37] Newey, W. (994): The Asymptotic Variance of Semiparametric Estimators, Econometrica, 62, [38] Neyman, J. and E. L. Scott (948): Consistent Estimates Based on Partially Consistent Observations, Econometrica, 6, 32. [39] Rasch, G. (960): Probabilistic Models for Some Intelligence and Attainment Tests, Denmarks Pædagogiske Institut, Copenhagen. [40] Rasch, G. (96): On the General Laws and the Meaning of Measurement in Psychology, Proceedings of the Fourth Berkeley Symposium on 33

35 Mathematical Statistics and Probability, Vol.4,UniversityofCalifornia Press, Berkeley and Los Angeles. [4] Woutersen, T. (200): Robustness Against Incidental Parameters and Mixing Distributions, unpublished manuscript, Department of Economics, University of Western Ontario. 34

36 Appendix Expansion for the Score of the Concentrated Likelihood Let us consider a second order expansion of the score of the concentrated likelihood around the true value of the orthogonal effect. The log likelihood is ` i (β, λ i ); its vector of partial derivatives with respect to β is d βi (β, λ i) = ` i (β, λ i ) / β; the concentrated likelihood is ` i β, λ b i (β) and its score is given by d βi β, λ b i (β). An approximation at β 0 around the true value λ i0 is d βi β 0, λ b i (β 0 ) = d βi (β 0, λ i0 )+d βλi (β 0, λ i0 ) bλi (β 0 ) λ i0 (A) + 2 d βλλi (β 0, λ i0 ) bλi (β 0 ) λ i0 2 + Op T /2 where d βλi (β, λ i)= 2` i (β, λ i ) / β λ i and d βλλi (β 0, λ i0 )= 3` i (β, λ i ) / β λ 2 i. In general, the first three terms are O p T /2, O p T /2,andO p (), butbecause of orthogonality d βλi (β 0, λ i0 ) is O p as opposed to O p (T ). T 2 Expansion for λ b i (β 0 ) λ i0 Letting d λi (β, λ i)= ` i (β, λ i ) / λ i,the estimator λ b i (β 0 ) solves d λi β 0, λ b i (β 0 ) =0. Let us also introduce notation for the terms: κ λλi κ λλ (β 0, λ i0 )=E T d λλi (β 0, λ i0 ) x i, λ i κ βλλi κ βλλ (β 0, λ i0 )=E T d βλλi (β 0, λ i0 ) x i, λ i Note that κ λλi and κ βλλi are individual specific because they depend on λ i0, but they do not depend on the y s. 3 Moreover, from the information matrix identity E T d λi (β 0, λ i0 ) d λi (β 0, λ i0 ) x i, λ i = κ λλi. 2 Since h i T T T d βλi (β 0, λ i0 ) 0 = O p (), wehaved βλi (β 0, λ i0 )=O p. 3 Also T d λλi (β 0, λ i0 ) = κ λλ (β 0, λ i0 ) + O p T, which holds as T T d λλi (β 0, λ i0 ) κ λλ (β 0, λ i0 ) = O p (). 35

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006 Comments on: Panel Data Analysis Advantages and Challenges Manuel Arellano CEMFI, Madrid November 2006 This paper provides an impressive, yet compact and easily accessible review of the econometric literature

More information

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools Syllabus By Joan Llull Microeconometrics. IDEA PhD Program. Fall 2017 Chapter 1: Introduction and a Brief Review of Relevant Tools I. Overview II. Maximum Likelihood A. The Likelihood Principle B. The

More information

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator Francesco Bartolucci and Valentina Nigro Abstract A model for binary panel data is introduced

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Informational Content in Static and Dynamic Discrete Response Panel Data Models

Informational Content in Static and Dynamic Discrete Response Panel Data Models Informational Content in Static and Dynamic Discrete Response Panel Data Models S. Chen HKUST S. Khan Duke University X. Tang Rice University May 13, 2016 Preliminary and Incomplete Version Abstract We

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Estimation of Structural Parameters and Marginal Effects in Binary Choice Panel Data Models with Fixed Effects

Estimation of Structural Parameters and Marginal Effects in Binary Choice Panel Data Models with Fixed Effects Estimation of Structural Parameters and Marginal Effects in Binary Choice Panel Data Models with Fixed Effects Iván Fernández-Val Department of Economics MIT JOB MARKET PAPER February 3, 2005 Abstract

More information

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system: Panel Data Exercises Manuel Arellano Exercise 1 Using panel data, a researcher considers the estimation of the following system: y 1t = α 1 + βx 1t + v 1t. (t =1,..., T ) y Nt = α N + βx Nt + v Nt where

More information

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Pedro Albarran y Raquel Carrasco z Jesus M. Carro x June 2014 Preliminary and Incomplete Abstract This paper presents and evaluates

More information

On IV estimation of the dynamic binary panel data model with fixed effects

On IV estimation of the dynamic binary panel data model with fixed effects On IV estimation of the dynamic binary panel data model with fixed effects Andrew Adrian Yu Pua March 30, 2015 Abstract A big part of applied research still uses IV to estimate a dynamic linear probability

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008 Panel Data Seminar Discrete Response Models Romain Aeberhardt Laurent Davezies Crest-Insee 11 April 2008 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 1 / 29 Contents Overview

More information

Efficient Estimation of Non-Linear Dynamic Panel Data Models with Application to Smooth Transition Models

Efficient Estimation of Non-Linear Dynamic Panel Data Models with Application to Smooth Transition Models Efficient Estimation of Non-Linear Dynamic Panel Data Models with Application to Smooth Transition Models Tue Gørgens Christopher L. Skeels Allan H. Würtz February 16, 2008 Preliminary and incomplete.

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Identification in Discrete Choice Models with Fixed Effects

Identification in Discrete Choice Models with Fixed Effects Identification in Discrete Choice Models with Fixed Effects Edward G. Johnson August 2004 Abstract This paper deals with issues of identification in parametric discrete choice panel data models with fixed

More information

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1 Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1 Victor Chernozhukov Iván Fernández-Val Jinyong Hahn Whitney Newey MIT BU UCLA MIT February 4, 2009 1 First version of May 2007.

More information

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press.

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press. Cheng Hsiao Microeconometrics Required Text: C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press. A.C. Cameron and P.K. Trivedi (2005), Microeconometrics, Cambridge University

More information

Binary Models with Endogenous Explanatory Variables

Binary Models with Endogenous Explanatory Variables Binary Models with Endogenous Explanatory Variables Class otes Manuel Arellano ovember 7, 2007 Revised: January 21, 2008 1 Introduction In Part I we considered linear and non-linear models with additive

More information

1 Estimation of Persistent Dynamic Panel Data. Motivation

1 Estimation of Persistent Dynamic Panel Data. Motivation 1 Estimation of Persistent Dynamic Panel Data. Motivation Consider the following Dynamic Panel Data (DPD) model y it = y it 1 ρ + x it β + µ i + v it (1.1) with i = {1, 2,..., N} denoting the individual

More information

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates Identification and Estimation of Nonlinear Dynamic Panel Data Models with Unobserved Covariates Ji-Liang Shiu and Yingyao Hu July 8, 2010 Abstract This paper considers nonparametric identification of nonlinear

More information

Econ 273B Advanced Econometrics Spring

Econ 273B Advanced Econometrics Spring Econ 273B Advanced Econometrics Spring 2005-6 Aprajit Mahajan email: amahajan@stanford.edu Landau 233 OH: Th 3-5 or by appt. This is a graduate level course in econometrics. The rst part of the course

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Semiparametric Identification in Panel Data Discrete Response Models

Semiparametric Identification in Panel Data Discrete Response Models Semiparametric Identification in Panel Data Discrete Response Models Eleni Aristodemou UCL March 8, 2016 Please click here for the latest version. Abstract This paper studies partial identification in

More information

Iterative Bias Correction Procedures Revisited: A Small Scale Monte Carlo Study

Iterative Bias Correction Procedures Revisited: A Small Scale Monte Carlo Study Discussion Paper: 2015/02 Iterative Bias Correction Procedures Revisited: A Small Scale Monte Carlo Study Artūras Juodis www.ase.uva.nl/uva-econometrics Amsterdam School of Economics Roetersstraat 11 1018

More information

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION

IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION IDENTIFICATION OF THE BINARY CHOICE MODEL WITH MISCLASSIFICATION Arthur Lewbel Boston College December 19, 2000 Abstract MisclassiÞcation in binary choice (binomial response) models occurs when the dependent

More information

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics

UNIVERSITY OF CALIFORNIA Spring Economics 241A Econometrics DEPARTMENT OF ECONOMICS R. Smith, J. Powell UNIVERSITY OF CALIFORNIA Spring 2006 Economics 241A Econometrics This course will cover nonlinear statistical models for the analysis of cross-sectional and

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Censoring and truncation b)

More information

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates Identification and Estimation of Nonlinear Dynamic Panel Data Models with Unobserved Covariates Ji-Liang Shiu and Yingyao Hu April 3, 2010 Abstract This paper considers nonparametric identification of

More information

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON

Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Parametric identification of multiplicative exponential heteroskedasticity ALYSSA CARLSON Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States (email: carls405@msu.edu)

More information

PROFILE-SCORE ADJUSTMENTS FOR INCIDENTAL-PARAMETER PROBLEMS. Geert Dhaene KU Leuven, Naamsestraat 69, B-3000 Leuven, Belgium

PROFILE-SCORE ADJUSTMENTS FOR INCIDENTAL-PARAMETER PROBLEMS. Geert Dhaene KU Leuven, Naamsestraat 69, B-3000 Leuven, Belgium PROFILE-SCORE ADJUSTMENTS FOR INCIDENTAL-PARAMETER PROBLEMS Geert Dhaene KU Leuven Naamsestraat 69 B-3000 Leuven Belgium geert.dhaene@kuleuven.be Koen Jochmans Sciences Po 28 rue des Saints-Pères 75007

More information

Short T Panels - Review

Short T Panels - Review Short T Panels - Review We have looked at methods for estimating parameters on time-varying explanatory variables consistently in panels with many cross-section observation units but a small number of

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Seung C. Ahn Arizona State University, Tempe, AZ 85187, USA Peter Schmidt * Michigan State University,

More information

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel

More information

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Exclusion Restrictions in Dynamic Binary Choice Panel Data Models

Exclusion Restrictions in Dynamic Binary Choice Panel Data Models Exclusion Restrictions in Dynamic Binary Choice Panel Data Models Songnian Chen HKUST Shakeeb Khan Boston College Xun Tang Rice University February 2, 208 Abstract In this note we revisit the use of exclusion

More information

Bias Corrections for Two-Step Fixed Effects Panel Data Estimators

Bias Corrections for Two-Step Fixed Effects Panel Data Estimators Bias Corrections for Two-Step Fixed Effects Panel Data Estimators Iván Fernández-Val Boston University Francis Vella Georgetown University February 26, 2007 Abstract This paper introduces bias-corrected

More information

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels : Dynamic Binary Random E ects Models Estimation with Unbalanced Panels Pedro Albarran* Raquel Carrasco** Jesus M. Carro** *Universidad de Alicante **Universidad Carlos III de Madrid 2017 Spanish Stata

More information

Estimation of Dynamic Panel Data Models with Sample Selection

Estimation of Dynamic Panel Data Models with Sample Selection === Estimation of Dynamic Panel Data Models with Sample Selection Anastasia Semykina* Department of Economics Florida State University Tallahassee, FL 32306-2180 asemykina@fsu.edu Jeffrey M. Wooldridge

More information

Department of Economics Seminar Series. Yoonseok Lee University of Michigan. Model Selection in the Presence of Incidental Parameters

Department of Economics Seminar Series. Yoonseok Lee University of Michigan. Model Selection in the Presence of Incidental Parameters Department of Economics Seminar Series Yoonseok Lee University of Michigan Model Selection in the Presence of Incidental Parameters Friday, January 25, 2013 9:30 a.m. S110 Memorial Union Model Selection

More information

Lecture 8 Panel Data

Lecture 8 Panel Data Lecture 8 Panel Data Economics 8379 George Washington University Instructor: Prof. Ben Williams Introduction This lecture will discuss some common panel data methods and problems. Random effects vs. fixed

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Dynamic logit model: pseudo conditional likelihood estimation and latent Markov extension

Dynamic logit model: pseudo conditional likelihood estimation and latent Markov extension Dynamic logit model: pseudo conditional likelihood estimation and latent Markov extension Francesco Bartolucci 1 Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it

More information

Course Description. Course Requirements

Course Description. Course Requirements University of Pennsylvania Spring 2007 Econ 721: Advanced Microeconometrics Petra Todd Course Description Lecture: 9:00-10:20 Tuesdays and Thursdays Office Hours: 10am-12 Fridays or by appointment. To

More information

Estimating Semi-parametric Panel Multinomial Choice Models

Estimating Semi-parametric Panel Multinomial Choice Models Estimating Semi-parametric Panel Multinomial Choice Models Xiaoxia Shi, Matthew Shum, Wei Song UW-Madison, Caltech, UW-Madison September 15, 2016 1 / 31 Introduction We consider the panel multinomial choice

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

Increasing the Power of Specification Tests. November 18, 2018

Increasing the Power of Specification Tests. November 18, 2018 Increasing the Power of Specification Tests T W J A. H U A MIT November 18, 2018 A. This paper shows how to increase the power of Hausman s (1978) specification test as well as the difference test in a

More information

1. GENERAL DESCRIPTION

1. GENERAL DESCRIPTION Econometrics II SYLLABUS Dr. Seung Chan Ahn Sogang University Spring 2003 1. GENERAL DESCRIPTION This course presumes that students have completed Econometrics I or equivalent. This course is designed

More information

THE BEHAVIOR OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF DYNAMIC PANEL DATA SAMPLE SELECTION MODELS

THE BEHAVIOR OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF DYNAMIC PANEL DATA SAMPLE SELECTION MODELS THE BEHAVIOR OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF DYNAMIC PANEL DATA SAMPLE SELECTION MODELS WLADIMIR RAYMOND PIERRE MOHNEN FRANZ PALM SYBRAND SCHIM VAN DER LOEFF CESIFO WORKING PAPER NO. 1992 CATEGORY

More information

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects MPRA Munich Personal RePEc Archive Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects Mohamed R. Abonazel Department of Applied Statistics and Econometrics, Institute of Statistical

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

Nonlinear Panel Data Analysis

Nonlinear Panel Data Analysis Nonlinear Panel Data Analysis Manuel Arellano CEMFI, Madrid Stéphane Bonhomme CEMFI, Madrid Revised version: January 2011 Abstract Nonlinear panel data models arise naturally in economic applications,

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity Jörg Schwiebert Abstract In this paper, we derive a semiparametric estimation procedure for the sample selection model

More information

APEC 8212: Econometric Analysis II

APEC 8212: Econometric Analysis II APEC 8212: Econometric Analysis II Instructor: Paul Glewwe Spring, 2014 Office: 337a Ruttan Hall (formerly Classroom Office Building) Phone: 612-625-0225 E-Mail: pglewwe@umn.edu Class Website: http://faculty.apec.umn.edu/pglewwe/apec8212.html

More information

Model Selection in the Presence of Incidental Parameters

Model Selection in the Presence of Incidental Parameters Model Selection in the Presence of Incidental Parameters Yoonseok Lee University of Michigan February 2012 Abstract This paper considers model selection of nonlinear panel data models in the presence of

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

Lecture 6: Discrete Choice: Qualitative Response

Lecture 6: Discrete Choice: Qualitative Response Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;

More information

Lecture 6: Dynamic panel models 1

Lecture 6: Dynamic panel models 1 Lecture 6: Dynamic panel models 1 Ragnar Nymoen Department of Economics, UiO 16 February 2010 Main issues and references Pre-determinedness and endogeneity of lagged regressors in FE model, and RE model

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal?

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Dale J. Poirier and Deven Kapadia University of California, Irvine March 10, 2012 Abstract We provide

More information

Model Selection in the Presence of Incidental Parameters

Model Selection in the Presence of Incidental Parameters Model Selection in the Presence of Incidental Parameters Yoonseok Lee University of Michigan July 2012 Abstract This paper considers model selection of nonlinear panel data models in the presence of incidental

More information

Course Description. Course Requirements

Course Description. Course Requirements University of Pennsylvania Fall, 2016 Econ 721: Advanced Micro Econometrics Petra Todd Course Description Lecture: 10:30-11:50 Mondays and Wednesdays Office Hours: 10-11am Fridays or by appointment. To

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects

An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects DISCUSSION PAPER SERIES IZA DP No. 7054 An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects Majid M. Al-Sadoon Tong Li M. Hashem Pesaran November 2012 Forschungsinstitut

More information

Estimation of Treatment Effects under Essential Heterogeneity

Estimation of Treatment Effects under Essential Heterogeneity Estimation of Treatment Effects under Essential Heterogeneity James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

The Estimation of Simultaneous Equation Models under Conditional Heteroscedasticity

The Estimation of Simultaneous Equation Models under Conditional Heteroscedasticity The Estimation of Simultaneous Equation Models under Conditional Heteroscedasticity E M. I and G D.A. P University of Alicante Cardiff University This version: April 2004. Preliminary and to be completed

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Improving GMM efficiency in dynamic models for panel data with mean stationarity

Improving GMM efficiency in dynamic models for panel data with mean stationarity Working Paper Series Department of Economics University of Verona Improving GMM efficiency in dynamic models for panel data with mean stationarity Giorgio Calzolari, Laura Magazzini WP Number: 12 July

More information

1 Introduction The time series properties of economic series (orders of integration and cointegration) are often of considerable interest. In micro pa

1 Introduction The time series properties of economic series (orders of integration and cointegration) are often of considerable interest. In micro pa Unit Roots and Identification in Autoregressive Panel Data Models: A Comparison of Alternative Tests Stephen Bond Institute for Fiscal Studies and Nuffield College, Oxford Céline Nauges LEERNA-INRA Toulouse

More information

Working Paper No Maximum score type estimators

Working Paper No Maximum score type estimators Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,

More information

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels Pedro Albarran y Raquel Carrasco z Jesus M. Carro x February 2015 Abstract This paper presents and evaluates estimation methods

More information

A PENALTY FUNCTION APPROACH TO BIAS REDUCTION IN NON-LINEAR PANEL MODELS WITH FIXED EFFECTS

A PENALTY FUNCTION APPROACH TO BIAS REDUCTION IN NON-LINEAR PANEL MODELS WITH FIXED EFFECTS A PENALTY FUNCTION APPROACH TO BIAS REDUCTION IN NON-LINEAR PANEL MODELS WITH FIXED EFFECTS C. ALAN BESTER AND CHRISTIAN HANSEN Abstract. In this paper, we consider estimation of nonlinear panel data models

More information

Parametric Identification of Multiplicative Exponential Heteroskedasticity

Parametric Identification of Multiplicative Exponential Heteroskedasticity Parametric Identification of Multiplicative Exponential Heteroskedasticity Alyssa Carlson Department of Economics, Michigan State University East Lansing, MI 48824-1038, United States Dated: October 5,

More information

Course Description. Course Requirements

Course Description. Course Requirements University of Pennsylvania Fall, 2015 Econ 721: Econometrics III Advanced Petra Todd Course Description Lecture: 10:30-11:50 Mondays and Wednesdays Office Hours: 10-11am Tuesdays or by appointment. To

More information

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator

More information

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor (part II) The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Brigham R. Frandsen Lars J. Lefgren December 16, 2015 Abstract This article develops

More information

Discrete Time Duration Models with Group level Heterogeneity

Discrete Time Duration Models with Group level Heterogeneity This work is distributed as a Discussion Paper by the STANFORD INSTITUTE FOR ECONOMIC POLICY RESEARCH SIEPR Discussion Paper No. 05-08 Discrete Time Duration Models with Group level Heterogeneity By Anders

More information

Efficiency of repeated-cross-section estimators in fixed-effects models

Efficiency of repeated-cross-section estimators in fixed-effects models Efficiency of repeated-cross-section estimators in fixed-effects models Montezuma Dumangane and Nicoletta Rosati CEMAPRE and ISEG-UTL January 2009 Abstract PRELIMINARY AND INCOMPLETE Exploiting across

More information