Testing for time-invariant unobserved heterogeneity in nonlinear panel-data models

Size: px
Start display at page:

Download "Testing for time-invariant unobserved heterogeneity in nonlinear panel-data models"

Transcription

1 Testing for time-invariant unobserved heterogeneity in nonlinear panel-data models Francesco Bartolucci University of Perugia Federico Belotti University of Rome Tor Vergata Franco Peracchi University of Rome Tor Vergata August 29, 2012 Abstract Recent literature on nonlinear panel data has emphasized the importance of accounting for timevarying unobserved heterogeneity, which may stem either from time-varying unit-specific omitted variables or macro-level shocks that affect each individual unit differently. In this paper, we propose a Hausman-like procedure to test the null hypothesis of time-invariant individual effects. The test can be used when the dependent variable is discrete and is based on a comparison between standard and pairwise conditional likelihood estimators. It requires no assumptions on the distribution on the time-varying individual effects, in particular it does not require them to be independent of the covariates in the model. We investigate the finite sample properties of the test by a simulation study. The results of this study show good size and power properties of the proposed test, especially with ordinal outcomes. A health economics example based on a sample from the Health and Retirement Study is used to illustrate the test. Keywords: Categorical response models; conditional likelihood; Hausman-like test; self-reported health; Health and Retirement Study. JEL: C12, C33, C35. Preliminary draft. We thank seminar participants at the University of Padua, Department of Statistical Sciences for useful comments. We also thank Florian Heiss for allowing us to use his arldv Stata package. Corresponding author: Federico Belotti, CEIS and Department of Economics and Finance, University of Rome Tor Vergata, via Columbia 2, Rome, Italy. federico.belotti@uniroma2.it, phone:

2 1 Introduction A distinctive feature of panel data modeling is the treatment of unobserved heterogeneity, which is typically interpreted as the effect of unobservable factors on the outcome of interest. The simplest way of dealing with this form of heterogeneity consists of including in the model time-invariant individual effects. For a detailed treatment see Hsiao (2005), Wooldridge (2010), and Arellano & Bonhomme (2011). However, assuming that these effects are constant over time may be difficult to justify in certain applications. For example, Stowasser et al. (2011) argue that the dynamic pattern of self reported health status (SRHS) can be better modeled by introducing a latent time-varying individual-specific true health component. Obviously, biased parameter estimates may result if the individual effects are time-varying but they are assumed to be time-invariant. This is especially true in the case of long panels. A few studies try to relax the assumption of time-invariant individual effects by using a dynamic latent process for the unobserved heterogeneity. In the case of nonlinear panel data models, one strategy is to include time-varying random effects, which are assumed to be independent of the covariates and are treated as either continuous or discrete. In particular, Heiss (2008) proposes a limited dependent variable model which, for every sample unit, relies on a sequence of timevarying effects which is assumed to follow an AR(1) process with common parameters. Bartolucci & Farcomeni (2009) propose a multivariate extension of the dynamic logit model based on timevarying individual effects which are assumed to follow a time-homogeneous Markov chain for every sample unit. The approaches mentioned above to account for time-varying unobserved heterogeneity have pros and cons. Though the Heiss (2008) s formulation is parsimonious (it uses only one more parameter with respect to a standard random-effects model) and more easily justifiable (continuous random effects are more naturally conceivable in most applications), the discrete approach, formulated as in Bartolucci & Farcomeni (2009), usually results in a more flexible model which may reach a better fit; see Bartolucci et al. (2012) for more detailed comments. On the other hand, both approaches require to formulate some assumptions on the distribution of the random effects and are computationally demanding. Therefore, practitioners find it useful to test for the presence of time-invariant unobserved heterogeneity before estimating these types of models. In this paper, we propose a Hausman-like specification test for the null hypothesis of time- 1

3 invariant individual effects in nonlinear panel data models against the alternative that these effects are time-varying. Our test is a pure specification test because it leaves the alternative deliberately vague. The test is based on a comparison between standard and pairwise conditional likelihood estimators. The two estimators converge to the same point in the parameter space under the null, but they diverge under the alternative. The proposed test is attractive because: (i) it does not require assumptions on the distribution of the unit-specific effects; (ii) it allows them to depend on observed explanatory variables; (iii) it can be used when the dependent variable is binary or ordinal; and (iv) it can be easily implemented using existing software for conditional likelihood inference in panel data models. 1 The test can also be viewed as a specification test against time-varying omitted variables that are possibly correlated with the covariates included in the model. In order to investigate the finite sample properties of the proposed test, we performed a simulation study. The results of this study show that the test performs quite well, with small size distortion and good power properties, especially when n 2000 and T 7 (common sample sizes in economics) and when the dependent variable is ordinal. The remainder of this article is organized as follows. Section 2 presents the statistical framework and the proposed test. Section 3 investigates the small sample properties of the test by simulation, while Section 4 provides an illustration based on a health economics application. Finally, Section 5 offers some conclusions. 2 The proposed test We consider a general model based on a latent continuous random variable yit for the ith unit at time t. In particular, we assume that y it = G(yit), (1) where G(.) is a parametric function which may depend on specific parameters according to the nature of y it, and yit = α it + x itβ + ε it, i = 1,..., n, t = 1,..., T, (2) where x it is a vector of c covariates, β is a parameter vector, α it represents unobservable individual time-varying characteristics, and ε it is a random error. Without loss of generality, we will focus 1 The specification test described in this article has been implemented in a series of R and Stata functions which are available from the authors upon request. 2

4 our attention on the case of binary and ordinal outcomes assuming that y it is tied to the latent variable y it through the following G(.) function y it = G(y it) = J 1 j=0 j 1{τ j y it < τ j+1 }, (3) where J is the number of response categories and = τ 0 < τ 1 < < τ J 1 < τ J = is a set of thresholds. When J > 2, y it is an ordinal outcome variable while, when J = 2, y it becomes a binary 0-1 indicator and model (1) collapses to a simple binary outcome model. The null hypothesis of interest is that the effect of unobserved heterogeneity, accounted for by the parameters α it, is in fact constant over time, that is, H 0 : α it = α i, i = 1,..., n, t = 1,..., T, (4) whereas the alternative hypothesis (H 1 ) is that unobserved heterogeneity is time-varying, without a priori assumptions on how it evolves over time. In order to test H 0, we propose a procedure based on the comparison of two asymptotically normal estimators which are both consistent under the null, but diverge under the alternative. In the following, we provide some details on these estimators, introduce the test statistic based on them and study its null asymptotic distribution. When J = 2 (case of binary outcomes), a consistent estimator of β under the null hypothesis of time-invariant heterogeneity may be obtained from the Condition Maximum Likelihood (CML) approach proposed by Andersen (1970) and Chamberlain (1980). This approach assumes that, under the null and conditionally on α i and x it, the errors ε it are independently and identically distributed (IID) with a standard logistic distribution. The resulting estimator is based on the maximization of the following conditional log-likelihood function n l 1 (β) = l 1i (β), l 1i (β) = log p(y i y i+ ), i=1 where y i = (y i1,..., y it ) is the sequence of responses for unit i, y i+ = t y it is a sufficient statistic for the time-invariant individual effect α i, and p(y i y i+ ) is the conditional probability of the observed sequence of responses given the sufficient statistic. More explicitly, we have t p(y i y i+ ) = exp(y itx it β) z:z + =y i+ t exp(z tx (5) itβ), where z = (z 1,..., z T ) denotes a vector of zeros and ones of the same dimension as y i, and the sum at the denominator is over all the vectors z whose elements sum up to y i+. 3

5 When J > 2 (case of ordinal outcomes), a number of estimators have been proposed in the literature. Chamberlain (1980) proposes to reduce the ordered model to a binary one by dichotomizing the ordinal outcome at a specific cut-off point j. However, the resulting estimator, say β j, does not exploit all the variation in the response, as units for which either y it < j or y it j for every t do not contribute to the log-likelihood. Following Chamberlain (1980) idea, Das & van Soest (1999), Ferrer-i-Carbonell & Frijters (2004) and Baetschmann et al. (2011) propose different strategies to exploit all the information available in the data. In this paper, we follow Baetschmann et al. (2011) by considering all possible dichotomizations y (j) it sample, where y (j) it = 1{y it > j 1}, j = 1,..., J 1. of the ordered outcome y it for each unit in the Under the assumption that the unknown parameter vector is the same for all y (j) it, the quasi-loglikelihood of this restricted CML estimator is l 1 (β) = n J 1 i=1 j=1 log p(y (j) i y (j) i+ ). (6) At this stage, in order to construct our test we need an alternative estimator of the parameter vector β, which is also consistent under the null but has different convergence properties under the alternative. When J = 2, one such estimator may be obtained by maximizing the pairwise version of l 1 (β), that is n l 2 (β) = l 2i (β), T l 2i (β) = log p(y i,t 1, y it y i,t 1 + y it ), i=1 t=2 where the conditional probability is defined as in (5) for each single pair of consecutive observations. This estimator is not ensured to be consistent under the alternative, but it has a different asymptotic bias than the estimator based on l 1 (β). When J > 2, a natural estimator to consider is the maximizer of l 2 (β) = n J 1 T i=1 j=1 t=2 log p(y (j) i,t 1, y(j) it y(j) i,t 1 + y(j) it ). (7) Regardless of the specific nature of the discrete outcome, let ˆβ 1 denote the estimator of β obtained by maximizing l 1 (β) and let ˆβ 2 denote the estimator obtained by maximizing l 2 (β). Under H 0, it may be simply proved that ˆβ 1 p β0 and ˆβ 2 p β0, where β 0 denotes the true value of β, and that n ( ˆβ1 β 0 ˆβ 2 β 0 ) d N(0, W 0 ), 4

6 where the joint variance-covariance matrix W 0 may be consistently estimated by a sandwich formula. More specifically, when J = 2 (binary case) the consistent estimator of W 0 is ( H1 O ) 1 ( S11 S 12 ) ( H1 O ) 1, Ŵ 0 = O H 2 S 21 S 22 O H 2 with H a = 1 n 2 l ai (β) n i=1 ˆβ, a = 1, 2, a β S ab = 1 n l ai (ˆβ a ) l bi (ˆβ b ) n β β, a, b = 1, 2. i=1 A similar expression must be used when J > 2 (ordinal outcomes), which is based on the loglikelihood defined in (6) and (7). On the other hand, if H 0 does not hold, then ˆβ 1 p β1 and ˆβ 2 p β2, with β 1 β 2. This second result follows from the fact that, since l 2 (β) is maximized using only consecutive pairs of observations, ˆβ 1 and ˆβ 2 will have different probability limits in general. The asymptotic results above suggests the following Hausman-like test statistic for H 0 : ˆδ = n(ˆβ 1 ˆβ 2 ) ˆV 1 0 (ˆβ 1 ˆβ 2 ), (8) where ˆV 0 is a consistent estimator of the variance-covariance matrix of n(ˆβ 1 ˆβ 2 ). This estimator is here computed as ˆV 0 = DŴ 0D, where D = (I c, I c ) and I c denotes the identity matrix of size c. The null asymptotic distribution of this test statistic is of χ 2 type with a number of degrees of freedom equal to the number of covariates, that is, ˆδ d χ 2 c. On the basis of this result we can test H 0 in the usual way and we can compute an asymptotic p-value measuring the evidence provided by the sample against this hypothesis. If the variancecovariance matrix ˆV 0 is singular, its inverse may be substituted in (8) with a generalized inverse V 0, and in this case the distribution to use in testing for H 0 is χ 2 c, where c < c is the rank of ˆV 0. It is worth emphasizing that the proposed testing procedure should have good properties even for other choices of the G(.) function in model (1). Our conjecture is motivated by the fact that, for instance in the case of count and polychotomous unordered outcome variables, the CML approach 5

7 may be used to construct two asymptotically normal estimators which are both consistent under the null and are likely to diverge under the alternative. As far as the drawbacks of the test are concerned, we are forced to assume a logistic distribution in order to exploit the CML approach, implying that no time-invariant regressors can be included in the model (the inclusion of time-invariant covariates causes the variance-covariance matrix ˆV 0 to be singular) and within-unit variation in the dependent variable is required for estimation. 2 Moreover, being a pure specification test, our procedure may lack power in some cases (Holly, 1982) such as, for instance, in presence of heteroskedastic errors. 3 Simulation study To assess the the size and power of the proposed test described in Section 2, we perform a Monte Carlo simulation study, which is fully described subsequently. 3.1 Simulation design We consider the following data-generating process (DGP) for the latent variable yit = α it + x it β + ε it, (9) where x it N(0, 1) and every error term ε it is homoskedastic with a standard logistic distribution. As far as the unit-specific parameters α it are concerned, we consider the following two cases 1. a Gaussian stationary AR(1) process with unit variance. More precisely, we assume α it = { vit, t = 1, ρα it 1 + (1 ρ 2 ) 1/2 v it, t = 2,..., T, (10) with v it N(0, 1). We consider a set of plausible values for the autocorrelation coefficient ρ (1, 0.9, 0.8, 0.7, 0.5, 0.25). 3 Notice that ρ = 1 does not represent the random walk case since, given the above formulation of the autoregressive process, if ρ = 1 then α i1 = α i2 = = α it and the unobserved heterogeneity becomes time-invariant (H 0 ); 2 Even if this is the standard practice, we have to highlight also that unit-specific effects enter additively in the model. 3 It is worth noting that for an AR(1) process, the value of the ρ coefficient equals the first-order autocorrelation. 6

8 2. a three states first-order Homogeneous Markov Chain (HMC) with mean equal to 1.5, unit variance, initial probabilities equal to 1/3 and transition probabilities matrix 1 π π/2 π/2 Π = π/2 1 π π/2. (11) π/2 π/2 1 π where the parameter π = (0, , 0.135, , ) is chosen so that the empirical first-order autocorrelation of the discrete process reflects that of the continuous AR(1) process at point 1. As above, if π = 0, then α i1 = α i2 = = α it and the unit-specific parameters become time-invariant (H 0 ). Furthermore, when the unit-specific parameters follow a Gaussian stationary AR(1) process, we study the properties of the test when the error terms in equation (12) are heteroskedastic. More precisely, we consider the following case y it = α it + x it β + σ it ε it, (12) where σ it = exp(z it δ) is a vector of observation-specific scale parameters for the distribution of every error term ε it. In particular, we set z it N(0, 0.25) and δ = 0.5. Given the aforementioned design, we consider both binary and ordinal outcomes using specific thresholds in equation (3). The former is obtained setting τ 1 = 0; while an ordinal outcome with J = 5 categories is generated setting τ 1 = 2, τ 2 = 0.75, τ 3 = 0.75, τ 4 = 2. We investigate the effect of different sample size n (1000, 2000, 4000) and panel length T (5, 7, 10). In each experiment we fix β = 1 in the linear model for the latent variable. Size at 5% level and power of the test were computed using 1000 replications per experiment. 3.2 Results Tables from 1 to 4 summarize simulation results by reporting the average and the standard deviation of the test statistic (over replications) as well as the power of the test. The first panel of each table reports the size of the test when the unobserved heterogeneity is time-invariant, i.e. ρ = 1 or π = 0. Overall, the test shows very small size distortions, regardless of sample size, panel length and nature of the outcome and unobserved heterogeneity, although the test tend to be a bit oversized in presence of heteroskedastic errors, especially in the ordinal case. The remaining panels of each table report the power of the test for different values of ρ and π. We start our discussion from Table 1, where the outcome is binary, the unit-specific effects follow 7

9 a Gaussian AR(1) process and the errors are homoskedastic. The results for the power of the test can be summarized as follows. First, the power and the autocorrelation coefficient seem to be linked by an inverse U-shaped relationship. Indeed, for given values of n and T, the power improves when ρ departs from one but starts to decrease when ρ drops below 0.5. This behavior is coherent with the fact that, as ρ approaches one, both the standard and the pairwise estimators are consistent, so their difference is small and the test may have poor power. On the other hand, when ρ approaches zero, α it + ε it degenerates towards an iid sequence (which is obtained if ρ = 0). In this case, the standard estimator is inconsistent but also the robustness of the pairwise estimator begins to falter and, once again, their difference gets smaller. Second, the power rapidly increases as n and T increase, exceeding 80% when n 4000 and T 10, regardless of the value of ρ. The same qualitative behavior, but with a higher power, is observed when the dependent variable is ordinal (Table 2). This result follows from the fact that in the ordinal case more information is exploited in the estimation process. The presence of heteroskedasticity does not seem to significantly reduce the power of the test. As shown in Tables 3 and 4, the test still shows good power properties in both binary and ordinal cases. This result might be related to the fact that heteroskedasticity adversely affects both the standard and the pairwise estimators in a similar way, leaving their difference almost unchanged. Finally, as shown in Tables 5 and 6, the performance of the test is very similar when the process representing the evolution of the unobserved heterogeneity is discrete rather than continuous. The same inverse U-shaped relationship is also observed between the power of the test and the parameter that indexes the first-order HMC transition probabilities matrix, i.e. π. 4 Empirical example In this section, we illustrate our procedure through an empirical application on SRHS of the elderly American population. 4.1 Data Our data are from the Health and Retirement Study (HRS), a longitudinal panel study that surveys a representative sample of more than 26,000 Americans over the age of 50 every two years. We 8

10 employ the RAND HRS Data File (Version L), a user-friendly version of the data produced by the RAND Center for the Study of Aging, which contains all waves from 1992 to After selecting respondents aged 50 and older in the first wave, our sample consists of a balanced panel of 4,094 individuals (a total of 40,940 observations). The outcome variable is the (5-categories) SRHS that, as in other surveys, is measured on a five-point ordered scale (poor, fair, good, very good, excellent). As covariates, we consider a typical set of socio-demographic characteristics (gender, age, education and ethnicity), the number of doctor visits and the body mass Index (BMI). Definitions and summary statistics for these variables are presented in Table 7. 4 Following Heiss (2008), we estimate an ordered logit model for SRHS in wave 10 where the covariates consist of a typical set of socio-demographics plus lagged values of SRHS (Table 8). This simple exercise gives an idea of the SRHS correlation pattern over a longer period of time highlighting two interesting results: i) coefficients of most lags are highly significantly different from zero, and ii) they get smaller the further away the respective observation is from wave 10. These stylized facts suggest that the most plausible model for this application is one in which SRHS depends on the true health status and this unobserved variable follows some random process over time with decreasing correlation Results We consider two model specifications: M1) age splines, BMI, number of GP visits and M2) M1 + wave dummies. Table 9 reports the two set of estimates used to construct the test statistic: the top panel shows the standard CML fixed-effects ordered logit estimates for the two model specifications, while the bottom panel reports the corresponding pairwise estimates. The key point here is that the two set of estimates diverge producing a test statistic of and , respectively. Given the corresponding degrees of freedom (6 and 15), we strongly reject the null hypothesis of time-invariant unobserved heterogeneity confirming Heiss (2008) results even in a longer panel. 4 Notice that, due to a failure in the AR(1) ordered logit model convergence, we are forced to drop outliers in the distribution of BMI and doctor visits. We use the method of percentiles. Since it does not seem to be outliers in the left tail, we drop out only values > 99.9 percentile, losing 37 individuals (370 observations). 5 See Heiss (2008) for a detailed discussion. An alternative estimator to Heiss (2008) approach would be a random effects model with state dependence. However, state-dependence of SRHS is not very convincing from a theoretical point of view. While for example in a model of female labor force participation, lagged outcomes can causally affect current outcome (Hyslop, 1999), this causality is not so clear in a SRHS model: why a simple perception of health should affect the future true health status? 9

11 Since H 0 has been rejected, we estimate the latent AR(1) ordered logit model proposed by Heiss (2008) in order to confirm the presence of a lower than one statistically significant autoregressive coefficient. 6 Table 10 shows the estimates for the two model specifications. Being a random effects model, we also include the same time-invariant socio-demographic covariates as in Heiss (2008) application. It is worth emphasizing that the lack of these covariates in performing our test (Table 9) does not affect its power since, being time-invariant, they are conditioned out from the likelihood function as fixed-effects. The estimated ρ = 0.95 and appears to be highly statistically significant (basically the same value found by Heiss (2008)). Hence, a more plausible model for the data is one where SRHS depends on true unobservable health, and this latent variable follows a time-series process with decaying autocorrelation. 5 Concluding remarks We propose a specification test for the null hypothesis of time-invariant unobserved heterogeneity in nonlinear panel data models against the alternative of time-varying heterogeneity. Our test is based on a comparison between standard and pairwise conditional likelihood estimators and can be considered as a pure specification test because it leaves the alternative deliberately vague. The finite-sample properties of the test are investigated via a set of Monte Carlo experiments. The results suggest that the test generally performs well and shows small size distortions and good power properties regardless of how the unit-specific effects evolve over time, especially for ordinal outcomes and when N 2000 and T 7. Moreover, our simulation results suggest that the adverse effect of heteroskedasticity on both size and power of the test is negligible. It is worth emphasizing that our test does not require to formulate assumptions on the distribution of unobserved heterogeneity, and it can be easily implemented using standard software for CML estimation. On the other hand, the use of the CML approach implies that no time-invariant regressors can be included in the model and within-unit variation in the dependent variable is required for estimation. We provide an empirical illustration using data from the HRS. We estimate the same model as Heiss (2008) using a longer balanced panel. The hypothesis of time-invariant unobserved hetero- 6 The estimation is performed using the arldv Stata package produced by Florian Heiss. The likelihood function of this model does not have a closed-form solution, so estimation involves numerical integration. We use 50 integration points. 10

12 geneity is rejected, thus confirming Heiss (2008) results: a more plausible model for the selected sample is one in which SRHS depends on true unobservable health, and this latent variable follows an autoregressive process with decreasing autocorrelation. In future work, we plan to extend our approach to the cases of count and polychotomous unordered outcome variables, for which we conjecture that the proposed test should provide equally good results. Our conjecture is driven by the fact that, even in these cases, the CML approach should guarantee to obtain two asymptotically normal estimators which are both consistent under the null and are likely to diverge under the alternative. 11

13 References Andersen, E. (1970). Asymptotic properties of conditional maximum likelihood estimators. Journal of the Royal Statistical Society Series B, 32, Arellano, M., & Bonhomme, S. (2011). Nonlinear panel data analysis. Annual Review of Economics, 3. Baetschmann, G., Staub, K., & Winkelmann, R. (2011). Consistent Estimation of the Fixed Effects Ordered Logit Model. IZA Discussion Papers 5443 Institute for the Study of Labor (IZA). Bartolucci, F., & Farcomeni, A. (2009). A multivariate extension of the dynamic logit model for longitudinal data based on a latent markov heterogeneity structure. Journal of the American Statistical Association, 104, Bartolucci, F., Silvia, B., & Pennoni, F. (2012). Mixture latent autoregressive models for longitudinal data. arxiv: v1,. Chamberlain, G. (1980). Analysis of covariance with qualitative data. Review of Economic Studies, 47, Das, M., & van Soest, A. (1999). A panel data model for subjective information on household income growth. Journal of Economic Behavior and Organization, 40, Ferrer-i-Carbonell, A., & Frijters, P. (2004). How important is methodology for the estimates of the determinants of happiness. Economic Journal, 114, Heiss, F. (2008). Sequential numerical integration in nonlinear state space models for microeconometric panel data. Journal of Applied Econometrics, 23, Holly, A. (1982). A remark on hausman s specification test. Econometrica, 50, Hsiao, C. (2005). Analysis of Panel Data. New York: Cambridge University Press. Hyslop, D. R. (1999). State dependence, serial correlation and heterogeneity in intertemporal labor force participation of married women. Econometrica, 67,

14 Stowasser, T., Heiss, F., McFadden, D., & Winter, J. (2011). Healthy, Wealthy and Wise? Revisited: An Analysis of the Causal Pathways from Socio-economic Status to Health. NBER Working Papers National Bureau of Economic Research, Inc. Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data. MIT press. 13

15 Table 1: Binary outcome, AR(1) unit-specific effects and homoskedastic errors T=5 T=7 T=10 N mean sd power mean sd power mean sd power ρ = ρ = ρ = ρ = ρ = ρ = Table 2: Ordinal outcome, AR(1) unit-specific effects and homoskedastic errors T=5 T=7 T=10 N mean sd power mean sd power mean sd power ρ = ρ = ρ = ρ = ρ = ρ =

16 Table 3: Binary outcome, AR(1) unit-specific effects and heteroskedastic errors T=5 T=7 T=10 N mean sd power mean sd power mean sd power ρ = ρ = ρ = ρ = ρ = Table 4: Ordinal outcome, AR(1) unit-specific effects and heteroskedastic errors T=5 T=7 T=10 N mean sd power mean sd power mean sd power ρ = ρ = ρ = ρ = ρ =

17 Table 5: Binary outcome, first-order HMC unit-specific effects and homoskedastic errors T=5 T=7 T=10 N mean sd power mean sd power mean sd power π = π = π = π = π = Table 6: Ordinal outcome, first-order HMC unit-specific effects and homoskedastic errors T=5 T=7 T=10 N mean sd power mean sd power mean sd power π = π = π = π = π =

18 Table 7: Summary statistics n = 4094; T = 10. Variable Description Mean Std. Dev. Min. Max. Self-rated Health (1 = poor, 2 = fair, 3 = good, 4 = very good, 5 = excellent) Age Age of the respondent (year) Female Dummy for female High school Dummy for high school (raeduc = 3) Some college Dummy for college degree (raeduc = 4) College degree+ Dummy for higher education (raeduc = 5) Non white Dummy for hispanic and black BMI Body mass index GP visits Number of GP visits (prv two years)

19 Table 8: Ordered Logit of SRHS in wave 10 on past SRHS SRHS wave 10 Age *** Female High school Some college College degree Non white SRHS wave *** SRHS wave *** SRHS wave *** SRHS wave *** SRHS wave ** SRHS wave SRHS wave SRHS wave ** SRHS wave cut-off ** cut-off *** cut-off *** cut-off *** Obs 4,094 Log-lik -4, Significance levels: * p < 10%; ** p < 5%, *** p < 1% 18

20 Table 9: Test implementation for both model specifications. S1 S2 Standard Age splines: *** Age splines: *** *** Age splines: *** *** Age splines: BMI *** *** GP visits *** *** Pairwise Age splines: *** * Age splines: *** Age splines: * *** Age splines: BMI ** ** GP visits *** *** Wave dummies No Yes H 0 = time-invariant individual effects Test statistic P-value Significance levels: * p < 10%; ** p < 5%, *** p < 1% 19

21 Table 10: AR(1) random effects ordered logit model (Heiss, 2008). S1 S2 Age splines: *** *** Age splines: *** *** Age splines: *** *** Age splines: BMI *** *** GP visits *** *** Female ** High school *** *** Some college *** *** College degree *** *** Non white *** *** Wave dummies No Yes σ *** *** ρ *** *** Log-lik Significance levels: * p < 10%; ** p < 5%, *** p < 1% 20

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator Francesco Bartolucci and Valentina Nigro Abstract A model for binary panel data is introduced

More information

Limited Dependent Variables and Panel Data

Limited Dependent Variables and Panel Data and Panel Data June 24 th, 2009 Structure 1 2 Many economic questions involve the explanation of binary variables, e.g.: explaining the participation of women in the labor market explaining retirement

More information

Comparison between conditional and marginal maximum likelihood for a class of item response models

Comparison between conditional and marginal maximum likelihood for a class of item response models (1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

Estimation in the Fixed Effects Ordered Logit Model. Chris Muris (SFU)

Estimation in the Fixed Effects Ordered Logit Model. Chris Muris (SFU) Estimation in the Fixed Effects Ordered Logit Model Chris Muris (SFU) Outline Introduction Model and main result Cut points Estimation Simulations and illustration Conclusion Setting 1. Fixed-T panel.

More information

Markov-switching autoregressive latent variable models for longitudinal data

Markov-switching autoregressive latent variable models for longitudinal data Markov-swching autoregressive latent variable models for longudinal data Silvia Bacci Francesco Bartolucci Fulvia Pennoni Universy of Perugia (Italy) Universy of Perugia (Italy) Universy of Milano Bicocca

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Dynamic logit model: pseudo conditional likelihood estimation and latent Markov extension

Dynamic logit model: pseudo conditional likelihood estimation and latent Markov extension Dynamic logit model: pseudo conditional likelihood estimation and latent Markov extension Francesco Bartolucci 1 Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Partial effects in fixed effects models

Partial effects in fixed effects models 1 Partial effects in fixed effects models J.M.C. Santos Silva School of Economics, University of Surrey Gordon C.R. Kemp Department of Economics, University of Essex 22 nd London Stata Users Group Meeting

More information

Adaptive quadrature for likelihood inference on dynamic latent variable models for time-series and panel data

Adaptive quadrature for likelihood inference on dynamic latent variable models for time-series and panel data MPRA Munich Personal RePEc Archive Adaptive quadrature for likelihood inference on dynamic latent variable models for time-series and panel data Silvia Cagnone and Francesco Bartolucci Department of Statistical

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Econometrics I Lecture 7: Dummy Variables

Econometrics I Lecture 7: Dummy Variables Econometrics I Lecture 7: Dummy Variables Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 27 Introduction Dummy variable: d i is a dummy variable

More information

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005 The lectures will survey the topic of count regression with emphasis on the role on unobserved heterogeneity.

More information

On IV estimation of the dynamic binary panel data model with fixed effects

On IV estimation of the dynamic binary panel data model with fixed effects On IV estimation of the dynamic binary panel data model with fixed effects Andrew Adrian Yu Pua March 30, 2015 Abstract A big part of applied research still uses IV to estimate a dynamic linear probability

More information

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools Syllabus By Joan Llull Microeconometrics. IDEA PhD Program. Fall 2017 Chapter 1: Introduction and a Brief Review of Relevant Tools I. Overview II. Maximum Likelihood A. The Likelihood Principle B. The

More information

xtseqreg: Sequential (two-stage) estimation of linear panel data models

xtseqreg: Sequential (two-stage) estimation of linear panel data models xtseqreg: Sequential (two-stage) estimation of linear panel data models and some pitfalls in the estimation of dynamic panel models Sebastian Kripfganz University of Exeter Business School, Department

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

A nonparametric test for path dependence in discrete panel data

A nonparametric test for path dependence in discrete panel data A nonparametric test for path dependence in discrete panel data Maximilian Kasy Department of Economics, University of California - Los Angeles, 8283 Bunche Hall, Mail Stop: 147703, Los Angeles, CA 90095,

More information

Advanced Econometrics

Advanced Econometrics Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 1 Jakub Mućk Econometrics of Panel Data Meeting # 1 1 / 31 Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled

More information

Linear dynamic panel data models

Linear dynamic panel data models Linear dynamic panel data models Laura Magazzini University of Verona L. Magazzini (UniVR) Dynamic PD 1 / 67 Linear dynamic panel data models Dynamic panel data models Notation & Assumptions One of the

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Dynamic Panels. Chapter Introduction Autoregressive Model

Dynamic Panels. Chapter Introduction Autoregressive Model Chapter 11 Dynamic Panels This chapter covers the econometrics methods to estimate dynamic panel data models, and presents examples in Stata to illustrate the use of these procedures. The topics in this

More information

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008 Panel Data Seminar Discrete Response Models Romain Aeberhardt Laurent Davezies Crest-Insee 11 April 2008 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 1 / 29 Contents Overview

More information

Casuality and Programme Evaluation

Casuality and Programme Evaluation Casuality and Programme Evaluation Lecture V: Difference-in-Differences II Dr Martin Karlsson University of Duisburg-Essen Summer Semester 2017 M Karlsson (University of Duisburg-Essen) Casuality and Programme

More information

Marginal and Interaction Effects in Ordered Response Models

Marginal and Interaction Effects in Ordered Response Models MPRA Munich Personal RePEc Archive Marginal and Interaction Effects in Ordered Response Models Debdulal Mallick School of Accounting, Economics and Finance, Deakin University, Burwood, Victoria, Australia

More information

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Pedro Albarran y Raquel Carrasco z Jesus M. Carro x June 2014 Preliminary and Incomplete Abstract This paper presents and evaluates

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006 Comments on: Panel Data Analysis Advantages and Challenges Manuel Arellano CEMFI, Madrid November 2006 This paper provides an impressive, yet compact and easily accessible review of the econometric literature

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Ninth ARTNeT Capacity Building Workshop for Trade Research Trade Flows and Trade Policy Analysis Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

More information

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama Problem Set #1 (Random Data Generation) 1. Generate =500random numbers from both the uniform 1 ( [0 1], uniformbetween zero and one) and exponential exp ( ) (set =2and let [0 1]) distributions. Plot the

More information

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity The Lahore Journal of Economics 23 : 1 (Summer 2018): pp. 1 19 Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity Sohail Chand * and Nuzhat Aftab ** Abstract Given that

More information

Econometric Methods for Panel Data

Econometric Methods for Panel Data Based on the books by Baltagi: Econometric Analysis of Panel Data and by Hsiao: Analysis of Panel Data Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies

More information

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances

Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances Discussion Paper: 2006/07 Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances J.S. Cramer www.fee.uva.nl/ke/uva-econometrics Amsterdam School of Economics Department of

More information

1 Estimation of Persistent Dynamic Panel Data. Motivation

1 Estimation of Persistent Dynamic Panel Data. Motivation 1 Estimation of Persistent Dynamic Panel Data. Motivation Consider the following Dynamic Panel Data (DPD) model y it = y it 1 ρ + x it β + µ i + v it (1.1) with i = {1, 2,..., N} denoting the individual

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test E 4160 Autumn term 2016. Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test Ragnar Nymoen Department of Economics, University of Oslo 24 October

More information

A Test of Cointegration Rank Based Title Component Analysis.

A Test of Cointegration Rank Based Title Component Analysis. A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Simplified Implementation of the Heckman Estimator of the Dynamic Probit Model and a Comparison with Alternative Estimators

Simplified Implementation of the Heckman Estimator of the Dynamic Probit Model and a Comparison with Alternative Estimators DISCUSSION PAPER SERIES IZA DP No. 3039 Simplified Implementation of the Heckman Estimator of the Dynamic Probit Model and a Comparison with Alternative Estimators Wiji Arulampalam Mark B. Stewart September

More information

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates Identification and Estimation of Nonlinear Dynamic Panel Data Models with Unobserved Covariates Ji-Liang Shiu and Yingyao Hu April 3, 2010 Abstract This paper considers nonparametric identification of

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Dynamic Panel of Count Data with Initial Conditions and Correlated Random Effects. : Application for Health Data. Sungjoo Yoon

Dynamic Panel of Count Data with Initial Conditions and Correlated Random Effects. : Application for Health Data. Sungjoo Yoon Dynamic Panel of Count Data with Initial Conditions and Correlated Random Effects : Application for Health Data Sungjoo Yoon Department of Economics, Indiana University April, 2009 Key words: dynamic panel

More information

A Guide to Modern Econometric:

A Guide to Modern Econometric: A Guide to Modern Econometric: 4th edition Marno Verbeek Rotterdam School of Management, Erasmus University, Rotterdam B 379887 )WILEY A John Wiley & Sons, Ltd., Publication Contents Preface xiii 1 Introduction

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

Multiple Equation GMM with Common Coefficients: Panel Data

Multiple Equation GMM with Common Coefficients: Panel Data Multiple Equation GMM with Common Coefficients: Panel Data Eric Zivot Winter 2013 Multi-equation GMM with common coefficients Example (panel wage equation) 69 = + 69 + + 69 + 1 80 = + 80 + + 80 + 2 Note:

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

Munich Lecture Series 2 Non-linear panel data models: Binary response and ordered choice models and bias-corrected fixed effects models

Munich Lecture Series 2 Non-linear panel data models: Binary response and ordered choice models and bias-corrected fixed effects models Munich Lecture Series 2 Non-linear panel data models: Binary response and ordered choice models and bias-corrected fixed effects models Stefanie Schurer stefanie.schurer@rmit.edu.au RMIT University School

More information

Economics 582 Random Effects Estimation

Economics 582 Random Effects Estimation Economics 582 Random Effects Estimation Eric Zivot May 29, 2013 Random Effects Model Hence, the model can be re-written as = x 0 β + + [x ] = 0 (no endogeneity) [ x ] = = + x 0 β + + [x ] = 0 [ x ] = 0

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered Simulation Estimation of Two-Tiered Dynamic Panel Tobit Models with an Application to the Labor Supply of Married Women Sheng-Kai Chang Abstract In this paper a computationally practical simulation estimator

More information

Applied Econometrics Lecture 1

Applied Econometrics Lecture 1 Lecture 1 1 1 Università di Urbino Università di Urbino PhD Programme in Global Studies Spring 2018 Outline of this module Beyond OLS (very brief sketch) Regression and causality: sources of endogeneity

More information

E 4101/5101 Lecture 9: Non-stationarity

E 4101/5101 Lecture 9: Non-stationarity E 4101/5101 Lecture 9: Non-stationarity Ragnar Nymoen 30 March 2011 Introduction I Main references: Hamilton Ch 15,16 and 17. Davidson and MacKinnon Ch 14.3 and 14.4 Also read Ch 2.4 and Ch 2.5 in Davidson

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information

Heteroskedasticity-Robust Inference in Finite Samples

Heteroskedasticity-Robust Inference in Finite Samples Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard

More information

Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors

Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors Michael Pfaffermayr August 23, 2018 Abstract In gravity models with exporter and importer dummies the robust standard errors of

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates Identification and Estimation of Nonlinear Dynamic Panel Data Models with Unobserved Covariates Ji-Liang Shiu and Yingyao Hu July 8, 2010 Abstract This paper considers nonparametric identification of nonlinear

More information

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning Økonomisk Kandidateksamen 2004 (I) Econometrics 2 Rettevejledning This is a closed-book exam (uden hjælpemidler). Answer all questions! The group of questions 1 to 4 have equal weight. Within each group,

More information

Dynamic panel data methods

Dynamic panel data methods Dynamic panel data methods for cross-section panels Franz Eigner University Vienna Prepared for UK Econometric Methods of Panel Data with Prof. Robert Kunst 27th May 2009 Structure 1 Preliminary considerations

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

LECTURE 10: MORE ON RANDOM PROCESSES

LECTURE 10: MORE ON RANDOM PROCESSES LECTURE 10: MORE ON RANDOM PROCESSES AND SERIAL CORRELATION 2 Classification of random processes (cont d) stationary vs. non-stationary processes stationary = distribution does not change over time more

More information

Bootstrapping a conditional moments test for normality after tobit estimation

Bootstrapping a conditional moments test for normality after tobit estimation The Stata Journal (2002) 2, Number 2, pp. 125 139 Bootstrapping a conditional moments test for normality after tobit estimation David M. Drukker Stata Corporation ddrukker@stata.com Abstract. Categorical

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Econometrics Problem Set 10

Econometrics Problem Set 10 Econometrics Problem Set 0 WISE, Xiamen University Spring 207 Conceptual Questions Dependent variable: P ass Probit Logit LPM Probit Logit LPM Probit () (2) (3) (4) (5) (6) (7) Experience 0.03 0.040 0.006

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. Linear

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation NELS 88 Table 2.3 Adjusted odds ratios of eighth-grade students in 988 performing below basic levels of reading and mathematics in 988 and dropping out of school, 988 to 990, by basic demographics Variable

More information

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article

More information

Consistent OLS Estimation of AR(1) Dynamic Panel Data Models with Short Time Series

Consistent OLS Estimation of AR(1) Dynamic Panel Data Models with Short Time Series Consistent OLS Estimation of AR(1) Dynamic Panel Data Models with Short Time Series Kazuhiko Hayakawa Department of Economics Hitotsubashi University January 19, 006 Abstract In this paper, we examine

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing Econometrics II Non-Stationary Time Series and Unit Root Testing Morten Nyboe Tabor Course Outline: Non-Stationary Time Series and Unit Root Testing 1 Stationarity and Deviation from Stationarity Trend-Stationarity

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects

An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects DISCUSSION PAPER SERIES IZA DP No. 7054 An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects Majid M. Al-Sadoon Tong Li M. Hashem Pesaran November 2012 Forschungsinstitut

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information