EM Algorithms for Ordered Probit Models with Endogenous Regressors

Size: px
Start display at page:

Download "EM Algorithms for Ordered Probit Models with Endogenous Regressors"

Transcription

1 EM Algorithms for Ordered Probit Models with Endogenous Regressors Hiroyuki Kawakatsu Business School Dublin City University Dublin 9, Ireland Ann G. Largey Business School Dublin City University Dublin 9, Ireland June 27, 2007 Abstract We propose an EM algorithm to estimate ordered probit models with endogenous regressors. The proposed algorithm has a number of computational advantages in comparison to direct numerical maximization of the (limited information) log likelihood function. First, the sequence of conditional M-steps can all be computed analytically, mostly via least squares. Second, the EM algorithm is parameterized so that each updating step naturally satisfies certain restrictions such as positive definiteness of the covariance matrix and monotonicity of cutpoints. Third, to address the potentially slow convergence of the EM algorithm, we propose the parameter expansion (PX-EM) algorithm that can accelerate convergence. Given a numerically stable algorithm to obtain maximum likelihood estimates, a number of classical tests become available for specification testing. We discuss tests for the presence of endogeneity and for instrument exogeneity (overidentification test). We conduct Monte Carlo simulations to examine the finite sample performance of the proposed estimator and provide an empirical application using survey data on social interactions. JEL classification: C13, C35. Keywords: ordered probit, endogeneity, EM algorithm.

2 1 Introduction This paper considers limited information maximum likelihood (LIML) estimation of ordered probit models with continuous endogenous regressors. Despite its asymptotic efficiency, LIML does not appear to be the estimator of choice for models in this class perhaps because LIML suffers from a number of computational disadvantages, especially in large models. As a consequence, the LIML estimator has generally been avoided in favor of less efficient but computationally simpler estimation methods (Rivers and Vuong 1988, p.351). Rivers and Vuong (1988) proposed a simple two-step estimator that requires only least squares and estimation of a standard (non-endogenous) ordered probit model. Newey (1987) proposed an asymptotically efficient two-step minimum chi-square estimator that applies generally to limited dependent variable models with continuous endogenous regressors. Our main contribution is to propose a numerically stable LIML estimator based on the EM algorithm for this class of models. In Section 2 we formulate the model and discuss numerical difficulties that can arise with LIML estimation that directly maximizes the loglikelihood function. Section 3 describes the proposed EM algorithm. The algorithm has the following features. First, the algorithm breaks up the M-step into a sequence of conditional M-steps (Meng and Rubin 1993). The resulting ECM algorithm does not require numerical maximization and the M-steps can all be computed analytically, mostly by least squares. Second, we employ a parameterization such that certain parameter restrictions are naturally satisfied when the estimates are updated in the M-steps. These restrictions include positive definiteness of the covariance matrix and monotonicity of the cutpoint parameters. Third, as shown by Meng and Rubin (1993), each step of the ECM algorithm monotonically increases the likelihood function. In Section 3, we also address two well-known drawbacks of the EM algorithm. First, the algorithm does not produce an estimate of the parameter covariance matrix which is needed for statistical inference. For inference purposes, we suggest using the observed information matrix to estimate the parameter covariance matrix (Jamshidian and Jennrich 2000). As the analytic second derivatives of the log-likelihood are rather complicated, the observed 1

3 information can be estimated by numerically differentiating the analytic scores. Moreover, the so-called robust covariance matrix that allows for clustered dependence can be obtained in the usual manner. Second, another well known problem of the EM algorithm is its slow convergence. To address this problem, we propose a simple modified parameter expansion (PX-ECM) algorithm that is known to accelerate convergence of the EM algorithm in other applications (Liu, Rubin and Wu 1998). In addition to its asymptotic efficiency (for a correctly specified model), the main advantage of the LIML estimator over two-step type estimators is that statistical inference can be conducted using any of the standard trinity of classical tests. Section 4 discusses some hypotheses of interest for the endogenous ordered probit model. Rivers and Vuong (1988) proposed tests for the presence of endogeneity using the two-step estimator. For the LIML estimator, the test of the presence of endogeneity is a simple exclusion test of the covariance parameters. Lee (1992) proposed testing over-identification (instrument exogeneity) using Newey s (1987) asymptotically efficient two-step estimator. For the LIML estimator, we show how to implement the LR test of Anderson and Rubin (1949) for testing over-identifying restrictions. Section 5 reports results from Monte Carlo simulations that examine the finite sample performance of the proposed ECM algorithms. The Monte Carlo design varies the degree of endogeneity, the degree of over-identification, and the quality of instruments to examine how they affect the finite sample properties of the algorithm and test statistics. In particular, we find that the PX-ECM algorithm can substantially accelerate convergence of the ECM algorithm. In Section 6, we fit the model to examine the empirical relation between social interactions which are measured as ordered outcomes and population density using survey data. Although the models have more than sixty parameters to estimate, we did not encounter any difficulties with using the PX-ECM algorithm to fit the data. The empirical application examines how the partial effects of population density on social interactions vary nonlinearly as a function of age. 2

4 2 Ordered Probit with Endogenous Regressors The following formulation of the ordered probit model with endogenous regressors follows that of Newey (1987) and Rivers and Vuong (1988). The model is y i = Y i β + X 1iγ + u i, i = 1,..., n (1a) Y i = Π X i + V i = Π 1 X 1i + Π 2 X 2i + V i u i N (0, Σ), Σ = σ 11 σ21 V i σ 21 Σ 22 (1b) y i = j if α j 1 y i < α j, j = 1,..., m, α 0 =, α m = + (1c) where y i is the observed scalar dependent variable with m ordered outcomes, y i is the continuous latent variable underlying y i, Y i is the r 1 endogenous regressors, X 1i is the s 1 included exogenous regressors, and X 2i is the (k s) 1 excluded exogenous regressors. The error terms (u i, Vi ) are assumed to have a joint Gaussian distribution with mean zero and covariance matrix Σ. Denote the (r + s + kr + (r + 1)(r + 2)/2 + m 1) 1 parameter vector to be estimated as θ = (β, γ, vec(π), vech(σ), α ). The limited information maximum likelihood estimator of θ maximizes the joint loglikelihood function l(θ) = l(y i, Y i X i, θ) = l(y i Y i, X i, θ) + l(y i X i, θ) To obtain the conditional log-likelihood l(y i Y i, X i, θ), note that y i Y i, X i N(µ i, σ 1 2 ) where µ i = Y i β + X1iγ + Vi Σ 1 22 σ 21 (2a) σ 1 2 = σ 11 σ 21Σ 1 22 σ 21 (2b) 3

5 and ( αyi µ ) ( i αyi 1 µ ) i p(y i Y i, X i, θ) = Φ Φ σ1 2 σ1 2 where Φ( ) is the cumulative distribution function of the standard normal distribution. The marginal log-likelihood l(y i X i, θ) is the log multivariate normal density l(y i X i, θ) = 1 2 (r log(2π) + log Σ 22 + Vi Σ 1 22 V i) (3) The limited information maximum likelihood estimates can, in principle, be obtained by directly maximizing the log-likelihood function l(θ). The maximization, however, must be done numerically and is the source of a number of computational difficulties with LIML estimation. First, we need to ensure positive definiteness of the estimated covariance matrix Σ. A common reparametrization to impose this constraint is to estimate its lower triangular Cholesky factor L with positive diagonals such that LL = Σ. 1 However, the identifying normalization discussed below may be difficult to impose under this reparameterization. Second, we need to ensure monotonicity of the estimated cutoff points α 1 <... < α m 1. One way to do this is to use the reparametrization α 1 = δ 1, α j = α j 1 + δ j where δ j = exp(d j ) for j = 2,..., m 1 and estimate the unconstrained δ 1, d 2,..., d m 1 instead of α j. For identification purposes, we need to impose certain restrictions on the parameters of the model. One commonly used normalization is to set σ 11 = 1. 2 Rivers and Vuong (1988) suggest an alternative normalization σ 1 2 = 1 which simplifies the computation of the conditional loglikelihood l(y i Y i, X i, θ). We note that one difference between the two normalizations is that numerical equivalence of the two-step estimates discussed below and LIML estimates hold under the latter normalization for just-identified models. To identify all cutoff parameters α 1,..., α m 1, the constant term must be excluded from the regressor list X 1i. Alternatively, we can arbitrarily fix a cutoff parameter and instead estimate the constant term as part of γ. For the rest of this paper, we use the normalization σ 1 2 = 1 and set α 1 = 0 and include a 1 For alternative parameterizations of the covariance matrix, see Pinheiro and Bates (1996). 2 For this normalization we can set the (1, 1) element of the cholesky factor L of Σ to one. 4

6 constant term in X 1i. This model then reduces to the binary probit model with endogenous regressors as is commonly estimated when m = 2. 3 EM Algorithm As shown above, the log likelihood function for limited dependent variable models with endogenous regressors is relatively straightforward to write down. However, perhaps due to computational difficulties of the LIML estimator, a number of computationally simpler alternative estimators have been proposed and used to date. These two-step estimators typically only require least squares and a standard limited dependent variables estimator but are generally less efficient than LIML (Rivers and Vuong 1988). The exception is Newey (1987) who develops an asymptotically efficient two-step estimator for a general class of limited dependent variable models with endogenous regressors. To address the computational difficulties with the LIML estimator, we develop an EM algorithm as a computational device to obtain LIML estimates. The proposed EM algorithm has the following computational features. First, as explained below, the sequence of conditional M-steps can all be computed analytically. In particular, most parameter estimates can be updated via a least squares regression. Second, which is related to the analytical tractability just described, we use the parameterization θ = (β, γ, vec(π), λ, vech(σ 22 ), δ ) where λ = Σ 1 22 σ 21 is r 1 and δ is (m 2) 1 with typical element δ j = α j α j 1 > 0 if m > 2. 3 Third, the covariance matrix parameters Σ 22 are updated in such a way that positive definiteness of Σ 22 is guaranteed. Therefore, there is no need to rely on parameter transforms to impose the positive definiteness constraint. This is the case regardless of the normalization used. 4 Similarly, the cutoff parameters δ j are updated in a way that preserves their ordering and there is no need to rely on parameter transforms. 3 The λ parametrization was also used in Smith and Blundell (1986) and Rivers and Vuong (1988). 4 For this reason, the algorithm in the Appendix is described using σ 11 and σ 1 2 without imposing a specific normalization. The unnormalized form is also used for the parameter expansion algorithm discussed below. 5

7 The details of the proposed EM algorithm are given in the Appendix. Below we discuss some issues that are specific to the model under consideration. 3.1 E-Step The E-step requires computation of the expected complete data log likelihood q(θ) E[l(y ) y] where y = (y1,..., y n) and y = (y 1,..., y n ). As pointed out by Ruud (1991), the main difficulty is that the cutoff parameters α in the ordered dependent variable model (1) is not identified in the EM algorithm as the latent variable y in (1) depends on the parameter α. To remove this parameter dependency of y, we follow Ruud (1991) and use a reparametrized latent variable model (7) given in the Appendix. As the resulting complete data log likelihood belongs to the exponential family, the E- step merely requires updating the complete data sufficient statistics evaluated at the current parameter estimates θ (t). For the model under consideration, the sufficient statistics are the first two conditional moments from the truncated Gaussian and are given in (8). 3.2 M-Step The first order conditions q(θ)/ θ = 0 for the M-step do not appear to have a closed form solution. The proposed algorithm breaks up the M-step into a sequence of conditional M- steps where we maximize a subset of the parameters conditional on the remaining parameter values (Meng and Rubin 1993). More specifically, we update the current parameter estimates θ (t) to θ (t+1) in the following order. 1. Given θ (t) obtain β (t+1) via a least squares regression of ỹ i X1i γ(t) Vi λ (t) on Y i where ỹ i as defined in (9) is evaluated at θ (t). 2. Given (β (t+1), γ (t), vec(π) (t), λ (t), vech(σ 22 ) (t), δ (t) ) obtain γ (t+1) via a least squares regression of ỹ i Yi β (t+1) Vi λ (t) on X 1i with the same ỹ i as in step Given (β (t+1), γ (t+1), vec(π) (t), λ (t), vech(σ 22 ) (t), δ (t) ) obtain Π (t+1) via solving the system of linear equations given in (10). 6

8 4. Given (β (t+1), γ (t+1), vec(π) (t+1), λ (t), vech(σ 22 ) (t), δ (t) ) obtain λ (t+1) via a least squares regression of ỹ i Yi β (t+1) X1i γ(t+1) on V i = Y i Π (t+1) X i with the same ỹ i as in step Update Σ (t+1) 22 = n V iv i /n with V i from step 4. Note that this update ensures positive definiteness of Σ (t+1) If m > 2, update δ j in the order j = 2,..., m 1 via solving the quadratic equation (12) where the coefficients of the quadratic equation are evaluated at (β (t+1), γ (t+1), vec(π) (t+1), λ (t+1), vech(σ 22 ) (t+1), δ (t+1) 2,..., δ (t+1) j 1, δ(t) j,..., δ (t) m 1 ). As shown by Meng and Rubin (1993), each step of the ECM algorithm monotonically increases the likelihood function. 3.3 Parameter Covariance One of the main drawbacks of the EM algorithm is that it does not produce an estimate of the parameter covariance matrix which is needed for statistical inference (Jamshidian and Jennrich 2000). The SE(C)M algorithm (Meng and Rubin 1991, van Dyk, Meng and Rubin 1995) is a well-known method to obtain the parameter covariance by running supplementary EM iterations once the parameter estimates have been obtained. The SE(C)M algorithm, however, requires evaluation of the second derivatives of the complete data log likelihood and may be numerically unstable (Jamshidian and Jennrich 2000). For this reason, we follow the recommendation of Jamshidian and Jennrich (2000) and estimate the parameter covariance matrix by the inverse of the observed information matrix 2 l(θ)/ θ θ evaluated at the EM estimates θ. As the analytic second derivatives are rather complicated, the Hessian matrix can be evaluated by numerical second differences of l(θ) or by numeric first differences of the scores l(θ)/ θ. The expressions for the analytic scores for the normalization σ 1 2 = 1 are given in Appendix A.3. 5 Alternatively, a robust covariance matrix that allows for within group (clustered) dependence can be used (Wooldridge 2001, Chapter 13). Let c i be a variable that identifies the 5 The expressions for the analytic scores are slightly more complicated for the normalization σ 11 = 1 as σ 1 2 = 1 λ Σ 22λ depends on the parameters λ, Σ 22. 7

9 C independent groups or clusters. The robust covariance matrix is then obtained as ( 2 l(θ) θ θ ) 1( C j=1 g j g j ) ( 2 l(θ) θ θ ) 1 where g j = c i =j l i/ θ is the sum of contributions to the scores from observations belonging to the j-th cluster. 3.4 Parameter Expansion Another well known drawback of the EM algorithm is its slow convergence. A solution proposed by Liu et al. (1998) is to expand the complete-data model with a larger parameter space without altering the original observed-data model. The key to implement this parameter expansion (PX-EM) algorithm is to find a suitable expansion of the parameter space that accelerates convergence. The probit example (without endogeneity) considered in Liu et al. (1998) suggests a PX-ECM algorithm for the ordered probit model (1) in which the normalized (conditional) variance parameter σ 1 2 is activated and included in the parameter vector to be estimated. To describe the proposed PX-ECM algorithm, denote the expanded parameter vector as θ = (β, γ, vec(π), λ, vech(σ 22 ), σ 1 2, δ ) where the original parameter vector θ is expanded by one additional parameter σ 1 2. The E- step of PX-ECM is unchanged from the ECM algorithm, except that the conditional moments (8) are evaluated with σ 1 2 from the current iteration value of θ. The M-step of PX-ECM is a simple modification of the ECM algorithm. Steps 1, 2, 4, 5 are unchanged. For step 3, equation (10) is solved using σ 1 2 = σ (t) 1 2. Between steps 5 and 6, we update the additional parameter to σ (t+1) 1 2 using equation (11). Note that this update ensures σ (t+1) 1 2 > 0. Then in step 6, we solve the quadratic equation (12) where σ 1 2 is evaluated at σ (t+1) 1 2. Finally, once the PX-ECM algorithm has converged, we rescale the expanded parameter 8

10 vector θ to obtain the original identified parameter vector θ as θ = (β / σ 1 2, γ / σ 1 2, vec(π), λ / σ 1 2, vech(σ 22 ), δ / σ 1 2 ) Note that, as shown by Liu et al. (1998), monotonic convergence of the ECM algorithm is preserved for the PX-ECM algorithm. We examine the acceleration achieved using this PX-ECM algorithm in our Monte Carlo simulations below. 4 Inference 4.1 Testing Normality As likelihood inference for nonlinear models depends on distributional assumptions, it is important to have easy to apply diagnostic checks. For the ordered probit model (1), the (r + 1) 1 vector of error terms (u i, Vi ) are assumed to be independently multivariate normally distributed. While there are a variety of tests available for testing multivariate normality, the difficulty here is that the residuals û i are not observable due to the latent variable y i. Smith (1987) proposed a conditional moment test based on the departure of the third and fourth moments of the residuals from normality, while Butler and Chatterjee (1997) proposed a joint test for exogeneity and normality based on the GMM criterion. Although the test by Smith (1987) can be directly applied to the model under consideration, it is relatively cumbersome to implement as it requires evaluation of conditional expectations of all third and fourth moments of the vector error term. We propose a much simpler informal test of normality as diagnostic check. Since the marginal and conditional distributions of a multivariate normal distribution is also normal, the idea is to test normality separately for u i V i and V i. Because (1b) is essentially a linear regression, the usual tests for residual normality can be applied to V i. For r = 1, one can use the Bera-Jarque test of normality or the normal quantile plot to detect departures from normality. For r > 1, Vi Σ 1 22 V i should be independent χ 2 with r degrees of freedom. A 9

11 simple visual diagnostic is then to compare the empirical quantiles of V i Σ 1 V 22 i with those from χ 2 (r) by a quantile-quantile plot. To check the normality of u i conditional on V i, we apply the conditional moment test of Chesher and Irish (1987) to test y i V i N(µ i, σ 1 2 ). The test of Chesher and Irish (1987) is essentially a special case of Smith (1987) for a single equation limited dependent variable model (without simultaneity). Our proposal is to apply the much simpler test of Chesher and Irish (1987) assuming V i is an observed regressor in (2a). The details of the test, which requires computation of the first four moment residuals (Chesher and Irish 1987), are provided in Appendix A Testing Endogeneity Once we have a numerically stable method to obtain maximum likelihood estimates, the classical trinity of tests (Wald, likelihood ratio, Lagrange multiplier) are available for inference purposes. A hypothesis of particular interest for the model under consideration is the presence of endogeneity. Under the null hypothesis of no endogeneity σ 21 = 0, (1a) can be efficiently estimated by a standard ordered probit estimation routine. Rivers and Vuong (1988) proposed tests for endogeneity based on two-step estimators. Based on our ECM or PX-ECM algorithm, we can test the hypothesis by λ = Σ 1 22 σ 21 = 0 using any of the classical trinity of tests as a restriction on one of the estimated parameters. Alternatively, one could also test the hypothesis by σ 21 = Σ 22 λ = 0 using the delta method as the restriction is a nonlinear function of the estimated parameters Σ 22 and λ. 4.3 Testing Instrument Exogeneity When estimating models with endogeneity, whether one has a valid instrumental variable is always a practical concern. For the minimum distance based two-step estimator of Newey (1987), Lee (1992) developed an over-identification test for instrument exogeneity when k > r + s. Here, we develop a likelihood ratio test of over-identification based on the LIML estimator. 10

12 Denote the reduced form of (1) as y i = c 1 X 1i + c 2 X 2i + ɛ i Y i = Π 1 X 1i + Π 2 X 2i + V i (4a) (4b) where c 1, c 2, Π 1, Π 2 are parameters. Subtracting β times (4b) from (4a), we have y i = Y i β + X1i(c 1 Π 1 β) + X2i(c 2 Π 2 β) + ɛ i Vi β Comparing this with the structural form (1a), we can formulate the limited information maximum likelihood estimator as the constrained problem (Anderson and Rubin 1949) l 0 = max c 1,c 2,Π 1,Π 2,β,Σ l(y i, Y i X 1i, X 2i ) subject to c 2 Π 2 β = 0 (5) The test of instrument exogeneity is the test of the (over-identifying) constraints c 2 Π 2 β = 0. The likelihood ratio approach to testing these constraints compares l 0 to the unconstrained problem l 1 = max c 1,c 2,Π 1,Π 2,Σ l(y i, Y i X 1i, X 2i ) (6) The likelihood ratio test statistic 2(l 1 l 0 ) is asymptotically χ2 with k s r degrees of freedom under the null of c 2 Π 2 β = 0. To compute the test statistic, one needs to estimate the unconstrained model (6) to obtain l 1. (l 0 can be obtained from the LIML estimates as discussed in the previous sections.) Fortunately, the unconstrained model (6) is relatively easy to estimate. By writing the joint log-likelihood as l(y i, Y i X 1i, X 2i ) = l(y i Y i, X 1i, X 2i ) + l(y i X 1i, X 2i ) we see that this model has the same structure as that discussed in section 2, except that the endogenous regressors Y i in the y i equation are replaced by X 2i. From the analytic scores in Appendix A.3, the maximum likelihood estimates for the Y i equation parameters can be 11

13 obtained by a least squares regression of Y i on X 1i, X 2i and l(y i X 1i, X 2i ) can be evaluated as (3). 6 The conditional likelihood l(y i Y i, X 1i, X 2i ) is an ordered probit model with conditional mean µ i = c 1 X 1i + c 2 X 2i + V i λ instead of (2a), where V i are the residuals from the Y i regression. The unconstrained loglikelihood l 1 can therefore be evaluated by running least squares and estimating a standard ordered probit model. 4.4 Partial Effects In applied work, the estimated parameters are usually not of direct interest. More commonly, we are interested in how the various regressors in (1a) affect the probability of ordered outcomes y. The partial effect is defined as Pr(y = j)/ Z h for a continuous regressor Z h and Pr(y = j D = 1, Z = Z) Pr(y = j D = 0, Z = Z) for a dummy regressor D. These partial effects generally depend on both the parameter vector θ and the regressors. One common approach is to plug-in some representative values such as the sample means or the medians for the regressors. Alternatively, one can average the partial effects evaluated for each observation in the sample. The latter is an estimate of the expected partial effect over the population. If there are several dummy regressors, the averaging approach may be preferable to plugging-in sample means of the dummy variables. As the partial effects are nonlinear functions of the parameters, confidence intervals are usually computed via a first order asymptotic approximation (the delta method). Appendix A.5 provides expressions for the approximate covariance matrix of the partial effects based on the delta method. Here we suggest an alternative Monte Carlo method which does not rely on linearization and does not necessarily center the confidence intervals about the point estimates. We simply simulate draws of the parameters θ N( θ, Cov( θ)) and evaluate the partial effects at θ. By repeating this procedure many times, we obtain an empirical distribution of the partial effects from which we can draw inference. If the simulated draw θ does not satisfy certain parameter restrictions (such as monotonicity of cutpoints), we redraw 6 Intuitively, this follows from the SUR structure of the reduced form (4). 12

14 until the restrictions are satisfied. 5 Monte Carlo Simulations In this section we examine the finite sample properties of the proposed ECM and PX-ECM algorithms via Monte Carlo simulations. For the data generating process, we consider the case with r = 1 endogenous regressor Y, s = 2 included exogenous regressors X 1 = 1, X 2, up to three instruments Z 1,..., Z 3, and m = 5 ordered outcomes. The parameter values are y i = Y i + 1 X 2,i + u i, (α 1, α 2, α 3, α 4 ) = (0, 1, 3, 6) Y i = X 2,i + Z 1,i Z k 2,i + V i u i N (0, Σ), Σ = V i 1 1 ρ 2 ρ σ22 1 ρ 2 σ 22 The 2 2 covariance matrix Σ is parameterized so that the normalization σ 1 2 = 1 holds and Cor(u i, V i ) = ρ. In our simulations, we vary the correlation parameter ρ = {±0.8, ±0.6, ±0.4, ±0.2, 0} and the variance parameter σ 22 = {1/2, 1, 2}. By varying ρ, we examine the performance of the algorithm as the degree of endogeneity changes. In particular, there is no endogeneity when ρ = 0. By varying σ 22, we examine the performance of the algorithm as the quality of the instruments change. The instruments become poor as σ 22 increases. In the simulations we also vary the number of instruments Z so that k = 3, 4, 5 where k = 3 corresponds to the just-identified case. For each pair of (k, ρ, σ 22 ), we draw the exogenous regressors X 2, Z 1,..., Z k 2 from the standard normal distribution, independently from each other. The draw is made once and data for y and Y are generated using the same fixed regressor values across the replications. The simulations are repeated 1000 times for a sample size of n = 1000 for each pair of (k, ρ, σ 22 ). As the simulation design fixes the cutpoint parameters α, the distribution of the ordered outcome y changes as we change (k, ρ, σ 22 ). Figure 1 displays the interquartile range of the frequency of the simulated ordered outcomes for the case σ 22 = 1. 7 As is clear from Figure 1, 7 The distribution is similar for the cases σ 22 = 1/2, 2 and is not displayed to conserve space. 13

15 the simulated y is not evenly distributed over the m = 5 possible outcomes, y = 1 being the most frequently observed and y = 5 the least. This is a common characteristic of ordered outcomes in real data. Also note that the frequency distribution is not symmetric about ρ = EM Algorithm To investigate the finite sample performance of the proposed estimator, Figure 2 displays the root mean squared errors (RMSE) of the parameter estimates for the case k = 4, σ 22 = 1. The figure compares the estimates from the EM algorithm with those from the twostep Amemiya GLS (AGLS) estimator proposed by Newey (1987). 8 The RMSE from the two estimators are quite similar for most parameters, except for the excluded exogenous regressors (instruments) in the reduced form equation. The similar performance of the two estimators for the parameters for the y equation reflects the asymptotic efficiency of the AGLS estimator. The inferior performance of the AGLS estimator for the parameters on the instruments is due to the fact that these estimates are essentially least squares estimates which ignore Cor(u, V ) = ρ. As such, the performance worsens with the magnitude of ρ. We note that for the just-identified case (k = 3), the RMSE for the parameter on the single instrument is quite similar between the two estimators. The case k = 5 exhibits the same feature as the k = 4 case in that the RMSE of the parameters for the instruments are noticeably larger for AGLS than PX-ECM. As the parameter β on the endogenous regressor is often the parameter of interest in applied work, Figure 3 displays the RMSE of the estimates β of this parameter from the PX-ECM algorithm as we vary the number of instruments k 2 and the variance parameter σ 22. The top figure shows that, conditional on k, the RMSE of β deteriorates only slightly as the quality of the instruments becomes poor (σ 22 becomes large). The bottom figure shows that, conditional on σ 22, the RMSE of β improves noticeably as we move from a just-identified model (k = 3) to an over-identified model (k = 4, 5). Moving from k = 4 to k = 5 slightly 8 Numerical optimization for the AGLS estimator failed in a few cases (either in the first or second stage ordered probit estimation) and were discarded when computing the performance measures based on AGLS. The ECM and PX-ECM algorithms both converged successfully (see footnote 9 for convergence criterion used) in all cases to almost identical estimates. 14

16 improves the RMSE but not as much as by moving from k = 3 to k = 4. While these results may be specific to the chosen data generating process, it does illustrate the potential gains in finite sample precision from having an over-identified system rather than a just-identified system. Figure 4 compares the computational cost of the ECM and PX-ECM algorithms. The figure displays the interquartile range of the distribution of iteration counts over the Monte Carlo simulations as we vary the pair (k, σ 22 ). 9 We observe the following. First, conditional on k, convergence becomes slightly slower as the instruments become poor (σ 22 becomes large). Second, conditional on σ 22, convergence becomes faster as we increase k, particularly for the PX-ECM algorithm. Third, the PX-ECM algorithm generally improves on the ECM algorithm, sometimes substantially so as we increase k. For k = 5, σ 22 = 2, ρ = 0.8, the median iteration count was 177 for the ECM algorithm, while that for the PX-ECM was 79, a gain of about 55% Inference The finite sample performance of the endogeneity tests described in Section 4.2 are displayed in Figures 5 6. We compare the following five tests: the Wald test from the two-step Amemiya GLS, two Wald tests, the Lagrange multiplier (score) test, and the likelihood ratio test. All tests are based on likelihood based estimates from the EM algorithm, except the first which is based on the two-step estimates. The hypothesis under test is λ = σ 21 /σ 22 = 0, except for one of the Wald tests where we test σ 21 = λσ 22 = 0 based on the delta method. As can be seen from Figures 5 6, the finite sample performance of all tests are quite similar both in terms of size and power for all (k, σ 22 ) pairs considered. Figure 5 indicates that there is very little size distortion. If anything, the tests tend to be slightly undersized, except for the case (k, σ 22 ) = (5, 2) in which they appear to be slightly oversized. The tests tend to get slightly more oversized as the instruments become poor (with an increase in σ 22 ), 9 A completion of both the E-step and the M-step is counted as one iteration. The convergence criterion was satisfaction of either l(θ (t) ) l(θ (t 1) ) < ɛ or max i θ (t) i θ (t 1) i < 10 8 where l( ) is the observed log-likelihood and ɛ Although the PX-ECM has one additional conditional M-step to update σ 1 2, total computation time can still be substantially reduced due to the decrease in iteration counts. 15

17 particularly for the case k = 5. Figure 6 shows that for the alternatives considered, the empirical power is nearly one for all cases except ρ = ± For the case ρ = 0.2, we find that the power increases with the number of instruments k holding constant their quality σ 22. For a given number of instruments k, the power increases as the quality of the instruments improve (with smaller σ 22 ). The main message with get from this is that for testing endogeneity, the quality and number of instruments appear more important than the choice of the test statistic for values of λ close to zero. Figure 7 examines the empirical size of the instrument exogeneity (over-identification) test discussed in Section 4.3. The empirical size is the rejection frequency of the test statistic based on the asymptotic χ 2 distribution with k 3 degrees of freedom. As seen in Figure 7, the test is generally well sized. If anything, the test has a slight tendency to be undersized. However, there are no systematic changes in the finite sample performance of the test as we vary ρ and σ Empirical Application As empirical application, we examine the relationship between measures of social interactions and population density. A detailed discussion of the economic issues and data are given in Brueckner and Largey (2006). Brueckner and Largey (2006) examined the relationship for measures of social interactions that are either binary (dummy variable) or continuous. In this application, we consider three measures of social interactions that are coded as ordered outcomes: neisoc (how often the respondent socializes with neighbors), confide (number of people the respondent can confide in), and friends (number of close friends). These three ordered outcomes were treated as continuous variables in Brueckner and Largey (2006). Figure 8 provides the ordered coding of these three variables and sample frequencies of each 11 The empirical power is not size adjusted; size adjustment makes little difference as the size distortion is small. The alternatives considered in terms of λ vary from 0.14 to 0.94 for σ 22 = 1/2, from 0.20 to 1.33 for σ 22 = 1, from 0.29 to 1.89 for σ 22 = 2, where the lower bounds correspond to the case ρ = ± The results in Figure 7 are based on the LR test using LIML estimates from the EM algorithm. We also examined the performance of the over-identification test using the two-step Amemiya GLS estimates. The results were nearly identical to those from the LR tests and are not reported. 16

18 outcome. We note that while the sample outcomes for neisoc and friends are relatively balanced, the outcomes for confide are concentrated in the highest order category. The covariates that correlate with these measures of social interactions are described in Table 1. These covariates are mostly dummy variables, except for age (and age squared), number of children, and the log of census tract population density. Population density is considered an endogenous regressor for reasons explained in Brueckner and Largey (2006). As in Brueckner and Largey (2006), the instruments are log average density for urbanized areas (den ua) and metropolitan statistical areas (den msa) containing the tract. The LIML estimates are reported in Table 2 for neisoc and confide. (We do not report estimates for friends as it fails the normality test as discussed below.) The parameter λ which measures the extent of endogeneity is statistically significant for neisoc but not for confide. The implied correlation is Cor(u i, V i ) = for neisoc and Cor(u i, V i ) = for confide. The correlations thus appear to be practically small. The coefficients on the two instruments den ua and den msa in the reduced form equation for den tract are both statistically significant, indicating that the instrument relevance condition is satisfied. The over-identification LR test reported at the bottom of Table 2 is insignificant for both neisoc and confide, indicating that instrument exogeneity is satisfied. The test for normality of the residuals discussed in Section 4.1 are reported in Table 3. The tests indicate that the third and fourth moments of the residuals V i from the reduced form equation (1b) for the endogenous regressor depart from those implied by the normal distribution. The conditional moment test for normality of the residuals u i from the structural equation (1a) fail to reject the null for neisoc and confide but indicate departure from normality for friends. The statistic of interest in Brueckner and Largey (2006) is the relation between social interactions and population density. As the coefficient on den tract in Table 2 is difficult to interpret in our nonlinear model, we display the partial effects of den tract on the two measures of social interactions in Figure 9. These partial effects are evaluated at the sample median values of the covariates, except that we vary age over the relevant range. We note several interesting features of these partial effects. First, as the estimated parameter on 17

19 den tract is negative for both social interaction measures, the probability for higher interaction decreases with density. However, the effect is not monotone. For confide, for example, the probability for higher interaction first increases and then decreases with density. Second, the parameter on age is statistically significant for both social interaction measures. However, the effect of age on the partial effects of population density are quite different between the two social interaction measures. For confide, the partial effects hardly vary with age and suggest that the age effect is not of practical importance for the effect of population density. For neisoc, there is a noticeable age effect on the effect of population density. In particular, while the effects of population density on social interactions for middle ranked outcomes increase with age, those for low and high ranked outcomes decrease with age. Such nonlinear effects of covariates are hardly apparent in the parameter estimates and Figure 9 highlights the importance of displaying the partial effects when interpreting the implications of the estimated model. 7 Concluding Remarks The proposed ECM algorithm to estimate ordered probit models with endogenous regressors is numerically stable and works well even for models with a large number of parameters. As the inference procedures using LIML is well established, we expect more widespread application of LIML estimation using the proposed algorithm in applied work. Although the present paper focused on the ordered probit model with endogenous regressors, the EM algorithm can be easily modified or extended to other classes of discrete or limited dependent variable models as considered by Newey (1987). The binary probit model with endogenous regressors is a special case of the model considered in this paper and requires no modification. One can also modify the EM algorithm to estimate limited dependent variable (e.g. Tobit) models with endogenous regressors. 18

20 A EM Algorithm A.1 E-Step The reparametrized latent variable model is y i = j if yi j = 1 0 yi < 1, 1 < j < m (7a) 0 < yi, j = m where y i = µ i + u i, N( α 1, σ u 1 2 ), j = 1 i N ( (7b) (1/δ j 1)µ i α j 1 /δ j, σ 1 2 /δj 2 ), 1 < j µ i and σ 1 2 are defined in (2), δ j α j α j 1 > 0 for 1 < j < m and δ 1 = δ m = 1. The complete data log likelihood function can be written as l(y, Y ) = l(yi, Y i X i, θ) = l(yi Y i, X i, θ) + l(y i X i, θ) (y i µ i + α 1 ) 2 = n 2 log(2π) n 2 log σ 1 2 y i =1 m + (log δ j (δ jyi µ i + α j 1 ) 2 ) 2σ 1 2 j=2 y i =j 2σ 1 2 nr 2 log(2π) n 2 log Σ V i Σ 1 22 V i From the first two moments of the truncated normal distribution (Johnson and Kotz 1970, 19

21 pp.81 83), we have the conditional moments µ ij φ(z 0,ij) Φ(z 0,ij ) σ1 2, j = 1 ŷi E[yi y i = j] = µ ij + φ(z 0,ij) φ(z 1,ij ) Φ(z 1,ij ) Φ(z 0,ij ) σ1 2 δ j, 1 < j < m σ 2 i E[(yi ŷi ) 2 y i = j] ( 1 z 0,ijφ(z 0,ij ) Φ(z 0,ij ) ( φ(z 0,ij ) Φ(z 0,ij ) ( = ( µ ij + φ(z 0,ij) Φ( z 0,ij ) σ1 2, j = m ) 2 ) σ 1 2, j = z 0,ijφ(z 0,ij ) z 1,ij φ(z 1,ij ) Φ(z 1,ij ) Φ(z 0,ij ) ( φ(z 0,ij ) φ(z 1,ij ) Φ(z 1,ij ) Φ(z 0,ij ) 1 + z 0,ijφ(z 0,ij ) Φ( z 0,ij ) ( φ(z 0,ij ) Φ( z 0,ij ) ) 2 ) σ 1 2, ) ) 2 σ1 2, 1 < j < m δ 2 j j = m (8a) (8b) where µ i α 1, j = 1 µ ij (µ i α j 1 )/δ j, 1 < j, z 0,ij µ ijδ j σ1 2, z 1,ij (1 µ ij)δ j σ1 2 The expected complete data log likelihood conditional on the observables can then be written as q(θ) E[l(y, Y ) y] = n 2 log(2π) n 2 log σ 1 2 y i =1 + m j=2 y i =j A.2 M-step ( log δ j δ2 j (ŷ i µ i1) 2 + σ 2 i 2σ 1 2 ( (ŷ i µ ij ) 2 + σ i 2 ) ) nr 2σ log(2π) n 2 log Σ V i Σ 1 22 V i A.2.1 β update q m β = j=1 y i =j δ j (ŷi µ ij) Y i σ

22 The first order condition q/ β = 0 can be written as Y i Y i β = y i =1(ŷ i + α 1 )Y i + m (δ j ŷi + α j 1 )Y i j=2 y i =j (X 1iγ + V i λ)y i The r 1 parameter β can be computed as a least squares regression of ỹ i X1i γ V i λ on Y i where ŷi ỹ i = + α 1 for y i = 1 (9) δ j ŷi + α j 1 for 1 < y i A.2.2 γ update q m γ = j=1 y i =j δ j (ŷi µ ij) X 1i σ 1 2 The first order condition q/ γ = 0 can be written as X 1i X 1iγ = y i =1 (ŷ i + α 1 )X 1i + m (δ j ŷi + α j 1 )X 1i j=2 y i =j (Y i The s 1 parameter γ can be computed as a least squares regression of ỹ i Y i X 1i with ỹ i as defined in (9). β + Vi λ)x 1i β Vi λ on A.2.3 Π update q m vec(π) = j=1 y i =j δ j (ŷi µ ij) Dµ i (Π) σ DV i Σ 1 22 V i(π) where Dµ i (Π) = (I r X i )λ = (λ X i ) DVi Σ 1 22 V i(π) = 2(I r X i )Σ 1 22 V i = 2(Σ 1 22 X i)v i 21

23 The first order condition q/ vec(π) = 0 can be written as (λλ + σ 1 2 Σ 1 22 ) X i Xi vec(π) = vec ( n X i Yi (λλ + σ 1 2 Σ 1 22 )) λ which is just a system of linear equations in the parameters Π. (ỹi Y i β X 1iγ ) X i (10) A.2.4 λ update q m λ = j=1 y i =j δ j (ŷi µ ij) V i σ 1 2 The first order condition q/ λ = 0 can be written as V i V i λ = (ỹ i Y i β X 1iγ)V i The r 1 parameter λ can be computed as a least squares regression of ỹ i Yi β X1i γ on V i with ỹ i as defined in (9). A.2.5 Σ 22 update q vec(σ 22 ) = n 2 vec(σ 1 22 ) vec(σ 1 22 V ivi Σ 1 22 ) The first-order condition q/ vec(σ 22 ) = 0 can be solved for Σ 22 as Σ 22 = 1 n V i V i A.2.6 σ 1 2 update q = n + σ 1 2 2σ 1 2 m j=1 y i =j (δ j ŷ i δ jµ ij ) 2 + δ 2 j σ 2 i 2σ

24 The first-order condition q/ σ 1 2 = 0 can be solved for σ 1 2 as σ 1 2 = 1 n ( (ỹi µ i ) 2 + δ 2 y i σ 2 i ) (11) ỹ i is defined in (9) and n is the sample size. A.2.7 δ j update q = ( 1 δ j(ŷi 2 δ j δ y i =j j σ σ 2 i ) + δ jµ ij ŷ i σ 1 2 ) m k=j+1 y i =k δ k (ŷ i µ ik) σ 1 2 The first-order condition q/ δ j = 0 is a quadratic equation in δ j ( (ŷi 2 y i =j m + σ 2 i ) + n j n m )δ 2 j k=j+1 y i =k ( (µ i α 1 δ 2 δ j 1 )ŷi y i =j ) (δ k ŷi µ i + α 1 + δ δ j 1 + δ j δ k 1 ) δ j n j σ 1 2 = 0 (12) where n j is the number of observations with y i = j and δ m = This quadratic equation has real roots, the larger of which is positive as required for δ j. A.3 Scores This section provides expressions of the analytic scores for the normalization σ 1 2 = 1. Denote the joint log-likelihood function as l(θ) = m j=1 y i =j log ( Φ(z ji ) Φ(z j 1,i ) ) nr 2 log(2π) n 2 log Σ Vi Σ 1 22 V i 13 For j = 2, the first sum in the coefficient of δ j is P y i =j (µi α1)by i while for j = m 1, the second double sum in the coefficient of δ j becomes P y i =m (by i µ i + α 1 + δ δ m 2). 23

25 where z ji (α j µ i )/ σ 1 2. l β = l γ = l vec(π) = l λ = φ ji Y i φ ji X 1i ( φji (λ X i ) + (Σ 1 22 X i)v i ) = λ φ ji V i l vech(σ 22 ) = n 2 D r vec(σ 1 22 ) l δ j = 1 σ1 2 y i =j φ(z ji ) Φ(z ji ) Φ(z j 1,i ) Dr vec(σ 1 22 V ivi Σ 1 22 ) m k=j+1 y i =k φ ji X i + φ ki, j = 2,..., m 1 vec(x i Vi Σ 1 22 ) where D r is the r 2 r(r + 1)/2 duplication matrix such that vec(σ 22 ) = D r vech(σ 22 ) and σ1 2 φji φ(z j 1,i) φ(z ji ) Φ(z ji ) Φ(z j 1,i ) A.4 Conditional Moment Test The first four moment residuals (Chesher and Irish 1987) for the ordered dependent variable equation (1a) are given by e i,1 = φ(z j 1,i) φ(z j,i ) Φ(z j,i ) Φ(z j 1,i ) e i,2 = z j 1,iφ(z j 1,i ) z j,i φ(z j,i ) Φ(z j,i ) Φ(z j 1,i ) e i,3 = 2e i,1 + z2 j 1,i φ(z j 1,i) z 2 j,i φ(z j,i) Φ(z j,i ) Φ(z j 1,i ) e i,4 = 3e i,2 + z3 j 1,i φ(z j 1,i) z 3 j,i φ(z j,i) Φ(z j,i ) Φ(z j 1,i ) where φ( ), Φ( ) are the density and distribution function of the standard normal, z j,i (α j µ i )/ σ 1 2 and µ i is defined in (2a). z j,i is evaluated at the maximum likelihood estimates of the parameters. Note that e i,1 = φ ji as defined in Appendix A.3. 24

26 The score χ 2 statistic for the conditional moment test of normality of the residuals u i in (1a) is given by ι R(R R) 1 R ι where ι is an n 1 vector of ones and R is an n (2r +s+3) matrix with typical row R i = (Yi e i,1, X1ie i,1, Vi e i,1, e 2,i, e 3,i, e 4,i ) The score statistic can be obtained as the explained sum of squares from the regression of ι on R (where any collinear columns of R are dropped from the regression). Under the null of normality, the score statistic is asymptotically χ 2 distributed with degrees of freedom equal to the column rank of R. A.5 Partial Effects The outcomes of interest are ( Pr(y = j) = Pr(α j 1 y αj x β ) ( αj 1 x β ) < α j ) = Φ Φ σ11 σ11 for j = 1,..., m where x β Y β + X1 γ. To simplify notation in what follows, define z j (α j x β)/ σ11, z j,1 (α j x β k k β k )/ σ 11, z j,0 (α j x β k k )/ σ 11 where a negative subscript on a vector indicates the vector without the corresponding element. A.5.1 Continuous regressor case The marginal effect of a continuous regressor x k is h( θ) Pr(y = j) x k = ( φ( z j 1 ) φ( z j ) ) βk / σ 11 From the delta method, the approximate variance of the marginal effect h( θ) can be obtained as Var ( h( θ) ) ( h( θ)/ θ) Cov( θ) ( h( θ)/ θ) 25

27 where h( θ) θ = ( z j φ( z j ) z j θ z j 1φ( z j 1 ) z ) j 1 βk + ( φ( z j 1 ) φ( z j ) ) β k / σ 11 θ σ11 θ For the σ 1 2 = 1 normalization, the relevant subvector of the parameters is θ = (β, γ, λ, vech(σ 22 ), δ ) with z j θ = ( Y, X 1, z j λ Σ 22, σ11 σ11 σ 11 z j 2σ 11 (λ λ) D r, ι j 1, 0 ) m j 1 σ11 β k / σ 11 θ = ( 0 k 1, 1, 0 r+s k, β k σ 11 λ Σ 22, β k (λ λ) D r, 0 ) m 2 / σ11 2σ 11 where ι j is a j 1 vector of ones. with For the σ 11 = 1 normalization, the relevant subvector of the parameters is θ = (β, γ, δ ) z j θ = ( Y, X 1, ι j 1, 0 ) β k / σ 11 m j 1, θ = ( 0 k 1, 1, 0 r+s k, ) 0 m 2 A.5.2 Dummy regressor case The marginal effect of a dummy variable regressor x k is j ( θ) Pr(y = j x k = 1) Pr(y = j x k = 0) = Φ( z j,1 ) Φ( z j 1,1 ) Φ( z j,0 ) + Φ( z j 1,0 ) From the delta method, the approximate variance of the effect j ( θ) can be obtained as Var ( j ( θ) ) ( j ( θ)/ θ) Cov( θ) ( j ( θ)/ θ) where j ( θ) θ = φ( z j,1 ) z j,1 θ φ( z j 1,1) z j 1,1 θ φ( z j,0 ) z j,0 θ + φ( z j 1,0) z j 1,0 θ 26

28 The derivatives z j,1 / θ, z j,0 / θ are trivial modifications of z j / θ given for the continuous regressor case above where one of the elements in X 1 corresponding to the dummy regressor is replaced by one and zero, respectively. 27

29 k = k = 4 k = y = Frequency 0.2 y = 3 y = y = 2 y = ρ Figure 1: Distribution of simulated ordered outcomes. Each panel displays the interquartile range (and the lines connect the median) of the frequency of the simulated ordered outcomes over the 1000 Monte Carlo simulations for nine correlation parameter values ρ = 0.8, 0.6,..., k is the total number of exogenous variables in the simulated model. The variance parameter is σ 22 = 1; the distribution for the cases σ 22 = 1/2 and σ 22 = 2 are similar. 28

30 pxem agls b π c 1 π c 2 π 3 RMSE π 4 δ λ δ σ 22 δ ρ Figure 2: Root mean squared error (RMSE) of each parameter estimates. The model is y = by + c 1 + c 2 X 2 + u, Y = π 1 + π 2 X 2 + π 3 Z 1 + π 4 Z 2 + V, Cor(u, V ) = ρ, Var(V ) = σ 22, λ = σ 21 /σ 22 = ρ/ (1 ρ 2 )σ 22. The true parameter values are b = c 1 = 1, c 2 = 1, π 1 = 0, π 2 = 1, π 3 = π 4 = 1, σ 22 = 1, λ = ρ/ (1 ρ 2 ), δ 2 = 1, δ 3 = 2, δ 4 = 3 for ρ = 0.8, 0.6,..., pxem is from the PX-EM algorithm and agls is from the two-step Amemiya GLS (Newey 1987) estimator. Based on 1000 Monte Carlo replications. 29

31 k = 3 k = 4 k = 5 σ 22 = 0.5 σ 22 = 1 σ 22 = 2 RMSE ρ σ 22 = 0.5 σ 22 = 1 σ 22 = 2 k = 3 k = 4 k = 5 RMSE ρ Figure 3: Root mean squared error (RMSE) of the endogenous regressor parameter estimates β. The two figures display the same information but using different conditioning variables in each panel as we vary the number of exogenous regressors k and the variance parameter σ 22. RMSE are based on estimates from the PX-EM algorithm over 1000 Monte Carlo replications. 30

32 k = 3, σ 22 = k = 4, σ 22 = 0.5 k = 5, σ 22 = k = 3, σ 22 = 1 k = 4, σ 22 = 1 k = 5, σ 22 = Iterations k = 3, σ 22 = 2 k = 4, σ 22 = 2 k = 5, σ 22 = em px ρ Figure 4: Distribution of iteration counts of the EM and PX-EM algorithms. Each panel displays the interquartile range (and the lines connect the median) of the iterations required to estimate the endogenous ordered probit model for the simulated data set over 1000 Monte Carlo replications. The Monte Carlo design varies the correlation parameter ρ = 0.8, 0.6,..., +0.8, the total number of exogenous variables k, and the variance parameter σ

What s New in Econometrics. Lecture 13

What s New in Econometrics. Lecture 13 What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018 SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM. Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies

More information

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 3 Jakub Mućk Econometrics of Panel Data Meeting # 3 1 / 21 Outline 1 Fixed or Random Hausman Test 2 Between Estimator 3 Coefficient of determination (R 2

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Joint Estimation of Risk Preferences and Technology: Further Discussion

Joint Estimation of Risk Preferences and Technology: Further Discussion Joint Estimation of Risk Preferences and Technology: Further Discussion Feng Wu Research Associate Gulf Coast Research and Education Center University of Florida Zhengfei Guan Assistant Professor Gulf

More information

Bootstrapping a conditional moments test for normality after tobit estimation

Bootstrapping a conditional moments test for normality after tobit estimation The Stata Journal (2002) 2, Number 2, pp. 125 139 Bootstrapping a conditional moments test for normality after tobit estimation David M. Drukker Stata Corporation ddrukker@stata.com Abstract. Categorical

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Lecture 6: Hypothesis Testing

Lecture 6: Hypothesis Testing Lecture 6: Hypothesis Testing Mauricio Sarrias Universidad Católica del Norte November 6, 2017 1 Moran s I Statistic Mandatory Reading Moran s I based on Cliff and Ord (1972) Kelijan and Prucha (2001)

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal?

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Dale J. Poirier and Deven Kapadia University of California, Irvine March 10, 2012 Abstract We provide

More information

Control Function and Related Methods: Nonlinear Models

Control Function and Related Methods: Nonlinear Models Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200

More information

Econometric Analysis of Games 1

Econometric Analysis of Games 1 Econometric Analysis of Games 1 HT 2017 Recap Aim: provide an introduction to incomplete models and partial identification in the context of discrete games 1. Coherence & Completeness 2. Basic Framework

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

A Note on Demand Estimation with Supply Information. in Non-Linear Models

A Note on Demand Estimation with Supply Information. in Non-Linear Models A Note on Demand Estimation with Supply Information in Non-Linear Models Tongil TI Kim Emory University J. Miguel Villas-Boas University of California, Berkeley May, 2018 Keywords: demand estimation, limited

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Specification testing in panel data models estimated by fixed effects with instrumental variables

Specification testing in panel data models estimated by fixed effects with instrumental variables Specification testing in panel data models estimated by fixed effects wh instrumental variables Carrie Falls Department of Economics Michigan State Universy Abstract I show that a handful of the regressions

More information

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments

1/24/2008. Review of Statistical Inference. C.1 A Sample of Data. C.2 An Econometric Model. C.4 Estimating the Population Variance and Other Moments /4/008 Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University C. A Sample of Data C. An Econometric Model C.3 Estimating the Mean of a Population C.4 Estimating the Population

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Statistics and econometrics

Statistics and econometrics 1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Binary Models with Endogenous Explanatory Variables

Binary Models with Endogenous Explanatory Variables Binary Models with Endogenous Explanatory Variables Class otes Manuel Arellano ovember 7, 2007 Revised: January 21, 2008 1 Introduction In Part I we considered linear and non-linear models with additive

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

More information

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 7: Cluster Sampling Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of roups and

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007 Hypothesis Testing Daniel Schmierer Econ 312 March 30, 2007 Basics Parameter of interest: θ Θ Structure of the test: H 0 : θ Θ 0 H 1 : θ Θ 1 for some sets Θ 0, Θ 1 Θ where Θ 0 Θ 1 = (often Θ 1 = Θ Θ 0

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

Testing an Autoregressive Structure in Binary Time Series Models

Testing an Autoregressive Structure in Binary Time Series Models ömmföäflsäafaäsflassflassflas ffffffffffffffffffffffffffffffffffff Discussion Papers Testing an Autoregressive Structure in Binary Time Series Models Henri Nyberg University of Helsinki and HECER Discussion

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation Seung C. Ahn Arizona State University, Tempe, AZ 85187, USA Peter Schmidt * Michigan State University,

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

Statistical Analysis of List Experiments

Statistical Analysis of List Experiments Statistical Analysis of List Experiments Graeme Blair Kosuke Imai Princeton University December 17, 2010 Blair and Imai (Princeton) List Experiments Political Methodology Seminar 1 / 32 Motivation Surveys

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 9. Specification Tests (1) Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 9. Specification Tests (1) Luc Anselin http://spatial.uchicago.edu 1 basic concepts types of tests Moran s I classic ML-based tests LM tests 2 Basic Concepts 3 The Logic of Specification

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid

Applied Economics. Regression with a Binary Dependent Variable. Department of Economics Universidad Carlos III de Madrid Applied Economics Regression with a Binary Dependent Variable Department of Economics Universidad Carlos III de Madrid See Stock and Watson (chapter 11) 1 / 28 Binary Dependent Variables: What is Different?

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

Testing for Regime Switching in Singaporean Business Cycles

Testing for Regime Switching in Singaporean Business Cycles Testing for Regime Switching in Singaporean Business Cycles Robert Breunig School of Economics Faculty of Economics and Commerce Australian National University and Alison Stegman Research School of Pacific

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

GOODNESS OF FIT TESTS IN STOCHASTIC FRONTIER MODELS. Christine Amsler Michigan State University

GOODNESS OF FIT TESTS IN STOCHASTIC FRONTIER MODELS. Christine Amsler Michigan State University GOODNESS OF FIT TESTS IN STOCHASTIC FRONTIER MODELS Wei Siang Wang Nanyang Technological University Christine Amsler Michigan State University Peter Schmidt Michigan State University Yonsei University

More information

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND Testing For Unit Roots With Cointegrated Data NOTE: This paper is a revision of

More information

Increasing the Power of Specification Tests. November 18, 2018

Increasing the Power of Specification Tests. November 18, 2018 Increasing the Power of Specification Tests T W J A. H U A MIT November 18, 2018 A. This paper shows how to increase the power of Hausman s (1978) specification test as well as the difference test in a

More information

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

The Bivariate Probit Model, Maximum Likelihood Estimation, Pseudo True Parameters and Partial Identification

The Bivariate Probit Model, Maximum Likelihood Estimation, Pseudo True Parameters and Partial Identification ISSN 1440-771X Department of Econometrics and Business Statistics http://business.monash.edu/econometrics-and-business-statistics/research/publications The Bivariate Probit Model, Maximum Likelihood Estimation,

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Finite-sample quantiles of the Jarque-Bera test

Finite-sample quantiles of the Jarque-Bera test Finite-sample quantiles of the Jarque-Bera test Steve Lawford Department of Economics and Finance, Brunel University First draft: February 2004. Abstract The nite-sample null distribution of the Jarque-Bera

More information