Identification and QML Estimation of Multivariate and Simultaneous Spatial Autoregressive Models

Size: px

Start display at page:

Download "Identification and QML Estimation of Multivariate and Simultaneous Spatial Autoregressive Models"

Vivien Morris
6 years ago
Views:

1 Identification and QML Estimation of Multivariate and Simultaneous Spatial Autoregressive Models Kai Yang and Lung-fei Lee Department of Economics, The Ohio State University September 25, 205 Second Draft: November 9, 204 First Draft: May 28, 204 Abstract This paper investigates a simultaneous equation spatial autoregressive model which consists of a finite number of equations, incorporates simultaneous effects, own-variable spatial lags and cross-variable spatial lags as explanatory variables, and allow for correlation between disturbances across equations. In exposition, we first discuss a multivariate spatial autoregressive model that can be treated as a reduced form of the former model. We study parameter spaces, the identification of parameters, asymptotic properties of quasi-maximum likelihood estimation, and computational issues. Monte Carlo experiments illustrate the advantages of QML and FIML, broader applicability and efficiency improvement, compared to instrumental variables based estimation methods in the existing literature. Keywords: Multivariate spatial autoregression, identification, quasi-maximum likelihood, spatial simultaneous equations, full information maximum likelihood JEL Classification Numbers: C3, C30, C3 The authors are grateful for having valuable comments and suggestions by three anonymous referees, audiences in China Meeting of the Econometric Society in Xieman 204 and seminar participants at the University of Colorado- Boulder and the Ohio State University. Address: 30 Arps Hall, 945 N. High St., Columbus, OH 4320, USA, telephone: , yang.840@osu.edu. Address: 40 Arps Hall, 945 N. High St., Columbus, OH 4320, USA, lee.777@osu.edu, fax:

2 Introduction The single equation spatial autoregressive (SAR) model has received much attention in spatial econometrics. Anselin (988) summarizes early development in estimation and testing for SAR models, while Kelejian and Prucha (998, 999) and Lee (2004, 2007) propose and investigate the two stage least squares (2SLS), the three stage least squares (3SLS), the quasi-maximum likelihood (QML) and the generalized method of moments (GMM) estimation methods. Though univariate SAR models have been well developed, the identification and estimation of multivariate SAR models, including simultaneous equations, are limited with a few exceptions: Kelejian and Prucha (2004), Baltagi and Bresson (20), Baltagi and Deng (202), Cohen-Cole et al. (203) and Liu (204). The studies on multivariate SAR model have been empirically motivated. The multivariate and simultaneous SAR models have been employed in regional science, such as studies on migration, employment and population (de Graaff et al., 202; Gebremariam et al., 20); housing economics (Baltiga and Bresson, 20; Jeanty, 200); fiscal policies analysis, such as fiscal competition over taxes and public input (Hauptmeier et al., 202), and interactions between governments expenditures (Aller and Elhorst, 20); and topics in agricultural economics (e.g. Wu and Lin, 200). Furthermore, the multivariate and simultaneous SAR model can be seen as best response functions derived from network games with multi-choice by capturing own-choice peer effects, cross-choice peer effects, simultaneous effects and correlated effects across disturbances. Cohen-Cole et al. (203) propose a multi-choice network game framework with a linear-quadratic utility function. Their empirical example studies cross-choice peer effects between time spent watching TV and students GPA, by estimating bivariate SAR models using Kelejian and Prucha s (2004) generalized spatial 2SLS. The empirical result demonstrates the presence of cross-variable spatial effects. The research on ML estimation of a multivariate SAR model is theoretically and empirically motivated as well. Kelejian and Prucha (2004) consider two stage least square (2SLS) and three stage least square (3SLS) estimation methods for a simultaneous equation SAR model, incorporating spatial lags in dependent variables and allowing for spatial correlation in disturbance terms. Their investigation of estimation and estimators large sample properties suggest computational simple estimation methods for empirical studies. Baltagi and Deng (202) extend the model to fit

3 panel data by deriving a 3SLS estimator for simultaneous equations with random effects. Motivated by multi-choice network games, Cohen-cole et al (203) investigate the identification of various bivariate spatial autoregressive models: the seemingly unrelated equations model with correlated disturbances that contains endogenous spatial effects and contextual effects; the triangular system model which introduces a one-way cross-choice peer effect in a two-equation system; and a square system of simultaneous equations with simultaneity effects and two-way cross-choice peer effects. While the first two models can be identified, the last one requires exclusive restrictions. They also utilize 2SLS and 3SLS empirically to estimate all three models. Instead of the IV-based estimation, Baltagi and Bresson (20) employ the ML method to estimate a spatial seemingly unrelated regression panel model. The disturbance terms allow spatial lag and spatial error components. They also propose joint and conditional Lagrange multiplier tests for the presence of spatial correlation and random effects. However, there are several problems are left to be solved. First, IV-based estimation methods, such as 2SLS and 3SLS, does not work for models without exogenous variables. Also, as we have shown in Monte Carlo experiments, in some cases, conventional IV methods can result in very large standard deviations of estimator. IV-based estimation methods also have efficiency losses even when they are precise. Second, the parameter space of spatial effects need to be studied since it is crucial for the stabilization of the multivariate SAR model and related to the interpretation of empirical results. Third, the identification conditions of models parameters, including the identification of parameters in the multivariate SAR models, and order and rank conditions of identification in the simultaneous SAR models, should be developed. Last, the consequential computational challenges using QML/FIML methods are to be solved. In this paper, we focus on the identification, quasi-maximum likelihood estimation, parameter spaces, and computational issues of the models. Specifically, this paper investigates identification and QML estimation of a multivariate SAR model taking own-variable spatial effects, cross-variable spatial effects and correlated effects into consideration. It is specified as Y nm = W n Y nm Ψ m + X nk Π km + V nm, which consists of m equations with m endogenous variables. (see Section 2 for a detailed description of this model). An extended linear-quadratic form central limit theorem is developed and employed to characterize large sample properties of the es- This model excludes simultaneity effects and cross-choice peer effects as explanatory variables. 2

4 timator. In addition, we extend the model to spatial simultaneous equations which is specified as: Y nm Γ m = W n Y nm Λ m + X nk C km + U nm, which contains m endogenous variables within a system of m simultaneous equations. (See Section 3 for detailed description of this model.) We investigate the identification and asymptotic properties using the full information maximum likelihood (FIML). In particular, we provide identification conditions which are analogous to the rank identification in the usual linear simultaneous equations. Furthermore, we derive the asymptotic properties and finite sample performances of 2SLS and 3SLS estimation with optimal IV in order to supplement the existing literature and for comparison (Appendix C and Section 4). Monte Carlo experiments in Section 4 illustrate two advantages of the QML estimation. First, while some multivariate SAR models without exogenous variables cannot be estimated by IV based methods, the model can still be estimated by QML/FIML. Second, when there are problems with IV, such as imprecise 2/3SLS estimators, the QMLE may still be efficient and precise. The QMLE and FIMLE can also gain efficiency by reducing 5%-50% standard deviation of 3SLSE even when IV-based estimators are precise. Proofs, details of derivations, a supplementary discussion of identification and 2SLS/3SLS estimation with optimal IV are reported in the appendix. 2 A Multivariate spatial autoregressive Model We study a multivariate spatial autoregressive model: Y nm = W n Y nm Ψ m + X nk Π km + V nm, () where Y nm is a n m matrix, which consists of m endogenous variables; and n is the number of observations. Neighborhood relationships are summarized by W n, a n n matrix. X nk represents n k matrix which contains all k exogenous variables. The n m matrix, V nm, is the disturbance matrix. There are two parameter matrices, Ψ m, a m m matrix, and Π km a k m matrix. In addition, we assume that the covariance matrix for v n,i., which is the ith row of V nm, is Σ vm for i =, 2,..., n (see Assumption ). This model includes own-variable spatial effects represented by diagonal elements of Ψ m, cross-variable spatial effects represented by off-diagonal elements of Ψ m, 3

5 and correlated effects across equations incorporated in the covariance matrix of disturbances. Take vectorization, the model is transformed as vec(y nm ) = (Ψ m W )n)y nm + (I m X nk )vec(π km ) + vec(v nm ). Therefore, the sample average log likelihood function for () is: mn ln L nm (Ψ m, Π km, Σ vm ) = 2 ln(2π) 2m ln Σ vm + mn ln S nm 2mn S nmvec(y nm ) (I m X nk )vec(π km )] ( Σ vm I n ) S nm vec(y nm ) (I m X nk )vec(π km )], (2) where S nm = I nm Ψ m W n. Ψ m0, Π km0, Σ vm0 and S nm0 are denoted the true value of parameters corresponding to Ψ m, Π km, Σ vm and S nm. The identification analysis in succeeding sections employs the expected likelihood function: mn E ln L nm (Ψ m, Π km, Σ vm ) = 2 ln(2π) 2m ln Σ vm + mn ln S nm Snm Snm0 2mn (I m X nk )vec(π km0 ) (I m X nk )vec(π km ) ] ( ) Σ vm I n Snm Snm0 (I m X nk )vec(π km0 ) (I m X nk )vec(π km ) ] 2mn Tr (Σ vm I n ) Snm S nm0 (Σ vm0 I n ) S nm0 S nm ]. (3) 2. Assumptions The following assumptions are basic for the model. Assumption. v n,i, the ith row of V nm, is independent and identically distributed random vector of dimension m with zero mean and covariance matrix Σ vm. The elements of disturbances satisfy the moment condition that E v n,ik v n,il v n,ip v n,iq +δ is bounded for any i =, 2,..., n and k, l, p, q =, 2,..., m for some positive constants δ. 4

6 Assumption allows correlation between disturbance terms from different equations for each spatial unit. This assumption means that different endogenous variables for an individual unit can be influenced by some correlated unobservables, and/or some common unobservables entering disturbance processes. Higher than fourth moments are bounded in order to derive asymptotic properties of QML estimators in following sections. Under the i.i.d. random vector of disturbances, the absolute moment E v n,ik v n,il v n,ip v n,iq +δ is a constant, which will be denoted by µ +δ,klpq. The parameter space of coefficients Ψ m is important to guarantee that the system is stable and the Jacobian matrix has a positive determinant. Since I nm Ψ m W n = n i= m ( ρ jλ i ), where ρ j is an eigenvalue of Ψ m and λ i is an eigenvalue of W n, in order to give a positive determinant, for each real ρ j, where j =,, m, / min i (λ i ) < ρ j < / max i (λ i ) where λ i s are real eigenvalues of W n. For complex ρ j and real λ i, ( ρ j λ i ) will be a complex number and its appearance will associate with a complex conjugate ( ρ j λ i ). As ( ρ j λ i )( ρ j λ i ) = ρ j λ i 2 > 0 because ρ j λ i can not be zero, the cases with complex ρ and real λ will have positive contribution to the determinant and they will not matter (LeSage and Pace (200)). So for the case that W n has all real eigenvalues, the parameter space can be completely determined by all the real eigenvalues ρ j of Ψ m such that / min i (λ i ) ρ j < / max i (λ i ). For the case that both ρ j and λ i are complex, even ( ρ j λ i )( ρ j λ i ) = ρ j λ i 2 is nonnegative, it can be zero in the event ρ j = /λ i = λ i / λ i 2. In order to rule out the equality, an additional sufficient condition can be ρ j < / λ i for all complex λ i for each complex ρ j. Since all the eigenvalues of Ψ m should lie in intervals indicated, a sufficient condition is that ρ(ψ m )ρ(w n ) <, where ρ(a) is the spectral radius of matrix A. ρ(w n ) can be calculated directly since it is the largest absolute value of eigenvalues of W n. The spectral radius of Ψ m, is less than any of its induced matrix norm according to spectral radius theorem. For instance, a strong sufficient condition is that Ψ m < /ρ(w n ). Intuitively, it means the sum of own variable and cross variable spatial effects to any variable in absolute value is bounded. Another sufficient condition can be that Ψ m < /ρ(w n ), which means the sum of own variable and cross variable spatial effects from any variable in absolute value is bounded. We make the following assumption: Assumption 2. The parameter space of coefficients Π km is compact; the parameter space of coefficients Ψ m is compact such that ρ(ψ m )ρ(w n ) < ; and the normalized covariance matrix Σ vm lies 5

7 on a compact space and is nonsingular. Compact parameter spaces are employed in proofs of consistency and asymptotic properties. It is desirable in proving the uniform convergence of the log likelihood function. Typically, the normalized covariance matrix Σ vm is positive definite, which is so unless some disturbances have multicollinearity. Assumption 3. Elements of X nk are exogenous constants and are uniformly bounded for all n. The above assumption states exogeneity of regressors X nk. The uniform boundedness will simplify our asymptotic analysis and the use of a central limit theorem 2. Additional regularity will be assumed for X nk for identification purpose in Assumption 5 below. Assumption 4. Row and column sums of W n in absolute value are uniformly bounded, uniformly in n. For any possible Ψ m in its parameter space, S nm is nonsingular, and S nm is bounded in row and column sums in absolute value uniformly in Ψ m. Assumption 4 is essential in order for the SAR model to be stable. It is a typical assumption for spatial econometric models(kelejian and Prucha, 998; Lee, 2004). As sequences, the row and column sum norms of S nm in n will be bounded in absolute value by using boundedness of W n and the compact parameter space of Ψ m. The sequence S nm is also assumed to be bounded in both row and column sum norm. Lemma indicates how these properties can be derived from properties of parameters and the weight matrix. Lemma. The sequences S nm and S nm are bounded in column sum norm if Ψ m W n <. They are bounded in row sum norm if Ψ m W n <. The above lemma gives sufficient conditions for the uniform boundedness of S nm and S nm. This means that the uniform boundedness assumptions require that the coefficients of spatial effects and cross-variable spatial effects cannot be very large. For example, when W n is row normalized, Ψ m < and Ψ m < / W n will be sufficient to guarantee the satisfaction of the above lemma. 2 It is possible to relax this boundedness assumption to some moment conditions on X nk. 6

8 2.2 Model Identification and Consistency of the QMLE The multivariate SAR model contains own-variable spatial lags and cross-variable spatial lags as explanatory variables, which result in endogenous regressors. In addition, the present of endogenous variables on the right-hand side of an equation makes identification an issue, which needs detailed analysis. Let J i (i =, 2,..., m) be a m row vector with all zero elements except for the ith entry, which is. Let X,n = (J I n )S nm0 (I m (W n X nk ))vec(π km0 ),..., Xm,n = (J m I n )S nm0 (I m (W n X nk ))vec(π km0 ), where the matrix A n (k+m) = X nk, X,n, X 2,n,..., X m,n ]. Assumption 5. The limiting matrix lim n n (A n (k+m) A n (k+m)) exists and is nonsingular. Assumption 5 guarantees that the true parameters Π km0 and Ψ m0 can be identified by maximizing the log likelihood function. For a single equation SAR model, Assumption 5 reduces to (X nk, W n S n0 X kπ k0 ) having full column rank, which is the same with sufficient condition in Lee (2004). This sufficient condition guarantees the existence of best IV estimator for each structural equation. This can be seen as follows. The original system () has its lth equation being y nl = W n Y nm ψ.l + X nk π kl + v nl = W n y nr ψ rl + X nk π kl + v nl. For W n y nr = W n (J r I n )vec(y nm ) = ( W n )(J r I n )vec(y nm ) = (J r I n )(I m W n )vec(y nm ), as vec(y nm ) = S nm0 vec(x nkπ km0 ) + vec(v nm ), it is apparent that (J r I n )(I m W n )S nm0 vec(x nkπ km0 ) is a proper best IV for W n y nr for each r, r =,, m. The above assumption concerns about the implied reduced form regression equation but not the implied variance structure. In order to identify the covariance matrix Σ vm, we need the following assumption. Let σ 2 m = Tr(Σ vm ) and Σ vm = Σ vm /σ 2 m. r= Assumption 6. (Σ lim n + mn Tr vm ( Σ vm ) ] I n Snm Snm0 (Σ vm0 I n ) S nm0 S nm ] I n ) Snm S nm0 (Σ vm0 I n ) S nm0 S nm mn > 0 unless Σ vm = Σ vm0 and Ψ m = Ψ m0. 7

9 The intuition of Assumption 6 can be illustrated by some simpler sufficient conditions in a finite sample setting. Proposition. (A) No matter whether W n is symmetric or asymmetric, suppose no linear combination of W n, W n and W nw n can be proportional to I n, then, (Σ mn Tr ) ] vm I n Snm Snm0 (Σ vm0 I n ) S nm0 S nm > ( Σ ) vm I n Snm Snm0 (Σ vm0 I n ) S nm0 S nm mn unless Σ vm = Σ vm0.3 (B) Suppose W n is asymmetric, I n, W n, W n and W nw n are linearly independent, then, (Σ mn Tr ) ] vm I n Snm Snm0 (Σ vm0 I n ) S nm0 S nm > ( Σ ) vm I n Snm Snm0 (Σ vm0 I n ) S nm0 S nm mn unless Σ vm = Σ vm0 and Ψ m = Ψ m0. We will prove that Assumption 5 and conditions in Proposition (A) are sufficient to identify all parameters in maximizing the likelihood function 4 in a finite sample. Proposition 2. In the limiting case, under Assumption -5, Ψ m0 and Π km0 can be identified; and the covariance matrix Σ vm0 can be identified when Assumption 6 is satisfied. In the finite sample situation: Corollary. (I) Suppose Assumption -4 are satisfied, and A n (k+n) A n (k+n) is nonsingular, then Ψ m0 and Π km0 can be identified. The covariance matrix Σ vm0 can be identified when conditions in Proposition (A) hold. (II) Suppose Assumption -4 are satisfied, conditions in Proposition (B) hold, and X nk has full column rank, then all the true parameters can be identified. 3 Note that when W n is symmetric, Ψ m might not be identified. The conditions to identify Ψ m, when W n is symmetric, are Ψ mσ vm0 Ψ m = Ψ m0σ vm0 Ψ m0 and ( Ψ mσ vm0 + ) Σ vm0 Ψ m = Ψ m0σ vm0 + Σ ( vm0 Ψ m0. Consider ) a case when m = 2. Suppose Σ vm0 = I2, Ψm0 =, both Ψ m0 and Ψ m = satisfy the conditions. Hence, we can not identify Ψ m in some special cases when m >. 4 There are several sufficient conditions for conditions in Proposition (A), for example, suppose W n is symmetric, I n, W n, and Wn 2 are linearly independent, or, suppose W n is asymmetric, I n, W n, W n and W nw n are linearly independent. 8

10 The identification is essential for the model and is a necessary requirement in proving the consistency of the QML estimators. Some special multivariate SAR models may be regarded as the general multivariate SAR with restrictions on parameters. Linear restrictions on coefficients of some explanatory variables are common among empirical models. Consider the restrictions in the form, vec Π km Ψ m = R m θ l, where θ l is a l vector representing free parameters in a multivariate SAR model with l < m(k + m), R m is a (m 2 + mk) l known matrix representing the transformation from free parameters in θ l to parameters in the model. The following corollary provides a formal statement of identification with linear restrictions on coefficients of the multivariate system. Corollary 2. Under Assumptions -4, with linear restrictions on Π km Ψ m and Ψ m, such that vec Π km = R m θ l, an identification condition for θ l is that the limiting matrix lim n + n R mσ vm (A n (k+m) A n (k+m))]r m exists and is nonsingular. Consider the identification of multivariate SAR model with restrictions in the following example: Y n = W n Y n2 Ψ 2 + X nk Π k + V n, Y n2 = W n Y n2 Ψ 22 + X nk Π k2 + V n2. m and m 2 are numbers of columns in Y n and Y n2 respectively, which represent the numbers of endegenous variables in two subsystems. One interesting point is that the system does not include I k 0 k m2 W n Y n. For this model, the transformation matrix is R m = I m 0 m k 0 m m 2. Then, 0 m2 k I m2 (I m A m (m+k) )R m = I m X nk, X m +,n,, X m,n ]. Therefore, the identification condition is the existence and nonsingularity of lim n n X nk, X m +,n,, X m,n ] X nk, X m +,n,, X m,n ], which is a weaker condition than that in Assumption 5. For estimation, one may consider the QML estimation of models with or without constraints. 9

11 Without constraints, the vector of parameters θ will consist of vec(π km ), vec(ψ m ), and the distinctive parameters of Σ vm. With constraints as above, θ will consist of free parameter and distinctive parameters of Σ vm. Without loss of generality, we shall focus on the estimation of the model without constraints. Explicitly θ = (vec(π km ), vec(ψ m ), vec (Σ vm ) ) 5, a (km + m 2 + m(m + )/2) is dimensional column vector, where vec (Σ vm ) selects only free parameters from vec(σ), has only m(m + )/2 free parameters because Σ vm is a variance matrix, which must be symmetric. For example, if k = 3, m = 2, θ = (π, π 2, π 3, π 2, π 22, π 32, ψ, ψ 2, ψ 2, ψ 22, σ 2, σ2 2, σ2 22 ). In this section, we discuss and prove consistency of the QMLE. First, we prove that the difference of the average concentrated log likelihood function and the expected concentrated log likelihood function in (7) converges in probability to zero. Lemma 2. Under Assumptions -4, as n, mn ln L nm (Ψ m, Π km, Σ vm) E ln L nm (Ψ m, Π km, Σ vm)] 0 uniformly in Ψ m, Π km and Σ vm in their compact parameter spaces. p As the QML estimation is a nonlinear approach, estimation with a constrained model can be similarly analyzed. The consistency is summarized in the following theorem. Theorem. Under Assumptions -6 the QML estimator ˆθ composed of (ˆΠ km, ˆΨ m and ˆΣ vm ) in the multivariate SAR model is consistent, i.e., ˆθ p θ Asymptotic Normality Asymptotic distribution of the QML estimator will be useful for statistical inference. The asymptotic distribution of ˆθ can be derived from the Taylor expansion of (2), given by ˆθ θ 0 = ] 2 ln L nm( θ) ln Lnm(θ0 ) ln L θ θ θ. The first step is to derive the asymptotic distribution of n nm(θ 0 ) θ. Let F m,ij represent a m m matrix with zero entries except the ijth and jith elements, which are one, so F m,ij is symmetric. Also let E m,ij represent a m m matrix with zero entries except the ijth element which is one. Specifically, let e mi represent the ith unit column vector of dimension m, which is zero except the ith element which is one. Then, E m,ij = e mi e mj, F m,ij = E m,ij + E m,ij = e mi e mj + e mje mi when i j, and F m,ii = E m,ii. First order derivatives of the likelihood function 5 vec (..) means vectorization excluding repeated identical parameters. 0

12 are: ln L nm (θ 0 ) vec(π km ) = ( Σ vm0 nk) X vec(vnm ), ln L nm (θ 0 ) = { vec(vnm ) ( Σ vm0 Σ vm,ij 2 F m,ijσ ) ] vm0 In vec(vnm ) Tr ( Σ vm0 F ]} m,ij) In, and ln L nm (θ 0 ) Ψ m,ij =vec(π km0 ) (I m X nk )S nm0 ( Em,ij Σ vm0) W n ] vec(vnm ) + { vec(v nm ) ( Σ ) ] vm0 E m,ji Wn S nm0 vec(v nm) Tr Snm0 ( )]} E m,ji W n. One observes that these first order derivatives have linear-quadratic forms. Specifically, they are characterized by the following general form: B nmvec(v nm ) + vec(v nm ) A nm vec(v nm ) E(vec(V nm ) A nm vec(v nm )) ], where B nm is a nm constant vector with uniformly bounded elements, and {A nm } is a sequence of nm nm constant matrices with uniformly bounded row and column sum norms. It has a difference from the linear-quadratic form in Kelejian and Prucha (998) in that the elements of vec(v nm ) are not independently distributed. Therefore, the central limit theorem with i.i.d. random variables for the linear-quadratic form in Kelejian and Prucha (998) is not directly applicable to the multivariate case. We need to extend the linear-quadratic central limit theoremto the multivariate A n... A nm case 6. Denote B nm = b n,...b nk,..., b nm] and A nm =... A nkl... in blocks, where A nm... A nmm b nk is a n vector and A nkl is a n n matrix, then the above linear quadratic form becomes: Q n = b nk V nk + k= k= l= V nk A nklv nl E(V nk A nklv nl ) ]. Lemma 3. Suppose Q n = m k= b nk V nk + m k= m l= V nk A nklv nl E(V nk A nklv nl )], where b nk is a n constant vector with uniformly bounded elements and A nkl is a n n constant matrix with uniformly bounded row and column sums. V nk and V nl are n random vectors with E(V nk V nl ) = σ 2 kl I n for any k, l =, 2,..., m, which satisfy Assumption. Furthermore, suppose that the variance of Q n is σ 2 Q n, which is O(n), and n σ2 Q n is bounded away from zero. Then Qn σ Qn d N(0, ). 6 Qu and Lee (2002) provides an extension of linear-quadratic central limit theorem to a bivariate case.

13 This CLT is essential. It shows that the first order derivatives of the properly normalized log likelihood function (score vector) at the true parameter vector are asymptotically normally distributed. Indeed, as any linear combination of these first order derivatives has a linear-quadratic form, with nonstochastic vectors and matrices, these combinations are asymptotically normal. Thus, these first order derivatives are jointly normally distributed according to the Cramer-Wold device. The next step is to calculate the variance and covariance of those first order derivatives. Since the disturbances across equations can be correlated, Lemma 4 provides the covariances of linearquadratic forms. Lemma 4. For the two linear quadratic forms Q (a,b) n = B nmvec(v nm ) + vec(v nm ) A nm vec(v nm ) E(vec(V nm ) A nm vec(v nm )) ] and Q (c,d) n = D nmvec(v nm ) + vec(v nm ) C nm vec(v nm ) E(vec(V nm ) C nm vec(v nm )) ], where V nm satisfies Assumption, then the covariance EQ (a,b) n Q (c,d) n ] is EQ (a,b) n Q (c,d) n ] =B nm(σ vm I n )D nm + TrA nm (Σ vm I n )(C nm + C nm)(σ vm I n )] + µ klp b nk,i C nlp,ii + µ klp d nk,i A nlp,ii + k= l= p= k= l= p= q= i= k= l= p= ( )] (µ klpq σkp 2 σ2 lq σ2 kq σ2 lp σ2 kl σ2 pq) A nkl,ii C npq,ii where Evec(V nm )vec(v nm ) ] = Σ vm I n, µ klp = Ev n,ik v n,il v n,ip ], µ klpq = Ev n,ik v n,il v n,ip v n,iq ], A nkl,ii denotes the iith element of A nkl which is the (k, l)th block of A nm and b nk,i denotes the ith entry of vector b nk which is the kth block of B nm, and similarly C nlp,ii and d nk,i are defined. ( ) ( ) ] We have the covariance matrix: E n ln L nm(θ 0 ) = Ω θ + Ξ θ, where the θ n ln L nm(θ 0 ) θ detailed expressions of Ω θ and Ξ θ are in the appendix. This covariance matrix may have two parts; the first part represents a typical information matrix under normality and the second part may contain high order moments of disturbances, when disturbances are not necessarily normal. In order i= i= 2

14 to derive the information matrix, we calculate the second derivatives of log likelihood function, 2 ln L nm vec(π km ) vec (Π km ) = Σ vm (X nk X nk), 2 ln L nm vec(π km ) Σ vm,ij = (Σ vmf m,ij Σ vm) X nk ]vec(v nm(θ)), 2 ln L nm vec(π km ) Ψ m,ij = (Σ vme m,ji) (X nk W n)]vec(y nm ), 2 ln L nm = n Σ vm,ij Σ vm,kl 2 TrΣ vmf m,ij Σ vmf m,kl ] vec(v nm (θ)) (Σ vmf m,ij Σ vmf m,kl Σ vm) I n ]vec(v nm (θ)), 2 ln L nm Σ vm,ij Ψ m,kl = vec(v nm (θ)) (Σ vmf m,ij Σ vme m,kl ) I n ]vec(y nm ), and 2 ln L nm Ψ m,ij Ψ m,kl = Tr S nm(e m,ji W n )S nm(e m,kl W n) ] vec(y nm ) (E m,ij Σ vme m,kl ) (W nw n )]vec(y nm ), for i, j, k, l =,..., m, where V nm (θ) = Y nm W n Y nm Ψ m X nk Π km. It can be shown ( ) that Ω θ = E 2 ln L nm(θ 0 ) n θ θ because the Hessian matrix of the log quasi-likelihood function from above depends only on linear and quadratic forms of V nm. Therefore, ( ) ( ) ] ( ) E n ln L nm(θ 0 ) n ln L nm(θ 0 ) θ θ = E 2 ln L nm(θ 0 ) n θ θ + Ξ θ. where in general, Ξ θ involves skewness and kurtosis of the disturbances. The Ξ θ can be zero if the disturbances are normally distributed. Lemma 5. Under Assumptions -6, for any consistent estimate θ of θ 0, ] 2 ln L nm( θ) n θ θ ] E 2 ln L nm(θ 0 ) p n θ θ 0. Utilizing the limiting distribution of the first order derivatives and the limiting matrix of the second derivatives, we arrive at the theorem on the asymptotic distribution of the QMLE. Theorem 2. Under Assumptions -6 and that Ω θ is nonsingular, then n(ˆθ θ 0 ) d N(0, Ω θ + Ω θ Ξ θω θ ). 2.4 Computation Algorithm In this subsection, we pay some attention on the computation of QML estimation of the multivariate model. First, with m equations, there are m(m + )/2 + m 2 + km parameters to be 3

15 estimated, which may be subject to curse of dimension if m is large. Second, in the log likelihood function, there is a determinant I mn Ψ m W n containing m 2 parameters. This section briefly discuss the computational technique that can ease the computational burden. For the determinant of the Jacobian transformation in the objective function, the following result suggests a way to analyze it. For a square matrix A m, its characteristic polynomial is p Am (ρ) = c m ρ m + c m ρ m + c m 2 ρ m c ρ + c 0, where c 0 = A m, c m = ( ) m, and c m j = ( ) m j M j,am with M j,am = (all j j princinple minors of A m ) for j =, 2,..., m. I mn Ψ m W n = n i= m ( ρ jλ i ), where ρ j is eigenvalue of Ψ m while λ i is eigenvalue of W n for j =,, m and i =,, n. There are m terms in m ( ρ jλ i ) containing λ i. As we know, Ψ m ρi m = m (ρ j ρ) = p Ψm (ρ). If we multiply ( ) m λ m i to both sides of m (ρ j ρ) = p Ψm (ρ) and let ρ = /λ i, where λ i 0, it becomes m ( ρ jλ i ) = p Ψm (/λ i )( ) m λ m i. With the characteristic polynomial in the above, the log determinant term in the likelihood function becomes ln I mn Ψ m W n = n i= ln + ] m ( )j M j,ψm λ j i. So in general, the determinant of the Jacobian transformation can be effectively evaluated by first computing the eigenvalues of W n, then all the principle minors of Ψ m. Comparing with the evaluation of such a determinant for a single equation SAR model, the additional computation is to evaluate all the principal minors of Ψ m during iterations of a maximization algorithm. For m = 2, only the trace and the determinant of Ψ 2 need to be evaluated during each iteration, as the above equation is simply ln I 2n Ψ 2 W n = n i= ln Tr(Ψ 2 )λ i + det(ψ 2 )λ 2 i ]. A 3-step method to reduce number of parameters to be estimated via a concentrated likelihood can be as follows: Step : Given Ψ m and Σ vm, we estimate Π km by a generalized least square estimation.: vec(π km ) =(I m X nk )(Σ vm I n )(I m X nk )] (I m X nk )(Σ vm I n )(I n Ψ m W n )vec(y nm ). 4

16 Substitute vec(π km ) into the log likelihood function, ln L nm =constant n 2 ln Σ vm + ln I nm Ψ m W n 2 vec(y nm) (I nm Ψ m W n ) Σ vm (I n X nk (X nk X nk) X nk )] (I nm Ψ m W n )vec(y nm ). Let ỹ nm (Ψ m ) = I m (I n X nk (X nk X nk) X nk )](I nm Ψ m W n )vec(y nm ), ỹ nm (Ψ m ) can be denoted by (ỹ n, (Ψ m),, ỹ n,m(ψ m )), where ỹ n,j (Ψ m ) is n dimensional column vector for j =,, m. Then, take derivative with respect to σ m (kl), which is the klth elements of Σ vm ln L nm σ (kl) m = n 2 Σ vm,kl 2ỹ n,k (Ψ m)ỹ n,l (Ψ m ), since ln Σ vm = Σ Σ vm. vm Step 2: Σ vm,kl (Ψ m ) = nỹ n,k (Ψ m)ỹ n,l (Ψ m ) for k, l =,, m. Step 3: We maximize the concentrated log likelihood function with respect to Ψ m : ln L nm =constant n 2 ln Σ vm(ψ m ) + ln + i= ( ) j M j,ψm λ j i 2 vec(y nm) (I nm Ψ m W n ) Σ vm(ψ m ) (I n X nk (X nk X nk) X nk )] (I nm Ψ m W n )vec(y nm ). 3 Identification and Estimation of a Simultaneous Equations Model We extend our multivariate spatial autoregressive model to a simultaneous equations SAR model, which is: Y nm Γ m = W n Y nm Λ m + X nk C km + U nm. (4) where Γ m is invertible with its diagonal elements being normalized to ones, the ith rows of U nm, u n,i, i =,, n are independent and identically distributed random vectors. The covariance 5

17 matrix of u n,i is Σ um. Then, the corresponding multivariate SAR model can be seen as a quasireduced form of model (4) with Ψ m = Λ m Γ m, Π km = C km Γ m The traditional simultaneous equations model can be viewed as and Σ vm = Γ m Σ um Γ m. Y nm Γ m = X nk C km + U nm. The difference between traditional SEM and simultaneous equations SAR model is that the later contains additional explanatory regressor W n Y nm, which introduce cross-sectional dependence among agents. Let Z nm = Y nm, W n Y nm, X nk ] and α m = (Γ m, Λ m, C km ), the model becomes Z nm α m = U nm. 3. Identification of Structural Parameters For identification, we assume that Ψ m and Π km are identifiable from the multivariate SAR model. C km. So it remains to consider the identification of the structural parameters Γ m, Λ m and By the linear relationships, Ψ m Γ m = Λ m and Π km Γ m = C km, the identification issue is similar algebraically to classic identification of structural parameters from parameters in reduced form equation of a linear simultaneous equation system. The identification for the classic linear simultaneous equation system can be found in, e.g. Schmidt (976). Definition 3. A parameter of the structural model Y nm Γ m = W n Y nm Λ m + X nk C km + U nm (or Z nm α m = U nm ) is identified if and only if it can be deducted from knowledge of the reduced form parameters Ψ m, Π km and Σ vm. The statement that structural parameters can be deducted from reduced form parameters means that the structural parameters vector is a unique solution to the equation system Ψ m Π km given Ψ m and Π km, which is equivalent to Ψ m Π km Γ m = Λ m and Γ m is invertible with scale normalization that its diagonal elements are all ones. Let equation system for solution becomes (θ Ψ,Π, I m+k )α m = 0. 6 C km Λ m C km Γ m = Ψ m Π km = θ Ψ,Π, the

18 We first consider the identification of parameters subject to exclusive restrictions. Lemma 6. Suppose Φ is a R (2m+k) matrix representing exclusive restrictions on the coefficients of the first equation s.t. Φα,m = 0, and there is no restriction on the disturbance variance, then the sufficient and necessary rank condition for the identification of α.,m is rank(φα m ) = m. And the necessary order condition for identification is R m. Similar to that for the usual linear simultaneous equations model, the order condition is a counting condition, which is in particular useful for situations with exclusion restrictions of variables in a structural equation. The issue for the simultaneous equations SAR model is on the spatial lagged variables in the system. They are endogenous variables in statistical sense. However, as we demonstrate below by an example, without exclusion of exogenous regressors X s but with exclusion of spatial lags (which are relevant for the whole system) in a structural equation, that structural equation may still be identifiable. For the purpose of counting excluded variables in a structural equation, it is appropriate to regard spatial lags as exogenous variables. This interpretation is possible because spatial lags bring in neighboring characteristics as extra exogenous variables for identification and estimation as in a single SAR model with relevant exogenous variables. However, without exogenous variables in the system, i.e., pure simultaneous equation SAR model, due to the structure of disturbances, parameter identification may still be possible even though spatial lags would no longer bring in any neighboring exogenous characteristics but neighboring disturbances. With exclusive restrictions, one may count the number of included endogenous variables (not including spatial lags) and the number of excluded spatial lags and exogenous regressors. The order condition is equivalent to the condition that the number of excluded exogenous variables and spatial lags is at least as large as the number of endogenous variables excluding the one with normalized coefficient being a unity in the structural equation. Some empirical models utilize not only exclusive restrictions but also other linear restrictions including normalization to identify true parameters. The following lemma summarizes identification with rank conditions: Lemma 7. Suppose Φ is a R (2m + k) matrix representing linear restrictions (including normalization) on the coefficients of the first equation s.t. Φα,m = c, where c is a vector of constants, and there is no restriction on the disturbance variance, then the sufficient and necessary rank condition 7

19 for the identification of α.,m is rank( Φα m ) = m. Restrictions on disturbances may also help identification. Discussions on identification with restrictions on disturbances are collected in Appendix B. Since we propose full information maximum likelihood estimation, we should consider the identification of the whole system when linear constraints may also include across-equations in the system. If we do not consider restrictions on disturbance terms, we may summarize linear restrictions including cross equation ones (including normalizations) into a matrix Ψ such that Ψvec(α m ) = c where c is a R constant vector and Ψ is a R (2m 2 + km) matrix. Lemma 8. A necessary and sufficient condition to identify the whole system is rank{ Ψ(I m α m )} = m Examples Two special groups of simultaneous equation models are relatively simpler but useful in empirical studies. Their identification issues are investigated here. Example. Here, Y n consist of a dependent variable which can be explained by explanatory regressors X and Y 2, where Y 2 is a vector of endogenous explanatory regressors subject to spatial interactions. Explicitly, the system is Y n + Y n2 Γ 2 = X nk C k + U n, Y n2 Γ 22 = W n Y n2 Λ 22 + X nk C k2 + U n2. where Y n is an n matrix, Y n2 is an n m 2, X nk is an n k matrix, Γ 2 is an m 2 matrix, Γ 22 is an invertible m 2 m 2 matrix with normalized diagonals of ones, Λ 22 is an m 2 m 2 matrix, C k is a k matrix and C k2 is a k m 2 matrix. Here m = + m 2. These equations can be combined into a matrix form: (Y n, Y n2 ) 0 m 2 Γ 2 Γ 22 = (W n Y n, W n Y n2, X nk ) 0 0 m2 0 m2 Λ 22 + (U n, U n2 ). C k C k2 8

20 Define α m = 0 m2 Γ 2 Γ m2 0 m2 Λ 22. The exclusive restriction for the first equation in the first C k C k2 group can be represented by Φ = 0 0 m 2 0 m2 0 k 0 m2 0 m2 m 2 0 m2 I m2 0 m2 k. For the rank conditions, with Φ α m = 0 m 0 m m 2, if rank(λ 22 ) = m 2, rank(φ α m ) = m, so the 0 m2 Λ 22 rank condition is satisfied and the structural parameters in the first equation can be identified. The possible identification of the equation in Y n is intuitively appealing because the second set of equations for Y n2 in its reduced form is a multivariate SAR, which can be identifiable, so there are valid IVs for Y n2, which can be used for an IV estimation of the equations Y n. Similarly, look at the first equation from the second group. Φ 2 = 0 m m2 0 k. Check the rank conditions: Φ 2 α m = 0 m m2 0 m2 0 k 0 0 m2 This equation cannot be identified whenever m 2 >, since the rank condition is not satisfied. When m 2 =, rank(φ 2 α m ) = which equals m because m = m 2 =, so the rank condition would be satisfied. These situations are quite easy to be understood. If m 2 =, we have a univariate SAR model Y n2 = λ 22 W n Y n2 + X nk C k2 + U n2, which is identifiable. However, if m 2 >, we have the multivariate SAR Y n2 = W n Y n2 Λ 22 Γ 22 + X nkc k2 Γ 22 + U n2γ 22, of which only the products of coefficient matrices are identifiable, but not the structural parameters separately. For a similar equation system but without the presence of any exogenous variables X nk in the equations, i.e., Y n + Y n2 Γ 2 = U n, and Y n2 Γ 22 = W n Y n2 Λ 22 + U n2, the preceding identification analysis will also be valid such that Γ 2 can be identified in the first equation, but Γ 22 and Λ 22 in the second equation would be underidentified. Example 2. 9

21 Consider the system Y n Γ + Y n2 Γ 2 = W n Y n Λ + X n C k + U n Y n2 = X n C k 2 + X n2 C k2 2 + U n2 where Y n is an n m matrix, Y n2 is an n m 2 matrix, X n is an n k matrix, X n2 is an n k 2 matrix, Γ 2 is an m 2 m matrix, Γ is an m m invertible matrix with diagonal elements normalized to be ones, Λ is an m m matrix, C k is a k m matrix, C k 2 is a k m 2 matrix and C k2 2 is a k 2 m 2 matrix. Here, m = m + m 2 and Y is a simultaneous SAR system of dependent variables explained by regressors X and Y 2, where Y 2 are endogenous regressors. Each equation in the second group of Y n2 equations is a regression equation and hence can be identified as long as regressors in X and X 2 are not linearly dependent, which is assumed. Consider the first equation in the first group. As Z nm = Y n, Y n2, W n Y n, W n Y n2, X n, X n2 ], α m = Γ 0 m m 2 Γ 2 I m2 Λ 0 m m 2. The exclusive restriction for the first equation in the first group can be 0 m m 0 m m 2 C k C k 2 0 k2 m C k2 2 represented by Φ = 0 m 2 m 0 m2 m 2 0 m2 m I m2 0 m2 k 0 m2 k 2. 0 k2 m 0 k2 m 2 0 k2 m 0 k2 m 2 0 k2 k I k2 Check the rank conditions for Φ α m = 0 m 2 0 m2 (m ) 0 m2 m 2 0 k2 0 k2 (m ) C k2 2. The rank condition requires that rank(c k2 2) = m + m 2, however, rank(c k2 2) m 2, the structural parameters in the first equation cannot be identified unless m = and rank(c k2 2) = m 2. When m =, the first group consists of a single equation Y n = W n Y n Λ Y n2 Γ 2 +X n C k +U n, which with the second set of equations, implies Y n = W n Y n Λ +X n (C k C k 2Γ 2 ) X n2 C k2 2Γ 2 +U n U n2 Γ 2. This is a univariate SAR equation, so Λ and the combined parameters (C k C k 2Γ 2 ) and C k2 2Γ 2 are identifiable. As C k 2 and C k2 2 are identifiable from the regression equations of Y n2, with full rank of C k2 2, the parameters C k and Γ 2 are identifiable. The rank condition rank(c k2 2) = m 2 20

22 means that there are enough exclusion restrictions in Y n equation in order to find IV s for Y n2. For the m > case, identification is not possible because it is apparent the reduced form multivariate SAR equations in the best can only identify, for example, the product Λ Γ. 3.2 Consistency and Asymptotic Properties of FIML We propose a FIML method for estimation of the simultaneous equations SAR model. The log likelihood function for (4) is mn ln L nm (Γ m, Λ m, C km, Σ um ) = 2 ln(2π) 2m ln Σ um + mn ln Γ m I n Λ m W n ( Γ 2mn m I n Λ ) m W n vec(ynm ) (I m X nk )vec(c km ) ] ( ) Σ vm I n ( Γ m I n Λ ) m W n vec(ynm ) (I m X nk )vec(c km ) ] (5) It is appearant that the likelihood function is the same with that in multivariate SAR model by simply replacing Ψ m = Λ m Γ m, Π km = C km Γ m and Σ vm = Γ m Σ um Γ m. In order to derive consistency and asymptotic normality for the FIML estimator of the structural parameters, we make the regularity conditions: Assumption 7. The ith row of U nm, u i, is independent and identically distributed random vector with zero mean and covariance matrix Σ um. The elements of disturbances satisfy the moment condition E u i,nk u i,nl u i,np u i,nq +δ < D 4 for any i =, 2,..., n and k, l, p, q =, 2,..., m for some positive constants D 4 and δ > 0. Assumption 8. All structural parameters can be identified from the parameters in the quasi-reduced form. The quasi-reduced form of this simultaneous equation SAR model satisfies Assumptions 2-6. In multivariate SAR model, we prove the consistency of QML estimation without any restrictions on its parameters. The parameter space is Θ. However, if there are restrictions on the multivariate SAR model, the parameter space, say Θ c, will have a smaller dimension but will still be assumed to be compact. Since the estimation of a restricted multivariate SAR model is to estimate its free parameters in a smaller compact set, the proof of consistency is still valid. 2

23 Suppose the structural parameters of (4) can be identified from the reduced form parameters. Let α denote the vector of structural parameters and θ be the vector of quasi-reduced form parameters. The MLE for structural parameters is to maximize ln L(α), which is equal to ln L(θ(α)), with respect to α. By assuming all the structural parameters can be identified from the quasi-reduced form parameters, since the reduced form parameters can be consistently estimated, the structural parameters can be consistently estimated by maximizing the log likelihood function with respect to α. Therefore, we have corollary 3. Corollary 3. Under Assumption 7-8, the structural parameters in Γ m, Λ m, C km and Σ um with all the exclusion restrictions imposed, can be consistently estimated by the FIML. The asymptotic distribution of the FIML estimators of structural parameters can be derived by the Taylor expansion of first order conditions. The first derivatives of the log likelihood function for (4) are ln L nm (α m0 ) C km,ij ln L nm (α m0 ) Σ um,ij = 2 ln L nm (α m0 ) Λ m,ij ln L nm (α m0 ) Γ m,ij =e (i )m+j ( Σ um0 X nk) vec(unm ), for i =,..., k and j =,..., m vec(u nm ) ( Σ um0 F m,ijσ ) ] vm0 In vec(unm ) 2n Tr ( Σ um0 F ] m,ij) ] In, =vec(c km0 ) (I m X nk )S nm0,s + vec(u nm ) ( Σ ) ] um0 E m,ji Wn S ( Em,ij Σ um0) W n ] vec(unm ) nm0,s vec(u nm) Tr S nm0,s ( E m,ji W n ) ]], and = vec(c km0 ) (I m X nk ( )S nm0,s Em,ij Σ ] um0) In vec(unm ) vec(u nm ) ( Σ ) ] um0 E m,ji In S nm0,s vec(u nm) Tr Snm0,s ( ) ]] E m,ji I n, for i, j =,, m, where e (i )m+j is a km vector with all zeros except the (i )m + jth entry, which is one, and S nm,s = Γ m I n Λ m W n. It is appearant that the first order derivatives are linear-quadratic forms, and, hence, the CLT of Lemma 3 can be applied. The second order derivatives, can also be explicitly written, which share the same features as those of the multivariate SAR model. Therefore, the asymptotic distribution of the FIML estimator follows: Corollary 4. Under Assumptions 7-8, suppose the information matrix Ω αm is nonsingular, n(ˆαm α m0 ) d N(0, Ω α m + Ω α m Ξ αm Ω α m ) 22

24 4 Monte Carlo Experiments Monte Carlo experiments are designed to investigate finite sample properties of the QMLE and FIMLE. We first look at multivariate SAR models and then move on to simultaneous equation models. In all our experiments, the spatial weight matrix, W is generated from 760 counties in great plains states of USA. When two counties, i and j, share border, w ij = w ji =. When sample size is n, we randomly pick a n n block along the diagonal and row normalize the matrix. 4. Monte Carlo Experiments for Multivariate SAR Models Three groups of experiments are conducted. The first group focus on the situation when 2SLS/3SLS cannot be applied. An example is a multivariate SAR process without exogenous regressors. Example : y = ψ W y + ψ 2 W y 2 + u and y 2 = ψ 2 W y + ψ 22 W y 2 + u 2, where no exogenous regressors are present, so they cannot be estimated by 2SLS or 3SLS. This is a pure bivariate SAR system. For this system, the disturbances u and u 2 are normally distributed with mean 0 and the covariance matrix: Σ = The spatial effects matrix Ψ m is set as ψ ψ 2 ψ 2 ψ 22 Table presents the Monte Carlo results when the sample sizes are 00, 300 and 500. = conduct 500 repetitions for each design which return empirical mean and standard deviation of the QML estimates. All the estimates are significant and the biases for ψ s are about 4% or less, even when the sample size is just 300. When the sample size increases, estimator bias decreases. Standard deviations decrease also in the rate of n, which is consistent with the implication of Theorem 2. The second group of Monte Carlo experiments is conducted for the following model. Example 2: y = ψ W y + ψ 2 W y 2 + β x + u and y 2 = ψ 2 W y + ψ 22 W y 2 + β 2 x + u 2, where the true β = β 2 =, x is uniform distributed U(0, 2). As suggested by the existing literature 7, we use W x, W 2 x and W 3 x as instrument variables to do 2SLS. Results for 2SLS and QMLEs are We 7 Kelejian and Prucha (2004), Cohen-cole et al (203). 23

25 Table : Monte Carlo Results for Example (500 repetitions) Method QMLE QMLE QMLE TRUE Size ψ 0.5 Mean S.D. (0.2855) (0.83) (0.606) ψ Mean S.D. (0.2820) (0.808) (0.56) ψ Mean S.D. (0.30) (0.745) (0.645) ψ Mean S.D. (0.2787) (0.707) (0.585) σ Mean S.D. (0.480) (0.083) (0.069) σ Mean S.D. (0.8) (0.06) (0.055) σ 22 Mean S.D. (0.395) (0.0779) (0.064) reported in table 2. The result suggests that the conventional IV (2SLS) method has large standard deviations for the 2SLSE of the spatial effects parameters though the exogenous effects can be estimated much more precisely and with small biases. On the other hand, the QMLE has much smaller biases and standard deviations for the estimates of the spatial effects parameters, compared to those of the 2SLS. The standard deviations of the 2SLSEs of the spatial effects can be 5 to 30 times larger than those of the QMLEs. The next example considers two functions with different exogenous variables. Example 3: y = ψ W y + ψ 2 W y 2 + β x + u and y 2 = ψ 2 W y + ψ 22 W y 2 + β 2 x 2 + u 2, where β = β 2 =. x and x 2 are generated from uniform distribution U(0, 2). Here we use W x, W 2 x and W 3 x for both X and X 2 as instrument variables in the 2SLS estimation. The results in Table 3 suggest that the biases of 2SLSE, 3SLSE and QMLE are small and the estimates are significant for all spatial effects and exogenous effects parameters. Comparing the precision of the estimates, while there are not much differences in the estimates of the β s, the standard deviations of the QMLEs of the spatial effects are much smaller than those of the 2SLSE and 3SLSE. QML estimators show substantial efficiency improvement compared to 2SLSEs and 24

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated etworks Yan Bao May, 2010 1 Introduction In this paper, I extend the generalized method of moments framework based on linear