Essays On Spatial Econometrics: Estimation Methods And Applications

Size: px
Start display at page:

Download "Essays On Spatial Econometrics: Estimation Methods And Applications"

Transcription

1 City Uiversity of New York CUNY CUNY Academic Works Dissertatios, Theses, ad Capstoe Projects Graduate Ceter Essays O Spatial Ecoometrics: Estimatio Methods Ad Applicatios Osma Doga Graduate Ceter, City Uiversity of New York How does access to this work beefit you? Let us kow! Follow this ad additioal works at: Part of the Ecoomic Theory Commos Recommeded Citatio Doga, Osma, "Essays O Spatial Ecoometrics: Estimatio Methods Ad Applicatios" 205. CUNY Academic Works. This Dissertatio is brought to you by CUNY Academic Works. It has bee accepted for iclusio i All Graduate Works by Year: Dissertatios, Theses, ad Capstoe Projects by a authorized admiistrator of CUNY Academic Works. For more iformatio, please cotact deposit@gc.cuy.edu.

2 ESSAYS ON SPATIAL ECONOMETRICS: ESTIMATION METHODS AND APPLICATIONS by Osma Doğa A dissertatio submitted to the Graduate Faculty i Ecoomics i partial fulfillmet of the requiremets for the degree of Doctor of Philosophy, The City Uiversity of New York 205

3 c 205 Osma Doğa All Rights Reserved ii

4 This mauscript has bee read ad accepted for the Graduate Faculty i Ecoomics i satisfactio of the dissertatio requiremet for the degree of Doctor of Philosophy. Dr. Wim Vijverberg Date Chair of Examiig Committee Dr. Merih Uctum Date Executive Officer Dr. Wim Vijverberg Dr. Merih Uctum Dr. Thom Thursto Supervisory Committee THE CITY UNIVERSITY OF NEW YORK iii

5 Abstract ESSAYS ON SPATIAL ECONOMETRICS: ESTIMATION METHODS AND APPLICATIONS by Osma Doğa Adviser: Professor Wim Vijverberg This dissertatio cosists of four essays o the estimatio methods ad applicatios of spatial ecoometrics models. I the first essay, we cosider a spatial ecoometric model cotaiig spatial lags i the depedet variable ad the disturbace terms with a ukow form of heteroskedasticity i the iovatios. We first prove that the maximum likelihood estimator MLE is geerally icosistet whe heteroskedasticity is ot take ito accout i the estimatio. We show that the ecessary coditio for cosistecy of the MLE depeds o the specificatio of the spatial weight matrices. The, we exted the robust geeralized method of momet GMM estimatio approach i Li ad Lee 200 for the spatial models allowig for a spatial lag ot oly i the depedet variable but also i the disturbace term. We show the cosistecy of the robust GMM estimator ad determie its asymptotic distributio. Fially, through a comprehesive Mote Carlo simulatio, we compare the fiite sample properties of the robust GMM estimator with other estimators proposed i the literature. I the secod essay, the fiite sample properties of heteroskedasticity robust estimators suggested for the spatial autoregressive models are compared through simulatio studies. Most of the estimators suggested for the estimatio of spatial autoregressive models are icosistet i the presece of a ukow form of heteroskedasticy. The estimators formulated from the GMM ad the Bayesia Markov Chai Mote Carlo MCMC frameworks ca be robust to a ukow form of heterokedasticity. I this essay, the fiite sample properties of the robust GMM estimators ad the Bayesia estimators based o MCMC approach are compared for the spatial autoregressive models. To this ed, a comprehesive Mote Carlo simulatio is desiged for the spatial models cotaiig a spatial lag i the depedet variable ad/or disturbace term. I the secod part of the study, two empirical iv

6 applicatios are provided to show how heteroskedasticity robust estimators are performig i applied research. I the third essay, we ivestigate the properties of spatial autoregressive models that have a spatial movig average process i the disturbace term. The spatial movig average process itroduces a differet iteractio structure amog observatios. I the first part of this essay, we describe the trasmissio ad the effect of shocks uder a spatial movig average process. I the secod part, we ivestigate the ecessary coditio for cosistecy of the maximum likelihood estimator MLE of spatial models with a spatial movig average process i the disturbace term. We show that the MLE of spatial autoregressive ad spatial movig average parameters is geerally icosistet whe heteroskedasticity is ot cosidered i the estimatio. We also show that the MLE of parameters of exogeous variables is icosistet ad determie its asymptotic bias. We provide simulatio results to evaluate the performace of the MLE. The simulatio results idicate that the MLE imposes a substatial amout of bias o both autoregressive ad movig average parameters. I the fourth essay, we aalyze the effect of foreig direct ivestmet FDI o ecoomic activity through a spatially augmeted Solow growth model that takes techological iterdepedece ito accout. The techological iterdepedece maifests itself through spatial exteralities which allow techology level of a coutry to deped o techology levels of its eighbors. Based o this modified growth model, we derive regressio specificatios ad study the impact of FDI o ecoomic growth. The spatial autocorrelatio, ofte cited i the empirical growth literature, is properly accouted for through these ew specificatios. Estimatios are carried out with the tools from spatial ecoometrics. Our fidigs idicate that FDI iflows have a sigificat positive effect o the growth rate of host coutries. v

7 Ackowledgemets This dissertatio would ot have bee possible without the ecouragemet ad support of Dr. Wim Vijverberg, Dr. Merih Uctum, ad Dr. Thom Thursto. This research was supported, i part, by a grat of computer time from the City Uiversity of New York High Performace Computig Ceter uder NSF Grats CNS , CNS ad ACI-263. vi

8 Cotets Preface iv List of Tables x List of Figures xii GMM Estimatio of Spatial Autoregressive Models with Autoregressive ad Heteroskedastic Disturbaces with Suleyma Taspiar. Itroductio The Model Specificatio ad Theoretical Motivatio GMM Estimatio of Spatial Autoregressive Models Estimatio Approach uder Ukow Heteroskedasticity The Icosistecy of Maximum Likelihood Estimator Robust GMM Estimatio of SARAR, Mote Carlo Experimets Desig Simulatio Results Coclusio Appedix Some Useful Lemmas The Icosistecy of the ML Estimator Proof of Mai Propositios Simulatio Results vii

9 2 Robust Estimatio Methods for Spatial Autoregressive Models: A Compariso of Bayesia ad Robust GMM Approach Itroductio Model Specificatio ad Assumptios Robust GMM Estimatio of Spatial Autoregressive Models Bayesia MCMC Estimatio of Spatial Autoregressive Models Bayesia MCMC Estimatio of SARAR, Bayesia MCMC Estimatio of SARAR, Bayesia MCMC Estimatio of SARAR0, Bayesia MCMC Computatio Mote Carlo Experimets Desig Simulatio Results Empirical Illustratios Coditioal Covergece Housig Price Model Coclusio Appedix Some Useful Lemmas Mote Carlo Simulatio Results Variable Defiitios ad Summary Statistics Heteroskedasticity of Ukow Form i Spatial Autoregressive Models with Movig Average Disturbace Term Itroductio Model Specificatio ad Assumptios Spatial Processes for the Disturbace Term The MLE of λ 0 ad ρ The MLE of β Mote Carlo Simulatio viii

10 3.6. Desig Simulatio Results Coclusio Appedix Some Useful Lemmas Proof of Propositio Simulatio Results for SARMA0, Simulatio Results for SARMA, Surface Plots of RMSEs for SARMA, The Effect of FDI o Ecoomic Growth: A Spatial Ecoometric Approach with Suleyma Taspiar Itroductio Cross-Coutry Regressio Specificatios Descriptio of Data Estimatio Approach Iterpretatio of Parameters i SDPD Model Empirical Results Estimatio Results for No-spatial Pael Data Models Estimatio Results for the SDPD Model Robustess Exercises Coclusio Appedix Data ad Sample of Coutries Mote Carlo Results for Bias Corrected QMLE Web Appedix for The Effect of FDI o Ecoomic Growth: A Spatial Ecoometric Approach Bibliography 245 ix

11 List of Tables. λ 0, β 0, β 20, β 30, ρ 0 = 0.8, 0.7, 0.4,.2, λ 0, β 0, β 20, β 30, ρ 0 = 0.3, 0.7, 0.4,.2, λ 0, β 0, β 20, β 30, ρ 0 = 0, 0.7, 0.4,.2, λ 0, β 0, β 20, β 30, ρ 0 = 0.3, 0.7, 0.4,.2, λ 0, β 0, β 20, β 30, ρ 0 = 0.8, 0.7, 0.4,.2, λ 0, β 0, β 20, β 30, ρ 0 = 0.3, 0.7, 0.4,.2, 0.3, = λ 0, β 0, β 20, β 30, ρ 0 = 0.8, 0.7, 0.4,.2, 0.3, = Coditioal Covergece: SARAR, Coditioal Covergece: Spatial Durbi Model SDM Coditioal Covergece: SARAR0, Coditioal Covergece: SARAR, Hedoic Housig-Price Equatio: SARAR, Spatial Durbi Model Hedoic Housig-Price Equatio: SARAR0, Hedoic Housig-Price Equatio: SARAR, N = N = N = N = N = N = N = x

12 2.6 N = N = N = N = N = Coditioal Covergece, N= Hedoic-Housig Price Equatio, N= Simulatio Results for SARMA0, Simulatio Results for SARMA,: = Simulatio Results for SARMA,: = Simulatio Results for SARMA,: = No-spatial Model Estimatios Bias Corrected QML Estimatio Results Space-time effect estimates of physical capital elasticity for colum 3 i Table Space-time effect estimates of huma capital elasticity for colum 3 i Table Space-time effect estimates of drag elasticity for colum 3 i Table Space-time effect estimates of FDI semi-elasticity for colum 3 i Table Robustess Exercises: Bias Corrected QML Estimatio Results Variable Defiitios ad Descriptive Statistics SDPD Mote Carlo Results SDPD Mote Carlo Results Uified Approach xi

13 List of Figures. Weight Matrices The effect of hyper-parameter r o the estimates of variace compoets v i s The effect of hyper-parameter r o the estimates of variace compoets v i s Scatter plots of depedet variables ad their spatial lags Posterior mea of variace compoets: Coditioal covergece Moitorig covergece: SARAR, Posterior Distributio of λ ad κ: SDM ad SARAR, Posterior mea of variace compoets: Hedoic Housig-Price Equatio Posterior distributio of autoregressive parameters RMSE of ρ RMSE of λ The Effect of a Shock The pealty fuctios for the dese weight matrix RMSE of β ad β RMSE of λ ad ρ FDI Iflows ad Stocks Diffusio effects of FDI for colum 3 i Table 4.2 whe p = 0 ad p = xii

14 GMM Estimatio of Spatial Autoregressive Models with Autoregressive ad Heteroskedastic Disturbaces with Suleyma Taspiar

15 Abstract We cosider a spatial ecoometric model cotaiig a spatial lag i the depedet variable ad the disturbace term with a ukow form of heteroskedasticity i iovatios. We first prove that the maximum likelihood ML estimator for spatial autoregressive models is geerally icosistet whe heteroskedasticity is ot take ito accout i the estimatio. We show that the ecessary coditio for the cosistecy of the ML estimator of spatial autoregressive parameters depeds o the structure of the spatial weight matrices. The, we exted the robust geeralized method of momet GMM estimatio approach i Li ad Lee 200 for the spatial model allowig for a spatial lag ot oly i the depedet variable but also i the disturbace term. We show the cosistecy of the robust GMM estimator ad determie its asymptotic distributio. Fially, through a comprehesive Mote Carlo simulatio, we compare fiite sample properties of the robust GMM estimator with other estimators proposed i the literature. Author Keywords: Spatial autoregressive models, Ukow heteroskedasticity, Robustess, GMM, Asymptotics, MLE JEL classificatio codes: C3, C2, C3

16 . Itroductio Spatial ecoometric models that have a log history i regioal sciece ad geography has bee receivig attetio i ecoomics i recet years. Spatial ecoometric models allow regressio specificatios through which spatial depedece amog observatios ca be icorporated i ecoomic aalysis ad i the estimatio of models. The spatial depedece is a special form of cross-sectioal depedece amog observatios determied by locatios of observatios i space. The estimatio of models with spatial depedece requires special estimatio techiques. There are three mai estimatio approaches: i the maximum likelihood ML estimatio method, ii the geeralized method of momet GMM/IV estimatio method, ad iii the Bayesia Markov Chai Mote Carlo MCMC estimatio method. For may spatial model specificatios, the ML estimatio has bee the most widely used techique ad has ofte bee the oly techique that is implemeted Aseli, 988, LeSage ad Pace, However, formal results cocerig the asymptotic properties of the quasi ML estimator have recetly bee established i Lee 2004 oly for pure spatial ad spatial autoregressive models. The ML estimatio ca ivolve a sigificat computatioal difficulty due to the presece of the determiat of a matrix i the likelihood fuctio, whose dimesios deped o the sample size. Kelejia ad Prucha, 998, Das et al., 2003, Kelejia ad Prucha, 200. Several solutios have bee suggested to overcome the computatioal burde of the ML method Ord, 975, Pace ad Barry, 997b,a, Barry ad Pace, 999, Simirov ad Aseli, 200, LeSage ad Pace, 2004, The GMM ad IV estimators have the advatage that they do ot require ay distributioal assumptio for the disturbace term ad remai to be computatioally more feasible tha ML estimatio. I the literature, differet kids of two stage least squares 2SLS estimators correspodig to the differet set of istrumetal variables have bee suggested Aseli, 988, Kelejia ad Prucha, 998, Lee, 2003, 2007a, Kelejia ad Prucha, 2007, 200. The spatial structure of regressio equatios motivate the selectio of the istrumets which are usually costructed from the exogeous variables ad spatial weight For the Bayesia MCMC approach, see Lesage 997, Paret ad Lesage 2007a ad LeSage ad Pace

17 matrices. Despite its computatioal simplicity, the 2SLS estimator is iefficiet relative to the ML estimator. The iefficiecy arises because the 2SLS estimator focuses oly o the determiistic part of the edogeous variable i.e., the spatial lag term ad the iformatio i the stochastic part is ot used i the estimatio. Kelejia ad Prucha 998, 200 propose a multi-step estimatio method that ivolves a combiatio of IV ad GMM estimatio for the spatial model that has a spatial autoregressive process i the depedet variable ad disturbace term for short SARAR,. This kid of model specificatio is ofte referred as the Kelejia-Prucha Model Elhorst, 200. I the first step, the iitial estimates of the parameters of the exogeous variable ad the autoregressive parameter of the spatial lag of the depedet variable are estimated by the 2SLS estimator. I the secod step, residuals from the first step are used to estimate the autoregressive parameter of the spatial lag of the disturbace term by the GMM estimator. I the fial step, the parameters are re-estimated by the 2SLS estimator after trasformig the model via a Cochrae-Orcut type trasformatio to accout for the spatial correlatio. However, the estimatio approach i Kelejia ad Prucha 998 is iefficiet relative to the ML estimatio?. The extesive Mote Carlo results i Das et al demostrate that the differece betwee fiite sample efficiecy, measured with root mea squared errors RMSE, betwee the ML ad the GMM ad IV estimators of Kelejia ad Prucha 998, 999 is very small. Drukker et al. 202 cosider the specificatio SARAR, where they allow for edogeous regressors i additio to spatial lag of the depedet variable. The estimatio approach ivolves several steps ad is a extesio of GMM/IV estimatio method of Kelejia ad Prucha 998, 999. To icrease the efficiecy of the GMM estimator, Lee 2007a,c, Li ad Lee 200, Liu et al. 200, ad Lee ad Liu 200 suggest sets of momet fuctios that are liear ad quadratic i the disturbace term for the GMM estimatio. I this approach, the liear momet fuctios are based o the determiistic part of the spatial lag term ad the quadratic momet fuctios are costructed for exploitig the stochastic part of the spatial lag variable i.e., the edogeous variable. The quadratic momet fuctios are chose i a such way that the GMM estimator is asymptotically equivalet to the ML estimator whe disturbaces are idepedet ad idetically distributed i.i.d. 4 with a ormal desity.

18 Whe disturbaces are simply i.i.d., Liu et al. 200 ad Lee ad Liu 200 show that the oe step GMM estimator joit GMM estimator is more efficiet tha the quasi ML estimator, respectively for the case of a SARAR, ad a SARARp,q. Most of the estimatio methods metioed above are valid uder the assumptio that the disturbace terms are i.i.d. I may regressio applicatios, heteroskedasticity is likely to be preset. 2 I the presece of ukow heteroskedasticity, the ML ad GMM estimators are geerally ot cosistet. The ML estimator is icosistet if the heteroskedasticity is ot icorporated ito the estimatio. For a SARAR,0, Li ad Lee 200 shows that the likelihood fuctio is ot maximized at the true parameter values i the presece of the ukow heteroskedasticity. The GMM estimators are also icosistet sice the momet fuctios are ofte desiged uder the assumptio that disturbaces are i.i.d. Hece, the orthogoality coditios for the momet fuctios might ot be satisfied. To hadle ukow heteroskedasticity, Kelejia ad Prucha 200 exted their estimatio approach by modifyig the momet fuctios for the case of a SARAR,. Badiger ad Egger 20 exted the robust estimatio approach i Kelejia ad Prucha 200 to the case of SARARp,q. Likewise, Li ad Lee 200 suggest a oe-step robust GMM estimator for the model with oly spatial depedece i the depedet variable. 3 I the preset study, the oe-step robust GMM estimatio approach suggested by Li ad Lee 200 is exteded to the spatial model with a spatial autoregressive process i both the depedet variable ad the disturbace term uder the assumptio that there is ukow form of heteroskedasticity i the disturbace term. We show that the ML estimator might ot be cosistet i the presece of the ukow heteroskedasticity, as the probability limits of the first order coditios evaluated at the true parameter values are geerally ot zero. We show that the ecessary coditio for the cosistecy of the ML estimator of spatial autoregressive parameters depeds o the structure of the spatial weight matrices. The, a robust GMM estimator is derived from a set of momet fuctios that are composed of both liear ad quadratic momet fuctios. The cosistecy of the estimator is established ad its asymptotic distributio is determied. Fiite sample properties are 2 For a example, see the empirical applicatio i Li ad Lee For a robust 2SLS estimator of SARAR,0, see Aseli

19 compared with that of other estimators through a comprehesive Mote Carlo simulatio. This paper is orgaized i the followig way. I Sectio 2, the theoretical motivatio for the case of a SARAR, is provided alog with the model assumptios ad their implicatios. I Sectio 3, the GMM estimators that have bee suggested i the literature are reviewed. I Sectio 4., we show the icosistecy of the ML estimator i the presece of ukow heteroskedasticity. We determie the asymptotic bias of the parameters of the exogeous variables. I Sectio 4.2, a robust GMM estimatio method is cosidered for the case of a SARAR,. The idetificatio coditios are determied. The mai large sample properties of the robust GMM estimator are stated i three propositios. The Mote Carlo simulatios are carried out i Sectio 5. Sectio 6 closes with cocludig remarks..2 The Model Specificatio ad Theoretical Motivatio I the literature, spatial depedece i regressio specificatios is categorized i two broad categories kow as spatial lag ad spatial error models. The spatial lag model icludes fuctioal forms i which a depedet variable at a poit i space depeds o depedet variables of surroudig locatios. The equilibrium outcome of theoretical ecoomic models of iteractig spatial uits motivates this kid of specificatio. I spatial error models, cross-sectioal correlatios amog error terms are icorporated ito the specificatio ad estimatio of models. Measuremet error i data usually teds to vary systematically over space, which causes spatial depedece amog error terms of a specificatio. 4 I this study, the followig first order SARAR, specificatio is cosidered: Y = λ 0 W Y + X β 0 + u, u = ρ 0 M u + ε,. where Y is a vector of observatios for the depedet variable, X is a k matrix of ostochastic exogeous variables, W ad M are spatial weight matrices of kow costats with zero diagoal elemets, ad ε is a vector of disturbaces or iovatios. The variables W Y ad M u are kow respectively as spatial lag of the depedet variable ad the disturbace term. The spatial effect parameters λ 0 ad 4 For the motivatio of model specificatios, see Aseli 988, 2006 ad LeSage ad Pace

20 ρ 0 are kow as the spatial autoregressive parameters. The above specificatio is fairly geeral i the sese that it allows for spatial spillovers i the depedet variable, exogeous variables ad disturbaces. 5 As the spatial data is characterized with triagular arrays, the variables i 3.3 have subscript. 6 Let Θ be the parameter space of the model. I order to distiguish the true parameter vector from other possible values i Θ, the model is stated with the true parameter vector θ 0 = ρ 0, ς 0 with ς 0 = λ 0, β 0. For the otatioal simplicity, we deote S λ = I λw, R ρ = I ρm, G λ = W S λ ad H ρ = M R ρ. Also, at the true parameter values ρ 0, λ 0, we deote S λ 0 = S, R ρ 0 = R, G λ 0 = G, H ρ 0 = H ad G = R G R. Next, assumptios that are required for the asymptotic properties of estimators are elaborated ad the their iterpretatios are cosidered for 3.3. Assumptio.. The elemets ε i of the disturbace term ε are distributed idepedetly with mea zero ad variace σ 2 i, ad E ε i ν < for some ν > 4 for all ad i. This assumptio allows idepedet ad heteroskedastic disturbaces. The elemets of the disturbace term have momets higher tha the fourth momet. This coditio is specifically required for the applicatio of the cetral limit theorem for the quadratic form give i Kelejia ad Prucha 200 for the GMM estimator. I additio, the variace of a quadratic form i ε exists ad is fiite whe the first four momets are fiite. 7 Fially, Liapuov s iequality guaratees that the momet less tha ν are also uiformly bouded for all ad i. Assumptio.2. The spatial weight matrices M ad W are uiformly bouded i absolute value i row ad colum sums. Moreover, S, S λ, R ad R ρ exist ad are uiformly bouded i absolute value i row ad colum sums for all values of ρ ad λ i a compact parameter space. I the literature, weight matrices are usually treated as exogeous ad fixed. Lee 2004, 5 Elhorst 200 ames the model with spatial spillovers i the depedet variable, exogeous variable ad disturbace term as the Maski Model. He states that that the parameter estimates caot be iterpreted i a meaigful way for this kid of model sice the edogeous ad exogeous effects caot be distiguished from each other. See also Aseli See Kelejia ad Prucha For the variace of the quadratic form i ε, see Lemma

21 2007c formulate the weight matrix as a fuctio of the sample size. Accordig to this formulatio, the sequece of weight matrix {W } is uiformly bouded i both row ad colum sums ad its elemets w,ij s are O h. The sequece {h } ca be bouded or diverget with the property that lim 0 h = 0, which implies that h is allowed to diverge oly at a rate slower tha that of. This formulatio provides a explicit way that describes how the spatial weight matrix W is expadig as the sample size icreases. For example, assume that a ecoomy cosists of r regios ad each regio is populated by k agets. The, the total umber of observatios from this ecoomy is = rk. I additio, i each regio each aget is equally affected by other agets of the same regio. There is o iteractio amog regios. Deote the row ormalized spatial weight matrix of a regio by C k which is give by k l kl k I k where l k is a k dimesioal vector of oes. The, the spatial weight matrix W for this ecoomy is block diagoal W = I r C k. Each elemet i a diagoal block is give by k, so that w,ij = O k. The, h = k k r = O r. Assume that the icrease i is geerated by the icrease of both r ad k. The, the fractio h teds to zero, as h diverges to ifiity. This kid of spatial weight matrix is used for large group iteractios scearios which have importat implicatios for the covergece rate of estimators Lee, For large group iteractios for which lim h 0, cosistecy of estimators might ot be available. As a example, Kelejia ad Prucha 2002 ad Yuzefovich et al cosider a row ormalized spatial weight matrix that has equal weights for all observatios. The spatial weight matrix is formulated as W = l l I where each off-diagoal elemet is. I that case, w,ij = O ad h lim = lim =. With this specificatio, Kelejia ad Prucha 2002 show that OLS, 2SLS ad ML estimators are icosistet for spatial autoregressive models. I this study, we assumed that h is bouded. The uiform boudedess of the terms i Assumptios. ad.2 is motivated to cotrol spatial autocorrelatios i the model at a tractable level Kelejia ad Prucha, Assumptio.2 also implies that the model i 3.3 represets a equilibrium 8 For examples of this kid of weight matrices, see Case 99, For a defiitio ad some properties of uiform boudedess see Kelejia ad Prucha

22 relatio for the depedet variable. By this assumptio, the reduced form of the model becomes feasible as Y = S X β 0 + S R ε. Fially, the statemet of Assumptio.2 is assumed to hold at the true ad arbitrary autoregressive parameter vector. The uiform boudedess of S estimator Liu et al., 200. λ ad R ρ is required for the ML estimator ot for the GMM I the literature, the parameter space for spatial autoregressive parameters λ 0 ad ρ 0 is restricted to the iterval,, whe spatial weight matrices are row ormalized. 0 I that case, matrices S ad R are osigular. More geeral parameter spaces have also bee cosidered i the literature. Let ν j for j =,..., be eigevalues of W. The spectral radius of W is defied by τ = max j ν j. The, S is osigular for all values of λ 0 i the iterval τ, τ. However, the computatio of eigevalues ivolves computatioal difficulties, ad becomes umerically ustable for spatial weight matrices with more tha 000 observatios Simirov ad Aseli, 200. Aother formulatio for the parameter space base o the maximum row ad colum sums of spatial weight matrices is also cosidered i the literature. Deote R i ad C j respectively as ith row sum ad jth colum sum of W i absolute value. Let the maximum row sum be give by R = max i j= wij = max i R i. Likewise, the maximum colum sum is defied by C = max j i= w ij = max j C j. Let m = max{c, R}. The, S is osigular for all values of λ 0 i the iterval m, m. 2 The followig assumptios are the usual regularity coditios required for the GMM estimator. Throughout this study, the vector of momet fuctios cosidered for the GMM estimator is i the form of gθ 0 = ε P ε,..., ε P m ε, ε Q. The momet fuctios ivolvig costat matrices P j for j =,..., m are kow as quadratic momet fuctios. The last momet fuctio Q ε is the liear momet fuctio, where the full colum rak matrix Q is r with r k+. The matrices P j s ad Q are chose i such way that orthogoality coditios of populatio momet fuctios are ot violated. Let 0 Kelejia ad Prucha 200 states that the iterval, is ot atural i the sese that equivalet model formulatio are possible by applyig a arbitrary scale factor to autoregressive parameters ad its iverse to weight matrices ad therefore the parameter space will deped o the scalig factor. Elhorst 202 outlie a simple procedure for fidig the parameter space for models with multiple spatial weights matrices. 2 For a proof of this result see Kelejia ad Prucha

23 P be the class of costat matrices with zero trace ad P 2 be class of costat matrices with zero diagoal elemets. 3 The quadratic momet fuctios ivolvig matrices from these both classes satisfy the orthogoality coditios whe disturbace terms are i.i.d. As it will be show, whe disturbace terms are merely idepedet, matrices from the class P \ P 2 ca ot be used to form quadratic momet fuctios. 4 Assumptio.4 states regularity coditios for these matrices ad the last assumptio characterizes the parameter space. Assumptio.3. The regressors matrix X is a k matrix cosistig of uiformly bouded costat elemets. It has full colum rak of k. Moreover, lim X X exists ad is osigular. Assumptio.4. The IV matrix Q has r k + liearly idepedet colums ad its elemets are uiformly bouded. P j matrices for j =,..., m is uiformly bouded i absolute value i row ad colum sums. Assumptio.5. The parameter space Θ is a compact subset of R k+2 ad θ 0 is i the iterior of Θ..3 GMM Estimatio of Spatial Autoregressive Models The GMM estimatio approach depeds o the momet fuctios that are derived from the structure of the model. The edogeous variable W Y o the right had side of the model is give more explicitly by W Y = W S X β 0 + W S R ε = G X β 0 + G R where G = W S = W I λ 0 W exists by Assumptio.2. Thus, W Y is a fuctio of a o-stochastic term G X β 0 ad a stochastic term G R ε. Lee 200, 2007a, Liu et al. 200, Lee ad Liu 200 ad Li ad Lee 200 form momet fuctios based o stochastic ad o-stochastic terms. The o-stochastic term is istrumeted by Q = [ R G X β 0, R X ], which forms the liear momet fuctio Q ε. The liear momet matrix Q is costructed from the expectatio of Z = [ W Y, X ]. Give cosistet iitial estimates of λ 0, ρ 0 ad β 0, the IV matrix Q becomes available. Lee 3 Note that P 2 is a subclass of P, i.e., P 2 P. 4 Here, P \ P 2 deotes set-theoretic differece of P ad P 2. 0 ε

24 2003 shows that the 2SLS estimator with Q is best i the sese that its asymptotic variace covariace matrix is the smallest amog the class of 2SLS estimators based o liear momet coditios. The stochastic part G R ε of W Y is istrumeted by P j ε, where P j P ad/or P j P 2 for j =,..., m. I this case, the quadratic momet is i the form of ε P j ε ad the orthogoality or populatio momet coditio is satisfied whe disturbaces are simply i.i.d. I that case, E ε P j ε = tr Pj Eε ε = 0 for P j s from either P or P 2. 5 For both stochastic ad o-stochastic term, the IVs are costructed i a such way that they are correlated with W Y but ucorrelated with ε. 6 The cosistecy of the GMM estimator does ot deped o a particular P j but the asymptotic variace-covariace matrix is a fuctio of P j s. Therefore, for the selectio of P j s, the asymptotic efficiecy of estimators eeds to be cosidered. Liu et al. 200 ad Lee ad Liu 200 provide the best selectio of P j P i the case of a SARAR, ad SARARp,q, respectively. 7 I the case of SARAR, with i.i.d. ormal iovatios, the best selectio is P = R G R tr R G R I, ad 2 P 2 = H tr H I. Let g θ = ε θp ε θ, ε θp 2 ε θ, ε θq be the set of sample momet fuctios. Liu et al. 200 show that give the set of momet fuctio g θ, ay other momet fuctios that ca be added to this set is redudat. They also show that the ML estimator is characterized by the set of momet fuctios g θ, therefore, the GMM estimator based o these momet fuctios is asymptotically equivalet to the ML estimator. Whe the iovatios are simply i.i.d, Liu et al. 200 suggest aother best set of quadratic momet fuctios so that the optimal GMM estimator is asymptotically more efficiet tha the quasi ML estimator. Whe disturbaces are idepedet ad heteroskedastic, some matrices P j with zero trace property caot be used i the formatio of the quadratic momet fuctios. Let Σ = D σ 2,..., σ2 be the diagoal variace matrix of the disturbace terms. If Pj P \ P 2 for ay j =,..., m, the the covariace E ε P j ε = tr P Eε ε = 5 tr returs the sum of the diagoal elemets of a iput matrix. 6 Note that cov Q, ε = 0k+ ad covp jε, ε = 0. 7 Liu et al. 200 also cosider the best GMM estimatio for the case of a SARAR,0 ad a SARAR0,.

25 tr P j Σ 0. O the other had, Pj with zero diagoal property is still available for the formatio of the quadratic momets, sice tr P Eε ε = tr P j Σ = 0 for ay P j P 2. Thus, the class of matrices with zero diagoal elemets provides robustess for the heteroskedasticity. Li ad Lee 200 exted the GMM estimatio method i Lee 200, 2007a to SARAR,0 that has a ukow form of heteroskedasticity i iovatios. The quadratic momet fuctios are based o the class P 2. Let ς 0 = λ 0, β 0 be the parameter vector of the model, Li ad Lee 200 suggest the set of momet fuctios g ς = ε ςp ε ς, ε ςq 2, where P = G D G P2 ad Q 2 = [ G X β 0, X ]. 8 The optimal robust GMM estimator derived from mi ς Θ g ςˆω g ς is cosistet ad asymptotically ormally distributed. Here, ˆΩ is a estimate of var g ς 0 = Ω based o a iitial cosistet estimator of ς 0. For the heteroskedastic case, the best selectio of P j is ot available. Li ad Lee 200 suggest that the selectio from P 2 for the simply i.i.d case ca be used for the case of idepedetly distributed disturbace terms. Thus, the cosistet estimates of G DG ad [ G X β 0, X ] are used i g ς for the robust optimal GMM estimator. The computatioally simple two-step GMM estimatio approach i Kelejia ad Prucha 998, 999 for the case of a SARAR, is based o two quadratic momet matrices from P : P = v M M trm M with v = trm 2, ad 2 P 2 = M. M Whe the iovatios are heteroskedastic, the orthogaality coditio of the quadratic momet fuctio based o P is violated, therefore Kelejia ad Prucha 200 cosider a quadratic momet matrix from the class P 2. I that case, the first momet is formed with P = M M D M M. The liear momet coditios i Kelejia ad Prucha 998 are based o the liearly idepedet colums of the set Q 3 = [ X, W X, W 2 X,..., M W X, M W 2 X,... ]. The IV matrix Q 3 provides a approximatio for E Z ad E M Z. For the illustratio of two-step GMM estimatio approach of Kelejia ad Prucha 200, let g ρ, ς = ε θp ε θ, ε θp 2 ε θ be the set of sample momet fuctios, ad let ς be a iitial cosistet estimator based o the istrumet matrix Q 3. The optimal GMM estimator of ρ 0 is defied as ˆρ = argmi ρ g ρ, ς 8 D is a operator that creates a matrix from the diagoal elemets of a iput matrix. 2 + ˆΨ g ρ, ς, where ˆΨ

26 is a estimator of the variace matrix of the limitig distributio of the ormalized sample momet g ρ, ς. 9 The estimator ˆρ is used for the two step GMM estimator of ς 0, which is based o the liear istrumetal matrix Q 3. Let g 2 ˆρ, ς = Q 3 ε ˆρ, ς be the sample momet fuctio, where ε ˆρ, ς = R ˆρ S λy R ˆρ X β. The the optimal two-step GMM estimator of ς 0 is defied by ˆς = argmi ς g 2ˆρ, ς Υ g 2 ˆρ, ς, where Υ = Q 3 Q As illustrated, the estimatio approach i Kelejia ad Prucha 998, Kelejia ad Prucha 200 ad Drukker et al. 202 is characterized by a sequetial two-step GMM estimatio method. 2 The sequetial GMM estimatio is motivated by computatioal simplicity as the ML estimatio ivolves sigificat computatioal burde for the large samples. I additio, the Kelejia-Prucha methodology also does ot ivolve the computatio of the iverse of the matrix S i the GMM framework. A possible disadvatage of the twostep GMM approach is that the resultig estimators may be iefficiet relative to the joit GMM estimator oe step GMM estimator derived by usig the complete set of momet fuctios with a optimal weight matrix Lee, 2007c, Lee ad Liu, Estimatio Approach uder Ukow Heteroskedasticity I this sectio, we cosider GMM ad ML estimatio of spatial autoregressive models with heteroskedastic disturbaces. I the first subsectio, the ecessary coditio for the cosistecy of the ML estimator is studied. The results show that the ML estimator of autoregressive parameters is geerally icosistet whe heteroskedasticity is ot icorporated ito estimatio. The ext subsectio covers a robust GMM estimatio method for a spatial model with spatial depedece i the depedet variable ad i the disturbace term. The results idicate that the robust GMM estimator is cosistet ad asymptotically 9 For the explicit form of ˆΨ, see Arraiz et al Note that ς ca be updated by usig the weight matrix I for a iitial first step ˆρ. 20 The estimator ˆς has bee called the feasible geeralized spatial two-stage least squares FGS2SLS estimator. 2 For the descriptio of the estimatio steps, see Arraiz et al. 200 ad Drukker et al For a differet approach of the GMM estimatio method, see Coley

27 ormally distributed..4. The Icosistecy of Maximum Likelihood Estimator Li ad Lee 200 show that the MLE is icosistet for the case of a SARAR,0. I this sectio, we show that the ML estimator is also icosistet for the spatial model i 3.3 whe there is a ukow form of heteroskedasticity i the iovatio terms. Let ζ = θ, σ 2 with θ = ρ, λ, β. The log likelihood of the model i 3.3 uder the assumptio that disturbaces are i.i.d. N0, σ0 2 is give by l L ζ = 2 l2π 2 lσ2 + l S λ + l R ρ.2 2σ 2 [ S λy X β ] R ρr ρ [ S λy X β ]. For otatioal simplicity, let R ρx = X ρ, ad M ρ = I P ρ, where P ρ = X ρ [ X ρx ρ ] X ρ, ad δ = ρ, λ. Note that X ρm ρ = 0 k ad M ρx ρ = 0 k. The solutio of the first order coditios for β ad σ 2 yields the followig ML estimators. 23 ˆβ δ = [ X ρx ρ ] X ρr ρs λy ˆσ 2 δ = ε θε θ = Y S λr ρm ρr ρs λy..3a.3b For a give value of δ, the ML estimators ˆβ δ ad ˆσ 2 δ ca be see as OLS estimators from the regressio equatio R ρs λy = R ρx β + ε. Substitutio of R ρs λy = R ρx β + ε ito ˆσ 2 δ yields ˆσ 2 δ = ε M ρε. For the asymptotic argumet of this sectio, we modify Assumptio.3. Assumptio.6. The exogeous variables matrix X is a k matrix cosistig of costat elemets that are uiformly bouded. It has full colum rak k. Moreover, lim X X ad lim X R ρr ρx exist ad are osigular for all values of ρ i Θ. The compact parameter space cotais ρ 0 by Assumptio.5, therefore the modified 23 The first order coditios from.2 are give i Appedix

28 assumptio also requires a fiite ad osigular limit for the term X R R X. With this ew assumptio, orders of certai terms ca be obtaied via the asymptotic aalysis give i Appedix 3.4. From.3b, we have plim ˆσ δ 2 0 = plim ε ε plim 2 ε X X X X ε..4 The first term o the right had side coverges to i= σ2 i by Chebyshev Weak Law of Large umbers. The secod term vaishes i probability so that the average of variaces of the disturbace terms is asymptotically equivalet to ˆσ δ 2 0, amely, ˆσ δ 2 0 = i= σ2 i + o p. 24 The cocetrated loglikelihood fuctio is obtaied by substitutig ˆβ δ ad ˆσ δ 2 ito.2: l L δ = 2 l2π + 2 lˆσ2 δ + l S λ + l R ρ..5 The MLE ˆδ = ˆλ, ˆρ is the extremum estimator derived from the cocetrated loglikelihood fuctio. The first order coditios of the cocetrated loglikelihood fuctio with respect to ρ ad λ are give by l L δ ρ l L δ λ = ˆσ δ 2 2ˆσ δ 2 tr H ρ,.6 ρ = ˆσ δ 2 2ˆσ δ 2 tr G λ,.7 λ where G λ = W S λ ad H ρ = M R ρ. The cosistecy of the MLE ˆδ requires that the first order coditios evaluated at the true parameter value δ 0 coverges i proba- l L bility to zero i.e., plim δ 0 δ = 0. This ecessary coditio for the cosistecy of the ML estimator of δ 0 is l L δ 0 = δ i= H,iiσi 2 i= σ2 i i= G.iiσ 2 i i= σ2 i 24 For the asymptotic argumet see Appedix 3.4. tr H + op tr..8 G + op 5

29 Deote σ 2 = i= σ2 i, H = tr H = i= H,ii, ad G = tr G = i= G,ii where G = R G R. The,.8 ca be writte i a more coveiet form 25 l L δ 0 = δ = i= cov H,ii, σi 2 i= H,ii H σ i 2 σ2 + o σ 2 p G.ii G σ i 2 σ2 σ 2 tr G G + op + o σ 2 p,.9 + o σ 2 p cov G,ii, σi 2 which shows that the ML estimators ˆλ ad ˆρ are icosistet uless cov H,ii, σi 2 = 0 σ 2 ad cov G,ii, σi 2 = 0. The icosistecy of ˆλ ad ˆρ depeds o the covariace betwee σ 2 variaces of elemets of the disturbace terms ad diagoal elemets of H ad G. It is obvious that whe ε is homoskedastic, l L δ 0 δ is o p as σi 2 = σ2 for i =,...,. This result also holds for the trivial case of ρ 0 = λ 0 = 0. Ituitively, the result i.9 idicates that the cocetrated loglikelihood fuctio is ot maximized at the true parameter vector whe disturbace terms have ukow heteroskedasticity. The ML estimator of β 0 i.3a is also icosistet, sice it is a fuctio of icosistet estimators ˆλ ad ˆρ. Explicitly, ˆβ ˆδ = β 0 + λ 0 ˆλ D ˆρ X G X β 0 + λ 0 ˆλ ρ0 ˆρ D ˆρ X H s G X β 0 + λ 0 ˆλ ρ0 ˆρ 2D ˆρ X H H G X β 0 + o p,.0 where D ˆρ = [ X ˆρ X ˆρ ]. 26 The above result shows that the asymptotic bias of ˆβ ˆδ depeds o weight matrices ad the regressors matrix, ad is ot zero uless autoregressive parameters are cosistet. For the special case of ˆλ = λ 0 + o p, the icosistecy of ˆρ has o effect o the asymptotic bias of ˆβ ˆδ, so that ˆβ ˆδ = β 0 +o p. For the spatial autoregressive model, where ρ 0 = 0 i 3.3, the result i the secod row of.9 simplifies to l L λ 0 λ = cov G,ii, σi 2 + o σ 2 p sice G = G. The term 25 Note that σ 2 = i= σ2 i is the average of the variace of the disturbace terms, ad tr G G = For the asymptotic argumet, see Appedix

30 D ˆρ X G X β 0 i.0 simplifies to X X X G X β 0 so that ˆβ ˆλ = β 0 + λ 0 ˆλ X X X G X β 0 + o p, which is the exact result stated i Li ad Lee 200. The cocetrated loglikelihood fuctio is oliear i δ, which makes it hard to make ay geeral coclusio about the asymptotic bias of the MLE ˆδ = ˆλ, ˆρ. For the spatial autoregressive model, Li ad Lee 200 ivestigate the asymptotic bias of ˆβ ˆλ for a case of group iteractios, where W is assumed to be a block-diagoal matrix such that each block has differet umber of uits ad each uit is equally affected by the other uits. Li ad Lee 200 shows that whe covariates are i.i.d with mea zero for all blocks, the asymptotic bias of the itercept is larger tha those of other coefficiets, ad the bias of all coefficiets are egatively related to the average block size. The specificatio i 3.3 with λ 0 = 0 is called the special error model SEM or SARAR 0, i the literature LeSage ad Pace, For this model, the ecessary coditio for the cosistecy of the ML estimator of ρ 0 is ot satisfied, sice the result i the first row of.9 is geerally ot zero. The MLE of β 0 for the SEM is give by ˆβ ρ = D ρx ρr ρy for a give ρ, which is the OLS estimator from the artificial regressio R ρy = R ρx β +ε. Substitutig Y = X β 0 +R ε ito ˆβ ρ yields ˆβ ρ = β 0 + D ρx ρr ρr ε. Uder Assumptio.6, it ca be show that ˆβ ˆρ = β 0 +o p. 27 This result idicates that uder ukow form of heteroskedasticity, the MLE ˆβ ˆρ has o asymptotic bias, eve whe the MLE ˆρ is icosistet. The spatial model specificatio with β 0 = 0 ad ρ 0 = 0 i 3.3 is kow as the pure spatial autoregressive model i the literature. The MLE estimator of λ 0 for this kid of model is also icosistet uder heteroskedastic disturbaces. The first order coditio of the cocetrated loglikelihood fuctio of this model with respect to λ is l L λ 0 λ = ˆσ 2 λy W S λy tr G λ, where ˆσ λ 2 = ε λε λ with ε λ = S λy. At λ 0, ˆσ λ 2 0 = i= σ2 i + o l L p. The, λ 0 λ = cov G,ii, σi 2 + o p by the same asymptotic argumet applied i the derivatio of.9. This result is the same as with the oe obtaied i Li ad Lee 200 for the case of a SARAR,0. I the special case, where the spatial weight matrices are the same ad the true parame- 27 For the asymptotic argumet, see Appedix 3.4. σ 2 7

31 ter values λ 0 ad ρ 0 are equal, the covariace terms i.9 are equal. 28 I this special case, the result i.9 simplifies to l L δ 0 δ = cov G,ii, σi 2 + o p, which is the ecessary coditio stated i Li ad Lee 200 for a spatial model with oly a spatial lag i the depedet variable. Despite this result, the asymptotic bias of the MLE ˆβ ˆδ will ot simplify to the oe derived for a spatial model with a spatial lag i the depedet variable. A atural questio is that uder what coditios the covariace terms i.9 are zero. A obvious case is whe both G ad H have diagoal elemets that are equal. The, the ecessary coditio for the cosistecy of ˆλ ad ˆρ is ot violated, eve if the disturbaces σ 2 are heteroskedastic. As a example, cosider a circular world weight matrix with equal diagoal elemets that relate each uit to the uits i frot ad i back. I that case, both H ad G have equal diagoal elemets. Aother case arises, whe the weight matrices W ad M are block-diagoal matrices with a idetical submatrix i the diagoal blocks ad zeros elsewhere. This is a special case of group iteractios example i Lee 200 where all group sizes are equal ad each eighbor of the same uit has equal weight. I this sectio, we have show that the ML estimators for autoregressive spatial models are geerally icosistet whe heteroskedasticity is preset i the disturbace terms. Besides its computatioal burde, the cosistecy of ML estimator is ot esured..4.2 Robust GMM Estimatio of SARAR, I this sectio, the robust GMM estimatio method suggested by Li ad Lee 200 is exteded for the model i 3.3. For the estimatio, we cosider the set of populatio momet fuctios g θ 0 = ε P ε,..., ε P m ε, ε Q where P j P 2 for j =,..., m. This set defies the orthogoality coditios that are cosidered for the estimatio. Throughout this sectio, we assume that the model i 3.3 satisfies Assumptios..6. First, we discuss the idetificatio of the parameter vector θ 0 i the GMM framework ad state coditios for the idetificatio. The, we determie the large sample properties of the robust GMM estimator I this case, W = M ad R = S so that H = W S = G ad G = S G S = S W S S = G. 29 The argumets provided here is geeral, ad issues about the selectio of paticular P j ad Q are preseted i the fial part of Sectio

32 The idetificatio of parameters i a GMM framework requires lim E g θ 0 = For ay value of the parameter vector θ Θ, cosider the expectatio of the set of momet fuctios i.: E ε θp ε θ E g θ E ε θp 2 ε θ =... E ε θp m ε θ E Q ε θ From 3.3, ε θ ca be writte i terms of the model parameters i the followig way: ε θ = R ρs λs R ε + R ρ [ S λs X β 0 X β ] = K δk ε + R ρk ς,.2 where K δ = R ρs λ, K = R S, k ς = [ S λs X β 0 X β ], ad ς = λ, β are itroduced for otatioal simplicity. Substitutig.2 ito. ad takig expectatio yield k ςr ρp R ρk ς + tr Σ K K δp K δk k ςr ρp 2 R ρk ς + tr Σ K K δp 2 K δk Eg θ =...3 k ςr ρp m R ρk ς + tr Σ K K δp m K δk Q R ρk ς where Σ = D σ 2 i,..., σ2. The idetificatio of the parameter vector θ0 ca be verified from E g θ = 0, i.e., θ 0 is idetified if θ 0 is the uique solutio for Eg θ = 0. The term Q R ρk ς i.3 ca be writte more explicitly as Q R ρk ς = Q R ρ [ X, G X β 0 ] [ β 0 β λ 0 λ ] = 0 r See Lemma 2.3 i Newey ad McFadde

33 The uique solutio of.4 is β 0, λ 0 if the matrix, [ ] Q R ρx, Q R ρg X β 0 has full colum rak k + for each possible value of ρ Θ by the virtue of Lemma. of Appedix.7.. Sice the liear IV matrix Q has colum rak greater tha or equal to k +, this rak coditio is equivalet to the fact that the matrix [ ] X, G X β 0 has full colum rak k +. Uder this rak coditio, the remaiig momet equatios i E g θ are for the idetificatio of ρ 0. To this ed, K δ is decomposed i the followig way 3 K δ = R ρs λ = R ρ ρ 0 M S λ λ 0 W = K + ρ 0 ρm S + λ 0 λr W + ρ 0 ρλ 0 λm W..5 Cosider the terms with P j, k ςr ρr ρp j k ς + tr Σ K K δp j K δk. Sice β 0 ad λ 0 are idetified from the rak coditio of the last momet equatio, the first term i the jth momet equatio is zero ad K δ term reduces to K + ρ 0 ρ M S. The remaiig term i the jth momet equatio ca be explicitly writte as tr Σ K K δp j K δk = ρ0 ρ tr P s j M R Σ + ρ 0 ρ 2 tr R M P j M R Σ = 0,.6 where P s j = P j + P j. There are two roots for ρ i.6. The first root is the true parameter value ρ 0, ad the secod root is ρ = ρ 0 + tr P s j M R Σ tr R M P j M R..7 Σ There are three cases i which ρ 0 is the uique root. If tr P s j M R Σ = 0 ad the deomiator is ot zero, the ρ 0 is the uique root. If the umerator is ot zero but the deomiator is zero, the the secod root is ot defied. I both cases, ρ 0 is uiquely idetified. If there is more tha oe matrix for the quadratic momet equatios, the there is aother case i which ρ 0 ca be uiquely idetified. The coditio for this case is that 3 K δ ca be decomposed by usig idetities S λ = S λ λ 0 W ad R ρ = R ρ ρ 0 M. 20

34 the fractio i.7 must be differet for each P j for j =,..., m so that the secod root does ot exist, tr P s i M R Σ tr R M P i M R Σ tr P s j M R Σ tr R M P j M R for all i j..8 Σ Whe the rak coditio for Q R ρk ς = 0 fails the β 0 ad λ 0 are ot idetified separately from the last momet equatio i Eg θ. I this case, the colum rak of the matrix [ ] X, G X β 0 is less tha k +. This implies that there exists a costat vector v such that X v = G X β 0. Usig this relatio i.4 Q R ρk ς = Q R ρ [ X β β 0 + X v λ λ 0 ] = Q R ρx [ β β0 + v λ λ0 ] = 0r..9 The regressors matrix X has full colum rak k by Assumptio.3. Thus, the matrix Q R ρx i the above equatio has full colum rak k for each ρ Θ. This implies that all solutios of.9 satisfies the relatio β = β 0 v λ λ 0 by virtue of Lemma. i Appedix.7.. This idicates that β 0 ad λ 0 are ot separately idetified from this momet equatio ad that oly oce λ 0 is idetified the idetificatio of β 0 will be feasible. The remaiig momet equatios i.3 are fuctios of δ = ρ, λ. Hece, these momet fuctios may provide idetificatio for the parameter vector δ 0. I this case, these momet equatios are simplified to tr Σ K K δp j K δk = 0 for j =,..., m sice, k ς = X [ β β0 + v λ λ0 ] = 0 at β = β 0 vλ λ 0. Lee 200b makes the observatio that these remaiig momet equatios correspods to the momet equatios of the followig process: Y = λ 0 W Y + u, u = ρ 0 M u + ε..20 For the above process, ε θ = R ρs λy = R ρs λ S Thus, for the jth quadratic momet E ε θp j ε θ = tr Σ K R ε = K δk ε. K δp j K δk. Therefore, the idetificatio of δ 0 ca be ivestigated from.20. Whe M = W, the 2

35 reduced form of.20 is Y = ρ 0 + λ 0 W Y ρ 0 λ 0 W 2 Y + ε. The idetificatio of δ 0 is ot possible from this process sice λ 0 ad ρ 0 ca ot be distiguished from each other Aseli, 988, p. 88. Thus, oly uder the coditio that M W, the idetificatio issue ca be ivestigated from the equatio tr Σ K K δp j K δk = 0. This equatio ca be explicitly writte as tr Σ K [ K δp j K δk = tr Σ I + ρ 0 ρ M S K + λ 0 λ R W K + ρ 0 ρ λ 0 λ [ M W K ] P j I + ρ 0 ρ M S K + ρ 0 ρ λ 0 λ ] M W K + λ 0 λ R W K = 0..2 I order to simplify the otatio, we itroduce the followig variables. α ρ,j = tr Σ P s j H α ρ 2,j = tr Σ H P j H α ρλ,j = tr Σ P s j H G + Σ G P s α ρλ 2,j = tr Σ G H P s j G j H Usig these variables, the equatio.2 simplifies to α λ,j = tr Σ P s j G α λ 2,j = tr Σ G P j G αρ 2 λ,j = tr Σ G H P s j H α ρ 2 λ 2,j = tr Σ G H P j H G tr Σ K K δp j K δk = αρ,j ρ0 ρ + α λ,j λ0 λ + α ρ,j 2 ρ0 ρ 2 ρ0 ρ 2 λ0 λ + α λ 2,j λ0 λ 2 + αρλ,j ρ0 ρ λ 0 λ + α ρ 2 λ,j + α ρλ,j 2 ρ0 ρ λ 0 λ 2 + αρ 2 λ 2,j ρ0 ρ 2 λ0 λ 2 = 0 for j =,..., m

36 The above system of equatios ca be writte i matrix form i the followig way α ρ, α ρ,2 α ρ,m ρ 0 ρ α λ, α λ,2 α λ,m λ 0 λ α ρ 2, α ρ 2,2 α ρ 2,m ρ0 ρ 2 α λ 2, α λ 2,2 α λ 2,m λ0 λ 2 α ρλ, α ρλ,2 α ρλ,m ρ0 ρ λ 0 λ = α ρ 2 λ, α ρ 2 λ,2 α ρ 2 λ,m ρ0 ρ 2 λ0 λ α ρλ 2, α ρλ 2,2 α ρλ 2,m ρ0 ρ λ 0 λ 2 α ρ 2 λ 2, α ρ 2 λ 2,2 α ρ 2 λ 2,m ρ0 ρ 2 λ0 λ 2 By Lemma. i Appedix.7., the system i.23 has a uique solutio at δ 0 if colums of the above matrix do ot have a liear combiatio with oliear o-zero costat coefficiets of the form α ρ c + α λ c 2 + α ρ 2c 2 + α λ 2c α ρλ c c 2 + α ρ 2 λc 2 c 2 + α ρλ 2c c α ρ 2 λ 2c2 c 2 2 = 0,.24 where αs represet the colum vectors of the above matrix ad c ad c 2 are arbitrary ozero costat coefficiets. With this coditio, ρ 0 ad λ 0 are uiquely idetified from the system i.23. Oce λ 0 is idetified, the idetificatio of β 0 follows from the last momet fuctio i.3. Assumptio.7 summarizes coditios for the idetificatio of the parameter vector θ 0 from the set of momet fuctios i g θ for sufficiet large. The similarity of this assumptio with Assumptio.5 i Liu et al. 200 is revealig: the mai differece is that the idetificatio coditios ow ivolve covariace matrix Σ. 32 Assumptio.7. For the idetificatio of the parameter vector θ 0 Θ, oe of the followig cases is assumed. i The limitig matrix lim Q R [ X, G X β 0 ] has full colum rak k + for each ρ Θ, 32 See also the idetificatio assumptios i Lee ad Liu 200 ad Li ad Lee 200, which have a similar structure. 23

37 ii The limitig value lim tr P s j H Σ 0 for some j, ad the limitig vector [ lim tr P s H Σ,..., lim tr P mh s ] Σ is liearly idepedet of the limitig vector [ lim ] tr H P H Σ,, lim tr H P m H Σ. 2 i The limitig matrix lim Q R X has full colum rak k for each ρ Θ, ii W M, iii The vector αs defied above do ot have a liear combiatio with some oliear o-zero costat coefficiets c ad c 2 i the form of α ρ c + α λ c 2 + α ρ 2c 2 + α λ 2c2 2 + α ρλ c c 2 + α ρ 2 λc 2 c 2 + α ρλ 2c c α ρ 2 λ 2c2 c2 2 = 0. The first coditio i case esures the idetificatio of β 0 ad λ 0 from the liear momet fuctio. The secod coditio i case provides the idetificatio for ρ 0 from the quadratic momet fuctios. I case 2, the quadratic momet fuctios esures the idetificatio of ρ 0 ad λ 0 uder the coditio of W M. Oce λ 0 is idetified, the idetificatio of β 0 follows from the first coditio i case 2. Let Ω = E [ g θ 0 g θ 0 ]. By Lemma.2 i Appedix.7., we obtai the variacecovariace matrix of the set of momet fuctios as tr Σ P P Σ + Σ P tr Σ P P mσ + Σ P m 0 r tr Σ P 2 P Σ + Σ P tr Σ P 2 P mσ + Σ P m 0 r Ω =.... tr Σ P m P Σ + Σ P tr Σ P m P mσ + Σ P m 0 r 0 r 0 r Q Σ Q.25 The variace-covariace matrix Ω has the same structure as the oe i Li ad Lee 200. Let Γ = E gθ0. A straightforward applicatio of matrix calculus yields.26. Ele- θ mets of Γ are fuctios of matrices that are uiformly bouded i absolute value i row 24

38 ad colum sums so that the order of elemets is either O or O, which i tur implies that Γ is bouded. tr Σ H P s tr Σ G P s 0 k tr Σ H P s 2 tr Σ G P s 2 0 k Γ =... tr Σ H P m s tr Σ G P m s 0 k 0 r Q R G X β 0 Q R X.26 Let Ψ Ψ be a arbitrary o-stochastic weightig matrix for the GMM objective fuctio. The weightig matrix plays the role of a metric by which the sample momet fuctios are made as close as possible to zero. Assume that Ψ coverges to a costat matrix Ψ 0 that has full rak, ad lim Ψ Γ exists ad has full rak Hase, The followig propositio shows that the geeric GMM estimator based o the set of momet fuctios g θ 0 = ε P ε,..., ε P m ε, ε Q with geeral P j s ad Q is cosistet ad has a asymptotic ormal distributio. 34 Propositio.. Suppose P j P 2 for j =,..., m ad Q is liear IV matrix. Uder Assumptios..7, the estimator ˆθ derived from the objective fuctio mi θ Θ g θψ Ψ g θ is a cosistet robust GMM estimator RGMME of θ 0. It has a asymptotic ormal distributio, amely ˆθ θ 0 d N 0, Υ,.27 [ ] [ where Υ = lim Γ Ψ Ψ Γ Γ Ψ Ψ Ω Ψ Ψ Γ Γ Ψ Ψ ] Γ. The estimator i Propositio 3. is the geeric GMM estimator cosidered i Hase The variace-covariace matrix of the RGMME of Propositio 3. is a fuctio of ukow terms Γ ad Ω. As usual, cosistet estimates of these terms ca be obtaied 33 For our case, matrices {Ψ } have dimesios equal or bigger tha k + 2. Let Q be k + liear IV matrix. I that case, g : R R k+2 R m+k+, where m + k + is the umber of orthogoality coditios. The, matrices {Ψ } are dimesioed m + k + m + k +. See Assumptio 2.5 i Hase 982, p The details of the proofs for all propositios are give i Appedix The structure of the asymptotic variace-covariace matrix Υ is also geeric ad it has the same structure with the oe give i Theorem 3. of Hase 982, see p

39 from a iitial cosistet estimator of θ 0. I the followig propositio, cosistet estimators for Γ ad Ω are give. Propositio.2. Let ˆε i be the residual of the model based o cosistet iitial estimates of θ 0 ad deote ˆΣ = D ˆε 2, ˆε2 2,..., ˆε2. The, uder the assumed regularity coditios, ˆΩ Ω = o p, 2 ˆΓ Γ = o p. The proof of Propositio.2 utilizes the facts that quadratic momet matrices are uiformly bouded i absolute value i row ad colum sums ad disturbace terms have uiformly bouded fourth momets. These two properties esure that the elemets ivolvig the trace operator i ˆΩ ad ˆΓ coverge i probability to the correspodig elemets of Ω ad Γ. The remaiig elemet i ˆΩ is Q ˆΣ Q. The asymptotic argumet for this term is i lie with that of White 980. Uder certai regularity coditios, White 980 shows that X ˆΣ X coverges almost surely to X Σ X, where ˆε i is a cosistet estimate of ε i. I Propositio 3., the GMM estimator is derived from the objective fuctio with a arbitrary weightig matrix. It is clear that differet choices of weightig matrices give rise to GMM estimators with differet asymptotic covariace matrices. The optimal estimator is the oe that has a asymptotic covariace matrix at least as small as that of ay other GMM estimator. Hase 982 shows that the optimal GMM estimator is based o the weightig matrix Ψ Ψ = Ω. This matrix plays a promiet role for the optimal GMM estimator uder the followig regularity coditio. Assumptio.8. The limitig matrix lim Ω exists ad is osigular. I.25, otice that the terms i Ω are fuctios of matrices that are uiformly bouded i absolute value i row ad colum sums. For example, a geeric term is tr Σ P i P j Σ + Σ P j which has a order of O. Therefore, Ω is order of O which implies that Ω is bouded. Propositio.2 yields a cosistet estimator ˆΩ for this optimal weightig matrix. The ext propositio shows that the optimal GMM estimator based o the weightig matrix ˆΩ is cosistet ad asymptotically ormal. 26

40 Propositio.3. Uder Propositio.2 ad Assumptios..8, the optimal robust GMM estimator derived from mi θ Θ g θˆω g θ has the asymptotic distributio give by d ˆθo, θ 0 N 0, lim Γ Ω Γ..28 A estimator of the asymptotic variace covariace matrix of ˆθo, θ 0 is eeded to make asymptotically valid ifereces ad costruct asymptotically correct cofidece regios. Propositio.2 guaratees that the cosistet estimator for the asymptotic variace covariace matrix i Propositio.3 is ˆΓ ˆΩ ˆΓ, where Γ ad Ω are evaluated at ˆθ o,. The remaiig issue is about the selectio of Q ad the selectio of the possible best P j s from the class P 2. The asymptotic variace-covariace matrix of the GMM estimator depeds o P j s ad Q. By usig the geeralized Schwartz iequality, Lee 200b shows that the best selectio of P j s from the class P 2 are give by i H DH ad ii G DG. With a similar argumet, Lee 2003 shows that the best IV matrix is Q = [ R G X β 0, R X ]. However, the argumets give i Lee 200b, 2003 are based o the assumptio that the disturbace terms are i.i.d. I case of ukow heteroskedasticity, the applicatio of the geeralized Schwartz iequality to the variace covariace matrix i the equatio.28 might ot provide the best selectio of P j s ad Q, sice it ivolves ukow matrix Σ. Hece, Li ad Lee 200 state that the cosistetly estimated P j s ad Q for the i.i.d. disturbaces case may still be desirable. Therefore, the optimal robust GMM estimator i Propositio.3 is cosidered with cosistetly estimated quadratic momet matrices: i H DH ad ii G DG ad liear IV matrices Q = [ R G X β 0, R X ]. A iitial cosistet estimator of θ0 ca be obtaied from a iitial GMM estimatio with quadratic momet matrices i M M D M M, ii W M D W M, ad liear momet matrix Q = [ M W X, W X, X ] For other cadidates, see Sectio 5. 27

41 .5 Mote Carlo Experimets.5. Desig I order to study the fiite sample properties of various robust ad o-robust estimators, we desig a extesive Mote Carlo experimet. The specificatios that are used to geerate 000 replicatios of each Mote Carlo experimet are described below. For three differet values of the sample size, : 00, 500, ad 000, the data geeratig process follows from the model i 3.3. There are three regressors ad o itercept term such that X = x,, x,2, x,3 ad β0 = β 0, β 20, β 30, where x,, x,2, ad x,3 are idepedet radom vectors that are geerated from a Normal0,. We let W = M ad set β 0 = 0.7, β 20 = 0.4 ad β 30 =.2 for all experimets. For the spatial autoregressive parameters λ 0, ρ 0, we employ combiatios of D = 0.8, 0.3, 0, 0.3, 0.8 to allow for weak ad strog spatial iteractios. We cosider two specificatios for the iovatio vector ε. To geerate the heteroskedastic errors, we follow Li ad Lee 200 ad cosider small group iteractios structure for the spatial weight matrix block diagoal weight matrix. For each sample size, we geerate radom groups where the size of each group is draw from Uiform3, 20 distributio. 37 For each group, if the group size is greater tha 0, we set the variace equal to the group size; otherwise we set the variace to the square of the iverse of the group size. Li ad Lee 200 also cosider creatig heteroskedastic errors by simply lettig them equal to iverse of the group sizes. We do ot cosider the latter case i our experimets. This small group iteractio sceario is similar to the oe i the Mote Carlo desig of Kelejia ad Prucha 2007, where they focus o a circular world i which the first ad the last oe third of a sample observatios have 5 eighbors i frot ad 5 i back, while the middle third oly has eighbor i frot ad i back. Figure. illustrates weight matrices ad variace processes for a sample of = 00. As figure shows, Li ad Lee 200 small group iteractios set-up yields a richer desig for heteroskedasticity. We let the i-th elemet of the iovatio vector ε be ε i = σ i ξ i, where σ i is the stadard error for the i-th observatio ad ξ i s are i.i.d. Normal0,. Also, i order to evaluate 37 The weight matrices are row ormalized. 28

42 each estimator s relative performace uder heteroskedasticity, we cosider a correspodig homoskedastic case i which disturbaces ε i s are i.i.d. Normal0,σ 2 0, where σ2 0 =. Figure.: Weight Matrices a Small Group Iteractios Sceario b Circular World c Variace processes 29

43 I all experimets, the followig estimators are cosidered: i Gaussia maximum likelihood estimator MLE, ii Geeralized spatial two-stage least squares estimator GS2SLSE i Kelejia ad Prucha , iii Best two-stage least squares estimator B2SLS i Lee 2003 with IV set Q = ˆR Ĝ X ˆβ ˆR X based o iitial estimates from the GS2SLSE, ad ρ 0 is estimated by the MOM i Kelejia ad Prucha 998, iv Best geeralized method of momets estimator BGMME i Liu et al. 200 based o iitial estimates from the GS2SLSE, v Robust geeralized spatial two-stage least squares estimator RGS2SLSE i Kelejia ad Prucha , vi Robust best two-stage least squares estimator BR2SLSE i Lee 2003 with IV set Q = ˆR Ĝ X ˆβ ˆR X based o iitial estimates from the RGS2LSE, ad ρ 0 is estimated by the GMME i Kelejia ad Prucha 200, vii Robust geeralized method of momet estimator of Propositio.3 RGMME based o the iitial estimates from the RGS2SLSE, ad viii Robust geeralized method of momet estimator of Propositio.3 RGMME 2 based o the iitial estimates from the BR2SLSE Simulatio Results I Tables..7, the empirical mea mea, the bias Bias, the empirical stadard error SD ad the root mea square error RMSE are reported for the estimates of each parameter. We do ot preset the results for all 8 estimators ad all 25 combiatios of λ 0, ρ 0 due to space limitatio. 4 The focus will be o the MLE, the RGMME, ad the RGMME 2. I each table for these three estimators, the results for the homoskedastic ad heteroskedastic disturbaces are preseted ext to each other for easy compariso. Before we evaluate the results i each table, a couple of geeral poits eed to be stated. First, as Arraiz et al. 200 ad Das et al poit out, if λ 0 is large i absolute value, it results i larger variaces of the elemets of the disturbace vector, which deteriorates the estimatio precisio. Yet, at the same time, the variatio i the explaatory variable W Y 38 I Kelejia ad Prucha 998, ρ 0 is estimated by the method of momet, ad δ 0 = β 0, λ 0 by the GS2SLSE. For short, we call both estimators simply as the GS2SLSE. 39 I Kelejia ad Prucha 200, ρ 0 is estimated by the GMM method ad δ 0 = β 0, λ 0 by the GS2SLSE. For short, we call both estimators simply as the RGS2SLSE. 40 Matlab routies for estimatio are available o request. 4 The experimets that are ot preseted here are available by request. 30

44 is also larger, which teds to improve the estimatio precisio. The et of these opposig effects determies the estimatio precisio. A similar argumet applies to the magitude i absolute value of ρ 0. Secod, as expected regardless of variace structure of the disturbaces, all estimators improve i terms of correspodig bias, SD, ad RMSE as the sample size icreases. Third, for all sample sizes ad o-zero combiatios of λ 0, ρ 0 the MLE uder heteroskedasticity is icosistet ad impose severe bias o all parameters. Fourth, the results for the sample size of 00 for all estimators ought to be iterpreted with cautio. We eed to emphasize the fact that sample size of 00 is itetioally chose to observe the behavior of the estimators whe the sample size is extremely small. 42 Table. presets the results for the specificatio which employs a strog spatial depedece i the depedet variable ad a weak spatial depedece i the disturbaces. For = 00, the MLE performs poorly eve whe the disturbaces are homoskedastic. It imposes sigificat bias o all parameters with much higher SDs, thus with much higher RMSEs. O the other had, both RGMMEs impose quite smaller bias o both λ 0 ad ρ 0 relative to MLE, ad almost o bias o β 0 uder both homoskedasticity ad heteroskedasticity. As the sample size icreases, the RGMMEs improve faster relative to the MLE i terms of the bias ad the SD. For = 500 ad = 000, the RGMMEs uder heterokedasticity impose trivial bias o all parameters. The MLE imposes trivial bias o all parameters uder homoskedasticity oly for = 000. Although the egative bias o both λ ad ρ is less ow, it is still sigificat uder heteroskedasticity. Also, the RGMMEs are as efficiet as the MLE for = 000 uder homoskedasticity. Table.2 presets the results for the specificatio which employs a weak spatial depedece both i the depedet variable ad the disturbaces. We have similar fidigs i terms of biases, but all estimators are more precise compared to Table., which cofirms the first geeral poit stated above. Uder heteroskedasticity, the MLE imposes sigificat bias o both λ 0 ad ρ 0 regardless of the sample size but surprisigly ot o β 0 for = 500 ad = 000. As expected, RGMMEs perform better uder heteroskedasticity for all samples sizes. For = 500 ad = 000 uder homoskedasticity, all estimators impose small trivial bias o both λ 0 ad ρ 0 ad the 42 Li ad Lee 200 employ umber of groups of 00 ad 200, where the size of each group is draw from Uiform3, 20 ad rouded to the closest iteger. This set up yields two itervals from which the sample size is draw: [300, 2000] ad [600, 4000]. Arraiz et al. 200 choose sample sizes of 486 ad

45 RGMMEs are as efficiet as the MLE. Table.3 presets the results for the specificatio i which there is o special depedece i both the depedet variable ad the disturbaces. For this case, our large sample result for the MLE suggests that the ecessary coditio for the cosistecy of autoregressive parameters is ot violated. That is, the case of ρ 0 = λ 0 = 0 implies l L δ 0 δ = o p eve if the iovatio terms are heteroskedastic. These observatio also suggest that the MLE of the parameters of the exogeous variable is also cosistet. For = 00, the MLE has sigificat biases, but as icreases both the biases ad the RMSEs decreases sigificatly. This patter is cosistet with the aforemetioed large sample results. For = 000 uder heteroskedasticity, the MLE imposes trivial bias o all parameters ad is as precise as the RGMMEs except for ρ 0. The RGMMEs perform as good as the MLE uder homoskedasticity for all samples sizes. Table.4 presets the results for the specificatio which employs a weak spatial depedece i the depedet variable ad a strog special depedece i the disturbaces. For = 00, the RGMMEs relatively perform better tha the MLE especially uder heteroskedasticity as expected. The MLE with hetereoskedastic disturbaces seems to be icosistet ad imposes sigificat bias o all parameters with larger SDs ad RMSEs, ad it does ot improve i the larger samples. Uder homoskedasticity, all estimators impose trivial biases i the larger samples ad the RGMMEs are as efficiet as the MLE. Table.5 presets the results for the specificatio which employs a strog spatial depedece i the depedet variable ad a weak special depedece i the disturbace terms. For all sample sizes uder homoskedasticity, all estimators result i trivial bias o all parameters. Uder heteroskedasticity, the MLE imposes sigificat bias o all parameters especially o ρ 0 for = 00. However, as the sample size icreases the MLE improves ad surprisigly imposes a small bias o all parameters. The RGMMEs are robust to heteroskedasticity ad do ot seem to be affected by the sample size. Uder homoskedasticity, the RGMMEs are agai as efficiet as the MLE. Overall, sigificat biases ad high RMSEs for the MLE i the heteroskedastic case are suggestig icosistecy. The results i Tables..5 idicate that the relative size of the bias ad the RMSE for the MLE depeds o the true values of autoregressive parameters. 32

46 Sice the cocetrated loglikelihood fuctio is oliear i these parameter, it is hard to make ay geeral coclusio about the asymptotic biases of these parameters. However, as see i Tables..5, whe the true values of the autoregressive parameters are large, the MLE imposes larger biases i geeral. As a result, i Table. ad Table.5, the biases ad RMSEs of the MLE of λ 0 is higher tha that of the MLE of ρ 0, ad this situatio is i the reverse directio i Table.4. The results i Tables..5 idicate that the biases of the MLE of β 0, β 20 ad β 30 is higher i geeral whe the biases of the autoregressive parameters are higher uder heteroskedastic cases. We compare the performace of the RGMME with other estimators suggested i the literature. We oly preset the estimatio results for = 500 ad = 000 ad for oly two combiatios of λ 0, ρ 0 due to space limitatio. Tables.6 ad.7 preset the estimatio results for both homoskedastic ad heteroskedastic cases. For the homoskedastic case, all estimators perform better ad are almost ubiased. The estimatio results for the GS2SLSE ad the B2SLSE idicates small biases for all parameters uder homoskedasticity i both tables. I particular, the RMSEs of the GS2SLSE ad the B2SLSE for λ 0 β 0, β 20 ad β 30 are almost idetical, which suggest that the set of liear istrumets suggested i Kelejia ad Prucha 998 provides a reasoable approximatio to the optimal istrumets. The same patter repeats itself i the compariso of the RGS2SLSE ad the RB2SLSE for the biases ad RMSEs uder homoskedasticity. This results cofirm the coclusio that the efficiecy gai based o the set of the optimal istrumets i Lee 2003 is limited Das et al., As expected, the GS2SLSE ad B2SLSE impose sigificat biases o ρ 0 uder heteroskedastic cases. As stated, the GS2SLSE of ρ 0 is icosistet as the orthogoality coditios of momets i Kelejia ad Prucha 998 do ot hold i the presece of heterokedastic disturbaces. The results i Tables.6 ad.7 for the GS2SLSE ad the B2SLSE of ρ 0 are cosistet with this asymptotic argumet. The estimatio approach i Kelejia ad Prucha 998 is a two-step GMM estimatio method as a result of which the icosistecy of the estimator of ρ 0 affects the estimates of other parameters. Overall, the estimatio results i Tables.6 ad.7 idicate that the GS2SLSE, the MLE, the B2SLSE ad the BGMME are icosistet uder heteroskedasticity. Fially, the 33

47 performace of the RGS2LSE ad the RB2SLSE is compared with that of the RGMME uder heteroskedasticity. I Table.6, both RGS2LSE ad RB2SLSE impose sigificat biases o the autoregressive parameters. I Table.7, both RGS2LSE ad RB2SLSE impose small biases o the autoregressive parameters ad they are as efficiet as RGMME. I geeral, our Mote Carlo results are cosistet with our aalytical large-sample results: the RGMME of Propositio.3 is cosistet ad the MLE of spatial autoregressive parameters is geerally icosistet whe there is heteroskedasticity i the model..6 Coclusio Heteroskedasticity of ukow form has importat cosequeces for the estimatio of spatial ecoometric models. Asymptotic properties of estimators for spatial models are sigificatly affected i the presece of ukow heteroskedasticity. Therefore, heteroskedasticity should be accouted i the desig of ay estimatio approach. If heteroskedasticity is ot accouted i estimatio, the ML estimator for spatial autoregressive models icludig the SAR model, the Kelejia-Prucha model, ad the SEM is geerally icosistet. We show that the probability limit of derivative of the cocetrated loglikelihood fuctio evaluated at the true parameter vector is geerally ot zero for the spatial autoregressive models the cocetrated loglikelihood fuctio is ot maximized at the true parameter vector. This ecessary coditio for the cosistecy of the ML estimator of the autoregressive parameters depeds o the structure of the spatial weight matrices. We also show that the ML estimator of the parameters of the exogeous variable i the SAR ad the Kelejia-Prucha model is icosistet, ad we state the expressios of the correspodig asymptotic biases. For the SEM, we show that the MLE of the parameters of the exogeous variable is cosistet, despite the fact that the MLE of the autoregressive parameter of the spatial lag of the disturbace terms is icosistet. Thus, besides its computatioal burde, the cosistecy of ML estimators for the autoregressive spatial models is ot esured i the presece of ukow form of heteroskedasticity. I GMM estimatio framework, heteroskedasticity of ukow form ca be icorporated ito estimatio through formatio of the momet fuctios. We exted the robust GMM 34

48 approach i Li ad Lee 200 for the spatial model that has a spatial lag both i the depedet variable ad the disturbace term. For the GMM estimator, the quadratic momet matrices are costructed from the spatial weight matrices i the way that the orthogoality coditios of the quadratic momet fuctios are ot violated uder the ukow form of heteroskedasticity. These quadratic momet fuctios are combied with the liear momet fuctio for the GMM estimatio. I particular, we show that the robust GMME is cosistet ad has a properly cetered asymptotic ormal distributio. The small sample properties of the RGMM estimator alog with the ML ad other estimators are studied. I geeral, our Mote Carlo results are cosistet with our aalytical large-sample results, amely that the RGMME of Propositio.3 is cosistet, ad the ML estimator of autoregressive parameters of spatial models is geerally icosistet whe there is ukow form of heteroskedasticity i the model. The RGMME of Propositio.3 has desirable fiite sample properties uder both cases of heteroskedasticity ad homoskedasticity. As Mote Carlo experimets clearly idicate, researchers ought to be careful i iterpretig the estimatio results if the sample size is smaller tha 500 ad heteroskedastic errors might be preset. The MLE clearly performs very poorly uder these circumstaces regardless of heteroskedasticity. It is quite coveiet for researchers to estimate spatial ecoometric models with spatial depedece both i the depedet variable ad the disturbace term via ML method, as the spatial ecoometrics toolbox by James LeSage provides the routie. However, i our opiio a more rigorous approach is i to estimate the model with the RGMMEs for the sample sizes less tha 500, ad ii to estimate the model with the RGMMEs alog with the MLE for the larger samples to compare the parameter estimates. 35

49 .7 Appedix.7. Some Useful Lemmas Lemma.. Let A R m be matrix of coefficiets ad x R be vector of ukows. Cosider the homogeeous equatio Ax = 0. The, There always exists a solutio of Ax = 0. 2 If raka <, the ifiitely may solutios exist. 3 Ax = 0 has oly the trivial solutio x = 0 if ad oly if raka=. Proof. Obviously, x = 0 satisfies the equatio. This solutio is the trivial oe. 2 First, we will show that there exist a o-trivial solutio whe raka <. Let a, a 2,..., a be the colum vectors of A. If raka <, the colums of A are liearly depedet. There exist real umber x, x 2,..., x ot all zero such that a x + a 2 x 2 +,..., a x = 0. This implies that x = x, x 2,..., x satisfies Ax = 0. Therefore there exist a otrivial solutio x. Secodly, we will show that there exist ifiitely may solutio whe raka <. Let x 0 be the o-trivial solutio, the cx is also a solutio for ay c R. 3 Assume that raka=. Cosider the colum ad the ull space or kerel of A. I set otatio, cola = {y R m : y = Ax for some x R } ad kera = {x R : Ax = 0}. It ca be show that the sum of the dimesio of cola ad the dimesio of kera is. 43 That is, dimcola + dimkera =. Sice dimcola = raka, the dimesio of the kerel is raka. Whe raka =, the dimesio of the kerel is zero. This meas that there are o liearly idepedet vectors satisfyig Ax = 0. Thus, x = 0 is the oly elemet of kera. O the other had, if x = 0 is the oly solutio of Ax = 0, the the colums of A are liearly idepedet, which implies raka =. Lemma.2. Let A, B ad C be matrices with ijth elemets respectively deoted by a,ij, b,ij ad c,ij. Assume that A ad B have zero diagoal elemets, ad C has 43 For a proof see Exercise 4.4 i Abadir ad Magus 2005, page

50 uiformly bouded row ad colum sums i absolute value. Let q be vector with uiformly bouded elemets i absolute value. Assume that ε satisfies Assumptio. with covariace matrix deoted by Σ =Dσ 2,..., σ2. The, Eε A ε ε B ε = a,ij b,ij + b,ji σiσ 2 j 2 i= j= 2 Eε C ε 2 = = 3 varε C ε = = i= i= i= i= c 2,ii = tr Σ A B Σ + Σ B where Σ = Dσ, 2..., σ. 2 [ Eε 4 i 3σi] 4 + c,ii σi c,ij c,ij + c,ji σiσ 2 j 2 i= i= j= c 2 [,ii Eε 4 i 3σi 4 ] + tr 2 Σ C + trσ C C Σ + Σ C Σ C, c 2 [,ii Eε 4 i 3σi] 4 + i= j= c,ij c,ij + c,ji σiσ 2 j 2 c 2 [,ii Eε 4 i 3σi 4 ] + trσ C C Σ + Σ C Σ C. 4 Eε C ε = O, varε C ε = O, ε C ε = O p. 5 EC ε = 0, varc ε = O, C ε = O p, varq C ε = O, q C ε = O p. Proof. For, 2 ad 3 see Li ad Lee 200. For the rest of proof, let c, c 2, c 3, c 4 ad m be positive costat real umbers. 4 Eε C ε = trc Eε ε = trc Σ = i= c,iiσi 2. By hypothesis ad Assumptio., c,ii s ad σi 2 s are uiformly bouded. The, Eε C ε = O. The order of varε C ε ca be obtaied from 3. The first term i 3 is [ i= c2,ii Eε 4 i 3σi 4 ] = O, sice σ 2 i, Eε4 i ad c,ii are uiformly bouded i. The secod term i 3 is trσ C C Σ + Σ C Σ C = O, sice Σ C C Σ ad Σ C Σ C are uiformly bouded i both row ad colum sums. Thus, varε C ε = O. The ext result ca be obtaied by Markov s iequality: P ε C ε > m m Eε C ε = mo ad hece, ε C ε = O p. 5 By Assumptio., EC ε = 0 ad varc ε = C Σ C = i= σ2 i c,ic,i, where c,i is the ith colum of C. By hypothesis ad Assumptio., there are costats c ad c 2 such that σ 2 i c ad c,i c 2 i,. Hece, i= σ2 i c,ic,i i= σ2 i 37

51 c,i 2 c c 2 2 = O. The ext result follows from Chebyshev s iequality: P C ε EC ε > m m 2 varc ε = m 2 O. Hece, C ε = O p. Next, varq C ε = q C Σ C q = i= σ2 i k2 i, where k i is the ith elemet of q C. By hypothesis, there are costats c 3 ad c 4 such that q i c 3 ad j= c,ji c 4 i,. The, k i = j= q jc,ji c 3 c 4 i,. Thus, varq C ε = i= σ2 i k2 i c c 3 c 4 2 = O. The last result follows from Chebyshev s iequality: P q C ε Eq C ε > m m 2 varq C ε = m 2 O ad hece, q C ε = O p. Lemma.3. Assume that k matrix X has uiformly bouded elemets i absolute value, ad lim X R R X exits ad is osigular. Let M be give by I P, where P = R X X R R X X R. covariace matrix deoted by Σ =Dσ 2,..., σ2. The, Assume that ε satisfies Assumptio. with M ad P are uiformly bouded i absolute value i both row ad colum sums. 2 varp ε = O, P ε = o p, varε P ε = O, ε P ε = O p. 3 Elemets of P are O. Proof. Let K = X R R X. By hypothesis, K has fiite limit so that there exist costat c such that k,ij c for all i, j ad, where k,ij is the i, jth elemet of K. Let X = R X. Elemets of X are uiformly bouded sice both X ad R are uiformly bouded i absolute value i row ad colum sums. Deote i,jth elemet of X by x.ij, the there exists a costat c 2 such that x,ij c 2 for all i, j ad. The, P = X X R R X X = X K X = k s= k r= k,srx,s x,r, where x,r ad x,s are respectively the rth ad the sth colums of X. Deote i,jth elemet of j= k k s= r= k,rsx,is x,jr k2 c c 2 2 for all i ad. P by p,ij, the j= p,ij = Likewise, i= p,ij = i= k k s= r= k,rsx,is x,jr k2 c c 2 2 for all j ad. These results show that P is uiformly bouded i absolute value i row ad colum sums, which implies that M = I P is also uiformly bouded i absolute value i row ad colum sums. 2 These results directly follow from Lemma.2 4 ad Lemma The i,jth elemet of P is p,ij = k k s= r= k,rsx,is x,jr k2 c c 2 2 = O. 38

52 .7.2 The Icosistecy of the ML Estimator The first order coditios of the loglikelihood fuctio with respect to β ad σ 2 are l L ζ β l L ζ σ 2 = σ 2 X R ρr ρ [ S λy X β ],.29a = 2σ 2 + 2σ 4 ε θε θ..29b The solutios of the above first order coditios yield the ML estimators for β ad σ 2 : ˆβ δ = [ X ρx ρ ] X ρr ρs λy, ˆσ 2 δ = ε θε θ..30a.30b where, X ρ = R ρx ad ε θ = R ρ [ S λy X ˆβ δ ]. Explicitly, the MLE ˆσ 2 δ ca be writte as ˆσ 2 δ = Y S λr ρm ρr ρs λy,.3 where M ρ = I P ρ with P ρ = X ρ X ρx ρ X ρ is a projectio type matrix. Note that X ρm ρ = 0 k ad M ρx ρ = 0 k. Substitutio of R ρs λy = R ρx β + ε ito ˆσ 2 δ yields ˆσ 2 δ = ε M ρε. At δ 0, the probability limit of ˆσ 2 δ 0 is plim ˆσ 2 δ 0 = plim ε ε plim 2 ε X [ X X ] X ε..32 For the first term o the right had side, we have ε ε = i= σ2 i +o p by Chebyshev Weak Law of Large umbers. The secod term vaishes by the virtue of Lemma.2 4 ad Lemma.3. Therefore, we have ˆσ 2 δ 0 = σi 2 + o p..33 i= 39

53 Cocetratig out β ad σ 2 from the loglikelihood fuctio yield l L δ = 2 l2π + 2 l ˆσ2 δ + l S λ + l R ρ..34 For the first order coditios with respect to ρ ad λ, the partial derivatives ˆσ2 δ ρ ˆσ 2 δ λ are required. These terms are give by ad ˆσ 2 δ ρ = 2 Y S λr ρm ρm S λy Y S λr ρp ρh ρm ρr ρs λy ˆσ 2 δ λ = 2 Y S λr ρm ρr ρw Y..36 The first order coditios of the cocetrated loglikelihood fuctio with respect to ρ ad λ are give by l L δ ρ l L δ λ = ˆσ δ 2 2ˆσ δ 2 tr H ρ, ρ = ˆσ δ 2 2ˆσ δ 2 tr G λ, λ.37a.37b where G λ = W S λ ad H ρ = M R ρ. For the cosistecy of the ML estimators ˆλ l L ad ˆρ, the ecessary coditio is plim δ 0 δ = 0. More explicitly, the probability limit of the followig equatio must be zero: l L δ 0 = δ ˆσ2 δ 0 2 ε M ε ρ ˆσ2 δ 0 2 ε Mε λ tr H tr..38 G The probability limit of ˆσ2 δ 0 ρ ad ˆσ2 δ 0 λ be foud by usig the derivative expressios i.35 ad.36. are required for the above equatio, which ca 40

54 The probability limit of the first term i the first row of.38 ca be writte as plim ˆσ2 δ 0 2 ε M ε ρ = plim plim Y S R M M S Y ε M ε.39 Y S R P H M R S Y. ε M ε Each term is hadled separately below by usig equalities R S Y = R X β 0 + ε ad S Y = X β 0 + R ε. Note that X M = 0 k ad M X = 0 k. The first term o the = plim. Substitutio of M = I X [X X ] X ito r.h.s. yields r.h.s. of.39 ca be writte as plim Y S R MMSY ε M ε plim ε M M R ε plim ε Mε Y S R M M S Y ε M ε = plim plim ε H ε + plim ε M ε ε M M X β 0 ε M ε ε X 2 [ X X ] X M R ε MMXβ 0 + ε M ε ε..40 ε M ε By Lemma.2 5 ad.33, the secod term o the r.h.s of.40 vaishes. The third term vaishes by Lemma.2 4 ad.33. The probability limit of the first term o the r.h.s. of.40 ca be foud by Chebyshev iequality. By Lemma.2 4, the variace of the term ε H ε is O = o. Hece, plim ε H ε E ε H ε = plim ε H ε i= H,iiσi 2 = 0. Combiig this result with the result i.33, we get plim Y S R M M S Y ε M ε i= H,iiσ 2 i i= σ2 i = 0..4 Now, we retur to the secod term o the r.h.s of.39: plim Y S R P H M R S Y = ε Mε plim Xβ 0 P M Mε +plim. The first term o the r.h.s. ε M ε ε X[X X] X H Mε ε M ε coverges i probability to zero by Lemma.2 5 ad.33. The secod term o the r.h.s. coverge i probability to zero by the virtue of Lemma.2 4,.33 ad Lemma.3. Hece, plim ˆσ2 δ 0 2 ε M ε ρ i= H,iiσi 2 = i= σ2 i 4

55 Now, we evaluate the probability limit of the expressio i the secod row of.38: plim ˆσ2 2 ε M ρ 0 ε λ + plim δ 0 = plim β 0 X R M R G X β 0 + plim ε M ε ε M R G R ε M ε ε + plim β 0 X R M R G R ε M R G X β 0 ε M ε ε..43 ε M ε The first term o the r.h.s. of.43 is hadled separately later. First, the third ad the fourth term o the r.h.s. are zero sice X R M = X M = 0 k. The secod term o the r.h.s. vaishes i probability, sice plim ε M R G X β 0 ε M ε = plim ε R G X β 0 plim ε M ε ε X 2 [ X X ] X R G X β 0. ε M ε.44 The umerators o the r.h.s. of.44 coverge i probability to zero by Lemma.2 5 ad Lemma.3, ad for the term i the deomiator we have ε M ε = i= σ2 i + o p as show i.33. The overall result is zero sice i= σ2 i is uiformly bouded for all by Assumptio.. As for the first term o the r.h.s. of.43, let G = R G R. The, plim ε M G ε ε M ε = plim ε G ε ε M ε plim ε X [X X ] X G ε ε M ε..45 The umerator of the last term o the r.h.s. of.45 is O p by Lemma.2 4 ad Lemma.3. Hece, as goes to ifiity the umerator coverges i probability to zero. The deomiator coverges to the uiformly bouded sum i.33. Hece, this term vaishes. Now, we ca retur to the first term o the r.h.s. of.45. By Lemma.2 4, the variace of ε G ε is O = o. The, Chebyshev iequality implies that plim ε G ε E ε G ε = plim ε G ε i= G.iiσ 2 i = 0. Therefore, plim ε G ε ε M ε i= G.iiσ 2 i i= σ2 i These results imply that the probability limit i.43 is give by ˆσ2 δ 0 2 = ε M ε λ 42 = i= G.iiσi 2 + o p..47 i= σ2 i

56 Combiig.47 ad.42, we obtai: l L δ 0 = δ i= H,iiσi 2 i= σ2 i i= G.iiσ 2 i i= σ2 i The result i.9 of the mai text follows from.48. tr H + op tr..48 G + op The asymptotic bias of the MLE ˆβ δ ca be determied from.30a. Let D ρ = [ X ρx ρ ]. The, the MLE ˆβ δ ca be writte as ˆβ δ = [ X ρx ρ ] X ρr ρs λy = β 0 + D ρx ρr ρr ε + λ 0 λd ρx ρr ρg X β 0 + λ 0 λd ρx ρr ρg R ε,.49 where we use S λ = S + λ 0 λw. Substitutio of R ρ = R + ρ 0 ρm ito the MLE ˆβ δ yields ˆβ δ = β 0 + D ρx ε + ρ 0 ρd ρx M ε + ρ 0 ρd ρx H ε + ρ 0 ρ 2 D ρx M H ε + λ 0 λd ρx G X β 0 + λ 0 λρ 0 ρd ρx H s G X β 0 + λ 0 λρ 0 ρ 2 D ρx H H G X β 0 + λ 0 λd ρx G ε + λ 0 λρ 0 ρd ρx M G ε + λ 0 λρ 0 ρd ρx H G ε + λ 0 λρ 0 ρ 2 D ρx H H G ε.50 Assumptio.6 alog with Lemma.3 implies that D ρ is uiformly bouded i absolute value i row ad colum sums. The, usig Lemma.2 5, terms with ε vaish i probability i the MLE ˆβ δ. Thus, ˆβ δ = β 0 + λ 0 λd ρx G X β 0 + λ 0 λρ 0 ρd ρx H s G X β 0 + λ 0 λρ 0 ρ 2 D ρx H H G X β 0 + o p. The asymptotic bias of ˆβ ˆδ follows from the above equatio. 43

57 The specificatio with λ 0 = 0 i 3.3 is called spatial error model SEM. For the SEM, the loglikelihood fuctio simplifies to l L ζ = 2 l2π 2 lσ2 + l R ρ 2σ 2 Y X β R ρr ρ Y X β, where ζ = θ, σ 2 with θ = ρ, β. The first order coditios yield ˆβ ρ = D ρx ρr ρy ad ˆσ 2 ρ = ε θε θ. The ecessary coditio for the cosistecy the MLE ˆρ ca be obtaied from.48. From the first row of.48, we get l L ρ 0 ρ = covh,ii, σi 2 + o σ 2 p, which implies that the MLE ˆρ is icosistet. Substitutio of Y = X β 0 + R ε ito ˆβ ρ yields ˆβ ρ = β 0 + D ρx ρr ρr ε = β 0 + D ρx R ε + ρ 0 ρd ρx M ε + ρ 0 ρd ρx R H ε + ρ 0 ρ 2 D ρx M H ε..5 Assumptio.6 alog with Lemma.3 implies that D ρ is uiformly bouded i absolute value i row ad colum sums. By Lemma.2 5, terms with ε have variaces of O. The, Chebyshev s iequality implies that ˆβ ρ = β 0 + o p so that ˆβ ˆρ has o asymptotic bias..7.3 Proof of Mai Propositios Proof of Propositio 3.. The GMM estimator is a extremum estimator. The coditios for the cosistecy of extremum estimators are established i Theorem 4.. i Amemiya 985, see p Let L θ be the objective fuctio of the GMM estimator. The GMM estimator ˆθ = argmi θ Θ L θ = argmi θ Θ g θψ Ψ g θ is cosistet uder the followig coditios: The parameter space is a compact subset of R k+2 ad θ 0 Θ, 2 L θ is cotiuous i θ, 3 L θ coverges to the o-stochastic fuctio EL θ i probability uiformly i θ Θ as, ad 4 Lθ = lim EL θ attais a uique global maximum at θ 0 i.e. the idetificatio coditios give i Assumptio.7 are satisfied. The coditios, 2 ad 4 are satisfied uder our assumptios. For coditio 3, it is eough to show that Ψ g θ coverges to its limit E Ψ g θ uiformly i θ Θ. Let Ψ = Ψ,..., Ψ m, Ψ x, where Ψ j is the jth colum ad Ψ x is a submatrix. Also, let Ψ i, be the ith row of the matrix Ψ such that Ψ i, = Ψ i,,..., Ψ i,m, Ψ i,x 44

58 where Ψ i,j, j =,..., m, are scalars ad Ψ i,x is a row subvector with its dimesio r as the umber of rows of Q. It is sufficiet to show the uiform covergece of Ψ i, g θ for each i. More explicitly, Ψ i, g θ = ε θ m j= Ψ i,jp j ε θ+ψ i,x Q ε θ. Sice ε θ = R ρ S λy X β, S λ = S + λ 0 λw, ad R ρ = R + ρ 0 ρm, the ε θ = R + ρ 0 ρm h ς + R β + λ 0 λg X β 0 ad ς = λ, β. More explicitly, ε + λ 0 λg R ε where h ς = X β 0 ε θ = R h ς + ρ 0 ρ M h ς + ε + λ 0 λ G ε + ρ 0 ρ H ε + ρ 0 ρ λ 0 λ M G R ε, where H = M R ad G = R G R. Hece, m ε θ Ψ i,j P j ε θ = j= m h ςr + ρ 0 ρh ςm j= Ψ i,j P j m R h ς + ρ 0 ρm h ς + h ςr + ρ 0 ρh ςm Ψ i,j P j ε + λ 0 λg ε + ρ 0 ρh ε + ρ 0 ρλ 0 λm G R ε m + ε + λ 0 λε G + ρ 0 ρε H + ρ 0 ρλ 0 λε R G M Ψ i,j P j R h ς + ρ 0 ρm h ς + ε + λ 0 λε G + ρ 0 ρε H m + ρ 0 ρλ 0 λε R G M Ψ i,j P j ε + λ 0 λg ε + ρ 0 ρh ε j= + ρ 0 ρλ 0 λm G R ε. j= j= For otatioal simplificatios defie l θ ad q θ i the followig way. l θ = m h ςr + ρ 0 ρh ςm Ψ i,j P s j ε + λ 0 λg ε + ρ 0 ρh ε j= + ρ 0 ρλ 0 λm G R ε, 45

59 where P s j = P j + P j, ad q θ = ε + λ 0 λε G + ρ 0 ρε H + ρ 0 ρλ 0 λε R G M m Ψ i,j P j ε + λ 0 λg ε + ρ 0 ρh ε + ρ 0 ρλ 0 λm G R ε. j= More compactly, m ε θ Ψ i,j P j ε θ = j= m h ςr + ρ 0 ρh ςm j= Ψ i,j P j R h ς + ρ 0 ρm h ς + l θ + q θ. Notice that ad h ςr = β 0 β X R + λ 0 λ X β 0 G R h ςm = β 0 β X M + λ 0 λ X β 0 G M. 46

60 By expasio, l θ = λ 0 λ X m β 0 G R j= Ψ i,j P s j ε + β 0 β X m R j= Ψ i,j P s j ε + ρ 0 ρ λ 0 λ X m β 0 G M Ψ i,j P s j ε + ρ 0 ρ β 0 β X M j= m Ψ i,j P s j ε + λ 0 λ 2 m X β 0 G R Ψ i,j P s j G ε + λ 0 λ j= β 0 β X m R Ψ i,j P s j G ε + ρ 0 ρ λ 0 λ 2 j= m Ψ i,j P s j G ε + ρ 0 ρ λ 0 λ β 0 β X m M j= + ρ 0 ρ λ 0 λ m Ψ i,j P s j j= X β 0 G M j= Ψ i,j P s j G ε m X β 0 G R Ψ i,j P s j H ε + ρ 0 ρ β 0 β j= H ε + ρ 0 ρ 2 λ0 λ m X β 0 G M Ψ i,j P s j j= j= + ρ 0 ρ 2 β0 β X m M Ψ i,j P s j H ε + ρ 0 ρ λ 0 λ 2 j= m m j= Ψ i,j P s j M G R ε + ρ 0 ρ λ 0 λ β 0 β X R M G R ε + ρ 0 ρ 2 λ0 λ 2 + ρ 0 ρ 2 λ0 λ β 0 β m X M m X β 0 G M j= Ψ i,j P s j j= Ψ i,j P s j j= X R H ε X β 0 G R Ψ i,j P s j M G R ε M G R ε..52 Each matrix i the above expasio is uiformly bouded. Thus, applyig Lemma.2 5, all the terms o the r.h.s. of.52 coverge i probability to zero. Hece, l θ = o p 47

61 uiformly i θ Θ. 44 Similarly, q θ = m ε Ψ i,j P j ε + λ 0 λ ε G j= m Ψ i,j P s j ε + ρ 0 ρ λ 0 λ ε R j= ρ 0 ρ ε G m j= M G R ε + ρ 0 ρ 2 λ0 λ ε H ε G m ρ 0 ρ 2 ε j= m j= G M Ψ i,j P s j ε + ρ 0 ρ ε H m j= Ψ i,j P s j H ε + λ 0 λ 2 ρ0 ρ ε G m j= Ψ i,j P j G ε + ρ 0 ρ 2 ε H R G M m j= Ψ i,j P s j ε + λ 0 λ m j= Ψ i,j P s j Ψ i,j P s j M G R ε + λ 0 λ 2 m j= Ψ i,j P j H ε + λ 0 λ 2 Ψ i,j P j M G R ε..53 By Lemma.2 4, the variace of the first term o the r.h.s. of.53 has order O. The, by geeralized Chebyshev iequality, this term coverges i probability to zero sice m j= Ψ i,je[ε P j ε ] = m j= Ψ i,jtrσ P j = 0 for P j P 2, j =,..., m. All others term has the same structure. Therefore the same asymptotic argumet applies. Here, we just provide the asymptotic argumet for a geeric term. Agai by Lemma.2 4, the variace of all the remaiig terms have order O. Cosider the secod term of the r.h.s. of.53 ad deote it with ζθ. Takig expectatio yields E λ 0 λ ε G m j= Ψ i,j P s j ε = λ 0 λ m [ ] Ψ i,j E tr ε G P s j ε j= = λ 0 λ m Ψ i,j tr Σ G P s j. Let ɛ > 0 be ay real umber ad P be the probability measure. The, by the geeralized 44 The uiform covergece i θ follows sice Θ is a compact set ad l θ is a quadratic ad cotiuous fuctio i θ. Thus, l θ has a bouded rage. From this observatio, the uiform covergece follows from Lemma 2.4 of Newey ad McFadde 994, see p The same argumet ca also be see i the proof of Propositio 3. i Lee 2007a j= 48

62 Chebyshev iequality, we obtai P[ λ0 λ m ε G Ψ i,j P s j ε j= E λ0 λ ε G m j= ] Ψ i,j P s j ε > ɛ varζθ ɛ 2. By Lemma.2 4, varζθ is O. As, the r.h.s of the above equatio coverges to zero. Thus, we have λ0 λ m ε G Ψ i,j P s j ε = λ 0 λ m Ψ i,j tr Σ G P s j + op. j= j= Applyig the same asymptotic argumet to the remaiig terms, the equatio i.53 simplifies to q θ = λ 0 λ λ 0 λ m j= m j= + λ 0 λ 2 ρ0 ρ Ψ i,j tr Σ G P s j + ρ0 ρ m j= Ψ i,j tr Σ P s j M G R + λ0 λ ρ 0 ρ m j= tr Σ H P s j M G R + λ0 λ 2 tr Σ H P j H + λ0 λ 2 ρ0 ρ 2 Ψ i,j tr Σ H P s j + ρ0 ρ m j= Ψ i,j tr Σ G P s j M G R + λ0 λ ρ 0 ρ 2 m j= m j= Ψ i,j tr Σ G P s j H m j= Ψ i,j tr Σ G P j G + ρ0 ρ 2 Ψ i,j tr Σ R Ψ i,j m j= G M P j M G R Ψ i,j + o p,.54 uiformly i θ Θ. The uiform covergece i θ follows sice Θ is a compact set ad 49

63 q θ is quadratic ad cotiuous fuctio i θ. Hece, m ε θ Ψ i,j P j ε θ = h ςr + ρ 0 ρ m h ςm Ψ i,j P j R h ς j= j= m m + ρ 0 ρ M h ς + λ 0 λ j= tr Σ P s j H + ρ0 ρ λ 0 λ m j= Ψ i,j tr Σ P s j G + ρ0 ρ m j= Ψ i,j tr Σ H P s j G + λ0 λ 2 ρ0 ρ + ρ 0 ρ 2 λ0 λ m j= tr Σ G P j G + ρ0 ρ 2 tr Σ R j= Ψ i,j Ψ i,j tr Σ P s j M G R + λ0 λ ρ 0 ρ m j= Ψ i,j tr Σ H P s j M G R + λ0 λ 2 m j= Ψ i,j tr Σ G P s j M G R m j= Ψ i,j Ψ i,j tr Σ H P j H + λ0 λ 2 ρ0 ρ 2 G M P j M G R + op,.55 m j= Ψ i,j uiformly i θ Θ. The r.h.s of the above equatio is simply expectatio of the term i the l.h.s. The above relatio holds for all i. Therefore, Ψ g θ coverges to E Ψ g θ uiformly i θ Θ. By the idetificatio coditio ad the above uiform covergece result, GMM estimator ˆθ is cosistet. Next, we show the asymptotic ormality of the GMM estimator ˆθ. The first order coditio implies that g ˆθ θ Ψ Ψ g ˆθ = 0. By the mea value theorem at θ, we have ˆθ = θ 0 ˆθ θ 0 = g ˆθ Ψ g θ g ˆθ θ Ψ θ θ g ˆθ Ψ θ Ψ g θ θ g ˆθ θ Ψ Ψ g θ 0 Ψ Ψ g θ 0,.56 50

64 where We will show that g θ θ ad the fact that E gθ 0 θ g θ = θ coverges to ε θp s ε θp s 2. ε θp s ε θ θ ε θ θ m εθ θ Q ε θ θ..57 gθ E uiformly i θ Θ. Give this result θ gθ E is cotiuous i θ ad plim θ ˆθ = θ 0, we have g ˆθ = θ + o p = Γ + o p Amemiya, 985, Theorem 4..5, p.3. Sice εθ θ = M S λy M X β, R ρw Y, R ρx, the gradiet i.57 ca be writte as g θ = θ ε θp s M S λy M X β ε θp s R ρw Y ε θp s R ρx ε θp s 2 M S λy M X β ε θp s 2 R ρw Y ε θp s 2 R ρx.... ε θp m s M S λy M X β ε θp mr s ρw Y ε θp mr s ρx Q M S λy M X β Q R ρw Y Q R ρx The probability limit of the above gradiet is evaluated below. By usig S λ = S +λ 0 λw ad R ρ = R + ρ 0 ρm equalities, the rows of the above gradiet ivolvig P j is give as g θ = ε θ j, θp s j M S Y + λ 0 λε θp s j M W Y ε θp s j M X β, ε θp s j R W Y + ρ 0 ρε θp s j M W Y, ε θp s j R X + ρ 0 ρε θp s j M X. Each term of the above jth row is evaluated separately. Oe of the terms is ε θp s j R W Y = ε θp s j R G X β 0 + ε θp s j G ε. More explicitly, by substitutig the expasio of 5

65 ε θ ito this term: ε θp s j R G X β 0 = R h ς + ρ 0 ρ M h ς + ε + λ 0 λ G ε + ρ 0 ρh ε + ρ 0 ρ λ 0 λ M G R ε P s j R G X β 0 = h ςr P s j R G X β 0 + ρ0 ρ h ςm P s j R G X β 0 + ε P s j R G X β 0 + λ0 λ ε G P s j R G X β 0 + ρ0 ρ ε H P s j R G X β 0 + ρ0 ρ λ 0 λ ε R G M P s j R G X β Notice that all terms except first two elemets have the same structure; therefore, they are subject to the same asymptotic argumet. By Lemma.2 5, all terms except the first two terms vaish. Thus, ε θp s j R G X β 0 = h ςr P s j R G X β 0 + ρ 0 ρh ς M P s j R G X β 0 + o p. Similarly, by Lemmas.2 4 ad 5, we have ε θp s j G ε = R h ς + ρ 0 ρ M h ς + ε + λ 0 λ G ε + ρ 0 ρ H ε + ρ 0 ρ λ 0 λ M G R ε P s j G ε = h ςr P s j G ε + ρ0 ρ h ςm P s j G ε + ε P s j G ε + λ0 λ ε G P s j G ε + = tr Σ P s j G + ρ0 ρ ε H P s j G ε + + λ0 λ tr Σ G P s j G ρ0 ρ λ 0 λ tr Σ G P s j M G R ρ0 ρ λ 0 λ ε R G M P s j G ε + ρ0 ρ tr Σ H P s j G + op,.59 uiformly i θ Θ. Combiig these results, we get ε θp s j R W Y = h ςr P s j R G X β 0 + ρ0 ρ h ςm P s j R G X β tr Σ P s j G + λ0 λ tr Σ G P s j G ρ0 ρ λ 0 λ tr Σ G P s j M G R ρ0 ρ tr Σ H P s j G + op,.60 uiformly i θ Θ. Sice h ς 0 = 0 at θ 0, ε θ 0 P s j R W Y = tr Σ P s j G + op. 52

66 Now we tur to aother elemet i the jth row. A similar aalysis applies to ε θp s j M S Y = ε θp s j M X β 0 + ε θp s j M R ε. By substitutig the expasio of ε θ i this terms, we get ε θp s j M X β 0 = R h ς + ρ 0 ρ M h ς + ε + λ 0 λ G ε + ρ 0 ρ H + ρ 0 ρ λ 0 λ M G R ε P s j M X β 0 = h ςr P s j M X β 0 + ρ0 ρ h ςm P s j M X β 0 + ε P s j M X β 0 + λ0 λ ε G P s j M X β 0 + ρ0 ρ ε H P s j M X β 0 + ρ0 ρ λ 0 λ ε R G M P s j M X β 0 = h ςr P s j M X β 0 + ρ0 ρ h ςm P s j M X β 0 + o p,.6 uiformly i θ Θ by Lemmas.2 4 ad 5. Similarly, ε θp s j M R ε = R h ς + ρ 0 ρ M h ς + ε + λ 0 λ G ε + ρ 0 ρ H ε + ρ 0 ρ λ 0 λ M G R ε P s j M R = h ςr P s j M R ε + ρ0 ρ h ςm P s j M R ε + ε P s j M R ε + λ0 λ ε G P s j M R ε + ρ0 ρ ε H P s j M R ε + ρ0 ρ λ 0 λ ε R G M P s j M R ε = tr Σ P s j M R + λ0 λ tr Σ G P s j M R + ρ0 ρ tr Σ H P s j M R + ρ0 ρ λ 0 λ tr Σ R G M P s j M R + op,.62 ε uiformly i θ Θ. Sice H = M R, ε θp s j M R ε = tr Σ P s j H tr Σ H P s j H + λ0 λ tr Σ G P s j H + ρ0 ρ + ρ0 ρ λ 0 λ tr Σ H P s j M G R + o p

67 The, combiig the above results, we get ε θp s j M S Y = h ςr P s j M X β 0 + ρ0 ρ h ςm P s j M X β tr Σ P s j H + λ0 λ tr Σ G P s j H ρ0 ρ λ 0 λ tr Σ H P s j M G R ρ0 ρ tr Σ H P s j H + op,.64 uiformly i θ Θ. Sice h ς 0 = 0 at θ 0, ε θ 0 P s j M S Y = tr Σ P s j H + op. With the same lie of argumet, we have ε θ 0 P s j R X = o p ad ε θ 0 P s j M X β 0 = o p. All the remaiig terms i the jth row vaishes whe evaluated at the true parameter value. Now, we retur ε θ Q θ i.57. This term ca be writte as ε θ Q = Q θ M S Y + λ 0 λ Q M W Y Q M X β, Q R W Y + ρ 0 ρ Q M W Y, Q R X + ρ 0 ρ Q M X. The first term i the r.h.s of the above equatio vaishes whe evaluated at the true parameter θ 0. For the secod term, we have Q R W Y = Q R G X β 0 + Q G ε = Q R G X β 0 + o p. Likewise the last term coverges to Q R X. Combiig all the previous results, we get the relatio g ˆθ θ = Γ + o p uiformly i θ, where Γ is give i.26. By CLT i Theorem of Kelejia ad Prucha 200, Ψ g θ 0 = [ ε m j Ψ ] jp j ε + Ψ x Q d ε N0, lim Ψ Ω Ψ. The asymptotic distributio of ˆθ θ 0 i.27 ow follows from.56 by the Slutzky theorem. Proof of Propositio.2. We first show the cosistecy of ˆΩ by showig that each elemet i ˆΩ Ω is op. Notice that some of the elemets Ω are of the form: i= j= P,ijσ 2 i σ2 j, where P,ij = P a,ij Pb,ij + P b,ji by Lemma.2. Also, otice that P,ii = 0. Followig the same steps of Li ad Lee 200, we first show i= j= P,ijε 2 i ε2 j = i= j= P,ijσ 2 i σ2 j + o p. The, we show that this relatio still holds, whe ˆε i replaces ε i. As a iitial step, we eed to establish the uiform 54

68 boudedess of P i both the row ad colum sum orms. P b is uiformly bouded i both row ad colum sum orms ad therefore its elemets are uiformly bouded by Assumptio.4. Hece, there exists a costat c such that P b,ij + P b,ji c, for all i, j ad. This implies P,ij c P a,ij. Sice P a is bouded i both row ad colum sum orms, P is uiformly bouded i both row ad the colum sum orms. Hece, By expasio, ε 2 i ε2 j σ2 i σ2 j = ε2 i σ2 i ε2 j σ2 j + σ2 i ε2 j σ2 j + σ2 j ε2 i σ2 i. i= j= P,ij ε 2 i ε 2 j σiσ 2 j 2 = + P,ij σ i 2 ε 2 j σ 2 j + i= j= }{{} B i= j= P,ij ε 2 i σi 2 ε 2 j σj 2 } {{ } A i= j= P,ij σ 2 j ε 2 i σ 2 i } {{ } C..65 First, we express A, B ad C i terms of quadratic forms for otatioal simplificatio. To this ed, let u = u,..., u such that u i = ε 2 i σ2 i for i =,..., ad let Σ σ = σ 2,..., σ2. The, A = u P u, B = u P Σ σ, ad C = Σ σ P u. As E u P u = tr P Λ where Λ = E u u 4 = D µ σ4,..., µ4 σ 4, where µ 4 i = E ε 4 i. This implies E u P u = tr P Λ = 0 sice P,ii = 0 i. By Lemma.2 4, plim A = 0. By Lemma.2 5 ad Assumptio., plim B = 0 ad plim C = 0. Hece, plim i= j= P,ij ε 2 iε 2 j P,ij σiσ 2 j 2 = 0. i= j= Next, we will show that i= j= P,ij ˆε 2 iˆε2 j = i= j= P,ijε 2 i ε2 j + o p. By expasio, ˆε 2 iˆε2 j ε2 i ε2 j = ˆε 2 i ε2 iˆε 2 j ε 2 j + ε 2 j ˆε 2 i ε 2 i + ε 2 i ˆε 2 j ε 2 j. 55

69 The, i= j= 2 P,ij ˆε iˆε 2 j ε 2 iε 2 j = + P,ij ε 2 2 jˆε i ε 2 i + i= j= }{{} ϑ 2 i= j= 2 P,ij ˆε i ε 2 2 iˆε j ε 2 j } {{ } ϑ i= j= P,ij ε 2 iˆε 2 j ε 2 j } {{ } ϑ From the model, we have ˆε = R ˆρ S ˆλY X ˆβ. By usig the relatios R ˆρ = R + ρ 0 ˆρ M ad Sˆλ = S + λ 0 ˆλ W, we get ˆε = [ R + ρ ˆρ ] [ M S Y + λ 0 ˆλ ] W Y X ˆβ = [ R + ρ ˆρ ] [ M X β 0 ˆβ + λ 0 ˆλ G X β 0 + R ε + λ 0 ˆλ ] G R ε. Let h ˆς = X β0 ˆβ + λ 0 ˆλ G X β 0 where ˆς = ˆλ, ˆβ. Hece, ˆε = ε + R + ρ ˆρ M h ˆς + λ 0 ˆλ G ε + ρ 0 ˆρ H ε + λ 0 ˆλ ρ 0 ˆρ M G R ε. Let e i, be the i-th row of the idetity matrix. The, i scalar form, ˆε i = ε i + a i + b i + c i + f i, where a i = e i, R h ˆς + ρ 0 ˆρ e i, M h ˆς, b i = λ 0 ˆλ e i, G ε, c i = ρ 0 ˆρ e i, H ε, ad fi = λ 0 ˆλ ρ 0 ˆρ e i, M G R ε. The, ˆε 2 i = ε i + a i + b i + c i + f i 2 = ε 2 i + a 2 i + b2 i + c2 i + f 2 i + 2ε ia i + 2ε i b i + 2ε i c i + 2ε i f i + 2a i b i + 2a i c i + 2a i f i + 2b i c i + 2b i f i + 2c i f i. Next, we will evaluate all three terms ϑ l, l =, 2, 3 ad show that they coverge i probability to zero. First, cosider ϑ 2 : ϑ 2 = i= j= P,ij ε 2 j a 2 i + b 2 i + c 2 i + fi 2 + 2ε i a i + 2ε i b i + 2ε i c i + 2ε i f i + 2a i b i + 2a i c i + 2a i f i + 2b i c i + 2b i f i + 2c i f i. We focus o terms with the higher orders i εs. Cosider i= j= P,ijε 2 j ε ib i = 56

70 λ 0 ˆλ i= j= l= P,ijG,il ε i ε 2 j ε l. By Cauchy-Schwartz iequality, E ε i ε l ε 2 j [ Eε i ε l 2] [ 2 Eε 4 j ] 2 [ Eε 4 i ] [ 4 Eε 4 l ] [ 4 Eε 4 j ] 4 c, where c is a costat, for all i,j,l, ad sice {µ 4 i } is a bouded sequece by Assumptio.. This implies E i= j= l= P,ij G,il ε i ε 2 jε l c P,ij G,il = O, j= l= i= sice P ad G are uiformly bouded i row ad colum sums. By the Markov iequality, i= j= l= P,ijG,il ε i ε 2 j ε l = O p, i.e., stochastically bouded. Sice λ 0 ˆλ = o p, λ 0 ˆλ i= j= l= P,ijG,il ε i ε 2 j ε l coverges i probability to zero. Aother term with high order is εs is i= j= P,ij ε 2 jfi 2 = λ 0 ˆλ 2 ρ 0 ˆρ 2 i= j= k= l= From the proof of the previous term, it follows that E i= j= k= l= c P,ij M,ik G,ik R,ik M,ilG,il R P,ij j= k= i= sice P,, M G R the Markov iequality provides that i= j= k= l= M,ik G,ik R P,ij M,ik G,ik R,ik M,ilG,il R,il ε2 jε k ε l.,ik,il ε2 jε k ε l M,il G,il R = O, are uiformly bouded i row ad colum sums. A applicatio of l= P,ij M,ik G,ik R,ik M,ilG,il R,il ε2 jε k ε l = O p.,il Sice λ 0 ˆλ = o p ad ρ 0 ˆρ = o p, i= j= P,ijε 2 j f i 2 coverges i probability 57

71 to zero. The remaiig terms i ϑ 2 are either of the same order or less i ε s. A similar aalysis with Markov iequality ca be applied to each of the remaiig terms, which yields ϑ 2 = o p. The structure of ϑ 3 is the same as that of ϑ 2, i.e., i s replaced by j s ad vice versa. Hece, ϑ 3 coverges to zero i probability. Now we tur to the first term ϑ : i= j= P,ij a 2 i + b 2 i + c 2 i + fi 2 + 2ε i a i + 2ε i b i + 2ε i c i + 2ε i f i + 2a i b i + 2a i c i + 2a i f i + 2b i c i + 2b i f i + 2c i f i a 2 j + b 2 j + c 2 j + f 2 j + 2ε j a j + 2ε j b j + 2ε j c j + 2ε j f j + 2a j b j + 2a j c j + 2a j f j + 2b j c j + 2b j f j + 2c j f j. We will agai focus o those terms with highest order i ε s. These terms are i= j= P,ijp 2 i q2 j, where p, q = {b, c, f}. Let p = q = b for expositio. The, i= j= P,ij b 2 ib 2 j = i= j= 2 2 P,ij ei, G ε ej, G ε λ0 ˆλ 4 = λ 0 ˆλ 4 P,ij G,ik G,ik2 G,jl G,jl2 ε k ε k2 ε l ε l2. i= j= k = k 2 = l = l 2 = }{{} L Applyig Cauchy-Schwartz iequality to the followig term yields E ε k ε k2 ε l ε l2 Eε 2 k ε 2 k 2 2 Eε 2 l ε 2 l 2 2 E ε 4 4 k E ε 4 4 k 2 E ε 4 4 l E ε 4 4 l 2 c, for some c for all sice µ 4 k, µ 4 k 2, µ 4 l, ad µ 4 l 2 are bouded by Assumptio.. Note that L is stochastically bouded, sice E i= j= k = k 2 = l = l 2 = c P,ij G,ik G,ik2 G,jl G,jl2 ε k ε k2 ε l ε l2 P,ij G,ik G,ik2 G,jl j= k= k2= l= l2= i= = O, G,jl2 ad by the Markov iequality, L is stochastically bouded, i.e., L = O p. Sice λ 0 ˆλ = 58

72 o p, i= j= P,ijb 2 i b2 j coverges i probability to zero. A similar aalysis with a applicatio of the Markov iequality esures that each of the remaiig combiatios p, q = {b, c, f} i i= j= P,ijp 2 i q2 j is o p. The rest of the terms i ϑ are of smaller order i εs ad ca easily verified to be stochastically coverget to zero. Hece, ϑ coverges i probability to zero. The, ϑ = ϑ 2 = ϑ 3 = o p implies the followig i= j= P,ij ˆε 2 iˆε2 j i= j= P,ijε 2 i ε2 j = o p. Combiig with the result of i= j= P,ijε 2 i ε2 j i= j= P,ijσ 2 i σ2 j = o p yields i= j= P,ij ˆε 2 iˆε 2 j P,ij σiσ 2 j 2 = o p. i= j= The remaiig term left i Ω is Q Σ Q = i= σ2 i q i, q i,, where q i, is the ith row of Q. The previous discussio applied to i= j= P,ijσ 2 i σ2 j esures that i= ˆε2 i q i, q i, i= σ2 i q i, q i, = o p. The, it follows that ˆΩ Ω = o p. Next, we show the cosistecy of ˆΓ. H P s j i= Oe type of the elemets with εs i Γ is ii σ2 i. Sice P s ad H s are all uiformly bouded i both row ad colum sums, so are matrices H P s j s. Hece, it follows from the same argumet i the proof of the cosistecy of ˆΩ that i= H P s j ii ˆε2 i i= H P s j ii σ2 i = o p. The other type of elemets with εs i Γ is i= G P s j ii σ2 i. By Assumptio.2, R, G, ad R are all uiformly bouded i both row ad colum sums. Hece, the matrices G P s j = O. By the same argumet from the proof of the cosistecy of ˆΩ, it follows that i= G P s j ii ˆε2 i i= G P s j ii σ2 i = o p. The, ˆΓ Γ = o p. Proof of Propositio.3. The proof follows i parallel to the proof of Propositio.3 i Li ad Lee 200. By geeralized Schwartz iequality, the optimal weightig matrix i Propositio 3. is Ω. First, we show that g θ Ω g θ g θω g θ = o p. Cosider g θ Ω g θ = g θω g θ + g θ Ω Ω g θ. Lettig Ψ = Ω 2 i Propositio 3., Assumptio.8 implies that Ψ 0 = lim Ω 2 exists. Because Ψ 0 is osigular, θ 0 correspods to the uique root of lim E g θ = 0 at θ 0, which is satisfied by Assumptio.7. A similar argumet i the proof of Propositio 3. 59

73 esures that g θω g θ coverges i probability to a well defied limit uiformly i θ Θ. Now, we show that g θ Ω Ω g θ = o p uiformly i θ Θ. Let. be the maximum row sum orm for vectors ad matrices. The, by the submultiplicative property of a matrix orm, g θ Ω Ω g θ g θ 2 Ω Ω. From the proof of Propositio 3., g θ E g θ = o p. Also, from the proof of Propositio 3., E ε θp j ε θ = h ςr ρp j R ρh ς + λ 0 λ tr Σ P s j G + ρ0 ρ tr Σ P s j H + ρ0 ρ λ 0 λ tr Σ P s j M G R + λ0 λ ρ 0 ρ tr Σ H P s j G tr Σ H P s j M G R + λ0 λ 2 ρ0 ρ tr Σ G P s j M G R + ρ0 ρ 2 λ0 λ + λ0 λ 2 tr Σ G P j G + ρ0 ρ 2 tr Σ H P j H + ρ 0 ρ 2 λ0 λ 2 tr Σ R G M P j M G R = O, uiformly i θ Θ, as h ςr ρp j R ρh ς = λ 0 λ 2 X β 0 G R + ρ 0 ρm Pj R + ρ 0 ρm G X β 0 + λ0 λ s X β 0 G R + ρ 0 ρm P j R + ρ 0 ρm X β0 β + β 0 β X R + ρ 0 ρm Pj R + ρ 0 ρm X β0 β = O p, uiformly i θ Θ. Similarly, E Q ε θ = Q R ρh ς = λ 0 λ Q R + ρ 0 ρm G X β 0 + Q R + ρ 0 ρm X β0 β = O uiformly i θ Θ. Hece, E g θ = O uiformly i θ Θ. The, g θ = O p uiformly i θ Θ by the Markov iequality. These imply that g θ Ω Ω g θ = o p uiformly i θ Θ. This result shows the cosistecy of the optimal robust GMME. From the proof of Propositio 3., we have g ˆθ = θ Γ + o p uiformly i θ. To 60

74 fid the limitig distributio, by.56 [ ˆθo, θ 0 = [ Γ = g ˆθ θ Ω Γ Ω ] Γ ] g θ θ Ω g ˆθ θ Ω g θ 0 g θ 0 + o p..67 Hece, the limitig distributio of ˆθo, θ 0 follows immediately from.67 by the CLT i Theorem of Kelejia ad Prucha 200 ad the Slutzky theorem. 6

75 .7.4 Simulatio Results Table.: λ 0, β 0, β 20, β 30, ρ 0 = 0.8, 0.7, 0.4,.2, 0.3. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE 00 MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ

76 Table.2: λ 0, β 0, β 20, β 30, ρ 0 = 0.3, 0.7, 0.4,.2, 0.3. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE 00 MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ

77 Table.3: λ 0, β 0, β 20, β 30, ρ 0 = 0, 0.7, 0.4,.2, 0. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE 00 MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ

78 Table.4: λ 0, β 0, β 20, β 30, ρ 0 = 0.3, 0.7, 0.4,.2, 0.8. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE 00 MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ

79 Table.5: λ 0, β 0, β 20, β 30, ρ 0 = 0.8, 0.7, 0.4,.2, 0.3. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE 00 MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ MLE λ β β β ρ RGMME λ β β β ρ RGMME 2 λ β β β ρ

80 Table.6: λ 0, β 0, β 20, β 30, ρ 0 = 0.3, 0.7, 0.4,.2, 0.3, = 500. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE GS2SLSE λ β β β ρ MLE λ β β β ρ B2SLSE λ β β β ρ BGMME λ β β β ρ RGS2SLSE λ β β β ρ RB2SLSE λ β β β ρ RGMME λ β β β ρ

81 Table.7: λ 0, β 0, β 20, β 30, ρ 0 = 0.8, 0.7, 0.4,.2, 0.3, = 000. Homoskedasticity Heteroskedasticity Mea Bias SD RMSE Mea Bias SD RMSE GS2SLSE λ β β β ρ MLE λ β β β ρ B2SLSE λ β β β ρ BGMME λ β β β ρ RGS2SLSE λ β β β ρ RB2SLSE λ β β β ρ RGMME λ β β β ρ

82 2 Robust Estimatio Methods for Spatial Autoregressive Models: A Compariso of Bayesia ad Robust GMM Approach 69

83 Abstract Most of the estimators suggested for the estimatio of spatial autoregressive models are icosistet i the presece of ukow forms of heteroskedasticy i the disturbace term. The estimators formulated from the geeralized method of momets GMM ad the Bayesia Markov Chai Mote Carlo MCMC frameworks ca be robust to ukow forms of heteroskedasticity. I this study, the fiite sample properties of the robust GMM estimators ad the Bayesia estimators based o the Markov Chai Mote Carlo MCMC approach are compared for the spatial autoregressive models that have heteroskedasticity of a ukow form. To this ed, a comprehesive Mote Carlo simulatio is desiged for spatial models cotaiig a spatial lag i the depedet variable ad/or the disturbace term. The simulatio results idicate that the maximum likelihood ML ad the Bayesia estimators impose relatively larger bias o the spatial autoregressive parameters. The robust GMM estimator is prove to perform relatively better for the specificatio where there is spatial depedece i both the depedet variable ad the disturbace term. Author Keywords: Spatial autoregressive models, Ukow heteroskedasticity, Robustess, GMM, MLE, Markov Chai Mote Carlo MCMC JEL classificatio codes: C3, C2, C3

84 2. Itroductio Most of the estimatio methods suggested i the literature are valid uder the assumptio that the disturbace terms of spatial models are i.i.d. or i.i.d ormal. However, i may regressio applicatios, heteroskedasticity may well be preset. For example, cross-sectioal uits usually differ i size ad some other characteristics, which i tur implies that the disturbace terms i the regressio aalyses across these cross-sectioal uits may be heteroskedastic. Heteroskedasticity may also be preset i the radom coefficiet case, where the parameters of a model are radom aroud fixed values. I this case, heteroskedasticity depeds o the exogeous variables of the spatial models. Moreover, i regressio aalysis, may depedet variables are costructed by data aggregatio. I such a case, heteroskedasticity arises from the process of averagig with differet umbers of observatios whe the data is gettig aggregated. I the preset study, we evaluate the performace of various heteroskedasticity robust estimators suggested i the literature for the spatial autoregressive models. To this ed, we coduct a Mote Carlo study ad provide two empirical illustratios to show how these estimators perform i applied research. I the presece of heteroskedastic disturbaces, the ML ad GMM estimators are geerally icosistet. The ML estimator is icosistet if the heteroskedasticity is ot icorporated ito estimatio, because the likelihood fuctio is ot maximized at the true parameter values. The GMM estimators are also icosistet sice the momet fuctios are ofte desiged uder the assumptio that disturbaces are i.i.d. Hece, the orthogoality coditios for the momet fuctios may ot be satisfied. To hadle ukow forms of heteroskedasticity, Kelejia ad Prucha 200 exted their estimatio approach by modifyig the momet fuctios for the spatial model that has a spatial lag i the depedet variable ad the disturbace term for short SARAR,. Badiger ad Egger 20 exted the robust estimatio approach i Kelejia ad Prucha 200 to the case of SARARp,q specificatio. Li ad Lee 200 suggest a oe-step robust GMM estimator for the model with oly spatial depedece i the depedet variable for short SARAR,0. For a robust 2SLS estimator of the SARAR,0 specificatio, see Aseli

85 A alterative to ML ad GMM/IV estimatio methods is the Bayesia estimatio method, which has bee receivig attetio i recet years. Bayesia data aalysis is distictly differet from the classical or frequetist aalysis i its treatmet of parameters i a model. I Bayesia ecoometrics the parameter vector is a radom variable, ad Bayesia aalysts formulate probabilistic statemets about the parameters before observig ay data. These ex-ate probabilities are called priors ad usually take the form of a probability distributio with kow momets. This otio of prior probabilities or subjective probabilities is totally abset i classical estimatio. I classical ecoometrics, all estimatio ad iferece is based o observed data. I both approaches, the likelihood fuctio has the same fuctioal form ad reflects the relatio betwee data ad the parameters of iterest. I Bayesia approach, the likelihood fuctio is combied with prior distributios via Bayes rule to costruct the posterior probability distributio of parameters. The posterior distributio fuctio of the parameter vector cotais the iformatio ecessary for the estimatio ad iferece. The Bayesia approach requires the evaluatio of higher dimesioal itegrals to obtai posterior expectatios, margial likelihoods, ad predictive desities. The applicatio of the Bayesia methods to the estimatio of spatial models follows the path of the progress that has bee made i the cotext of Bayesia computatio techiques. Hepple 995b idetifies four phases for the developmet of Bayesia computatio techiques. I the first phase, Bayesia studies ivolved problems that ca be characterized with well kow probability distributios such that the characteristics of posterior distributios such as meas ad covariaces ca be aalytically derived. I the secod phase, Bayesia studies focused o techiques through which problems ivolvig multidimesioal probability distributios ca be reduced to uivariate or bivariate itegratios. I this phase, umerical techiques for the uivariate ad bivariate itegratio were used. I the third phase, Bayesia aalysts worked o efficiet procedures for higher order itegratios. Gauss-Hermite, importace samplig ad Mote Carlo itegratio techiques were used to tackle complicated ad high-dimesioal problems. I the fourth phase, Markov Chai Mote Carlo simulatio techiques were itroduced that make the computatio of higher dimesio problems feasible. The advet of the MCMC approach represets a shift i thikig, where the focus 72

86 o the questio of aalytical momet calculatio is replaced with a more geeral questio of samplig issues from the posterior distributios Chib, 200, Albert ad Chib, 993, Casella ad George, 992. I the Gibbs samplig versio of the MCMC approach, a joit posterior distributio is decomposed ito coditioal posterior distributios through which radom draws or a simulated sample ca be obtaied. The simulated samples are used to make iferece for characteristics of the joit posterior distributio. The developmet i Bayesia computatio techiques provides a wide rage of tools that ca be applied to estimatio of spatial models. The early literature o the Bayesia perspective o spatial models uses combiatios of tools developed durig the period from the first to the third phase. For example, Hepple 979 aalytically derives the joit posterior ad margial posterior distributios of parameters for a spatial model cotaiig a spatial lag i the disturbace term. The posterior momets are calculated through the uivariate ad the bivariate umerical itegratio techiques. Aseli 982, 988 cosiders the Bayesia approach for pure spatial autoregressive ad spatial error models. Diffuse priors for the parameters of models are suggested ad margial posterior distributios of parameters are aalytically derived. The posterior mea of the autoregressive parameter for a pure spatial autoregressive model i Aseli 982 is estimated with the uivariate umerical itegratio. A small Mote Carlo simulatio study i Aseli 982 demostrates that the Bayesia estimator performs as well as the ML estimator for larger values of the autoregressive parameter ad larger samples. 2 Hepple 995a,b develops Bayesia aalyses for major spatial specificatios icludig the SARAR,0 model, SARAR0, or the SEM model ad spatial movig average models for short SARMA0,. I each case, the joit posterior distributios of parameters are stated from which the margial posterior of the spatial autoregressive parameters are aalytically derived. The aalytical derivatio for the margial posterior distributio of the parameters of the exogeous variables is ot available as the spatial autoregressive parameters ca ot be aalytically itegrated out from the joit posterior distributio. However, the dimesio of joit posterior ca be reduced to two dimesios so that bivariate 2 To the best of our kowledge, Aseli 982 is the first study i which small sample properties of the ML estimator are compared with the frequetist properties of the Bayesia estimator i the spatial ecoometrics literature. 73

87 umerical itegratio techiques ca be used for the estimatio of the margial posterior momets. As a result, the estimate of the spatial autoregressive parameters ca be obtaied through a uivariate umerical itegratio, ad the estimates of parameters of the exogeous variables ca be obtaied through a bivariate umerical itegratio. 3 Hepple 2002, 2003 provides further aalytical simplificatios such that the momets of the margial posterior distributios of the exogeous variables i spatial models with oly oe autoregressive parameter ca be obtaied through uivariate umerical itegratio. However, for the case of SARAR, ad SARMA, where there are two spatial autoregressive parameters the calculatio of these momets agai requires bivariate itegratio over the parameter space of autoregressive parameters. The recet studies use the MCMC approach to estimate spatial models. This approach is more appropriate for cases where margial posterior distributios are difficult to simplify aalytically ad to itegrate umerically. The MCMC approach is itroduced for most types of spatial models i Lesage 997 ad LeSage ad Pace Kakamu ad Wago 2008 compare fiite sample properties of the Bayesia estimators based o the MCMC approach with that of the ML estimator for the static pael spatial autoregressive model. The Mote Carlo simulatio results i Kakamu ad Wago 2008 show that the Bayesia estimator is virtually as efficiet as the ML estimator. I spatial models, the boudaries of the parameter space for spatial autoregressive parameters are kow, which facilitate the selectio of the proper uiformative priors for these parameters. Thus, i most studies, uiform priors over the parameter space are assiged to spatial autoregressive parameters as i LeSage ad Pace 2009, Lesage ad Fischer 2008, Lesage 997, Kakamu ad Wago Oliveira ad Sog 2008 cosider two versios of Jeffreys prior, called idepedece Jeffreys ad Jeffreys-rule priors for the spatial autoregressive parameter for the case of SARAR0,. Jeffreys priors are uiformative improper priors that are costructed from the iformatio matrix. The appealig aspect of these priors is that they are robust to re-parametrizatio of the model. For the case of SARAR0,, Oliveira ad Sog 2008 state the margial posterior desity of au- 3 Note that as the dimesio of spatial parameter icreases, the dimesio of umerical itegratio will rise. For example, for the case of SARAR, the margial posterior of autoregressive parameters is two dimesioal. 74

88 toregressive parameter up to a costat of proportioality ad use the adaptive rejectio Metropolis samplig algorithm to simulate radom draws. I a Mote Carlo study, they show that the Bayesia iferece based o the idepedece Jeffreys prior is slightly better tha the iferece obtaied based o both uiform ad Jeffreys-rule priors. The Bayesia estimators metioed above are desiged for the cases where spatial models have homoskedastic disturbaces. The treatmet of the ukow form of heteroskedasticity i the Bayesia approach starts by assumig a structure for the ukow covariace matrix of disturbace term. First, it is assumed that the covariace matrix ca be decomposed ito a costat compoet ad a compoet that varies over observatios. Secod, a prior desity fuctio is assiged to each compoet so that the joit posterior desity of parameters ca be costructed. The most importat assumptio is that the compoet that varies over observatios is assumed to be i.i.d. with a commo prior desity fuctio. I this approach, a prior desity fuctio for the parameter of the commo desity fuctio is also specified. I the literature, the priors determied i this way are kow as hierarchical priors. This feature of the Bayesia approach allows flexible models through which ay distributio for the disturbace term ca be approximated with a high degree of accuracy Koop, Oce the prior desity fuctios are determied, the joit posterior desity fuctio ca be easily costructed, from which a full set of coditioal posterior desity fuctios ca be obtaied to form a MCMC samplig scheme. The radom draws or the simulated sample are the used to calculate the relevat momets for the estimatio. Lesage 997 is the first study i which the above outlied hierarchical Bayesia approach of ukow heteroskedasticity is itroduced for the spatial autoregressive models. Followig Geweke 993, Lesage 997 assigs a i.i.d chi-square distributio as the prior for the elemets of the compoet of the covariace matrix that is assumed to vary over observatios. The samplig for the spatial autoregressive parameter is carried out through the ratio of uiform samplig algorithm. LeSage ad Pace 2009 use a Metropolis Hastig algorithm based o a tued radom walk procedure to sample from the coditioal posterior desity fuctio of autoregressive parameters to form the Metropolis withi Gibbs samplig procedure for the whole model. Lesage ad Fischer 2008 cosider the hierarchical Bayesia approach for the estimatio of a coditioal autoregressive CAR spatial 75

89 model, i which the spatial structured radom compoet of the model is assumed to have a autoregressive process. It is of iterest to ivestigate the performace of the aforemetioed heteroskedasticityrobust GMM ad Bayesia estimators through a Mote Carlo study ad applied applicatios. I this study, we compare the small sample properties of these estimators through a Mote Carlo study. 4 I this respect, we make three cotributios to the literature. First, this study is the first to compare the metioed approaches of the ukow heteroskedasticity. For the case of SARAR,0, SARAR, ad SARAR0,, the robust estimatio methods from both perspectives are outlied. Secod, we itroduce the robust GMM estimator for the case of SARAR0, specificatio alog the lies of Li ad Lee 200, ad ivestigate its fiite sample properties. Third, we provide two empirical examples based o real world data set varyig i size to shed light o the effect of heteroskedasticity o the paramater estimates. I the followig, Sectio 2 provides model assumptios that are required for the formal study of the autoregressive spatial models. Sectio 3 ad 4 builds the robust GMM approach ad the Bayesia hierarchical approach for the case of SARAR,0, SARAR, ad SARAR0,. Sectio 5 cotais the Mote Carlo experimets we use to provide fiite sample properties of the estimators. I Sectio 6, we provide two empirical illustratios. Sectio 7 cocludes. 2.2 Model Specificatio ad Assumptios I this study, the followig first order SARAR, specificatio is cosidered: Y = λ 0 W Y + X β 0 + u, u = ρ 0 M u + ε, 2. where Y is vector of depedet variable, X is k matrix of ostochastic exogeous variables, W ad M are spatial weight matrices of kow costats with zero diagoal elemets, ad ε is vector of disturbaces or iovatios. The variables W Y ad M u are kow respectively as the spatial lag of the depedet variable ad 4 To the best of our kowledge, there is o such study i the literature. 76

90 the disturbace term. The spatial effect parameters λ 0 ad ρ 0 are kow as the spatial autoregressive parameters. The model specificatios with λ 0 0, ρ 0 = 0 ad λ 0 = 0, ρ 0 are kow, respectively, as SARAR,0 ad SARAR0, i the literature. Let Θ be the parameter space of the model. I order to distiguish the true parameter vector from other possible values i Θ, the model is stated with the true parameter vector θ 0 = ρ 0, ς 0 with ς 0 = λ 0, β 0. For the otatioal simplicity, we deote S λ = I λw, R ρ = I ρm, G λ = W S λ ad H ρ = M R ρ. Also, at the true parameter values ρ 0, λ 0, we deote S λ 0 = S, R ρ 0 = R, G λ 0 = G, H ρ 0 = H ad G = R G R. The model is cosidered uder the followig assumptios. Assumptio : The elemets ε i of the disturbace term ε are distributed idepedetly with mea zero ad variace σ 2 i, ad E ε i ν < for some ν > 4 for all ad i. This assumptio allows idepedet ad heteroskedastic disturbaces. The elemets of the disturbace term have momets higher tha the fourth momet. The existece momets coditio is required for the applicatio of the cetral limit theorem for the quadratic form give i Kelejia ad Prucha 200. I additio, the variace of quadratic form i ε exists ad is fiite whe the first four momets are fiite. Assumptio 2: The spatial weight matrices M ad W are uiformly bouded i absolute value i row ad colum sums. Moreover, S, S λ, R ad R ρ exist ad are uiformly bouded i absolute value i row ad colum sums for all values of ρ ad λ i a compact parameter space. The uiform boudedess of the terms i Assumptio ad 2 is motivated to cotrol spatial autocorrelatios i the model at a tractable level Kelejia ad Prucha, Assumptio 2 also implies that the model i 2. represets a equilibrium relatio for the depedet variable. By this assumptio, the reduced form of the model becomes feasible as Y = S X β 0 + S R ε. Fially, the statemet of Assumptio 2 is assumed to hold at the true ad arbitrary autoregressive parameter vector. The uiform boudedess of S et al., 200. λ ad R ρ is required for the ML estimator, ot for the GMM estimator Liu 5 For a defiitio ad some properties of uiform boudedess see Kelejia ad Prucha

91 I the literature, the parameter space for spatial autoregressive parameters λ 0 ad ρ 0 is restricted to the iterval,, whe spatial weight matrices are row ormalized. 6 The followig assumptios are the usual regularity coditios required for the GMM estimator. Throughout this study, the vector of momet fuctios cosidered for the GMM estimator is i the form of gθ 0 = ε P ε,..., ε P m ε, ε Q. The momet fuctios ivolvig costat matrices P j for j =,..., m are kow as quadratic momet fuctios. The last momet fuctio Q ε is the liear momet fuctio, where Q is a k with full colum rak ad with k k +. The matrices P j s ad Q are chose i such way that orthogoality coditios of populatio momet fuctios are ot violated. Let P 2 be the class of costat matrices with zero diagoal elemets. The quadratic momet fuctios ivolvig matrices from this class satisfy the orthogoality coditios whe disturbace terms satisfy Assumptio. 7 Assumptio 4 states regularity coditios for these matrices ad the last assumptio characterizes the parameter space. Assumptio 3: The regressors matrix X is a k matrix cosistig of uiformly bouded costat elemets. It has full colum rak of k. Moreover, lim X X exists ad is osigular. Assumptio 4: Elemets of IV matrix Q are uiformly bouded. P j for j =,..., m is uiformly bouded i absolute value i row ad colum sums. Assumptio 5: The parameter space Θ is a compact subset of R k+2 ad θ 0 ItΘ. 2.3 Robust GMM Estimatio of Spatial Autoregressive Models I this sectio, the robust GMM estimators for the case of SARAR,0, SARAR, ad SARAR0, are reviewed. For the SARAR,0 specificatio, where ρ = 0, the reduced form of the model is Y = S X β 0 + S ε. The, the spatial lag of the depedet 6 There are some other formulatios for the parameter space of autoregressive parameter, for details see Kelejia ad Prucha 2007, To see this, let Σ = Diagσ, 2..., σ 2 be covariace matrix of disturbace terms. The, Eε P jε = trp jeε ε = trp jσ = 0. 78

92 variable ca be expressed as W Y = G X β 0 + G ε. This results idicates that the edogeous variable W Y is fuctio of a o-stochastic term G X β 0 ad a stochastic term G ε. This decompositio motivates the formatio of the momet fuctios for the GMM estimatio. The o-stochastic part is istrumeted by Q = G X β 0, X, which forms the populatio momet fuctio Q ε Lee, 2003, 2007a. The stochastic part is istrumeted by P j ε, for j =,..., m, where P j P 2. I this case, the momet fuctio is i the form of ε P j ε Lee, 2007a. The robust GMM estimator i Li ad Lee 200 is based o a sigle quadratic momet fuctio. Lee 2007a shows that whe disturbace terms are i.i.d, the best selectio of P j P 2 is P = G DiagG, where DiagA retur a diagoal matrix created from the diagoal elemets of the matrix A. Li ad Lee 200 suggest that this selectio ca also be used for the case where disturbace terms satisfy Assumptio. Cosider the followig set of momet fuctios g ς = ε ς G DiagG ε ς, 2.2 G X β 0, X ε ς where ε ς = S λy X β. Straightforward calculatios show that 8 Ω = tr [ ] Let Ω = E g ς 0 g ς 0 Σ P P Σ + Σ P 0 k+ 0 k+ Q Σ Q [ ] ad Γ = E g ς 0. ς, 2.3 Γ = tr Σ P + P Q G X β 0 G 0 k Q X, 2.4 where Σ = Diagσ 2,..., σ2 is the covariace matrix of disturbace terms. The, 8 Lemma 2 i Appedix ca be used to derive Ω matrices i this sectio. 79

93 the robust GMM estimator is defied by ˆλ ˆς = = argmi ς Θ g ςˆω ˆβ g ς, 2.5 where ˆΩ is a cosistet estimate of Ω, which ca be costructed from a 2SLS estimator based o the istrumet matrix W 2 X, W X, X. 9 Note that P ad Q also ivolve ukow parameters. To surmout this problem, a iitial GMM estimator of ς 0 ca be obtaied from the momet fuctios ivolvig P + = W W DiagW W ad Q + = W 2 X, W X, X. 0 Li ad Lee 200 show that the robust GMM estimator defied i 2.5 is cosistet ad asymptotically ormally distributed, amely [ d ˆς ς 0 N 0, lim ] Γ Ω Γ. 2.6 For the case of SARAR,, the specificatio of the regressio model implies R Y = λ 0 R W Y + R X β 0 + ε, where istrumets are eeded for R W Y. The edogeous variable ca be writte as R W Y = R G X β 0 +G R ε, which motivates the followig set of momet fuctios: ε θ H DiagH ε θ g 2 θ = ε θ G DiagG ε θ, 2.7 R G X β 0, R X ε θ where ε θ = R ρs λy R ρx β. For the set of the momet fuctios i 2.7, 9 I that case, for ˆΣ i ˆΩ, we ca use ˆΣ = Diagˆε 2..., ˆε 2, where ˆε is are residuals based o the 2SLS estimator. 0 For some other cosistet iitial estimators, see Li ad Lee

94 [ ] [ ] let Ω 2 = E g 2 θ 0 g 2 θ 0 ad Γ 2 = E g2 θ 0, which are give as follows: θ ] ] tr Σ P [P Σ + Σ P tr Σ P [P 2 Σ + Σ P 2 0 k+ ] ] Ω 2 = tr Σ P 2 [P Σ + Σ P tr Σ P 2 [P 2 Σ + Σ P 2 0 k+, 0 k+ 0 k+ Q Σ Q 2.8 tr Σ H P + P tr Σ R G R P + P 0 k Γ 2 = tr Σ H P 2 + P 2 tr Σ R G R P 2 + P 2 0 k, k+ Q R G X β 0 Q R X where P = H DiagH, P 2 = G DiagG ad Σ = Diagσ 2,..., σ2. The, the robust GMM estimator is defied by ˆθ = argmi θ Θ g 2θˆΩ 2 g 2θ, 2.0 where ˆΩ 2 is a cosistet estimate of Ω 2, which ca be obtaied from a iitial GMM estimator based o P + = M M DiagM M, P + 2 = M W DiagM W ad Q + = M W X, W X, X. It ca be show that 2 [ d ˆθ θ 0 N 0, lim ] Γ 2Ω 2 Γ Next, we cosider the robust GMM estimator for the SARAR0, specificatio, which is ested i the SARAR, model. The set of momet fuctio i 2.7 implies the followig set of the momet fuctio for the oe-step GMM estimatio of SARAR0,: g 3 ρ, β = ε ρ, β H DiagH ε ρ, β, 2.2 R X ε ρ, β where ε ρ, β = R ρy R ρx β. Deote δ = ρ, β. For this set of momet The estimator i Kelejia ad Prucha 200 ca also be used as a iitial estimator. For some other cadidates, see the first essay. The details about estimates of Ω 2 ad Γ 2 are provided i the first essay of this dissertatio. 2 This result is proved i the first essay of this dissertatio. 8

95 [ ] [ ] fuctio, let Ω 3 = E g 3 δ 0 g 3 δ 0 ad Γ 3 = E g3 δ, which are give as follows: δ Ω 3 = tr Σ P + P + Σ + Σ P + 0 k 0 k X R Σ R X, 2.3 Γ 3 = tr Σ P + + P + H 0 k 0 k X R R X, 2.4 where P + = H DiagH. The, the oe-step GMM estimator is defied by ˆδ = argmi δ Θ g 3δˆΩ 3 g 3δ. 2.5 The robust GMM estimator ˆδ is cosistet ad asymptotically ormally distributed [. with the asymptotic covariace matrix of lim Γ 3 Ω 3 3] Γ 3 Iitial cosistet estimators are required for the estimator defied i 2.5. A iitial robust GMM estimator ca be obtaied from the set of momet fuctios g ρ, β = ε ρ, βp ε ρ, β, ε ρ, βx, where P = M M DiagM M. 4 Liu et al. 200 suggest a oe step best GMM estimator for the case of a SARAR0,, whe disturbace terms are simply i.i.d. I case of heteroskedastic iovatios, the estimators i Liu et al. 200 are icosistet, sice the orthogoality coditios of the momet fuctios are violated. Lee 200 suggests a feasible geeralized least squared GLS estimator for the parameters of the exogeous variables ad a robust method of momet MOM estimator for the autoregressive parameter, whe disturbace terms are i.i.d. The estimatio approach i Lee 200 ca easily be exteded for the case of heteroskedastic iovatios. Let ˆβ ols = X X X Y ad û = Y X ˆβols. Followig Lee 200, we cosider the followig robust MOM estimator for ρ 0 : ˆρ = argmi ρ Θ [û R ρp + R ρû ] 2, This result directly follows from Propositio 3 of the first essay i this dissertatio. 4 Note that P is suggested by Kelejia ad Prucha

96 where P + = H DiagH. Lee 200 shows that the estimator i 2.6 is the best MOM estimator i the sese that it is the most efficiet estimator i the class of the MOM estimators obtaied from the quadratic momet fuctios ivolvig matrices P j P 2, whe disturbaces are i.i.d. For the heteroskedastic case, the best quadratic momet matrix may ot be available, as the variace of ˆρ ivolves a ukow Σ. Despite this, the best quadratic momet matrix P + of i.i.d. case ca be used for the case of heteroskedastic iovatios. 5 For otatioal simplificatio, let P s = P + P for ay matrix P. The limitig distributio of the estimator i 2.6 is ˆρ ρ d N 0, [ lim ] tr 2 Σ P +s H tr Σ P Σ P + s. 2.7 Lee 200 also aalytically derives the cosistet root from the objective fuctio defied i 2.6. The cosistet root ˆρ is { ˆρ = û H DiagH [ s û M û H DiagH s 2 M û 4û M H DiagH M û û H DiagH ] /2 } û 2.8 2û M H DiagH. M û The robust MOM estimators defied i 2.6 ad 3.3 with P + = H DiagH are ot feasible, sice P + ivolves kow parameter ρ 0. As usual, P + ca be made available with a iitial cosistet estimator of ρ 0. A iitial estimator of ρ 0 ca be foud by replacig P + i 2.6 ad 3.3 with P 2 = M or P 3 = M M DiagM M. Oce a estimate of ρ 0 is obtaied, the GLS estimator of β 0 ca be writte as ˆβ gls = X R ˆρ Σ R ˆρ X X R ˆρ Σ R ˆρ Y, where Σ = Diag{σ 2,..., σ2 }. ca be show that It ˆβgls β 0 d N 0, [ lim X R Σ ] R X Li ad Lee 200 also use quadratic momet matrices from the i.i.d. case for the heteroskedastic case. 83

97 The above argumets idicates that the asymptotic distributio of ˆβ gls does ot deped o the asymptotic distributio of the iitial cosistet estimator of ρ 0. However, the efficiet estimatio of ρ 0 affects the power of test statistics desiged for the presece of spatial autocorrelatio. The efficiecy of ˆβ GLS, ad the MOM estimator ˆρ ca be compared with that of the oe-step GMM estimators defied i 2.5. Straightforward calculatios shows that the robust oe-step GMM estimator of ρ 0 is asymptotically equivalet to the MOM estimator ˆρ. 6 The robust oe-step GMM estimator of β 0 is asymptotically equivalet to the OLS estimator from the regressio R Y = R X β 0 + ε. 7 Therefore, the GLSE ˆβ gls is more efficiet. However, the GLSE is ot feasible sice it ivolves ukow covariace matrix Σ. The covariace of the robust oe-step GMM estimator of β 0 is give by X R R X X R Σ R X X R R X. I this case, as White 980 shows, the exact structure of Σ is ot required to make appropriate ifereces. That is, the middle term X R Σ R X ca be cosistetly estimated without beig able to estimate Σ cosistetly. A cosistet estimate is X R ˆρ ˆΣ R ˆρ X, where ˆρ is a cosistet estimator of ρ 0 ad ˆΣ = Diagˆε 2,..., ˆε2, with OLS residuals ˆε i from the regressio Y = X β 0 + ε. 2.4 Bayesia MCMC Estimatio of Spatial Autoregressive Models The model i 2. uder Assumptio has a ukow form of heteroskedasticity. Let Σ = Diagσ 2,..., σ2 be the diagoal variace covariace matrix of disturbace terms. For Bayesia estimatio, we assume that Σ = σ 2 0 V, where V = Diagv,..., v with v i = σ2 i σ 2 0 for i =,...,. This assumptio idicates that the heteroskedasticity specificatio has two compoets: i a costat compoet σ 2 0, ad ii a compoet v i that varies over observatios. The costat compoet is a arbitrary term that facilitates the Bayesia aalysis Koop, 2003, Griffiths, I this sectio, we review the Bayesia 6 They have the same limitig covariace matrix. This result will ot be valid for the i.i.d. iovatios case. See Liu et al This ca be see from the liear momet fuctio i the secod row of 2.2. From 2.3 ad 2.4, the covariace of the robust GMM estimator of β 0 is X R R X X R Σ R X X R R X, which is the OLS estimator obtaied from the regressio R Y = R X β 0 + ε. 84

98 MCMC approach for the spatial autoregressive models Bayesia MCMC Estimatio of SARAR,0 First, we cosider the case of SARAR,0, where ρ 0 = 0 i 2.. Let ν = v,..., v be vector. The, the likelihood fuctio of this model ca be writte as LY ς, σ 2, ν = 2π 2 σ 2 2 i= v /2 i S λ 2.20 { exp S λy X β 2 σ 2 V S λy X β }. It will be coveiet to write the expoet of the above likelihood fuctio i terms of geeralized least squares GLS quatities. To this ed, for a give value of λ, the GLS estimator from the regressio S λy = X β + ε is β = X V X X V S λy. Let SSE = S λy X β σ 2 V S λy X β be the weighted residuals sum of squares. The, the expoet of the likelihood fuctio ca be writte as { exp S λy X β 2 σ 2 V S λy X β } 2.2 { = exp SSE + β β X 2 σ 2 V X β β }. Thus, the likelihood fuctio i 2.20 ca be writte i terms of GLS quatities i the followig way: LY ς, σ 2, ν = 2π 2 σ 2 2 i= v /2 i S λ 2.22 { exp 2 SSE + β β X σ 2 V X β β }. Followig Geweke 993 ad LeSage ad Pace 2009, we assume the followig prior distributios for the parameters of SARAR,0: i β Nc, T, ii σ 2 IGa, b, where IGa, b is the iverse gamma desity fuctio with shape parameter a ad scale parameter b, iii r v i i.i.d. χ 2 r, for i =,...,, where the degrees of freedom parameter r has gamma distributio, amely r Γm, k, iv λ U τ, τ, where U deotes uiform 85

99 distributio fuctio, ad τ is the spectral radius of W. 8 All prior distributio fuctios are assumed to be idepedet. The prior distributio of r v i is suggested by Geweke 993. A o-spatial liear regressio model with covariace matrix σ0 2V, where v i s are idepedetly ad idetically distributed, correspods to a liear regressio model with disturbaces havig a scale mixture of ormal distributio Geweke, 993, Chib, 200. I particular, Geweke 993 shows that the posterior distributio fuctio of the parameter of a o-spatial liear regressio model uder the assumptio of prior distributio r v i i.i.d. χ 2 r for i =,..., is proportioal to the posterior distributio fuctio of a o-spatial liear regressio model whose disturbaces are i.i.d tr, where t deotes the uivariate Studet-t distributio. The prior desity fuctios are give i the followig equatios. pβ = 2π k/2 T /2 exp { β c T β c }, 2.23 pσ 2 = ba Γa σ2 a+ exp{ b }, 2.24 σ2 pν = r 2 r/2 [Γ r 2 ] i= v r+2/2 i exp{ r }, v i pr = k m Γm rm exp{ r }, for r > 0, ad m, k > 0, 2.26 k τ 2 pλ =, if λ τ, τ Otherwise. Note that pν is the joit prior distributio fuctio of v i for i =,...,. 9 I the formulatio of prior desity fuctios, Γ deotes Gamma fuctio defied by Γa = 0 µ a e µ dµ. The uiform prior distributio for the autoregressive parameter i 2.27 idicates that all values i the iterval τ, τ are equally probable, where τ is the spectral radius of W. Lesage ad Fischer 2008 itroduce a alterative prior distributio fuctio 8 Note that the parameter space of autoregressive parameter is,. For a proof, see Kelejia ad τ τ Prucha The desity of v i a ca be obtaied from that of r/v i. To this ed, for ay r, let X = r v i with desity fuctio p X. Defie Y = r/x = v i, the the derived desity fuctio of Y is give by p Y y = p Xr/y dx dy. The straightforward calculatio shows that pvi = [ r 2 ]r/2 [Γr/2] v r+2 2 i exp{ r 2v i }. Also ote that /v i χ 2 r/r, sice χ 2 r/r = Γr/2, 2/r we have /v i Γr/2, 2/r which implies that v i Iv Γr/2, r/2. 86

100 for λ based o Beta fuctio. I this formulatio p λ = + λ d λ d Betad, d 2 2d = ΓdΓd Γ2d + λ d λ d 2 2d, 2.28 where Betad, d = t 0 µd µ d dµ is the stadard Beta fuctio. Lesage ad Fischer 2008 shows that whe d is ear to uity ad λ,, p λ produces uiformative prior over the iterval, such that the ed-poits get zero prior weight, which is cosistet with the parameter space of λ. Let pβ, σ 2, ν, r, λ Y be the joit posterior desity fuctio of parameters. Bayes Theorem idicates that the joit posterior desity is proportioal to the product of the likelihood fuctio ad prior desities. Uder the assumptio that prior distributios of parameters are idepedet, the joit posterior desity ca be writte as pβ, σ 2, ν, r, λ Y LY ς, σ 2, v,..., v pβpσ 2 pνprpλ 2π 2 σ 2 2 v /2 i S λ 2.29 i= { exp SSE + β β X 2 σ 2 V X β β } T /2 exp { β c T β c } σ 2 a+ exp{ b σ 2 } r 2 r/2 [Γ r 2 ] i= v r+2/2 i exp{ r } r m exp{ r 2v i k } pλ. The MCMC approach requires a full set of coditioal posterior desity fuctios of parameters to form a algorithm through which radom draws or simulated sample ca be obtaied. The simulated sample obtaied from the coditioal posterior desities is a Markov chai with the property that its limitig distributio is the joit posterior distributio Chib, 200, Albert ad Chib, 993, Casella ad George, 992, Tierey, 994. Coditioal o the other parameters of the model, the coditioal posterior desity of β ca be obtaied by collectig all terms i 2.29 that are ot multiplicatively separable from compoets that iclude β. Let pβ σ 2, ν, r, ρ, Y be the coditioal posterior desity 87

101 fuctio of β. The, { pβ σ 2, ν, r, λ, Y exp 2 SSE + β β X σ 2 V X β β + β c T β c } 2.30 By completio of square see Lemma i Appedix, the last two terms of the expoet i 2.30 ca be writte as β β X σ 2 V X β β + β c T β c 2.3 = β β X σ 2 V X + T β β + β c X σ 2 V X + T β c, where β = X V X + σ 2 T X V S λy + σ 2 T c. Usig 2.3, the coditioal posterior desity of β ca be writte as { pβ σ 2, ν, r, λ, Y exp 2 β β X σ 2 V X + T } β β, 2.32 where the terms that are multiplicatively separable from β are absorbed ito the costat of proportioality. The kerel of pβ σ 2, ν, r, λ, Y i 2.32 is proportioal to the kerel of a multivariate ormal desity with mea β ad covariace X σ 2 V X + T. Thus, β σ 2, ν, r, λ, Y N β, X σ 2 V X + T Now, we retur to joit posterior desity i 2.29 to determie the coditioal posterior distributio of σ 2. By collectig all terms i 2.29 that are ot multiplicatively separable from the compoets that iclude σ 2, the coditioal posterior desity of σ 2 ca be obtaied as follows: pσ 2 β, ν, r, λ, Y σ 2 2 a+ exp { σ S λy X β V 2 2 S λy X β + 2b 2.34 }. 88

102 The compariso of the above kerel with that of a iverse gamma desity fuctio idicates that the coditioal posterior distributio of σ 2 is a iverse gamma distributio with the shape parameter a = 2 +a ad with the scale parameter b = Thus, S λy X β V S λy X β +2b 2. σ 2 β, ν, r, λ, Y IGa, b I the same way, the coditioal posterior distributio of ν ca be obtaied by igorig all terms i the joit posterior desity fuctio that are ot related to ν. Thus, pν β, σ 2, r, λ, Y i= i= { v /2 i exp S λy X β σ 2 V S λy X β } 2 v r+2/2 i exp{ r } v i For otatioal simplificatio, let e = S λy X β. The S λy X β σ 2 V S λy X β = e σ 2 V e = e 2 j j= σ 2 v j, where e j is the jth elemet of e. The, the equatio i 2.36 ca be writte as pν β, σ 2, r, λ, Y i= i= { } v /2 i exp e2 j 2σ 2 v j= j i= v r+3/2 i exp{ σ 2 e 2 i + r 2v i }. v r+2/2 i exp{ r } 2v i The above result shows that the coditioal posterior desity of each v i ca be writte as pv i β, σ 2, r, λ, Y v r+3/2 i exp{ σ 2 e 2 i + r 2v i }, for i =,..., The fuctioal form i the right had side of 2.37 does ot correspods to ay kow form of desity fuctio. Geweke 993, p.s26 suggests a method through which the coditioal posterior distributio of v i ca be determied. Let Ψ = σ 2 e 2 i +r v i. The, the result i 2.37 implies that the desity fuctio of Ψ is proportioal to Ψ r /2 exp{ Ψ/2}, which 89

103 implies 20 σ 2 e 2 i + r 2v i β, σ 2, r, λ, Y χ 2 r +, for i =,..., Now, we retur to coditioal posterior distributio of the hyper-parameter r. By collectig all terms that are ot multiplicatively separable from the compoets that iclude r from the joit posterior desity yields, pr β, σ 2, ν, λ, Y r 2 r/2 r m [Γ r 2 ] i= v r/2 i exp{ rk + 2v i 2kv i } The above desity does ot correspod to a stadard distributio, therefore the Metropolis Hastig type algorithm ca be employed to obtai radom draws. Fially, the coditioal posterior distributio of λ takes the form: { pλ β, σ 2, ν, r, Y S λ exp S λy X β 2 σ 2 V S λy X β } The uiform prior desity for λ, which is absorbed i the proportioality costat is used for the result i For the case where prior takes the form suggested by Paret ad LeSage 2007b, the coditioal posterior distributio of λ takes the followig form: { pλ β, σ 2, ν, r, Y S λ exp S λy X β σ 2 V S λy X β } 2 + λ d λ d Betad, d 2 2d 2.4 Both coditioal posterior desities i 2.40 ad 2.4 are ot i form of a kow desity. Like hyper-parameter r, samplig for λ ca be accomplished through the Metropolis- Hastigs approach This result ca be see through the derived desity fuctio of Ψ. Sice dv i = σ 2 e 2 i +r, the dψ Ψ 2 pψ σ 2 e 2 i +r r+3/2 exp{ Ψ r+3/2 2 σ 2 e 2 Ψ i + r } σ 2 e 2 i +r Ψ r /2 exp{ Ψ/2}. The kerel σ 2 e 2 i +r Ψ 2 of pψ implies Ψ χ 2 r +. 2 The Metropolis-Hastig algorithm is described i Sectio

104 2.4.2 Bayesia MCMC Estimatio of SARAR, For the case of SARAR,, the likelihood fuctio takes the followig form: LY θ, σ 2, ν = 2π 2 σ 2 2 i= v /2 i S λ R ρ 2.42 { exp R 2σ 2 ρs λy R ρx β V R ρs λy R ρx β }. Alog the same lies of argumet preseted for the case of SARAR,0, the Bayesia MCMC approach ca be itroduced. For the commo parameters, the same prior distributios itroduced i the previous sectio for SARAR,0 are assumed. For the spatial autoregressive parameter ρ either a uiform prior or a Beta prior ca be assumed. Agai the kerel of the likelihood fuctio ca be writte i terms of GLS quatities from the regressio R ρs λy = R ρx β + ε, where the GLS estimator is β = X R ρv R ρx X R ρv R ρs λy for a give value of λ, ρ. Let SSE = R ρs λy R ρx β σ 2 V R ρs λ R ρx β be the weighted residuals sum of squares. The, the likelihood fuctio i 2.42 ca be writte i terms of GLS quatities i the followig way: LY θ, σ 2, ν = 2π 2 σ 2 2 i= v /2 i S λ R ρ 2.43 { exp 2 SSE + β β X R ρσ 2 V R ρx β β }. 9

105 Assumig idepedet prior distributios itroduced i the previous sectio, the joit posterior desity for the case of SARAR, ca be writte as pβ, σ 2, ν, r, λ, ρ Y LY θ, σ 2, νpβpσ 2 pνprpλpρ 2π 2 σ 2 2 v /2 i S λ 2.44 i= { exp SSE + β β X 2 R ρσ 2 V R ρx β β } T /2 exp { β c T β c } σ 2 a+ exp{ b σ 2 } r 2 r/2 [Γ r 2 ] v r+2/2 i exp{ r } r m exp{ r 2v i k } pλpρ. i= Coditioal o the other parameters of the model, the coditioal posterior desity of β ca be obtaied by collectig all terms i 2.44 that are ot multiplicatively separable from compoets that iclude β. This operatio yields β σ 2, ν, r, ρ, λ, Y N β, X R ρσ 2 V R ρx + T, 2.45 where β = X R ρv R ρx +σ 2 T X R ρv R ρs λy +σ 2 T c. The same lie of argumets of the previous sectio shows that σ 2 β, ν, r, ρ, λ, Y IGa, b, 2.46 a = 2 + a b = R ρs λy R ρx β V R ρs λy R ρx β + 2b. 2 Let e = R ρs λy R ρx β with ith elemet deoted by e i. The same argumet for the result i 2.38 implies that σ 2 e 2 i + r v i β, σ 2, r, λ, ρ, Y χ 2 r +, for i =,..., Now, we retur to coditioal posterior distributio of r. The argumet i the previous sectio shows that the prior desity of r has o relevace for the coditioal pos- 92

106 teriors of other parameters. Moreover, r does ot eter the likelihood fuctio so that pr β, ν, r, λ, ρ, Y = pr ν. Therefore, the coditioal posterior of r is the same as with the oe stated i Fially, the coditioal posterior distributios of autoregressive parameters take the followig forms: pλ β, σ 2, ν, r, ρ, Y 2.48 { S λ exp R 2σ 2 ρs λy R ρx β V R ρs λy R ρx β }, pρ β, σ 2, ν, r, λ, Y 2.49 { R ρ exp R 2σ 2 ρs λy R ρx β V R ρs λy R ρx β } Bayesia MCMC Estimatio of SARAR0, The likelihood fuctio of spatial error model SEM or SARAR0, is give by LY β, σ 2, ρ, ν = 2π 2 σ 2 2 i= v /2 i R ρ 2.50 { exp R 2σ 2 ρy R ρx β V R ρy R ρx β }. The expoet of the likelihood fuctio ca be writte i terms of GLS quatities from the regressio R ρy = R ρx β+ε. Defie β = X R ρv R ρx X R ρv R ρy ad SSE = R ρy R ρx β σ 2 V R ρy R ρx β. The, the expoet { i 2.50 ca be writte as 2σ 2 R ρy R ρx β V R ρy R ρx β } = { } 2 SSE + β β X R ρσ 2 V R ρx β β. Employig the same prior 93

107 desity fuctios of the previous sectio, the joit posterior desity ca be writte as pβ, σ 2, ν, r, ρ Y LY β, σ 2, ρ, νpβpσ 2 pνprpρ 2π 2 σ 2 2 v /2 i S λ 2.5 i= { exp SSE + β β X 2 R ρσ 2 V R ρx β β } T /2 exp { β c T β c } σ 2 a+ exp{ b σ 2 } r 2 r/2 [Γ r 2 ] i= v r+2/2 i exp{ r } r m exp{ r 2v i k } pρ. The same argumet of the previous sectios shows that the coditioal posterior desities of parameter are give as β σ 2, ν, r, ρ, Y N β, X R ρσ 2 V R ρx + T, 2.52 where β = X R ρσ 2 V R ρx +T X R ρσ 2 V R ρy +T c. The same lie of argumets of the previous sectio shows that σ 2 β, ν, r, ρ, Y IGa, b, 2.53 a = 2 + a b = R ρy R ρx β V R ρy R ρx β + 2b, 2 σ 2 e 2 i + r v i β, σ 2, r, ρ, Y χ 2 r +, for i =,...,, 2.54 where e i is the ith elemet of e = R ρy R ρx β. Agai, the coditioal posterior of r is the same as with the oe stated i Fially, the coditioal posterior distributios of autoregressive parameter ρ take the 94

108 followig form: pρ β, σ 2, ν, r, Y { R ρ exp R 2σ 2 ρy R ρx β V R ρy R ρx β } Bayesia MCMC Computatio I Bayesia aalysis, the joit posterior distributio of parameters embodies all iformatio about the parameters. A cost loss fuctio is specified as a criterio for determiig a optimal poit estimate. Let ζ = ρ, λ, β, σ 2, ν, r be the parameter vector. Oe of the most popular loss fuctio is the weighted squared error fuctio, Cˆζ, ζ = ˆζ ζ Qˆζ ζ, where Q is a positive defiite matrix. The, the Bayesia poit estimate is defied by ˆζ = argmiˆζ E ζ Y [ Cˆζ, ζ ] = argmiˆζ Θ ˆζ ζ Qˆζ ζpζ Y dζ, 2.56 where pζ Y is the posterior desity of ζ. The solutio of above miimizatio problem yields ˆζ = Eζ Y = Θ ζpζ Y dζ, which is the mea of the posterior distributio of ζ Koop et al., 2007, p.34. The closed form solutio for ˆζ from the joit posterior desity fuctios i 2.29, 2.44 ad 2.5 caot be obtaied. I the MCMC approach, a Markov chai is costructed from the coditioal posterior distributios so that a large umber of a simulated sample of parameters ca be obtaied. The mea of the simulated sample is the used as a estimate of Eζ Y Chib, 200, Casella ad George, 992. The previous sectio shows that the joit posterior desities of spatial autoregressive models ca be decomposed ito coditioal posterior desity fuctios of parameters through which radom draws ca be obtaied. Whe coditioal posterior distributios are kow, the Gibbs samplig method ca be used to draw from coditioal distributios. As the coditioal posterior distributio of β, σ 2 ad v i are i the form of kow distributios, radom draws ca be easily obtaied through the Gibbs samplig method for these parameters. Ufortuately, the coditioal posterior distributios of the autoregressive parameters ad hyper-parameter r do ot correspods to kow distributios. The samplig for these parameters ca be accomplished through the Metropolis-Hastig algorithm. As a result, 95

109 the samplig scheme for spatial autoregressive models cosists of the combiatio of the Gibbs samplig ad the Metropolis-Hastig algorithm. This whole procedure of samplig is called Metropolis withi Gibbs samplig Geweke ad Keae, 200. The Metropolis-Hastig samplig method requires a proposal distributio or cadidate geeratig distributio from which the cadidate values for the autoregressive parameters ca be geerated. LeSage ad Pace 2009 use a ormal distributio as the proposal distributio with a tued radom-walk procedure. Accordig to this method, the cadidate values λ ew, ρ ew are geerated by λ ew = λ old + c φ, 2.57 ρ ew = ρ old + c φ, where φ N0, The costat parameter c i 2.57 ad 2.58 is called the tuig parameter, which esures that the sampler moves over the etire coditioal posterior distributios of the autoregressive parameters. The icremet radom variable deoted by φ i 2.57 ad 2.58 is symmetric so that the acceptace probability value for the cadidate values λ ew, ρ ew takes the followig form: { pλ ew, λ old = mi, pλew β, σ 2, ν, r, ρ old }, Y pλ old β, σ 2, ν, r, ρ old, 2.59, Y { pρ ew, ρ old = mi, pρew β, σ 2, ν, r, λ old }, Y pρ old β, σ 2, ν, r, λ old. 2.60, Y The ew cadidates λ ew ad ρ ew are accepted, respectively, with probability pλ ew, λ old ad pρ ew, ρ old. The tuig parameter or the spread of the cadidate geeratig desity affects the behavior of the chai i at least two ways: i it affects the acceptace rate of ew cadidate values through acceptace probability, ii it also affects the regio where the ew cadidate values are sampled. To illustrate the role of the tuig parameter, cosider a chai that has coverged so that the ew cadidates are sampled aroud the mode of the margial posterior desities of autoregressive parameters. If the tuig parameter is chose to be large to allow the ew geerated cadidates to be far from the curret value or mode, 96

110 the the ew cadidates will have low probability of beig accepted. I that case, the chai may stack at the curret value the cadidates will ever be accepted. Reducig the tuig parameter will correct this problem. O the other had, a small tuig parameter value will reduce the chace of gettig cadidates that are far away from the curret value. I this case, it will take loger time for the chai to traverse the support of the desity, as a result, low probability regios of the desity support will be uder-sampled. These two issues raise the questio of how to choose the optimal tuig parameter. For our case, the literature has suggested a tuig parameter that ca result i approximately 0.5 acceptace rate Chib ad Greeberg, 995, Chib, 200. I practice, the tuig parameter c is adjusted i such way that the acceptace rate falls betwee 40% ad 60%. 22 The samplig from the coditioal posterior desity of the hyper-parameter r ca be carried out with a argumet similar to the oe described above. The hyper-parameter r plays a cetral role i geeratig heteroskedasticity robust estimators. I the precedig sectio, the prior of v i is assumed to be IGr/2, r/2. The, the prior mea ad variace of v i are respectively give as r r 2 ad 2r 2. The prior variace approaches to zero r 3 8r 2 +20r 6 ad the prior mea approaches to as r goes to ifiity. This observatio implies that v i coverges i probability to its mea as r. Thus, higher values of r are associated with the homoskedastic assumptio ad lower values implies a heteroskedastic assumptio. Geweke 993 shows that i the case where uiformative priors are assumed, the secod momet of joit posterior desity exists oly if r > 4. For our case, the uiformative priors uiform distributios are assumed for the autoregressive parameters, therefore the lowest value that r ca take is bigger tha 4. I our Mote Carlo simulatio we set r = 5 as the data geeratig process is assumed to be heteroskedastic. Fially, the elicitatio of the parameters of the prior distributios for β ad σ 2 is required. The prior distributios for β ca be made relatively diffuse by assigig a large prior variace. This ca be accomplished by settig c = 0 ad T = I k 0 5. The coditioal posterior distributio of σ 2 is a iverse Gamma distributio. The aalytical results provided for the mea of this distributio idicates that a relatively diffuse prior for σ 2 ca 22 I practice, c ca be set 0.5 at the iitial step. LeSage ad Pace 2009 suggest c = c/. for the case where the acceptace rate falls below 40%, ad c =.c for the case where the acceptace rate rises above 60%. This ca be accomplished through a while loop i the algorithm. See also Holloway et al

111 be obtaied by assigig values of a = 0 ad b = For illustratio, a sigle pass through the Metropolis withi Gibbs samplig scheme cosists of the followig steps:. Let ζ 0 = ρ 0, λ 0, β 0, σ 20, ν 0, r 0 be the iitial parameter values. 2. Update β: Draw β from pβ ρ 0, λ 0, σ 20, ν 0, r 0, Y. 3. Update σ 2 : Draw σ 2 from pσ 2 ρ 0, λ 0, β, r 0, Y. 4. Update ν: Draw v i from p σ 2 e 2 i +r v i ρ 0, λ 0, β, σ 2, r 0, Y for i =,...,. 5. Update ρ: Calculate ρ ew = ρ 0 + c φ, { } a Calculate pρ ew, ρ 0 = mi, pρew λ 0,β,σ 2,ν,r 0,Y, pρ 0 λ 0,β,σ 2,ν,r 0,Y b Draw U from Uiform0,, c Set ρ = ρ ew if pρ ew, ρ 0 > U, otherwise ρ = ρ Update λ: Calculate λ ew = λ 0 + c φ, { } a Calculate pλ ew, λ 0 = mi, pλew ρ,β,σ 2,ν,r 0,Y, pλ 0 ρ,β,σ 2,ν,r 0,Y b Draw U from Uiform0,, c Set λ = λ ew if pλ ew, λ 0 > U, otherwise λ = λ 0. The sequece {ζ 0, ζ,..., ζ M } obtaied from the Metropolis withi Gibbs samplig form a Markov chai whose probability desity fuctio coverges to the joit posterior desity pζ Y as M goes to ifiity Chib, 200, Tierey, To discard the effect of the iitial value ζ 0 o the chai, iterates from some iitial periods are usually excluded from the chai bur-i repetitios. Let h be a fuctio defied o the parameter space of the model. Uder a suitable law of large umbers for Markov chais, it ca be show that M M j= hζj Θ hζpζ Y dζ as M. Note that the covergece is established 23 For the case of SARAR,0, we show that σ 2 β, ν, r, ρ, Y IGa b, which is give i The, the coditioal posterior mea Eσ 2 β, ν, r, ρ, Y = b. The choice of a = 0 ad b = 0 elimiates the a effect of the prior distributio o the coditioal posterior mea. 24 Theorem of Tierey 994 ad Theorem 2 of Chib 200 give coditios uder which the covergece holds. Theorem 4-5 i Geweke 993, p.s28-s29 show that these coditios hold for a o-spatial liear regressio model. Therefore, it ca be show that these coditios also hold for liear spatial models. 98

112 i terms of the simulated sample size M ad ot i terms of the data sample size. Thus, posterior momets such as mea ad secod momet ca be easily estimated by takig hζ = ζ ad hζ = ζζ, respectively. The hyper-parameter r is ot updated i the above illustrated Metropolis withi Gibbs samplig scheme. 25 The hyper-parameter affects estimates of variace compoets v i s through the coditioal posterior distributios give i 2.38, 2.47 ad Figure 2. ad 2.2 illustrate the effect of this parameter o the posterior mea of ν for the sample size of 00 ad 300, respectively, for the case of a heteroskedastic SARAR,0 specificatio. Both figures are based o a small group iteractio sceario described i Sectio 5, where the group sizes are determied by the radom draws from the iterval 3, Figure 2.a ad Figure 2.2a show true group variaces ad the resultig disturbace terms. The other plots i Figure 2. ad 2.2 show the effects of various r values o estimates of variace compoets v i. Figure 2.b ad Figure 2.2b show that the estimates are similar for r = 5, 6, 7. O the other had, Figure 2.c ad Figure 2.2c idicate that the r-values bigger tha 0 give estimates close to uity, suggestig a homoskedastic assumptio. These two examples idicate that a hyper-parameter value ear 5 is a optimal choice for the estimatio of heteroskedastic models. 25 Followig LeSage ad Pace 2009, we assume that this parameter is kow. I the Mote Carlo desig, we set r = The block diagoal weight matrix is row ormalized. Members of a group have the same variace. 99

113 Figure 2.: The effect of hyper-parameter r o the estimates of variace compoets v i s 0 Disturbace terms Group variaces a Group variaces ad disturbace terms of true data geeratig process Estimates of variace compoets: ν i Estimates of variace compoets: ν i 5 r=5 r=6 r=7 5 r=5 r=0 r= b Posterior mea of variace compoets v i c Posterior mea of variace compoets v i 00

114 Figure 2.2: The effect of hyper-parameter r o the estimates of variace compoets v i s 0 Disturbace terms Group variaces a Group variaces ad disturbace terms of true data geeratig process Estimates of variace compoets: ν i Estimates of variace compoets: ν i r=5 r=6 r= r=5 r=0 r= b Posterior mea of variace compoets v i c Posterior mea of variace compoets v i 2.5 Mote Carlo Experimets I the previous sectios, we provided aalytical results for the estimators. The coditioal posterior distributio of the parameters idicate that the Bayesia estimators resemble to the geeralized least square GLS estimators such that the effect of outliers are dowweighted i the estimatio. I the robust GMM approach, the sample momet fuctios 0

115 are weighted with the iverse of their covariaces so that they are made as close as possible to zero. Despite these aalytical results, it is difficult to make ay geeral statemet about the performace of the estimators i fiite samples. Hece, to evaluate the fiite sample properties of estimators, we tur to Mote Carlo simulatios Desig I this sectio, the small sample properties of the ML estimator, the robust GMM estimator ad the Bayesia estimator are compared through Mote Carlo simulatio experimets. The model is specified as Y = λ 0 W Y + X β 0 + u, u = ρ 0 M u + ε. 2.6 The Mote Carlo experimets iclude the specificatios i SARAR,: λ 0 0, ρ 0 0, ii SARAR,0: λ 0 0, ρ 0 = 0, ad iii SARAR0,: λ 0 = 0, ρ 0 0. There are two regressors ad o itercept term such that X = [x,, x,2 ] ad β 0 = β 0, β 20, where x, ad x,2 are idepedet radom vectors that are geerated from a Normal0,. We cosider = 00, 500, 000. We let W = M ad set β 0 = ad β 20 = for all experimets. For the spatial autoregressive parameters λ 0, ρ 0, we employ combiatios from the set K = 0.9, 0.6, 0.3, 0, 0.3, 0.6, 0.9 to allow for weak ad strog spatial iteractios. The row ormalized spatial weight matrix is based o a small group iteractio sceario described i Li ad Lee 200. I this sceario, the weight matrix is a block diagoal matrix that has oidetical blocks of the group size draw from Uiform3,20. Let g =,..., G be the umber of groups. The, the i, jth elemet of the weight matrix w ij is defied as if i, j g w ij = 0 otherwise for g =,..., G 2.62 The members of a group have the same variace ad we use the group size to create heteroskedasticity. If the group size is greater tha 0, we set the variace equal to the group size; otherwise we set the variace to the square of the iverse of the group size. 02

116 The, we geerate the i-th elemet of the iovatio vector ε accordig to ε i = σ i ξ i, 2.63 where σ i is the stadard error for the i-th observatio ad ξ i s are i.i.d. Normal0,. I all experimets, the followig estimators are cosidered: The Gaussia maximum likelihood estimator MLE, 2 the robust geeralized method of momet estimators RGMME i 2.5, 2.0 ad 2.6, 3 the Bayesia estimators BE. For each specificatio, the Mote Carlo experimet is based o 000 repetitios. For the BE, we ru the MCMC algorithm 6000 times ad discard the first 000 iteratios Simulatio Results The simulatio results are preseted i Appedix I each table, the empirical mea mea, the bias Bias, the empirical stadard error SD ad the root mea square error RMSE of parameter estimates are preseted for the three estimators ear to each other for easy comparisos. First, we cosider the simulatio results for the case of SARAR,0, which are preseted i Tables i Appedix Table 2.9 shows the simulatio results for N = 00. The MLE imposes sigificat bias o λ 0 for the case where there is egative strog spatial depedece relative to the case of strog positive spatial depedece. The same patter shows itself for the BE estimates. The RGMME relatively imposes smaller bias o λ 0 i all cases. I terms of fiite sample efficiecy measured by RMSE, the MLE ad BE are more efficiet. Table 2.9 also reports the estimatio results for β 0 ad β 20. Theoretically, the biases reported for λ 0 also cotamiate the estimates of β 0 ad β 20 for the case of the MLE Li ad Lee, 200. The results for β 0 ad β 20 shows that this theoretical predictio is oly the case whe λ 0 = 0.9. Surprisigly, the MLE imposes trivial biases o β 0 ad β 20 for the other cases. I terms of fiite sample efficiecy, the MLE ad the BE are relatively more efficiet. Table 2.0 reports the estimatio results for N = 500. The MLE results repeat the same patter of for λ 0, β 0 ad β 20. Agai, the BE imposes sigificat bias o λ 0 whe there is strog egative spatial depedece. The RGMME imposes almost o bias 03

117 o λ 0. The fiite sample efficiecy is higher for all estimators relative to Table 2.9. The same patter ca be see i Table 2. whe N = 000. Overall, the MLE ad the BE impose sigificat bias o λ 0 whe there is strog egative spatial depedece. The amout bias does ot decrease as the sample size icreases, which suggest the icosistecy for the case of the MLE. Next, we tur to the simulatio results for the case of SARAR0, preseted i Tables Theoretically, the MLE of ρ 0 is agai icosistet ad the MLE of β 0 ad β 20 is cosistet. The results i Tables are cosistet with this large sample result. That is, the MLE imposes sigificat bias o ρ 0 ad its magitude does ot decrease as the sample size icreases. The RGMME imposes almost o bias o ρ 0, ad as the sample icreases its fiite sample efficiecy measured with RMSE icreases sigificatly. For all three sample sizes, the BE imposes sigificat amout of bias o ρ 0 whe ρ 0 = 0.6 ad ρ 0 = 0.9. I terms of fiite sample efficiecy, it is slightly more efficiet tha the RGMME for the estimatio of β 0 ad β 20. The SARAR0, specificatio R ρy = R ρx β 0 +ε is a geeralized least squares model with covariace Σ for a give value of ρ. Our aalytical results i Sectio 3 shows that the RGMME of β 0 is asymptotically equivalet to the OLSE. O the other had, the mea of the coditioal posterior distributio of β 0 give i Sectio 4 idicates that the Bayesia estimator of β 0 mimics the GLS estimator, which is more efficiet tha the OLSE. The Mote Carlo results cofirm this aalytical result by idicatig that the BE estimator performs better i terms of fiite sample efficiecy as measured by RMSEs. Tables reports the estimatio results for the case of SARAR,. First, we will evaluate the performace of the estimator i terms of bias amout through the estimatio results reported i the tables. The, we will evaluate the estimators i terms of fiite sample efficiecy through surface plots provided i Appedix We report oly the estimatio results for the autoregressive parameters. Theoretically, the MLE is icosistet for all parameters for this specificatio whe the true parameters values of autoregressive parameters are ot zero. Whe λ 0 = 0, oly the MLE of ρ 0 is icosistet. For the trivial case where ρ 0 = λ 0 = 0, heteroskedasticity has o effect o the large sample properties of the MLE. Tables cotai the estimatio 04

118 results for λ 0. For all sample size, the MLE imposes sigificat bias o λ 0 ad the amout of bias is relatively smaller for higher value of λ 0. The fiite sample efficiecy of the MLE is higher for the cases where there is positive spatial depedece i the depedet variable. This ca be see from the estimatio results for RMSEs. For the estimatio results of λ 0 i Tables , the RGMME imposes relatively smaller bias, ad the magitude of bias decreases sigificatly as the sample size icreases to N = 000. Whe there is strog positive spatial depedece i the disturbace terms, i.e., whe ρ 0 is high, the RGMME imposes sigificat bias o λ 0. I cotrast, the BE imposes sigificat bias o λ 0 i all cases ad for all sample sizes. Now, we tur to the estimatio results of ρ 0 reported i Tables First, whe the sample size is N = 00, all estimators impose sigificat bias o ρ 0. As expected, the MLE imposes sigificat bias o ρ 0 i all cases for all sample sizes. The RGMME imposes mostly trivial bias o ρ 0 whe N = 000. Despite this, for the cases where there is strog egative or positive spatial depedece i the model, the RGMME occasioally imposes sigificat bias o ρ 0. Fially, the BE imposes sigificat bias o ρ 0 for all cases ad i all sample sizes. Now, we will evaluate the estimators i terms of fiite sample efficiecy measured by RMSE through the surface plots provided i Appedix Figure 2.9 i Appedix shows the surface plots for RMSEs of ρ 0 for all estimators. The surface plots i the first, secod ad third row are respectively for N = 00, N = 500 ad N = 000. The plots i the first colum of Figure 2.9 show that the MLE has higher RMSEs whe ρ 0 is ear ad λ 0 is ear. The same patter ca be see for the BE. Whe N = 000, the RGMME has higher RMSEs for the cases where ρ 0 is ear. Figure 2.0 shows the surface plots of RMSEs for λ 0. The plots i the first colum shows that for values of λ 0 ear to, RMSEs are geerally higher for the MLE. For the RGMME, RMSEs are higher whe ρ 0 has values ear ad λ 0 has values ear. The BE shows a similar patter for RMSEs of λ 0. Fially, i terms of magitude, a glace at the simulatio results i Table suggests that the RGMME has smaller RMSEs. 05

119 2.6 Empirical Illustratios I this sectio, we use two empirical examples to compare parameter estimates based o the OLS, ML, robust GMM ad Bayesia estimators. 27 For the first example, we estimate the covergece equatio of the spatially augmeted Solow growth model described i Ertur ad Koch 2007 to test the coditioal covergece hypothesis for a sample 9 coutries over the period I the secod applicatio, the hedoic housig-price model described i Harriso ad Rubifeld 978 is estimated for a sample of 506 observatios o cesus tracts i the Bosto Stadard Metropolita Area i 970. For these applicatios, detailed defiitios ad descriptive statistics of all variables are provided i Appedix Figure 2.3 shows the scatter plots of the depedet variables ad their spatial lags. The scatter plots idicate positive associatios betwee the depedet variables ad their spatial lags. Figure 2.3: Scatter plots of depedet variables ad their spatial lags Coutries growth rates i log Media House Valuesi $ Sum of the distace weighted growth rates of all coutries i log Average of Media House Values i 0 earest couties i $000 a Distace weighted sum of growth rates of all coutries b Average Media House i earest 0 cesus tracts 27 For both applicatios, the Bayesia estimates are based o 0000 draws retaied from a total 5000 draws of MCMC samplig with the first 5000 draws are discarded for bur-i. For the Bayesia estimates, we preset posterior meas ad t-statistics calculated from the stadard deviatios of the MCMC draws. 06

120 2.6. Coditioal Covergece Ertur ad Koch 2007 modify the textbook Solow growth model by assumig that the structure of techological iterdepedece is determied by geographic distace betwee coutries. The techological iterdepedece is formulated i the followig way: A it = A 0 e gt k θ it N j= A γw ij jt, 2.64 where N is the total umber of coutries. The aggregate level of techology A it of coutry i at time t depeds o the level of physical capital per worker k it ad the product of the weighted stock of kowledge of other coutries deoted by the term N j Aγw ij jt. The parameter θ [0, represets the stregth of exteralities geerated by physical capital accumulatio of coutry i i the productio of techology. The parameter g is the growth rate of the exogeous portio of techology ad is assumed to be the same for all coutries. The parameter γ [0, reflects the degree of techological iterdepedece, ad the weight w ij is a exogeous term that represets geographic proximity betwee coutry i ad j. Ertur ad Koch 2007 cosider the followig two specificatios for the weight w ij : w ij = 0, if i = j ad w d 2 ij, otherwise ij = 0, if i = j exp{ 2d ij }, otherwise 2.65 where d ij is the great circle distace betwee coutry capitals. The other equatios of the spatially augmeted Solow growth model are the same as that of the stadard textbook growth model. The productio fuctio is the Cobb-Douglas productio fuctio with costat retur to scale i labor ad capital, ad is formulated as Y it = A it Kit αl α it, where Y it is the output, K it is the physical capital ad L it is the level of labor. The level of output i a coutry depeds o the level of the other coutries outputs ad productio iputs through the process of A it. Thus, the augmeted Solow growth model turs ito a iterdepedet system i which coutries caot be treated as a isolated uit of aalysis. As i the textbook Solow model, the growth rate of labor deoted by i is exogeous ad each coutry saves a fixed fractio of its output deoted by s i for the capital ivestmet. The evolutio of 07

121 the capital stock is formulated i the cotext of the capital labor ratio k it ad is assumed to be govered by the followig equatio: dk it dt = s i y it i + δk it, where δ is the depreciatio rate of capital ad is assumed to be the same for all coutries. The productio fuctio is characterized with dimiishig margial retur i the productio iputs, which esures the existece of a steady-state solutio for the modified growth model. The covergece dyamic of a ecoomy to its steady state ca be explored by log-liearizig the equatio for the evolutio of capital aroud the steady state level. The speed of trasitio to the steady state equilibrium is measured by the covergece rate, which is assumed to be the same for all coutries. The empirical equatio that correspods to the covergece dyamic of the spatially augmeted Solow growth model is recovered as l y i t l y i 0 T N + θ 2 w ij l s j + θ 3 j i N = β 0 + β l y i 0 + β 2 l s i + β 3 l i + g + δ + θ w ij l y j 0 N j i w ij l j + g + δ + λ N j j i w ij l y j t l y j 0 T + ε i, 2.66 where β 0 = g ρ + e T κ T α φ la 0e µt, β = e κt T, β 2 = β 3 = α+φ e κt T α φ, θ = γ α e κt T α φ, θ 3 = θ 2 = αγ e κt T α φ. This equatio is derived uder the assumptio that the covergece rate deoted by κ has the same value for all coutries. 28 All coutries are assumed to have the same steady state growth rate of g such that g + δ = For the estimatio, we oly cosider the w ij specificatio for the weight matrix.29 I Tables , we provide estimatio results based o the OLS, ML, RGMM ad Bayesia estimators for the case of SARAR,0, spatial Durbi model SDM, SARAR0, ad SARAR,. The OLS results are repeated i each table for easy comparisos. The diagostic tests for heteroskedasticity idicates the presece of heteroskedasticity for all models. We use two versio of the Lagrage multiplier LM test: i the Breusch-Paga LM test, ad ii the Koeker ad Bassett variat of the LM test. 30 These tests are specified 28 Note that the autoregressive parameter λ i 2.66 ad ρ i β 0 are complicated fuctios of the parameters of the model, see Ertur ad Koch 2007, p The estimatio results based o the wieght matrix with w ij are similar. 30 The Breusch-Paga Lagrage multiplier test is sesitive to the assumptio of ormality of disturbace terms. The Koeker ad Bassett variat is robust to the assumptio of ormality. For the case 08

122 based o the square of the explaatory variables, ad we report the test results for the OLS ad ML results as the other estimators are robust to heteroskedasticity. I each specificatio, the ull hypothesis of homoscedasticity is rejected by both tests. Figure 2.4: Posterior mea of variace compoets: Coditioal covergece Posterior mea of variace compoets SARAR,0 r= Posterior mea of variace compoets SDM r= Observatios a Estimates of v i: SARAR, Observatios b Estimates of v i: SDM Model Posterior mea of variace compoets SARAR0, r= Posterior mea of variace compoets SARAR, r= Observatios c Estimates of v i: SARAR0, Observatios d Estimates of v i: SARAR, of SARAR,0, Li ad Lee 200 shows that the Breusch-Paga LM test has desirable fiite sample properties. 09

123 Table 2.: Coditioal Covergece: SARAR,0 OLSE MLE RGMME BE Itercept [0.9222] [.0808] [.309] [.9879] logy [ ] [ ] [-3.375] [-4.337] logs [8.0389] [7.424] [5.4575] [7.878] log [ ] [ ] [-2.772] [-2.373] W [ logy 995 logy 960 ] [2.9964] [.9523] [3.8975] Mora s I Test [0.0000] LM SEM [0.0005] LM SAR [0.0002] Robust LM SEM [0.402] Robust LM SAR [0.309] Breusch-Paga LM [0.0047] [0.043 ] Breusch-Paga-Koeker LM [0.062] [ ] Jarque-Bera Test [0.5000] [ 0.645] Implied κ [3.2063] [3.4685] [2.9563] [3.7394] σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p- values for the test statistics. Furthermore, the differece betwee the test statistics of these two versios suggests that the assumptio of ormality is erroeous as idicated by the Jarque-Bera test of ormality. The presece of heteroskedasticity ca also be tested from the Bayesia perspective by ivestigatig the posterior mea of variace compoets. Recall that the Bayesia estimators are based o the assumptio that the covariace matrix Σ ca be decomposed as σ 2 V, where V = Diagv i,..., v. Thus, the estimates of v i s that deviate substatially from uity idicates the presece of heteroskedasticity. Figure 2.4 shows the plots of the 0

124 posterior mea of v i s for each specificatio. The substatial deviatio of the posterior mea of v i from the uity idicates the presece of heteroskedasticity i each specificatio. Now, we tur to estimatio results of models reported i tables. Table 2. provides the estimatio result for the case of SARAR,0 specificatio. All four estimators provide similar estimates for the parameters of exogeous variables. As a result, the covergece parameter κ estimates are similar. The OLS colum i Table 2. cotais five diagostic tests to detect spatial autocorrelatio amog the residuals of the OLS model. The Mora s I test is a diffuse test i the sese that the alterative is a uspecified form of spatial correlatio. I the case of LM tests, the alterative hypothesis is stated either as SARAR,0 deoted by LM SAR or SARAR0, deoted by LM SEM. As idicated i Table 2., these three tests reject the ull hypothesis of o spatial correlatio. The OLS colum also icludes two robust versio of the ML test: i the robust LM SAR ad ii the robust LM SEM. These tests are locally robust to misspecificatio of the alterative hypothesis Aseli ad Bera, 998. Both robust tests are isigificat. Fially, we tur to the estimatio results for the autoregressive parameter λ i Table 2.. The estimate of autoregressive parameter is higher for the case of the BE. Our Mote Carlo results for a similar sample size reported i Table 2.9 idicate that the RGMME overestimates λ, while the MLE ad the BE uderestimate λ. The estimates reported for this empirical illustratio are ot cosistet with the Mote Carlo results. Table 2.2 reports estimatio results whe there exist exogeous spatial iteractio of the covariates. This specificatio, which is give i 2.66, correspods to the theoretical covergece equatio of the spatially augmeted Solow growth model. For the RGMME results, we use iitial estimates from the BE, sice the iitial GMM ad 2SLS estimators described i Sectio 3 report estimates of the autoregressive parameter bigger tha. First, the joit test for the spatial lag of the exogeous variable deoted by Wald-SARAR,0 is sigificat, suggestig that these variables have explaatory power. O the other had, the Bayesia posterior odds ratio is i favor of the SARAR,0 specificatio. 3 While the 3 The model probabilities are for the SARAR,0 ad for the SDM. The posterior model probabilities are calculated by the Matlab fuctio model probs of the Spatial Ecoometric Toolbox provided by James LeSage. Also ote that the log-likelihood value reported for the SDM is higher tha the oe reported for the SARAR,0.

125 spatial lag of iitial icome W logy 960 ad savig rate W logs are sigificat i the case of RGMME, the oly sigificat spatial lag of exogeous variables is that of iitial icome i the case of the ML ad Bayesia estimators. Aother importat differece is that the estimates for the iitial icome from the spatial model reported by the ML, RGMM ad Bayesia estimators are sigificatly differet from the oes reported by the OLSE. As a result, the estimates of the covergece rate are differet. The estimates reported i the literature rage betwee % ad 3% Durlauf et al., Therefore, the OLS model uderestimates κ, while all estimates from the spatial models are i the rage, where the RGMME assigs the largest value of Fially, we compare estimates of the autoregressive parameter reported by the RGMM, ML ad Bayesia estimators. The estimates do ot agree ad the RGMME assigs the largest value of 0.820, which is cosistet with the Mote Carlo results reported i Table 2.9, where the RGMME overestimates ad the MLE ad the BE uderestimate the autoregressive parameter. 2

126 Table 2.2: Coditioal Covergece: Spatial Durbi Model SDM OLSE MLE RGMME Bayesia Itercept [0.9222] [0.3529] [-.774] [.3526] logy [ ] [ ] [-4.934] [-4.673] logs [8.0389] [7.3974] [ 5.403] [6.7477] log [ ] [ ] [-3.00] [ ] W [ logy 995 logy 960 ] [3.8548] [0.29] [3.2304] W logs [ ] [ ] [ ] W log [.774] [.4345] [.4906] W logy [3.086] [ [2.2860] Mora s I Test [0.0000] LM SEM [0.0005] LM SAR [0.0002] Robust LM SEM [0.402] Robust LM SAR [0.309] Breusch-Paga LM [0.0047] [ ] Breusch-Paga-Koeker LM [0.062] [ ] Jarque-Bera Test [0.5000] [ 0.787] Wald-SARAR,0 Test [0.064] [0.000] Implied κ [3.2063] [3.6067] [3.4286] [ 5.934] σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p- values for the test statistics. 3

127 Table 2.3: Coditioal Covergece: SARAR0, MLE RGMME Bayesia Itercept [0.7859] [0.7839] [0.909] logy [ ] [ ] [ ] logs [7.7483] [5.5677] [6.5803] log [-3.5] [ ] [ ] W u [ 4.375] [2.5650] [3.559] Commo factor test [0.095] [0.974] Breusch-Paga LM [0.0047] [ ] Breusch-Paga-Koeker LM [0.062] [ ] Jarque-Bera Test [0.5000] [ ] Implied κ [3.3676] [2.766] [2.8844] σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p-values for the test statistics. The Wald ad likelihood ratio tests are used for the commo factor restrictios. The SDM specificatio ests the SARAR0,, which is give i Table 2.3. The parameter estimates reported i Table 2.3 for all exogeous variables are almost the same. The estimators report slightly differet estimates for the autoregressive parameter W u. The diagostic test results for the commo factor restrictio implies that the SARAR0, 4

128 specificatio may explai the sample data better. I additio, the Bayesia posterior odds ratio is i favor of the SARAR0, specificatio with the posterior model probability of Whe we compare the model with the SARAR,0 specificatio, the posterior odds ratio is i favor of the SARAR,0 with a model probability of The coverges rates reported i Table 2.3 are higher tha the oes obtaied from Table 2. uder the SARAR,0 specificatio. Fially, the parameter estimates for the SARAR, specificatio are give i Table 2.4. The estimates of the autoregressive parameters obtaied from the RGMME are totally differet from the oes reported by the ML ad Bayesia estimators. The estimates obtaied from the MLE ad the BE for λ 0 are egative. This result idicates that the distace based average growth rate of all other coutries has egative spillover effects for a particular coutry. O the other had, the RGMME reports a positive estimate of for λ 0 ad a egative estimate of for ρ 0. The parameter estimates of exogeous variables are similar across estimators i terms of magitudes ad sigs. It is iterestig that all exogeous variables have very small t-statistics i the case of the Bayesia estimator. I the case of RGMME ad MLE, the t-statistics are calculated from the aalytical covariace matrix. O the other had, these statistics are simply calculated from the MCMC draws for the Bayesia estimator. The small t-statistics of the BE suggests that the variatio i the samplig draws is sigificatly higher for this applicatio. 32 Figure 2.5a shows the first 000 draws for the autoregressive parameters to moitor the samplig process for these parameters. The plots i Figure 2.5a suggest that the sampler does ot ecouter ay problem for these parameters. Sice the estimatio results for the autoregressive parameters are differet sigificatly across the estimators, we ivestigate the optimizatio procedures of the estimators. 33 For both the RGMME ad the MLE, the o-liear optimizatio is a simplex search method that does ot use umerical or aalytic gradiets. 34 To esure that the estimates reflect 32 The diagostic tests idicate the covergece of the sampler. We use Matlab fuctio coda of the Spatial Ecoometric Toolbox to moitor the covergece of the sampler. 33 This model is also estimated with the two-step robust GMM estimator suggested i Kelejia ad Prucha 200. The estimates obtaied from this estimator are similar to oes reported for the RGMME. 34 I the case of the MLE, the cocetrated log-likelihood fuctio is maximized over the autoregressive parameters. O the other had, all parameters are joitly estimated i the case of the RGMME. For both estimators, the Matlab i-built fuctio fmisearch is used. 5

129 the global optimum, we use differet iitial values for the parameters ad cofirm that the estimates do ot chage. Figure 2.5b shows the objective fuctios values agaist the iteratio umbers for the case of RGMME ad MLE. The plots show that the icremets i the objective fuctios slow dow through the ed of the search suggestig that the umerical routies coverged to a optimum. We suggest that this specificatio should be rejected for three mai reasos: i The egative autoregressive parameters are couter ituitive for this applicatio, ii the estimate of the autoregressive parameter of the disturbace term is isigificat i the case of RGMME, ad iii the Mote Carlo results for the case of SARAR, reported i Table 2.5 ad 2.8 for the sample size of 00 idicate that all estimators perform poorly. To summarize, from the ecoomic theory stadpoit, the correct specificatio is the SDM as it stads for the empirical equatio that correspods to the theoretical covergece equatio of the Solow growth model. From the statistical stadpoit, the estimatio results from the RGMM ad ML estimators show that the commo factor test results are ot i agreemet with the robust LM test results. I such situatios, Elhorst 200 suggests that the SDM should be adopted sice the estimates from this model are at worst iefficiet. 6

130 Table 2.4: Coditioal Covergece: SARAR, OLSE MLE RGMME Bayesia Itercept [0.9222] [2.453] [.4390] [-0.007] logy [ ] [ ] [ ] [ ] logs [8.0389] [7.0957] [4.6429] [0.075] log [ ] [ ] [-.9286] [ ] W [ logy 995 logy ] [ ] [2.2493] [ ] W u [23.038] [-0.560] [ ] Mora s I Test [0.0000] LM SEM [0.0005] LM SAR [0.0002] Robust LM SEM [0.402] Robust LM SAR [0.309] Breusch-Paga LM [0.0047] [0.004 ] Breusch-Paga-Koeker LM [0.062] [0.025] Jarque-Bera Test [0.5000] [ 0.508] Implied κ [3.2063] [4.009] [2.8558] [ ] σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p- values for the test statistics. 7

131 Figure 2.5: Moitorig covergece: SARAR, 0.5 Samplig for λ MCMC iteratios Samplig for ρ MCMC iteratios a MCMC Samplig for autoregressive parameters Fuctio values Objective fuctio values ad iteratio umbers SARAR, RGMME Iteratios Fuctio values SARAR, MLE Iteratios b Objective fuctio values agaist the iteratio umbers 8

132 Though our Mote Carlo desig does ot cotai the SDM, our simulatio results for the SARAR,0 specificatio ca be used to establish which set of estimates i Table 2.2 is likely to be closest to the truth. The simulatio results i Table 2.9 suggest that the estimates reported i the RGMME colum are likely to be much closer to the truth. Figure 2.6: Posterior Distributio of λ ad κ: SDM ad SARAR,0 Posterior Distributios r=4 mea= r=5 mea= r=6 mea= r=7 mea= r=00 mea=0.290 r=00 mea=0.290 Posterior mea of λ: SARAR,0 r=4 mea= Posterior Desity of κ: SARAR, r=00 mea= Posterior distributio of covergece rate r=4 mea=0.008 r=4 mea=0.008 r=5 mea=0.008 r=6 mea= r=7 mea= r=00 mea= λ values κ values a Posterior distributio of λ: SARAR,0 b Posterior distributio of κ: SARAR,0 Posterior Distributios r=4 mea= r=5 mea= r=6 mea= r=7 mea= r=00 mea= r=00 mea= Posterior mea of λ: SDM r=4 mea= Posterior Desity of κ: SDM Posterior distributio of covergece rate r=4 mea=0.049 r=5 mea=0.05 r=6 mea=0.052 r=7 mea=0.05 r=00 mea= λ values κ values c Posterior distributio of λ: SDM d Posterior distributio of κ: SDM 9

133 As a fial exercise, we ivestigate the effect of differet hyper-parameter values o the Bayesia estimates. Beside the SDM, we also cosider the SARAR,0 specificatio sice it has the largest posterior model probability. I Figure 2.6, we show the effect of the hyperparameter r o the estimates of the autoregressive λ ad covergece κ parameters. The desity plots i the figure idicate that the estimates of these parameters are ot sigificatly affected by differet values of the hyper-parameter i this applicatio Housig Price Model I the secod illustrative applicatio, the Harriso ad Rubifeld 978 data set described i Gilley ad Pace 996 is used to estimate a hedoic housig-price model for a sample of 506 observatios of cesus tracts i the Bosto Stadard Metropolita Area i 970. This data set has bee used by may authors to illustrate various ecoometric techiques, see for example LeSage 999 ad Baltagi ad Liu I this applicatio, the media house price is explaied by thirtee variables that are described more fully i Table 2.22 of Appedix 2.8.3: per-capita crime rate by tow, proportio of residetial lad zoed for lots over sq.ft., proportio of o-retail busiess acres per tow, Charles River dummy variable, itric oxides cocetratio parts per 0 millio, average umber of rooms per dwellig, proportio of ower-occupied uits built prior to 940, weighted distace to five Bosto employmet ceters, idex of accessibility to radial highways, full-value property tax rate, pupil-teacher ratio by tow, proportio of blacks per tow, ad proportio of lower class per tow. 35 We costruct the cotiguous weight matrix based o the earest 0 cesus tracts. 36 Let J i be a set cotaiig the earest 0 cesus tracts for the cesus tract i. The, the elemet w ij is determied accordig to 0, if i = j or j / J i w ij = for i, j =,..., N. 2.67, if j J i 35 The depedet variable is i log form. Followig LeSage 999, we scale each variable by subtractig the mea ad dividig with the stadard deviatio. 36 This kid of weight matrix is used i Pace et al. 202a to illustrate the performace of various estimators whe the data geeratig process assumes a autoregressive process for the exogeous variables. 20

134 The weight matrix is row ormalized so that the spatial lag terms are a simple average of the variables i the earest 0 cesus tracts. The estimatio results are give i Tables 2.5 to 2.8. First off all, the ull hypothesis of o heteroskedasticity is rejected as idicated by the Breusch-Paga LM tests ad its Koeker versio results i all specificatios. The posterior mea of variace compoets v i s are preseted i Figure 2.7. The estimates of the variace compoets idicate that the costat variace assumptio may ot hold for this applicatio. Figure 2.7: Posterior mea of variace compoets: Hedoic Housig-Price Equatio Posterior mea of variace compoets Posterior mea of variace compoets 20 SARAR,0 r=5 SDM r= Observatios Observatios a Estimates of v i: SARAR,0 b Estimates of v i: SDM Model 7 Posterior mea of variace compoets SARAR0, r=5 6 Posterior mea of variace compoets SARAR, r= Observatios c Estimates of v i: SARAR0, Observatios d Estimates of v i: SARAR, 2

135 The estimatio results i Table 2.5 are for the SARAR,0 specificatio. The first colum cotais the OLS estimatio results. The Mora s I test ad robust LM test statistics i this colum strogly idicate the presece of the spatial autocorrelatio amog the residuals of the OLS model. I particular, the robust LM test statistic for the presece of the spatial lag of the disturbace term Robust LM SEM suggests that the SARAR0, may explai data better. The parameter estimates are mostly i agreemet except for the pollutio deoted by Nitric ad Charles River dummy variables. The Bayesia estimator reports a egative isigificat estimates for the pollutio variable. I the case of the Charles River dummy variable, all spatial models i all subsequet specificatios have isigificat estimates. LeSage 999 states that this locatioal dummy variable is uecessary, sice the spatial models accout explicitly the spatial ature of the data. 22

136 Table 2.5: Hedoic Housig-Price Equatio: SARAR,0 OLSE MLE RGMME Bayesia Crime [-7.862] [ ] [ ] [ ] Zoig [2.360] [ [ ] [2.263] Idustry [.0032] [.0068] [.998] [.0265] Charles river [2.9285] [0.526] [0.4754] [-0.297] Nitric [ ] [ ] [-3.386] [-.0338] Rooms [5.4355] [7.0329] [ [.9405] House age [0.3987] [0.364] [0.8279] [ ] Distace [-6.549] [-5.780] [ ] [-5.628] Access [5.3780] [5.7227] [5.4665] [4.8265] Tax rate [-4.66] [-4.394] [-5.267] [-5.2] Pupil/teacher [-7.360] [ ] [-5.663] [ ] Black pop [3.8507] [3.9870] [2.6058] [6.557] Lower class [-4.38] [-.7367] [ ] [ ] W dep [77.059] [6.8564] [2.965] Mora s I Test [0.0000] Robust LM SARAR, [0.000] Robust LM SARAR0, [0.0000] Breusch-Paga LM [0.0000] [0.0000] Breusch-Paga-Koeker LM [0.0000] [0.0000] Jarque-Bera Test σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p-values for the test statistics. 23

137 Table 2.6: Spatial Durbi Model MLE RGMME Bayesia Crime [ ] [ ] [-5.408] Zoig [2.854] [2.325] [.6803] Idustry [-0.295] [0.554] [ ] Charles River [-0.05] [ ] [-0.433] Nitric [ ] [ ] [-.9494] Rooms [3.029] [4.5845] [5.0546] House age [ ] [-3.37] [ ] Distace [ ] [-.056] [-.0934] Access [3.3526] [3.303] [4.047] Tax rate [ ] [ ] [ ] Pupil/teacher [-.705] [ ] [ ] Black pop [4.6565] [5.2782] [8.0644] Lower class [ ] [ ] [-6.392] W dep [8.565] [7.3028] [2.9062] W Crime [-.0767] [ ] [ ] W Zoig [0.8973] [0.8668] [0.9845] W Idustry [ ] [-0.25] [0.4285] W Charles R [0.5753] [.085] [.0722] W Nitric [0.9544] [0.3507] [0.2944] W Rooms [-3.0] [-2.696] [ ] W House age [4.2999] [4.3658] [5.0684] W Distace [-0.360] [ ] [-0.20] W Access [-.3834] [-.2866] [-.202] W Tax rate [.789] [.5522] [.095] W Pupil/teacher [ ] [-0.546] [ ] W-Black pop [ ] [ ] [ ] W Lower class [2.653] [0.7366] [-0.959] Breusch-Paga LM [0.0000] Breusch-Paga-Koeker LM [0.0000] Jarque-Bera Test [0.00] Wald Test-SARAR, [0.0000] [0.0000] σ 2 l L R 2 N Note: The brackets cotai the t-statistics for the parameter estimates ad p-values for the test statistics. 24

138 Table 2.7: Hedoic Housig-Price Equatio: SARAR0, MLE RGMME Bayesia Crime [ ] [-5.768] [-5.232] Zoig [.7685] [2.5402] [.9732] Idustry [-0.20] [-0.738] [0.0486] Charles River [0.2208] [0.576] [0.2207] Nitric [ ] [ ] [ ] Rooms [9.4200] [4.2762] [8.6843] House age [ ] [-.394] [-2.952] Distace [-2.376] [ ] [ ] Access Tax rate [-4.286] [-4.292] [ ] Pupil/teacher [ ] [-4.078] [ ] Black pop [5.6742] [5.2849] [5.3545] Lower class [-.0588] [-6.669] [ ] W u [28.632] [2.40] [4.683] Commo factor test [0.0000] [0.358] Breusch-Paga LM [0.0000] Breusch-Paga-Koeker LM [0.0000] Jarque-Bera Test [0.00] σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p-values for the test statistics. 25

139 Table 2.8: Hedoic Housig-Price Equatio: SARAR, MLE RGMME Bayesia Crime [-7.079] [-5.606] [-.9887] Zoig [2.0052] [2.5650] [0.677] Idustry [0.424] [0.2723] [-0.080] Charles River [0.253] [0.48] [-0.039] Nitric [ ] [-.9599] [ ] Rooms [9.5035] [4.777] [4.4940] House age [-2.442] [-2.078] [-.2533] Distace [-3.732] [ ] [-.2558] Access [5.483] [3.8237] [.5656] Tax rate [-4.460] [ ] [-.5796] Pupil/teacher [-3.536] [ ] [-.05] Black pop [5.3796] [4.604] [2.3654] Lower class [ ] [-6.746] [ ] W dep [4.9553] [3.592] [4.962] W u [5.3025] [5.8408] [.8537] Breusch-Paga LM [0.0000] Breusch-Paga-Koeker LM [0.0000] Jarque-Bera Test [0.000] σ l L R N Note: The brackets cotai the t-statistics for the parameter estimates ad p-values for the test statistics. The effects of the spatial lags of the covariates i the SDM model are preseted i 26

140 Table 2.6. The sigificat Wald test statistics for the case of MLE ad RGMME suggest the eed for the iclusio of the spatial lag of the exogeous variables. I compariso with the SARAR,0 specificatio, the posterior odds ratio is i favor of the SDM. Agai most of the estimates are similar ad the MLE assigs relatively higher estimate to the autoregressive parameter W dep. The BE estimate of is lower tha the MLE ad RGMME estimates. Table 2.7 reports the estimatio result for the case of SARAR0,. The relatively higher robust LM test statistic i the OLS colum of Table 2.5 idicates that the SEM model is more likely to explai the data. The autoregressive parameter estimate reported i Table 2.7 is for the MLE, for the RGMME, ad for the BE. The simulatio results i Table 2.3 show that all estimators uderestimate the autoregressive parameter whe there is positive spatial depedece. Sice the amout of bias is relatively smaller for the case of RGMME, it is likely that the estimate is much closer to the truth. For the SARAR, specificatio, the estimatio results are reported i Table 2.8. For this specificatio, all estimators report similar estimates for the autoregressive parameters. Regardig the explaatory variables, the Bayesia estimates have smaller t-statistics, sometimes rederig the slope estimates statistically isigificat. For example, the pollutio variable deoted by Nitric is o loger sigificat i the case of Bayesia estimator. Next, we ivestigate the best model that explais the sample data for this applicatio. Whe we compare the first three specificatios, SARAR,0, SDM, SARAR0,, the classical tests icludig the likelihood ratio ad Wald tests for the commo factor restrictios i Table 2.7 do ot agree for this applicatio. The Bayesia posterior odds ratio is i favor of the SDM i every cases. As for the compariso of the SDM ad SARAR, specificatios, we suggest that the SDM should be adopted for two mai reasos: i The spatial lag of exogeous variables are joitly sigificat as idicated by the Wald test statistics i Table 2.6, ii eve if the true data geeratig process for this applicatio is the SARAR, model, the parameter estimates from the SDM will be still cosistet but iefficiet. O the other had, i the case where the true data geeratig process is the SDM, the parameter estimates based o the SARAR, specificatio will suffer from the 27

141 omitted variable bias. 37 Our Mote Carlo results i Table 2.0 characterize the properties of the estimators for the case of the SARAR,0 for a sample size of 500. Sice the SDM is a simple extesio of the SARAR,0 specificatio by icludig the spatial lag of the exogeous variables, the simulatio results reported i Table 2.0 are also valid for the SDM. I terms of average absolute bias, the RGMME performs better tha the ML ad Bayesia estimators. 38 Therefore, we coclude that the estimates i the RGMME colum of Table 2.6 are more likely to be closest to the truth. The otable differeces i terms of statistical iferece represet a importat policy implicatios for this applicatios. For example, the Zoig variable is sigificat i the RGMME colum, but it is isigificat i the Bayesia colum. The Distace variable is sigificat i the MLE colum, but it is isigificat i both RGMME ad Bayesia colums. A reverse situatio is preset for the Pupil/teacher variable. 37 LeSage 999 compares SARAR,0, SARAR0, ad SARAR, for this applicatio. He suggests that the SARAR, model explais data better. He did ot compare the SARAR, with the SDM. 38 The average absolute bias of estimators i Table 2.0 is as follow: i 0.65% for the RGMME, ii 5.95% for the MLE, ad iii 5.75% for the Bayesia estimator. 28

142 Figure 2.8: Posterior distributio of autoregressive parameters Posterior mea of λ: SARAR,0 r=4 mea=0.40 r=5 mea=0.46 r=6 mea= r=7 mea= r=00 mea= Posterior mea of λ: SDM r=4 mea=0.628 r=5 mea=0.640 r=6 mea=0.640 r=7 mea= r=00 mea= Posterior Distributios r=5 r=4 r=7 mea= r=00 mea= Posterior Distributios r=4 r=00 mea= r= λ values λ values a Posterior distributio of λ: SARAR,0 b Posterior distributio of λ: SDM Model 0 9 r=4 mea= r=5 mea= r=6 mea= r=7 mea= r=00 mea=0.857 Posterior mea of λ: SARAR0, 8 7 Posterior mea of λ: SARAR, r=4 mea= r=5 mea=0.246 r=6 mea= r=7 mea=0.245 r=00 mea= Posterior Distributios r=00 Posterior Distributios r= λ values λ values c Posterior distributio of ρ: SARAR0, d Posterior distributio of λ: SARAR, As a fial exercise, we ivestigate the affect of the hyper-parameter r o the estimates of the autoregressive parameter for each specificatio. Figure 2.8 cotais the oparametric desity estimates of the autoregressive parameter based o the draws from the Metropolis withi Gibbs sampler. The desity plots i Figure 2.8 idicate that the posterior distributios of λ are i agreemet for r = 4, 5, 6, 7. O the other had, the resultig distributios for r = 00 sigificatly deviate from the other distributios with higher posterior meas. 29

143 The percetage deviatios i the resultig posterior meas betwee the case of r = 00 ad r = 5 are give as 3% for SARAR,0, 6% for SDM, 2% for SARAR,0 ad % for SARAR,. Thus, the estimates i the case of SARAR,0 ad SARAR0, are relatively more sesitive to heteroskedasticity i this applicatio. The relatively higher value of r = 00 is associated with the homoskedastic assumptio so that the estimates based o this value should be i close agreemet with that of the MLE. Our Mote Carlo results show that the Bayesia estimates are mostly lower tha the MLE estimates, which i tur implies that a substatial amout of robustificatio is takig place i the case of the Bayesia estimator. 2.7 Coclusio This study examie the effect of ukow forms of heteroskedasticity o spatial models that have a spatial autoregressive process i the depedet ad/or the disturbace term. The GMM ad Bayesia estimatio methods have the advatage that the estimator formulated from these estimatio frameworks ca be robust to a ukow form of heteroskedasticity. I the GMM framework, the ukow heteroskedasticity is icorporated ito estimatio through the formulatio of appropriate momet fuctios. The quadratic momet fuctios cosidered for the estimatio are desiged i such a way that the orthogoality coditio of the momet fuctios is ot violated i the presece of heteroskedastic disturbace terms. I the Bayesia framework, the specificatio of the heteroskedasticity is assumed to have two compoets: i a costat compoet ad ii a compoet that varies across observatios. The compoet that varies over observatios is assumed to be i.i.d. to facilitate the estimatio. The mea of the resultig posterior coditioal distributio of the parameters of the exogeous variables idicates that the Bayesia estimator mimics the geeralized least squares estimator GLSE, where the aberrat observatios are automatically dowweighted i the estimatio. To compare fiite sample properties of the heteroskedastiticy robust estimators, a Mote Carlo simulatio is desiged for the case of SARAR,0, SARAR0, ad SARAR,. For the SARAR,0 specificatio, the simulatio results show that the MLE of the autore- 30

144 gressive parameter imposes sigificat bias ad the bias amout does ot decrease as the sample size icreases. The Bayesia estimator BE imposes sigificat bias o spatial autoregressive estimator whe there exists strog egative spatial depedece. The RGMME imposes trivial bias o all parameters for all sample sizes. For the case of SARAR0,, the simulatio results repeat the same patter for the spatial autoregressive parameter. Namely, the MLE imposes sigificat bias i all sample sizes, ad the BE imposes sigificat bias for the cases where there exists strog spatial depedece. The aalytical results i Sectios 3 ad 4 show that the BE may be more efficiet as it mimics the GLSE. The fiite sample efficiecy measured with RMSEs idicates that the BE is more efficiet for the case of SARAR0,. The simulatio results for the case of SARAR, idicates that whe the sample size is small N = 00, all estimators impose sigificat bias o the autoregressive parameters. As the sample size icreases, the amout of bias decreases oly i the case of the RGMME. Despite this geeral patter, for the cases where there is strog positive or egative spatial depedece, the RGMME occasioally imposes a substatial amout of bias o the autoregressive parameters. I terms of fiite sample efficiecy, the RGMME performs better i all cases. I the fial sectio, we preset two empirical illustratios to show how the heterokedasticity robust estimators perform i applied research. I the first empirical applicatio, coditioal covergece hypothesis is tested through a spatially augmeted Solow growth model for 9 coutries. Although all estimators report similar covergece rates, the estimates of the autoregressive parameters show a substatial variatios. For the SDM ad SARAR, specificatios, the estimates of the autoregressive parameters do ot agree. For the SARAR, specificatio, the MLE ad the BE report implausible estimates for the autoregressive parameters. I the secod applicatio, a housig price equatio is estimated for a sample of 506 observatios. We show that the SDM is the most likely cadidate that explais the sample data. The estimatio results from this specificatio idicate that the estimate for the autoregressive parameter varies across the estimators. I additio, the estimates of some exogeous variables are ot i agreemet which i tur implies a differet set of iferece across estimators. For both empirical applicatios, we suggest that the set of estimates reported i the RGMME colums is likely to be the most accurate sice the 3

145 Mote Carlo results idicate that the RGMME has the smallest average absolute bias. Overall, we show that the estimates of the spatial autoregressive parameters ca sigificatly vary across the estimators. The estimates of the autoregressive parameters play a importat role i the estimatio of the direct ad idirect margial effects for the iterpretatio of the partial effects of covariates. The effect of heteroskedasticity o these impact estimates ca be explored i the future studies. 32

146 2.8 Appedix 2.8. Some Useful Lemmas Lemma 2. Completio of Square. Let A ad B be positive defiite matrices. Defie C = A + B ad D = A + B. Let x, α ad γ be vectors. The, there exists a vector µ such that x α A x α + x γ B x γ = x µ C x µ + α γ D α γ, 2.68 where µ = CA α + B γ. Proof. See Abadir ad Magus 2005, Exercise 8.5, p. 26. Lemma 2.2. Let A, B ad C be matrices with ijth elemets respectively deoted by a,ij, b,ij ad c,ij. Assume that A ad B have zero diagoal elemets, ad C has uiformly bouded row ad colum sums i absolute value. Assume that ε satisfies Assumptio with covariace matrix deoted by Σ =Diag{σ 2,..., σ2 }. The, Eε A ε ε B ε = a,ij b,ij + b,ji σiσ 2 j 2 i= j= 2 Eε C ε 2 = = 3 varε C ε = = i= i= i= i= c 2,ii Proof. See Li ad Lee 200. = tr Σ A B Σ + Σ B where Σ = Diagσ, 2..., σ. 2 [ Eε 4 i 3σi] 4 + c,ii σi c,ij c,ij + c,ji σiσ 2 j 2 i= i= j= c 2 [,ii Eε 4 i 3σi 4 ] + tr 2 Σ C + trσ C C Σ + Σ C Σ C, c 2 [,ii Eε 4 i 3σi] 4 + i= j= c,ij c,ij + c,ji σiσ 2 j 2 c 2 [,ii Eε 4 i 3σi 4 ] + trσ C C Σ + Σ C Σ C. 33

147 2.8.2 Mote Carlo Simulatio Results Mote Carlo Simulatio Results for SARAR,0 Table 2.9: N = 00 λ MLE λ RGMME λ BE λ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [0.265]0.5[0.305] -0.95[-0.05]0.396[0.397] -0.66[0.284]0.5[0.322] [0.22]0.59[0.265] -0.6[-0.0]0.556[0.556] [0.228]0.3[0.263] [0.056]0.87[0.95] [0.065]0.662[0.666] [0.063]0.62[0.74] [-0.052]0.208[0.24] 0.080[0.080]0.58[0.586] [-0.060]0.9[0.200] [-0.2]0.96[0.230] 0.37[0.07]0.457[0.462] 0.65[-0.35]0.87[0.230] [-0.]0.58[0.93] 0.653[0.053]0.296[0.30] 0.48[-0.9]0.5[0.93] [-0.035]0.044[0.056] 0.928[0.028]0.086[0.09] 0.86[-0.039]0.042[0.058] β,mle β,rgmme β,be [0.024]0.366[0.366] 0.992[-0.008]0.368[0.368].028[0.028]0.346[0.347] [0.008]0.368[0.368] 0.985[-0.05]0.37[0.372].03[0.03]0.347[0.347] [-0.00]0.377[0.378] 0.985[-0.05]0.389[0.390] 0.99[-0.009]0.343[0.344] [-0.022]0.378[0.378] 0.976[-0.024]0.383[0.383] 0.98[-0.09]0.352[0.352] [0.007]0.367[0.367].003[0.003]0.374[0.374].006[0.006]0.348[0.348] [0.025]0.373[0.374].06[0.06]0.376[0.376].027[0.027]0.350[0.35] [-0.007]0.375[0.375] 0.959[-0.04]0.385[0.388].00[0.00]0.347[0.347] β 2,MLE β 2,RGMME β 2,BE [0.044]0.382[0.384].02[0.02]0.376[0.377].053[0.053]0.363[0.367] [0.005]0.37[0.37] 0.984[-0.06]0.370[0.370].000[0.000]0.339[0.339] [0.005]0.380[0.380] 0.998[-0.002]0.385[0.385].004[0.004]0.357[0.357] 0.000[-0.000]0.355[0.355] 0.997[-0.003]0.362[0.362] 0.999[-0.00]0.329[0.329] [0.006]0.38[0.38] 0.999[-0.00]0.387[0.387].003[0.003]0.356[0.356] [0.003]0.38[0.38] 0.986[-0.04]0.390[0.390].005[0.005]0.348[0.348] [0.029]0.377[0.378] 0.998[-0.002]0.38[0.38].035[0.035]0.355[0.357] 34

148 Table 2.0: N = 500 λ MLE λ RGMME λ BE λ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [0.274]0.4[0.297] [0.002]0.20[0.20] [0.27]0.098[0.288] [0.50]0.20[0.92] [0.003]0.88[0.88] -0.46[0.39]0.8[0.82] [0.053]0.24[0.35] [0.00]0.69[0.69] [0.044]0.24[0.32] [-0.07]0.0[0.] [-0.000]0.8[0.8] [-0.022]0.09[0.] [-0.048]0.084[0.097] 0.296[-0.004]0.090[0.090] 0.249[-0.05]0.085[0.099] [-0.049]0.060[0.078] 0.602[0.002]0.085[0.085] 0.549[-0.05]0.059[0.078] [-0.06]0.06[0.023] 0.900[0.000]0.02[0.02] 0.882[-0.08]0.06[0.024] β,mle β,rgmme β,be [0.025]0.66[0.68].004[0.004]0.65[0.65].026[0.026]0.55[0.57] [0.04]0.73[0.73].002[0.002]0.7[0.7].05[0.05]0.59[0.60] [0.008]0.67[0.67].003[0.003]0.66[0.66].005[0.005]0.54[0.54] [-0.00]0.69[0.69] 0.986[-0.04]0.69[0.69] 0.992[-0.008]0.58[0.58] [0.000]0.69[0.69] 0.997[-0.003]0.68[0.68] 0.999[-0.00]0.57[0.57] [0.006]0.7[0.7] 0.999[-0.00]0.70[0.70].008[0.008]0.60[0.60] [0.003]0.72[0.72] 0.989[-0.0]0.7[0.7].007[0.007]0.63[0.63] β 2,MLE β 2,RGMME β 2,BE [0.00]0.70[0.70] 0.989[-0.0]0.68[0.68].03[0.03]0.58[0.59] [-0.002]0.66[0.66] 0.988[-0.02]0.65[0.66].000[0.000]0.57[0.57] [-0.004]0.68[0.68] 0.990[-0.00]0.68[0.68].000[0.000]0.58[0.58] [-0.002]0.73[0.73] 0.994[-0.006]0.74[0.74] 0.997[-0.003]0.59[0.59] [-0.004]0.67[0.67] 0.992[-0.008]0.67[0.67] 0.999[-0.00]0.56[0.56] [-0.004]0.7[0.7] 0.988[-0.02]0.70[0.70] 0.999[-0.00]0.59[0.59] [-0.00]0.70[0.70] 0.986[-0.04]0.68[0.69].004[0.004]0.60[0.60] 35

149 Table 2.: N = 000 λ MLE λ RGMME λ BE λ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [0.263]0.084[0.276] [-0.004]0.40[0.4] [0.237]0.08[0.25] [0.62]0.083[0.82] [-0.002]0.33[0.33] -0.45[0.49]0.085[0.7] [0.06]0.082[0.02] [-0.002]0.4[0.4] [0.054]0.08[0.098] [-0.008]0.075[0.076] [-0.004]0.09[0.09] -0.0[-0.0]0.076[0.076] [-0.044]0.062[0.076] 0.298[-0.002]0.067[0.067] 0.257[-0.043]0.062[0.075] [-0.047]0.039[0.06] 0.598[-0.002]0.038[0.038] 0.554[-0.046]0.038[0.060] [-0.07]0.0[0.020] 0.899[-0.00]0.00[0.00] 0.883[-0.07]0.0[0.02] β,mle β,rgmme β,be [0.05]0.2[0.22] 0.999[-0.00]0.20[0.20].06[0.06]0.09[0.0] [0.002]0.2[0.2] 0.993[-0.007]0.2[0.2].002[0.002]0.0[0.0] [-0.002]0.6[0.6] 0.995[-0.005]0.6[0.6] 0.998[-0.002]0.03[0.03] 0.003[0.003]0.[0.].00[0.00]0.[0.].003[0.003]0.00[0.00] [0.007]0.7[0.7].003[0.003]0.7[0.7].006[0.006]0.04[0.04] [0.009]0.3[0.3].00[0.00]0.3[0.3].007[0.007]0.02[0.02] [0.009]0.5[0.5] 0.995[-0.005]0.4[0.4].02[0.02]0.04[0.05] β 2,MLE β 2,RGMME β 2,BE [0.04]0.5[0.6] 0.997[-0.003]0.3[0.3].07[0.07]0.03[0.05] [0.0]0.7[0.7].00[0.00]0.6[0.6].0[0.0]0.04[0.05] [-0.004]0.0[0.0] 0.993[-0.007]0.09[0.09] 0.997[-0.003]0.098[0.098] [-0.002]0.22[0.22] 0.996[-0.004]0.22[0.22] 0.998[-0.002]0.07[0.07] [0.004]0.5[0.5].00[0.00]0.5[0.5].004[0.004]0.0[0.0] [0.003]0.3[0.3] 0.996[-0.004]0.2[0.2].006[0.006]0.02[0.03] 0.9.0[0.0]0.3[0.3] 0.998[-0.002]0.2[0.2].03[0.03]0.04[0.04] 36

150 Mote Carlo Simulatio Results for SARAR0, Table 2.2: N = 00 ρ MLE ρ RGMME ρ BE λ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [0.402]0.43[0.427] -.69[-0.269]0.853[0.894] -0.40[0.499]0.07[0.5] [0.238]0.64[0.289] [-0.203]0.685[0.74] [0.302]0.2[0.325] [0.066]0.206[0.27] -0.49[-0.9]0.598[0.628] [0.098]0.56[0.84] [-0.059]0.26[0.224] -0.46[-0.46]0.428[0.452] [-0.066]0.7[0.84] [-0.23]0.24[0.247] 0.9[-0.09]0.325[0.342] 0.32[-0.68]0.8[0.247] [-0.2]0.74[0.2] 0.533[-0.067]0.83[0.95] 0.403[-0.97]0.58[0.253] [-0.038]0.056[0.068] 0.892[-0.008]0.064[0.064] 0.798[-0.02]0.068[0.23] β,mle β,rgmme β,be [-0.00]0.37[0.37].005[0.005]0.365[0.365] 0.998[-0.002]0.350[0.350] [-0.05]0.376[0.376] 0.986[-0.04]0.370[0.370] 0.985[-0.05]0.352[0.353] [0.003]0.379[0.379].000[-0.000]0.378[0.378].004[0.004]0.353[0.353] [-0.003]0.365[0.365] 0.997[-0.003]0.366[0.366] 0.997[-0.003]0.345[0.345] [-0.009]0.374[0.374] 0.99[-0.009]0.373[0.373] 0.992[-0.008]0.349[0.349] [-0.009]0.378[0.378] 0.99[-0.009]0.377[0.377] 0.989[-0.0]0.359[0.360] [-0.009]0.364[0.364] 0.99[-0.009]0.363[0.363] 0.994[-0.006]0.334[0.334] β 2,MLE β 2,RGMME β 2,BE [-0.003]0.36[0.36] 0.992[-0.008]0.354[0.354] 0.997[-0.003]0.346[0.346] [0.007]0.356[0.356].007[0.007]0.354[0.354].008[0.008]0.334[0.334] [-0.002]0.378[0.378] 0.997[-0.003]0.377[0.377] 0.996[-0.004]0.354[0.354] [-0.005]0.333[0.333] 0.994[-0.006]0.333[0.334] 0.995[-0.005]0.32[0.32] [0.05]0.373[0.373].07[0.07]0.373[0.373].009[0.009]0.347[0.347] [-0.00]0.358[0.358] 0.990[-0.00]0.357[0.358] 0.988[-0.02]0.338[0.338] [0.009]0.364[0.364].009[0.009]0.364[0.364].007[0.007]0.343[0.343] 37

151 Table 2.3: N = 500 ρ MLE ρ RGMME ρ BE λ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [0.295]0.25[0.32] [-0.057]0.276[0.28] [0.363]0.0[0.377] [0.47]0.28[0.95] -0.66[-0.06]0.23[0.239] [0.9]0.0[0.220] [0.055]0.29[0.40] [-0.032]0.92[0.95] [0.073]0.6[0.37] [-0.02]0.6[0.8] -0.03[-0.03]0.47[0.50] [-0.026]0.05[0.08] [-0.050]0.093[0.06] 0.284[-0.06]0.02[0.03] 0.220[-0.080]0.084[0.6] [-0.052]0.064[0.083] 0.589[-0.0]0.063[0.064] 0.487[-0.3]0.058[0.27] [-0.08]0.08[0.026] 0.897[-0.003]0.06[0.06] 0.829[-0.07]0.025[0.076] β,mle β,rgmme β,be [0.008]0.65[0.65].00[0.00]0.60[0.60].008[0.008]0.56[0.56] [-0.006]0.68[0.68] 0.994[-0.006]0.67[0.67] 0.995[-0.005]0.59[0.59] [-0.00]0.68[0.68].000[-0.000]0.67[0.67] 0.998[-0.002]0.60[0.60] [-0.00]0.72[0.72].000[-0.000]0.72[0.72] 0.999[-0.00]0.62[0.62] [0.00]0.69[0.70].00[0.00]0.69[0.70].00[0.00]0.60[0.6] [-0.00]0.7[0.7] 0.999[-0.00]0.7[0.7] 0.998[-0.002]0.62[0.62] [0.006]0.58[0.58].006[0.006]0.58[0.58].006[0.006]0.49[0.49] β 2,MLE β 2,RGMME β 2,BE [-0.00]0.6[0.6].000[0.000]0.58[0.58].000[-0.000]0.53[0.53] [0.005]0.66[0.66].004[0.004]0.64[0.64].004[0.004]0.55[0.55] [-0.004]0.67[0.67] 0.996[-0.004]0.67[0.67] 0.995[-0.005]0.58[0.58] 0.008[0.008]0.75[0.75].008[0.008]0.75[0.75].008[0.008]0.67[0.67] [-0.009]0.62[0.62] 0.99[-0.009]0.62[0.62] 0.993[-0.007]0.53[0.53] [-0.003]0.68[0.68] 0.997[-0.003]0.67[0.67] 0.999[-0.00]0.59[0.59] [0.003]0.64[0.64].003[0.003]0.64[0.64].000[0.000]0.58[0.58] 38

152 Table 2.4: N = 000 ρ MLE ρ RGMME ρ BE λ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [0.39]0.079[0.328] -0.92[-0.02]0.82[0.83] [0.36]0.074[0.368] [0.8]0.087[0.20] -0.66[-0.06]0.58[0.59] [0.2]0.08[0.226] [0.060]0.088[0.06] [-0.027]0.3[0.34] [0.078]0.08[0.2] [-0.009]0.082[0.082] -0.03[-0.03]0.03[0.04] -0.0[-0.0]0.074[0.075] [-0.048]0.065[0.08] 0.290[-0.00]0.07[0.072] 0.224[-0.076]0.058[0.096] [-0.054]0.044[0.069] 0.592[-0.008]0.042[0.043] 0.486[-0.4]0.039[0.2] [-0.09]0.03[0.023] 0.898[-0.002]0.0[0.0] 0.828[-0.072]0.08[0.074] β,mle β,rgmme β,be [-0.005]0.[0.] 0.995[-0.005]0.08[0.08] 0.995[-0.005]0.02[0.02] [-0.003]0.5[0.5] 0.997[-0.003]0.4[0.4] 0.997[-0.003]0.05[0.05] [0.003]0.3[0.3].002[0.002]0.3[0.3].003[0.003]0.03[0.04] 0.000[0.000]0.3[0.3].000[0.000]0.3[0.3].000[0.000]0.03[0.03] [-0.004]0.3[0.3] 0.996[-0.004]0.3[0.3] 0.997[-0.003]0.03[0.03] [0.003]0.3[0.3].003[0.003]0.2[0.2].003[0.003]0.04[0.04] [0.000]0.06[0.06].000[0.000]0.06[0.06].000[0.000]0.097[0.097] β 2,MLE β 2,RGMME β 2,BE [0.003]0.[0.].003[0.003]0.08[0.08].003[0.003]0.03[0.04] [-0.008]0.09[0.09] 0.992[-0.008]0.09[0.09] 0.993[-0.007]0.00[0.00] [0.003]0.6[0.6].003[0.003]0.6[0.6].004[0.004]0.05[0.06] [-0.004]0.3[0.3] 0.996[-0.004]0.3[0.3] 0.997[-0.003]0.04[0.04] [0.00]0.2[0.2].00[0.00]0.2[0.2].00[0.00]0.03[0.03] [0.005]0.2[0.2].004[0.004]0.2[0.2].004[0.004]0.03[0.03] [0.003]0.[0.].003[0.003]0.[0.].00[0.00]0.04[0.04] 39

153 Mote Carlo Simulatio Results for SARAR, Table 2.5: N = 00 λ MLE λ RGMME λ BE λ ρ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [-0.075]0.947[0.950] [-0.080]0.464[0.47] [0.240]0.0[0.260] [-0.60].460[.469] [-0.025]0.53[0.53] [0.333]0.096[0.347] [-0.7].538[.543] [0.093]0.60[0.609] [0.378]0.09[0.394] [0.096].302[.305] -0.70[0.99]0.562[0.596] [0.48]0.30[0.437] [0.225].32[.54] [0.203]0.62[0.654] [0.492]0.76[0.523] [0.407]0.865[0.956] [0.325]0.803[0.867] [0.622]0.295[0.688] [0.997]0.739[.24] -0.06[0.794].026[.298] 0.084[0.984]0.570[.37] [-0.22].269[.286] -0.77[-0.7]0.597[0.608] -0.52[0.079]0.09[0.20] [-0.235]2.280[2.292] -0.6[-0.0]0.737[0.737] [0.66]0.095[0.9] [-0.090]2.086[2.088] -0.53[0.069]0.750[0.754] [0.225]0.08[0.249] [0.083].923[.925] [0.227]0.907[0.934] [0.305]0.28[0.33] [0.290].742[.766] [0.352]0.978[.040] -0.9[0.409]0.32[0.429] [0.70]0.832[.094] [0.55].47[.257] [0.577]0.48[0.595] [.66]0.476[.259] 0.578[.78]0.863[.460] 0.292[0.892]0.235[0.923] [-0.24].449[.465] [-0.57]0.659[0.677] [-0.09]0.02[0.49] [-0.080]2.8[2.82] [-0.054]0.658[0.660] [-0.039]0.09[0.6] [0.0]2.2[2.2] [0.023]0.825[0.826] [0.028]0.25[0.28] [0.56].82[.827] -0.04[0.96].092[.09] -0.94[0.06]0.28[0.66] [0.298].634[.66] 0.023[0.323].64[.208] [0.226]0.33[0.263] [0.598]0.92[.099] 0.27[0.57].38[.273] 0.099[0.399]0.39[0.423] [0.985]0.250[.06] 0.728[.028]0.782[.29] 0.425[0.725]0.53[0.74] [-0.9].254[.260] -0.23[-0.23]0.740[0.770] [-0.278]0.22[0.303] [0.55].38[.327] -0.08[-0.08]0.793[0.797] -0.20[-0.20]0.24[0.236] [0.87].308[.32] 0.004[0.004]0.764[0.764] -0.33[-0.33]0.32[0.87] [0.5].553[.557] 0.047[0.047]0.95[0.952] [-0.070]0.35[0.52] [0.282].067[.03] 0.263[0.263].49[.79] 0.052[0.052]0.38[0.47] [0.479]0.334[0.584] 0.508[0.508]0.970[.095] 0.27[0.27]0.32[0.254] [0.756]0.300[0.83] 0.837[0.837]0.687[.083] 0.52[0.52]0.34[0.529] [-0.092]0.640[0.647] 0.05[-0.95]0.770[ [-0.389]0.30[0.40] [0.70]0.846[0.863] 0.85[-0.5]0.948[0.954] -0.03[-0.33]0.34[0.340] [0.43]0.96[0.927] 0.245[-0.055]0.977[0.979] 0.046[-0.254]0.20[0.28] [0.49]0.675[0.69] 0.327[0.027].6[.6] 0.06[-0.94]0.33[0.235] [0.72]0.792[0.8] 0.487[0.87].460[.472] 0.202[-0.098]0.24[0.58] [0.269]0.664[0.76] 0.680[0.380].057[.23] 0.334[0.034]0.23[0.28] [0.529]0.58[0.552] 0.99[0.69]0.923[.] 0.589[0.289]0.33[0.38] [-0.04]0.82[0.89] 0.479[-0.2]0.528[0.542] 0.20[-0.399]0.40[0.423] [0.072]0.394[0.400] 0.560[-0.040]0.562[0.564] 0.239[-0.36]0.26[0.382] [0.074]0.328[0.336] 0.590[-0.00]0.846[0.846] 0.28[-0.39]0.9[0.34] [0.020]0.46[0.46] 0.599[-0.00]0.874[0.874] 0.330[-0.270]0.23[0.296] [0.03]0.633[0.634] 0.762[0.62]0.984[0.998] 0.386[-0.24]0.7[0.244] [0.23]0.223[0.255] 0.865[0.265]0.932[0.969] 0.49[-0.09]0.8[0.6] [0.304]0.06[0.322] 0.997[0.397]0.956[.035] 0.633[0.033]0.86[0.89] [-0.06]0.63[0.64] 0.80[-0.090]0.673[0.679] 0.638[-0.262]0.47[0.300] [-0.024]0.472[0.473] 0.907[0.007]0.569[0.569] 0.638[-0.262]0.2[0.285] [-0.024]0.49[0.5] 0.93[0.03]0.635[0.636] 0.644[-0.256]0.7[0.282] [-0.03]0.45[0.48] 0.945[0.045]0.578[0.580] 0.657[-0.243]0.8[0.270] [-0.022]0.30[0.32].09[0.9]0.98[0.988] 0.666[-0.234]0.3[0.269] [0.03]0.03[0.04].072[0.72].79[.9] 0.678[-0.222]0.8[0.287] [0.082]0.0[0.37] 0.929[0.029].035[.036] 0.432[-0.468]0.605[0.765] 40

154 Table 2.6: N = 500 λ MLE λ RGMME λ BE λ ρ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [-.580]4.626[4.888] [0.0]0.306[0.306] -0.78[0.82]0.060[0.9] [-2.577]4.727[5.384] [0.08]0.299[0.299] -0.65[0.249]0.072[0.259] [-2.485]4.402[5.055] -0.85[0.049]0.3[0.35] [0.33]0.078[0.340] [-2.273]4.265[4.833] [0.098]0.325[0.340] [0.444]0.088[0.453] [-.67]3.668[3.849] -0.72[0.79]0.424[0.460] [0.593]0.09[0.600] [-0.009]2.804[2.804] [0.36]0.740[0.824] [0.826]0.09[0.83] [.238].333[.89] 0.04[.004]0.948[.38] 0.382[.282]0.087[.285] [-0.268]3.669[3.679] [-0.026]0.280[0.28] [-0.003]0.079[0.079] [-0.969]3.926[4.043] -0.65[-0.05]0.272[0.272] [0.07]0.082[0.08] [-.78]3.779[3.958] [0.06]0.293[0.293] -0.44[0.59]0.086[0.8] [-0.744]3.248[3.332] [0.053]0.305[0.309] [0.27]0.087[0.284] [-0.73]2.582[2.588] [0.36]0.47[0.439] -0.79[0.42]0.089[0.430] [0.463].758[.88] [0.364]0.680[0.77] 0.033[0.633]0.085[0.639] [.085]0.904[.42] 0.355[0.955]0.827[.263] 0.459[.059]0.077[.06] [0.445].963[2.02] [-0.035]0.248[0.250] [-0.44]0.084[0.67] [0.044]2.28[2.28] [-0.033]0.262[0.264] [-0.080]0.087[0.8] [-0.82]2.72[2.79] [-0.022]0.27[0.272] [0.008]0.087[0.087] [-0.045].658[.658] [-0.005]0.527[0.527] -0.9[0.09]0.090[0.42] [0.065].77[.78] -0.25[0.085]0.453[0.46] [0.243]0.086[0.258] [0.449]0.985[.083] [0.292]0.599[0.666] 0.55[0.455]0.082[0.462] [0.93]0.27[0.956] 0.535[0.835]0.7[.096] 0.52[0.82]0.078[0.824] [0.463].07[.8] [-0.032]0.206[0.208] [-0.243]0.084[0.257] [0.235]0.780[0.85] [-0.037]0.247[0.249] -0.86[-0.86]0.087[0.205] [0.055]0.755[0.757] [-0.042]0.262[0.266] -0.2[-0.2]0.083[0.40] [-0.03]0.890[0.890] [-0.023]0.269[0.270] [-0.025]0.085[0.089] [0.075]0.977[0.980] 0.055[0.055]0.442[0.445] 0.094[0.094]0.082[0.24] [0.299]0.943[0.989] 0.263[0.263]0.658[0.709] 0.266[0.266]0.074[0.276] [0.74]0.8[0.736] 0.748[0.748]0.609[0.964] 0.587[0.587]0.08[0.593] [0.327]0.448[0.555] 0.274[-0.026]0.47[0.50] 0.00[-0.299]0.08[0.30] [0.088]0.456[0.464] 0.265[-0.035]0.72[0.75] 0.05[-0.249]0.082[0.262] [-0.00]0.368[0.368] 0.267[-0.033]0.204[0.206] 0.0[-0.99]0.082[0.25] [-0.090]0.63[0.637] 0.273[-0.027]0.348[0.349] 0.65[-0.35]0.080[0.57] [0.009]0.447[0.447] 0.37[0.07]0.438[0.439] 0.263[-0.037]0.075[0.084] [0.78]0.83[0.255] 0.482[0.82]0.603[0.630] 0.392[0.092]0.072[0.6] [0.49]0.54[0.55] 0.860[0.560]0.59[0.84] 0.653[0.353]0.086[0.363] [0.43]0.329[0.359] 0.584[-0.06]0.090[0.092] 0.33[-0.287]0.074[0.296] [-0.009]0.332[0.332] 0.579[-0.02]0.08[0.0] 0.337[-0.263]0.074[0.273] [-0.27]0.253[0.283] 0.568[-0.032]0.30[0.34] 0.362[-0.238]0.075[0.250] [-0.36]0.75[0.222] 0.572[-0.028]0.247[0.249] 0.399[-0.20]0.074[0.24] [-0.067]0.53[0.67] 0.597[-0.003]0.486[0.486] 0.460[-0.40]0.069[0.56] [0.053]0.4[0.5] 0.704[0.04]0.527[0.538] 0.553[-0.047]0.067[0.082] [0.292]0.098[0.308] 0.969[0.369]0.734[0.82] 0.690[0.090]0.87[0.208] [-0.002]0.39[0.39] 0.894[-0.006]0.025[0.026] 0.730[-0.70]0.064[0.82] [-0.042]0.34[0.40] 0.893[-0.007]0.030[0.03] 0.726[-0.74]0.066[0.86] [-0.069]0.07[0.27] 0.897[-0.003]0.49[0.49] 0.726[-0.74]0.068[0.87] [-0.065]0.092[0.3] 0.887[-0.03]0.209[0.20] 0.728[-0.72]0.067[0.85] [-0.047]0.495[0.497] 0.95[0.05]0.563[0.565] 0.735[-0.65]0.080[0.84] [0.0]0.079[0.079].055[0.55]0.834[0.848] 0.730[-0.70]0.87[0.252] [0.064]0.3[0.46] 0.994[0.094].072[.076] 0.298[-0.602]0.888[.073] 4

155 Table 2.7: N = 000 λ MLE λ RGMME λ BE λ ρ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [-0.83].302[.35] [-0.003]0.94[0.94] [0.32]0.052[0.42] [-0.7]2.272[2.380] [0.04]0.208[0.208] [0.2]0.052[0.27] [-0.927]2.47[2.589] [0.025]0.23[0.24] -0.65[0.285]0.064[0.293] [-0.635]2.340[2.424] [0.044]0.20[0.206] [0.364]0.088[0.375] [-0.262]2.084[2.0] [0.07]0.88[0.20] -0.43[0.469]0.23[0.484] [0.87].734[.744] [0.03]0.309[0.326] [0.628]0.98[0.658] [0.900]0.678[.27] [0.460]0.806[0.928] 0.45[.045]0.39[.093] [0.259].353[.377] -0.69[-0.09]0.93[0.94] -0.66[-0.06]0.057[0.059] [0.052].68[.682] [-0.002]0.203[0.203] [0.060]0.06[0.086] [-0.38]2.3[2.37] [0.007]0.26[0.26] [0.43]0.066[0.58] [-0.60].987[.994] [0.02]0.253[0.254] [0.248]0.067[0.257] [0.86].57[.582] [0.072]0.248[0.259] [0.393]0.068[0.399] [0.503].338[.429] [0.38]0.34[0.367] 0.003[0.603]0.072[0.607] [.033]0.648[.220] [0.594]0.784[0.984] 0.42[.02]0.085[.024] [0.460]0.576[0.737] [-0.020]0.7[0.72] [-0.37]0.064[0.5] [0.245]0.603[0.65] [-0.020]0.79[0.80] [-0.068]0.063[0.092] [-0.02].000[.000] -0.37[-0.07]0.20[0.202] [0.0]0.065[0.066] [0.076]0.874[0.877] [0.006]0.29[0.29] -0.87[0.3]0.066[0.3] [0.249]0.65[0.663] [0.043]0.240[0.244] [0.240]0.065[0.248] [0.459]0.526[0.698] -0.57[0.43]0.45[0.439] 0.39[0.439]0.06[0.443] [0.826]0.423[0.928] 0.286[0.586]0.687[0.903] 0.54[0.84]0.062[0.86] [0.343]0.503[0.609] -0.06[-0.06]0.38[0.39] [-0.208]0.065[0.28] [0.08]0.466[0.478] [-0.025]0.53[0.55] -0.59[-0.59]0.064[0.7] [-0.035]0.235[0.237] -0.03[-0.03]0.69[0.69] [-0.092]0.062[0.] [0.00]0.070[0.070] -0.03[-0.03]0.9[0.9] -0.0[-0.0]0.059[0.060] [0.6]0.079[0.40] [-0.005]0.239[0.239] 0.096[0.096]0.062[0.4] [0.278]0.404[0.490] 0.090[0.090]0.352[0.363] 0.263[0.263]0.057[0.269] [0.644]0.94[0.673] 0.524[0.524]0.589[0.788] 0.589[0.589]0.06[0.592] [0.248]0.42[0.489] 0.290[-0.00]0.00[0.00] 0.060[-0.240]0.060[0.248] [0.033]0.45[0.47] 0.293[-0.007]0.09[0.09] 0.096[-0.204]0.056[0.22] [-0.44]0.206[0.25] 0.283[-0.07]0.30[0.32] 0.37[-0.63]0.058[0.73] [-0.]0.08[0.37] 0.28[-0.09]0.59[0.60] 0.93[-0.07]0.058[0.22] [-0.04]0.06[0.07] 0.265[-0.035]0.252[0.254] 0.272[-0.028]0.057[0.063] [0.28]0.425[0.444] 0.357[0.057]0.349[0.354] 0.406[0.06]0.053[0.8] [0.444]0.43[0.466] 0.738[0.438]0.534[0.69] 0.660[0.360]0.068[0.366] [0.099]0.34[0.329] 0.59[-0.009]0.058[0.058] 0.378[-0.222]0.054[0.229] [-0.077]0.305[0.35] 0.589[-0.0]0.068[0.069] 0.388[-0.22]0.053[0.28] [-0.90]0.63[0.250] 0.586[-0.04]0.084[0.085] 0.40[-0.90]0.054[0.98] [-0.62]0.078[0.80] 0.579[-0.02]0.02[0.04] 0.440[-0.60]0.05[0.68] [-0.089]0.06[0.39] 0.567[-0.033]0.64[0.68] 0.484[-0.6]0.05[0.27] [0.05]0.4[0.5] 0.596[-0.004]0.320[0.320] 0.569[-0.03]0.053[0.06] [0.247]0.26[0.278] 0.99[0.39]0.526[0.65] 0.704[0.04]0.202[0.228] [0.03]0.20[0.2] 0.898[-0.002]0.05[0.05] 0.796[-0.04]0.040[0.2] [-0.077]0.3[0.52] 0.896[-0.004]0.08[0.09] 0.782[-0.8]0.043[0.25] [-0.093]0.082[0.24] 0.896[-0.004]0.022[0.022] 0.780[-0.20]0.046[0.28] [-0.05]0.066[0.083] 0.894[-0.006]0.03[0.032] 0.779[-0.2]0.047[0.30] [-0.027]0.067[0.073] 0.889[-0.0]0.55[0.56] 0.775[-0.25]0.059[0.38] [-0.008]0.076[0.076] 0.945[0.045]0.43[0.433] 0.772[-0.28]0.6[0.206] [0.029]0.633[0.634].027[0.27]0.826[0.836] 0.302[-0.598]0.903[.084] 42

156 Table 2.8: N = 00 ρ MLE ρ RGMME ρ BE λ ρ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [-2.305]3.37[3.893] -.04[-0.4]0.73[0.744] [0.29]0.0[0.3] [-2.025]3.200[3.787] [-0.245]0.87[0.853] [0.6]0.05[0.57] [-.400]2.72[3.060] -0.68[-0.38]0.676[0.747] -0.49[-0.9]0.25[0.73] [-0.954]2.297[2.488] -0.38[-0.38]0.595[0.674] -0.33[-0.33]0.66[0.355] [-0.732].870[2.008] 0.002[-0.298]0.548[0.624] -0.67[-0.467]0.202[0.509] [-0.704].77[.905] 0.32[-0.279]0.538[0.606] 0.069[-0.53]0.298[0.609] [-.07].885[2.68] 0.38[-0.59]0.706[0.876] 0.263[-0.637]0.628[0.895] [-2.269]3.459[4.37] -.062[-0.62]0.800[0.86] [0.404]0.095[0.45] [-3.229]4.52[5.259] [-0.338]0.948[.006] [0.206]0.06[0.232] [-2.609]3.986[4.764] [-0.306]0.787[0.844] -0.30[-0.00]0.34[0.35] [-2.444]3.963[4.656] [-0.403]0.804[0.899] -0.29[-0.29]0.52[0.267] [-.823]3.34[3.782] -0.08[-0.408]0.693[0.805] [-0.377]0.65[0.4] [-.985]3.34[3.886] 0.38[-0.462]0.702[0.84] 0.49[-0.45]0.77[0.484] [-.986]3.38[3.73] 0.47[-0.753]0.77[.040] 0.539[-0.36]0.83[0.405] [-2.99]3.77[4.365] [-0.099]0.986[0.99] -0.48[0.482]0.04[0.494] [-4.8]4.570[6.52] -0.93[-0.33]0.987[.035] [0.267]0.2[0.293] [-3.920]4.685[6.09] [-0.343]0.856[0.922] [0.042]0.37[0.43] [-3.562]4.574[5.798] [-0.426]0.779[0.888] -0.54[-0.54]0.54[0.28] [-2.850]4.57[5.04] -0.43[-0.443]0.777[0.894] [-0.304]0.62[0.344] [-2.664]3.969[4.780] 0.7[-0.483]0.707[0.857] 0.24[-0.386]0.65[0.420] [-2.55]3.858[4.606] 0.229[-0.67]0.639[0.927] 0.568[-0.332]0.42[0.36] [-2.929]4.337[5.234] [-0.093]0.954[0.959] [0.553]0.7[0.566] [-4.796]5.026[6.947] [-0.290].077[.5] [0.354]0.33[0.378] [-4.640]5.066[6.869] -0.64[-0.34]0.897[0.960] -0.73[0.27]0.4[0.90] [-3.863]4.939[6.270] [-0.350]0.846[0.95] [-0.069]0.56[0.70] [-3.533]4.822[5.978] [-0.376]0.85[0.898] 0.086[-0.24]0.62[0.269] [-3.02]4.440[5.370] 0.42[-0.458]0.668[0.809] 0.280[-0.320]0.53[0.355] [-2.963]4.342[5.256] 0.299[-0.60]0.62[0.865] 0.597[-0.303]0.3[0.330] [-2.706]4.292[ [-0.085].043[.047] -0.24[0.659]0.35[0.672] [-5.780]5.09[7.74] [-0.329].63[.209] -0.45[0.455]0.43[0.477] [-5.353]5.422[7.69] [-0.294].00[.39] [0.24]0.48[0.283] [-4.563]5.404[7.073] -0.34[-0.34]0.933[0.984] 0.037[0.037]0.54[0.59] [-4.42]5.299[6.726] [-0.380]0.80[0.886] 0.82[-0.8]0.58[0.98] [-3.646]5.03[6.23] 0.63[-0.437]0.727[0.848] 0.364[-0.236]0.45[0.277] [-3.279]4.73[5.742] 0.357[-0.543]0.726[0.907] 0.639[-0.26]0.37[0.294] [-3.303]4.598[5.66] -.046[-0.46].02[.2] [0.80]0.67[0.88] [-6.045]5.326[8.056] [-0.274].58[.90] 0.03[0.63]0.46[0.630] [-5.979]5.603[8.94] [-0.349].75[.225] 0.092[0.392]0.54[0.42] [-4.785]5.736[7.470] [-0.290].70[.206] 0.98[0.98]0.59[0.254] [-4.320]5.548[7.03] -0.00[-0.30]0.90[0.96] 0.35[0.05]0.5[0.52] [-3.72]5.22[6.4] 0.83[-0.47]0.808[0.909] 0.467[-0.33]0.4[0.94] [-3.82]4.692[5.669] 0.458[-0.442]0.69[0.820] 0.688[-0.22]0.72[0.273] [-4.09]4.698[6.229] -.4[-0.24].205[.224] 0.32[.032]0.270[.067] [-6.559]5.645[8.654] [-0.222].246[.265] 0.295[0.895]0.98[0.97] [-5.905]5.93[8.369] -0.63[-0.33].453[.49] 0.362[0.662]0.92[0.689] [-5.80]5.978[7.90] [-0.350].32[.358] 0.438[0.438]0.90[0.478] [-3.975]5.550[6.826] 0.085[-0.25]0.96[0.94] 0.539[0.239]0.85[0.302] [-3.373]5.060[6.08] 0.372[-0.228]0.739[0.774] 0.623[0.023]0.97[0.98] [-4.459]5.25[6.793] 0.527[-0.373]0.733[0.823] 0.46[-0.484]0.553[0.735] 43

157 Table 2.9: N = 500 ρ MLE ρ RGMME ρ BE λ ρ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [-5.42]5.263[7.555] [-0.050]0.423[0.425] [0.237]0.073[0.248] [-2.855]5.090[5.836] [-0.040]0.324[0.326] [0.038]0.09[0.098] [-0.956]3.496[3.624] [-0.080]0.306[0.36] [-0.43]0.02[0.76] [-0.20].522[.527] [-0.093]0.259[0.275] [-0.268]0.04[0.287] [-0.252]0.973[.005] 0.79[-0.2]0.23[0.26] [-0.364]0.0[0.378] [-0.75].9[2.040] 0.460[-0.40]0.24[0.279] 0.99[-0.40]0.088[0.40] [-.220]2.804[3.058] 0.599[-0.30]0.340[0.454] 0.586[-0.34]0.065[0.32] [-6.897]5.252[8.669] -0.92[-0.02]0.406[0.406] [0.307]0.08[0.38] [-3.67]5.430[6.554] [-0.008]0.34[0.34] [0.5]0.094[0.49] [-.230]3.695[3.894] [-0.055]0.304[0.309] [-0.059]0.00[0.6] [-0.478]2.2[2.66] -0.08[-0.08]0.263[0.276] -0.20[-0.20]0.02[0.225] [-0.435].430[.495] 0.93[-0.07]0.260[0.282] 0.003[-0.297]0.00[0.33] [-0.999]2.557[2.745] 0.443[-0.57]0.237[0.285] 0.243[-0.357]0.085[0.367] [-.52]2.872[3.095] 0.6[-0.289]0.29[0.40] 0.606[-0.294]0.064[0.30] [-7.370]5.365[9.6] [0.008]0.429[0.429] [0.394]0.088[0.404] [-4.380]5.686[7.77] -0.65[-0.05]0.380[0.380] [0.93]0.094[0.25] [-.754]4.275[4.62] [-0.024]0.36[0.37] -0.27[0.029]0.03[0.07] [-0.784]2.807[2.94] [-0.060]0.30[0.307] -0.9[-0.9]0.05[0.58] [-0.625]2.47[2.236] 0.208[-0.092]0.248[0.264] 0.073[-0.227]0.094[0.246] [-0.826]2.390[2.528] 0.448[-0.52]0.23[0.277] 0.297[-0.303]0.078[0.33] [-.435]3.390[3.68] 0.639[-0.26]0.246[0.359] 0.636[-0.264]0.062[0.27] [-7.622]5.589[9.45] [0.03]0.428[0.429] -0.40[0.490]0.092[0.499] [-4.69]5.956[7.537] -0.60[-0.00]0.376[0.376] [0.294]0.02[0.3] [-2.65]4.769[5.237] -0.33[-0.03]0.35[0.35] -0.76[0.24]0.03[0.6] [-0.75]2.9[3.006] [-0.035]0.300[0.302] [-0.025]0.098[0.0] [-0.569]2.230[2.30] 0.22[-0.079]0.275[0.286] 0.5[-0.49]0.090[0.74] [-.27]3.343[3.558] 0.468[-0.32]0.229[0.264] 0.370[-0.230]0.078[0.243] [-.367]3.288[3.56] 0.647[-0.253]0.28[0.334] 0.67[-0.229]0.069[0.239] [-8.05]5.80[9.899] [0.00]0.428[0.428] [0.607]0.00[0.65] [-4.602]6.80[7.706] [0.05]0.390[0.39] -0.85[0.45]0.06[0.428] [-2.622]5.326[5.937] -0.34[-0.04]0.37[0.372] [0.245]0.08[0.267] [-0.850]3.35[3.457] -0.09[-0.09]0.327[0.327] 0.090[0.090]0.0[0.35] [-0.49]2.35[2.366] 0.235[-0.065]0.283[0.290] 0.256[-0.044]0.087[0.097] [-.020]3.80[3.340] 0.485[-0.5]0.242[0.268] 0.448[-0.52]0.073[0.68] [-.309]3.63[3.423] 0.69[-0.209]0.209[0.295] 0.70[-0.90]0.073[0.204] [-8.36]6.047[0.37] [0.05]0.448[0.448] -0.25[0.775]0.7[0.784] [-4.976]6.508[8.92] [0.007]0.403[0.403] -0.09[0.58]0.5[0.593] [-2.06]5.84[5.579] [0.08]0.359[0.359] 0.05[0.405]0.0[0.49] [-0.820]3.600[3.692] -0.06[-0.06]0.340[0.340] 0.236[0.236]0.098[0.256] [-0.675]3.029[3.04] 0.255[-0.045]0.307[0.3] 0.38[0.08]0.090[0.2] [-0.789]2.88[2.927] 0.52[-0.079]0.253[0.265] 0.55[-0.049]0.075[0.090] [-.07]2.550[2.780] 0.728[-0.72]0.28[0.278] 0.764[-0.36]0.04[0.7] [-7.697]6.489[0.067] -0.86[0.039]0.444[0.446] 0.76[.076]0.75[.09] [-5.0]6.775[8.486] -0.59[0.009]0.42[0.42] 0.285[0.885]0.60[0.899] [-2.48]5.447[5.855] [0.006]0.375[0.375] 0.390[0.690]0.47[0.706] [-0.702]3.58[3.587] [-0.002]0.367[0.367] 0.503[0.503]0.26[0.59] [-0.624]2.84[2.882] 0.302[0.002]0.366[0.366] 0.67[0.37]0.20[0.339] [-0.89]2.450[2.583] 0.602[0.002]0.303[0.303] 0.75[0.5]0.38[0.80] [-6.049]6.593[8.948] 0.80[-0.090]0.253[0.269] 0.03[-0.887]0.782[.83] 44

158 Table 2.20: N = 000 ρ MLE ρ RGMME ρ BE λ ρ Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] Mea[Bias]Std.[RMSE] [-.397]2.237[2.637] [-0.027]0.246[0.247] -0.70[0.99]0.064[0.209] [-0.696]2.452[2.549] -0.65[-0.05]0.244[0.249] [0.024]0.074[0.078] [0.046]0.873[0.875] [-0.052]0.27[0.223] [-0.24]0.082[0.49] [-0.05]0.366[0.369] [-0.052]0.75[0.83] [-0.226]0.094[0.244] [-0.95]0.289[0.349] 0.247[-0.053]0.3[0.4] 0.02[-0.288]0.094[0.303] [-0.293]0.499[0.579] 0.559[-0.04]0.00[0.08] 0.30[-0.290]0.04[0.308] [-0.34]0.727[0.792] 0.780[-0.20]0.250[0.277] 0.683[-0.27]0.092[0.236] [-2.769]3.683[4.608] -0.90[-0.00]0.264[0.264] -0.6[0.289]0.067[0.296] [-.844]3.74[4.7] [-0.025]0.259[0.260] [0.08]0.077[0.33] [-0.268].67[.693] [-0.037]0.22[0.224] -0.35[-0.05]0.083[0.098] [-0.2]0.243[0.27] [-0.038]0.96[0.200] -0.7[-0.7]0.079[0.89] [-0.268]0.65[0.35] 0.238[-0.062]0.67[0.78] 0.039[-0.26]0.074[0.272] [-0.394]0.7[0.83] 0.538[-0.062]0.28[0.43] 0.294[-0.306]0.06[0.32] [-0.60].634[.744] 0.736[-0.64]0.249[0.298] 0.652[-0.248]0.05[0.253] [-4.02]4.307[5.886] [-0.020]0.260[0.260] [0.36]0.070[0.368] [-2.494]4.500[5.45] [-0.007]0.265[0.265] -0.42[0.88]0.078[0.204] [-0.300].904[ [-0.009]0.227[0.227] [0.034]0.080[0.087] [-0.05]0.309[0.327] [-0.030]0.203[0.206] [-0.095]0.075[0.2] [-0.245]0.508[0.564] 0.243[-0.057]0.76[0.85] 0.099[-0.20]0.070[0.23] [-0.39].027[.099] 0.525[-0.075]0.50[0.68] 0.336[-0.264]0.058[0.270] [-0.497].476[.557] 0.740[-0.60]0.203[0.258] 0.668[-0.232]0.046[0.236] [-4.468]4.746[6.58] [-0.005]0.256[0.256] [0.452]0.079[0.458] [-2.535]4.88[5.444] [0.00]0.260[0.26] -0.37[0.283]0.078[0.293] [-0.475]2.566[2.609] -0.36[-0.06]0.242[0.242] -0.78[0.22]0.079[0.45] [-0.034]0.530[0.53] -0.08[-0.08]0.25[0.25] -0.02[-0.02]0.076[0.077] [-0.47]0.366[0.394] 0.263[-0.037]0.9[0.94] 0.77[-0.23]0.068[0.4] [-0.327].096[.44] 0.533[-0.067]0.55[0.69] 0.398[-0.202]0.057[0.20] [-0.702]2.00[2.25] 0.743[-0.57]0.72[0.233] 0.695[-0.205]0.049[0.2] [-5.286]4.850[7.74] [-0.024]0.278[0.279] [0.546]0.092[0.554] [-3.289]5.449[6.365] [-0.008]0.270[0.270] -0.2[0.389]0.088[0.399] [-0.423]2.793[2.825] [0.002]0.239[0.239] [0.233]0.079[0.246] [0.054]0.770[0.772] -0.09[-0.09]0.225[0.226] 0.087[0.087]0.077[0.6] [-0.058]0.349[0.353] 0.289[-0.0]0.204[0.205] 0.270[-0.030]0.067[0.073] [-0.37].363[.400] 0.535[-0.065]0.7[0.83] 0.469[-0.3]0.054[0.42] [-0.623].776[.882] 0.76[-0.39]0.49[0.204] 0.737[-0.63]0.054[0.72] [-5.600]5.209[7.649] -0.92[-0.02]0.272[0.272] -0.23[0.687]0.07[0.696] [-3.238]5.687[6.544] [-0.004]0.274[0.274] [0.530]0.096[0.539] [-0.430]3.80[3.209] [0.007]0.26[0.26] 0.076[0.376]0.09[0.387] [0.76]0.829[0.848] 0.00[0.00]0.232[0.232] 0.223[0.223]0.08[0.238] [0.08]0.578[0.578] 0.289[-0.0]0.223[0.223] 0.389[0.089]0.064[0.0] [-0.68]0.853[0.869] 0.565[-0.035]0.94[0.97] 0.567[-0.033]0.058[0.067] [-0.566].439[.547] 0.790[-0.0]0.53[0.88] 0.790[-0.0]0.073[0.32] [-6.305]5.275[8.22] -0.92[-0.02]0.266[0.266] 0.005[0.905]0.76[0.922] [-3.03]5.924[6.687] [0.007]0.290[0.290] 0.85[0.785]0.44[0.798] [-0.348]3.292[3.30] [0.003]0.263[0.263] 0.35[0.65]0.30[0.629] [0.07].263[.265] -0.0[-0.0]0.249[0.249] 0.452[0.452]0.6[0.467] [-0.024]0.249[0.250] 0.290[-0.00]0.232[0.232] 0.599[0.299]0.0[0.36] [-0.225]0.955[0.98] 0.594[-0.006]0.224[0.224] 0.725[0.25]0.09[0.66] [-.88]3.825[4.235] 0.847[-0.053]0.8[0.88] 0.002[-0.898]0.772[.84] 45

159 Figures of RMSEs for SARAR, Figure 2.9: RMSE of ρ 8.4 RMSE ρ λ RMSE ρ λ RMSE ρ λ a ρ MLE, N = 00 b ρ GMME, N = 00 c ρ BE, N = RMSE RMSE RMSE ρ λ ρ λ ρ λ d ρ MLE, N = 500 e ρ GMME, N = 500 f ρ BE, N = RMSE ρ λ RMSE ρ λ RMSE ρ λ g ρ MLE, N = 000 h ρ GMME, N = 000 i ρ BE, N =

160 Figure 2.0: RMSE of λ RMSE ρ λ RMSE ρ λ RMSE ρ λ a λ MLE, N = 00 b λ GMME, N = 00 c λ BE, N = RMSE RMSE RMSE ρ λ ρ λ ρ λ d λ MLE, = 500 e λ GMME, N = 500 f λ BE, N = RMSE ρ λ RMSE ρ λ RMSE ρ λ g λ MLE, N = 000 h λ GMME, N = 000 i λ BE, N =

161 2.8.3 Variable Defiitios ad Summary Statistics This sectio cotais variable defiitios ad descriptive statistics for the data sets used i the empirical illustratios. Ivestmet rate s Pop. growth rate Average growth rate Table 2.2: Coditioal Covergece, N=9 Variables Defiitio mi max mea std l y 960 logarithm of real GDP per worker at 960 logarithm of average share of gross ivestmet i GDP over logarithm of average growth of workig age populatio ages 5 to 60 over average growth of real GDP per worker over Source: The data is take from the Joural of Applied Ecoometrics Data Archive. 48

162 Table 2.22: Hedoic-Housig Price Equatio, N=506 Variables Defiitio mi max mea std Crime per capita crime by tow Zoig proportio of residetial lad zoed Idustry Charles river for lots over 25,000 sq.ft. proportio of o-retail busiess acres per tow Charles River dummy variable if tract bouds river Nitric itric oxides cocetratio parts per 0 millio Rooms average umber of rooms per House age Distace Access dwellig proportio of ower-occupied uits built prior to 940 weighted distaces to five Bosto employmet ceters idex of accessibility to radial highways Tax rate full-value property tax rate per $0,000 Pupil/teacher pupil-teacher ratio by tow Black pop. 000Bk where Bk is is the proportio of blacks by tow Lower class lower status of the populatio House price media value of ower occupied house price i $000 Source: This data set is available i the Kelley Pace s Spatial Statistics Toolbox. 49

163 3 Heteroskedasticity of Ukow Form i Spatial Autoregressive Models with Movig Average Disturbace Term 50

164 Abstract I this study, I ivestigate the ecessary coditio for cosistecy of the maximum likelihood estimator MLE of spatial models with a spatial movig average process i the disturbace term. I show that the MLE of spatial autoregressive ad spatial movig average parameters is geerally icosistet whe heteroskedasticity is ot cosidered i the estimatio. I also show that the MLE of parameters of exogeous variables is icosistet ad determie its asymptotic bias. I provide simulatio results to evaluate the performace of the MLE. The simulatio results idicate that the MLE imposes a substatial amout of bias o both autoregressive ad movig average parameters. Author Keywords: Spatial depedece, Spatial Movig Average, Spatial Autoregressive, Maximum likelihood estimator, MLE, Asymptotics, Heteroskedasticity, SARMA,. JEL-classificatio: C3, C2, C3

165 3. Itroductio The spatial depedece amog the disturbace terms of a spatial model is geerally assumed to take the form of a spatial autoregressive process. The spatial model that has a spatial lag i the depedet variable ad a autoregressive process i the disturbace term is kow as the SARAR model. The mai characteristic of a autoregressive process is that the effect of a locatio-specific shock trasmits to all other locatios with its effects gradually fadig away for the higher order eighbors. The spatial autoregressive process may ot be appropriate if there is strog evidece of localized trasmissio of shocks. That is, the autoregressive process is ot the correct specificatio whe the effects of shocks are cotaied withi a small regio ad are ot trasmitted to other regios. A alterative to a autoregressive process is a spatial movig average process, where the effect of shocks are more localized. Haiig 978, Aseli 988 ad more recetly Hepple 2003 ad Figleto 2008a,b cosider a spatial movig average process for the disturbace terms. The spatial model that cotais a spatial lag of the depedet variable ad a spatial movig average process for the disturbace term is kow as the SARMA model. I the literature, various estimatio methods have bee proposed Kelejia ad Prucha, 998, 999, Das et al., 2003, Kelejia ad Prucha, 200, Liu et al., 200, Lee, 2004, 2007a, Lee ad Liu, 200, Lesage, 997, LeSage ad Pace, 2009, Hepple, 995a. The ML method is the best kow ad most commo estimator used i the literature for both SARAR ad SARMA specificatios. Lee 2004 shows the first order asymptotic properties of the MLE for the case of SARAR,0. The geeralized method of momet GMM estimators are also cosidered for the estimatio of the spatial models. Kelejia ad Prucha 998, 999 suggest a two step GMM estimator for the SARAR, specificatio. Oe disadvatage of the two-step GMME is that it is usually iefficiet relative to the MLE Prucha, 204, Liu et al., 200, Lee, 2007c. To icrease efficiecy, Lee 2007a, Liu et al. 200 ad Lee ad Liu 200 formulate oe step GMMEs based o a set of momet fuctios ivolvig liear ad quadratic momet fuctios. I this approach, the reduced form of spatial models motivates the formulatio of momet fuctios. The reduced equatios idicate that the edogeous variable, i.e., the 53

166 spatial lag term, is a fuctio of a stochastic ad a o-stochastic term. The liear momet fuctios are based o the orthogoality coditio betwee the o-stochastic term ad the disturbace term, while the quadratic momet fuctios are formulated for the stochastic term. The, the parameter vector is estimated simultaeously with a oe-step GMME. Lee 2007a shows that the oe-step GMME ca be asymptotically equivalet to the MLE whe disturbace terms are i.i.d. ormal. I the case where disturbaces are simply i.i.d., Liu et al. 200 ad Lee ad Liu 200 suggest a oe-step GMME that ca be more efficiet tha the quasi MLE. Figleto 2008a,b exteds the two-step GMME suggested by Kelejia ad Prucha 998, 999 for spatial models that have a movig average process i the disturbace term, i.e., SARMA,. Baltagi ad Liu 20 modify the momet fuctios cosidered i Figleto 2008a i the maer of Arold ad Wied 200, ad suggest a GMME for the case of SARMA0,. The spatial movig average parameter i both Figleto 2008a ad Baltagi ad Liu 20 is estimated by a o-liear least squares estimator NLSE. The asymptotic distributio for the NLSE of the spatial movig average parameter is ot provided i either Figleto 2008a or Baltagi ad Liu 20. Recetly, Kelejia ad Prucha 200 ad Drukker et al. 203 provide a basic theorem regardig the asymptotic distributio of their estimator uder fairly geeral coditios. The estimatio approach suggested i Kelejia ad Prucha 200 ad Drukker et al. 203 ca easily be adapted for the estimatio of the SARMA, ad SARMA0, models. Fially, although the Kelejia ad Prucha approach i Figleto 2008a ad Baltagi ad Liu 20 has computatioal advatages, it may be iefficiet relative to the ML method. I the presece of a ukow form of heteroskedasticity, Li ad Lee 200 show that the MLE for the cases of SARAR,0 may ot be cosistet as the log-likelihood fuctio is ot maximized at the true parameter vector. They suggest a robust GMME for the SARAR,0 specificatio by modifyig the momet fuctios cosidered i Lee 2007a. Likewise, Kelejia ad Prucha 200 modify the momet fuctios of their previous twostep GMME to allow for a ukow form of heteroskedasticity. Figleto 2008a ad Baltagi ad Liu 20 do ot compare the fiite sample efficiecy of their estimators with the MLE. 54

167 The spatial movig average model itroduces a differet iteractio structure. Therefore, it is of iterest to ivestigate implicatios of a movig average process for estimatio ad testig issues. I this paper, I ivestigate the effect of heteroskedasticy o the MLE for the case of SARMA, ad SARMA0, alog the lies of Li ad Lee 200. The aalytical results show that whe heteroskedasticity is ot cosidered i the estimatio, the ecessary coditio for the cosistecy of the MLE is geerally ot satisfied for both SARMA, ad SARMA0, models. For the SARMA, specificatio, I also show that the MLE of other parameters is also icosistet, ad I determie its asymptotic bias. My simulatio results idicate that the MLE imposes a substatial amout of bias o spatial autoregressive ad movig average parameters. However, the simulatio results also show that the MLE of other parameters reports a egligible amout of bias i large samples. The rest of this paper is orgaized as follows. I Sectio 4.2, I specify the SARMA, model i more detail ad list assumptios that are required for the asymptotic aalysis. I Sectio 3.3, I briefly discuss implicatios of spatial processes proposed for the disturbace term i the literature. Sectio 3.4 ivestigates the ecessary coditio for the cosistecy of the MLE of autoregressive ad movig average parameters. Sectio 3.5 provides expressios for the asymptotic bias of the MLE of parameters of the exogeous variables. Sectio cotais a small Mote Carlo simulatio. Sectio 4.8 closes with cocludig remarks. 3.2 Model Specificatio ad Assumptios I this study, the followig first order SARMA, specificatio is cosidered: Y = λ 0 W Y + X β 0 + u, u = ε ρ 0 M ε 3. where Y is a vector of observatios of the depedet variable, X is a k matrix of o-stochastic exogeous variables, with a associated k vector of populatio coefficiets β 0, W ad M are spatial weight matrices of kow costats with zero diagoal elemets, ad ε is a vector of disturbaces. The variables W Y ad M ε are kow as the spatial lag of the depedet variable ad the disturbace term, respectively. 55

168 The spatial effect parameters λ 0 ad ρ 0 are kow as the spatial autoregressive ad movig average parameters, respectively. As the spatial data is characterized with triagular arrays, the variables i 3. have subscript. 2 The model specificatios with λ 0 0, ρ 0 0, ad λ 0 = 0, ρ 0 are kow, respectively, as SARMA, ad SARMA0, i the literature. Let Θ be the parameter space of the model. I order to distiguish the true parameter vector from other possible values i Θ, the model is stated with the true parameter vector θ 0 = β 0, δ 0 with δ 0 = λ 0, ρ 0. For otatioal simplicity, we deote S λ = I λw, R ρ = I ρm, G λ = W S λ, H ρ = M R ρ, X ρ = R ρx, ad G δ = R ρg λr ρ. Also, at the true parameter values ρ 0, λ 0, we deote S λ 0 = S, R ρ 0 = R, G λ 0 = G, H ρ 0 = H, X ρ 0 = X, ad G δ 0 = G. The model i 3. is cosidered uder the followig assumptios. Assumptio 3.. The elemets ε i of the disturbace term ε are distributed idepedetly with mea zero ad variace σ 2 i, ad E ε i ν < for some ν > 4 for all ad i. The elemets of the disturbace term have momets higher tha the fourth momet. The existece momets coditio is required for the applicatio of the cetral limit theorem for the quadratic form give i Kelejia ad Prucha 200. I additio, the variace of a quadratic form i ε exists ad is fiite whe the first four momets are fiite. Fially, Liapuov s iequality guaratees that the momets less tha ν are also uiformly bouded for all ad i. Assumptio 3.2. The spatial weight matrices M ad W are uiformly bouded i absolute value i row ad colum sums. Moreover, S, S λ, R ad R ρ exist ad are uiformly bouded i absolute value i row ad colum sums for all values of ρ ad λ i a compact parameter space. The uiform boudedess of terms i Assumptio 3.2 is motivated to cotrol spatial autocorrelatios i the model at a tractable level Kelejia ad Prucha, Assumptio 3.2 also implies that the model i 3. represets a equilibrium relatio for the 2 See Kelejia ad Prucha For a defiitio ad some properties of uiform boudedess, see Kelejia ad Prucha

169 depedet variable. By this assumptio, the reduced form of the model becomes feasible as Y = S X β 0 + S R ε. The uiform boudedess of S λ ad R ρ i Assumptio 3.2 is oly required for the MLE, ot for the GMME Liu et al., 200. Whe W is row ormalized, a closed subset of iterval /λ mi,, where λ mi is the smallest eigevalue of W, ca be cosidered as the parameter space for λ 0. Aalogously, a closed subset of /ρ mi,, where ρ mi is the smallest eigevalue of M, ca be the parameter space of ρ 0 LeSage ad Pace, 2009, p The ext assumptio states the regularity coditios for the exogeous variables. Assumptio 3.3. The matrix X is a k matrix cosistig of costat elemets that are uiformly bouded. It has full colum rak k. Moreover, lim X X ad lim X ρx ρ exist ad are osigular for all values of ρ i a compact parameter space. 3.3 Spatial Processes for the Disturbace Term I the literature, there are three mai parametric processes to model spatial autocorrelatio amog disturbace terms: i spatial autoregressive process SAR, ii spatial movig average process SMA, ad iii spatial error compoets model SEC. The implied covariace structure is differet uder each specificatio. I this sectio, I describe the trasmissio ad the effect of shocks uder each specificatio. The SAR process is specified as u = ρ 0 M u + ε, 3.2 where u is a vector of regressio disturbaces, ad ε is a vector of i.i.d. iovatios with variace σ0 2. Uder the assumptio of a equilibrium, i.e., R is ivertible, the reduced from of 3.2 is u = R ε with the covariace matrix of E u u = Ω = σ0 2R R. Note that eve if the iovatios are homoskedastic, the diagoal elemets of Ω are ot equal suggestig heteroskedasticity for the regressio disturbaces. A expasio 4 There are some other formulatios for the parameter spaces i the literature. For details see Kelejia ad Prucha 200 ad LeSage ad Pace Note that the parameter spaces for β 0 ad σ 2 0 are ot required to be compact. As show i 3.8a ad 3.8b, the MLE of these parameters is a OLS type estimator, hece boudedess is eough for the parameter spaces. 57

170 of I ρ 0 M for ρ 0 < yields I ρ 0 M = j=0 ρj 0 M j = I +ρ 0 M +ρ 2 0 M 2 +. Hece, the SAR specificatio of the disturbace term implies that a shock at locatio i is trasmitted to all other locatios. The first term I implies that the shock at locatio i directly affects locatio i, ad through other terms deoted by the powers of M affects higher order eighbors. Evetually, the shock feeds back to locatio i through the itercoectedess of eighbors. Note that ρ 0 < esures that the magitude of the trasmitted shock decreases for the higher orders of eighbors. As a result, the SAR specificatio allows researchers to model global trasmissio of shocks where the full effect of a shock to locatio i is the sum of iitial shock ad the feedback from other locatios. If a more localized spatial depedece is cojectured for a ecoomic model, the a spatial movig average process SMA specificatio is more suitable Haiig, 978, Hepple, 2003, Figleto, 2008a,b. The SMA process is specified as u = ε ρ 0 M ε, 3.3 where ρ 0 is the spatial movig average parameter. The reduced form does ot ivolve a iverse of a square matrix. Hece, the trasmissio of a shock emaated from locatio i is limited to its immediate eighbors give by the ozero elemets i the ith row of M. Uder this specificatio, the covariace matrix of u is Ω = σ0 2R R = σ0 2 I ρ 0 M + M + ρ 2 0 M M. The spatial covariace is limited to ozero elemets of M + M ad M M. I compariso with the SAR specificatio, the rage of covariace iduced by the SMA model is much smaller. Kelejia ad Robiso 993 suggest aother specificatio which is called the spatial error compoets SEC model. This specificatio is similar to the SMA process i the sese that the implied covariace matrix does ot ivolve a matrix iverse. Formally, the SEC model is give by u = M ε +ɛ, where ε is a vector of regioal iovatios, whereas ɛ is a vector of locatioal iovatios. Assumig that ε ad ɛ are idepedet, the variace-covariace matrix becomes Ω = σ 2 ɛ I + σ 2 εm M, which idicates that the spatial correlatio i a SEC specificatio is eve more localized. There have bee some direct attempts to parametrize the covariace matrix of u, rather 58

171 tha defiig a process for the disturbace term. For example, Besag 974 cosiders a coditioal first-order autoregressive model CAR such that the covariace matrix of u takes the form of Ω = σ0 2I ρ 0 M, where M is assumed to be a symmetric cotiguity matrix. This covariace structure implies a process of u = I ρ 0 M /2 ε. As i the case of the SAR process, a shock i a locatio is trasmitted to all other locatios, but ow with a smaller amplitude. Aother example is Ω = σ0 2I + ρ 0 M, where M is assumed to be symmetric Richardso et al., 992, Hepple, 995b. I this case, the spatial correlatio is restricted to first order eighbors, i.e., o-zero elemets of M. The elemets of Ω ca also be specified through a covariace geeratig fuctio. For example, i Ripley 2005, the covariace geeratig fuctio is defied i terms of distace betwee two locatios i such a way that the resultig covariace is always o-egative defiite. Let d ij be the distace betwee locatio i ad j, ad Ω ij, be the covariace betwee these two locatios. The, the covariace geeratig fuctio is defied by [ ] σ0 2 2 cos d ij 2ψ d ij 2ψ d2 ij /2, if d 4ψ Ω ij, = 2 ij 2ψ 0, otherwise. 3.4 Ituitively, Ω ij, is proportioal to the itersectio area of two discs of commo radius cetered o locatios i ad j. The covariace geeratig fuctio i 3.4 depeds o the sigle parameter ψ, ad has a fairly liear egative relatioship with d ij Richardso et al., 992, Ripley, Aother covariace geeratig fuctio family, first itroduced by Whittle i 954, is a two parameter fuctios defied i terms of Gamma ad Bessel fuctios. This family has the followig specificatio: Ω ij, = σ0 2 [ 2 ν Γν ] δdij ν K ν δd ij, 3.5 where K ν is the modified Bessel fuctio, ad Γ is the stadard Gamma fuctio. The parameters ν > 0 ad δ > 0 are respectively kow as a shape parameter ad a spatial parameter. The spatial parameter δ determies how far the spatial correlatio will stretch. For the special case, where ν = 2, this covariace geeratig fuctio gives a 59

172 expoetial decayig spatial correlatios Richardso et al., 992. There is also a more geeral expoetial covariace geeratig fuctio that depeds o two parameters. This fuctio is specified by Ω ij, = σ0 2γ expλd ij, where γ ad λ are parameters eed to be estimated. This fuctio also exhibits expoetial decay for the spatial correlatios. I the literature, there are some other covariace geeratig fuctio families. However, the majority of these fuctios do ot ecessarily esure that Ω is a positive defiite matrix Haiig, 987, Richardso et al., 992. The formal properties of the MLE for spatial models that have a covariace structure determied by a parametric fuctio are ivestigated i a early study by Mardia ad Marshall 984. I this study, the authors state coditios uder which the MLE is cosistet ad has asymptotic ormal distributio. I this study, the spatial model specified i 3. is cosidered. The iteractio betwee the spatial autoregressive process ad the movig average process for this model iduces a complicated patter for the trasmissio of a locatio specific shock. Uder Assumptio 3.2, the reduced form of the model is give by Y = S X β 0 +S R ε. The last term i the reduced form ca be writte as S R ε = ε ρ 0 M ε + l= λl 0 W ε l ρ 0 M l= λl 0 W ε l. I this represetatio, the higher power of W does ot have zero diagoal elemets, which i tur implies that the total effect of a regio specific shock also cotais the feedback effects passed through other locatios. The correspodig expressio i the case of SARAR, specificatio is give by S R ε = l=0 λl 0 W l k=0 ρk 0 M ε k. Agai, the iduced patter ivolves the iteractio of two weight matrices ad two parameters. Followig Figleto 2008a, I illustrate the trasmissio patter for a shock uder each specificatio by usig a rook weight matrix over a 5 5 lattice. Figure 3. shows the impact of a shock emaated from the uit located at the ceter of lattice. 5 I the case of SAR ad SARAR,, the effect of shock is more vigorous over the whole lattice. For the SMA specificatio, the shock is oly trasmitted to the immediate uits as show i Figure 3.b. I cotrast, the effect of the shock gradually dies out uder the SARMA, model. 5 For easy compariso, we set λ 0 = 0.9 for SAR, ρ 0 = 0.9 for SMA, λ 0, ρ 0 = 0.5, 0.9 for SARAR, ad λ 0, ρ 0 = 0.5, 0.9 for SARMA,. The disturbace of the uit located at the ceter of the lattice is icreased by 3. 60

173 0.2 Figure 3.: The Effect of a Shock a The effect of a shock: SAR b The effect of a shock: SMA c The effect of a shock: SARAR, d The effect of a shock: SARMA, 6

174 3.4 The MLE of λ 0 ad ρ 0 The log-likelihood fuctio for the model i 3. uder the assumptio that the disturbace terms of the model are i.i.d. ormal with mea zero ad variace σ 2 0 ca be writte as where ζ = give by l L ζ = 2 l2π 2 lσ2 + l S λ l R ρ 2σ 2 S λy X β R ρr ρ S λy X β, 3.6 θ, σ 2. The first order coditios with respect to β ad σ 2 are respectively l L ζ β l L ζ σ 2 = σ 2 X ρr ρ S λy X β, 3.7a = 2σ 2 + 2σ 4 ε θε θ, 3.7b where ε θ = R ρ S λy X β. The solutios of the first order coditios for a give δ yield the MLE of β 0 ad σ 2 0 : ˆβ δ = X ρx ρ X ρr ρs λy, 3.8a ˆσ 2 θ = ε θε θ. 3.8b Cocetratig the log-likelihood fuctio by elimiatig σ 2 gives the followig equatio: l L θ = 2 l2π 2 2 l ε θε θ S λ 2 R ρ The above represetatio is useful for explorig the role of the Jacobia terms S λ ad R ρ i the ML estimatio. The MLE of θ is the extremum estimator obtaied from the maximizatio of 3.9. I a equivalet way, the MLE of θ 0 ca be defied by { } ε ˆθ = argmi θε θ θ Θ S λ 2 R ρ I the special case, where S λ = R ρ =, the MLE is the NLSE obtaied from 62

175 the miimizatio of ε θε θ, i.e., ˆθ NLSE, = argmi θ Θ ε θε θ. It is clear that the Jacobia terms S λ ad R ρ play a role of a weight or a pealty o ε θε θ. The pealty is a fuctio of the autoregressive parameters ad the spatial weight matrices, which ca be defied as f λ, ρ, W, M = S λ 2 R ρ 2. For the SARAR, specificatio, the last term i 3.9 is give by 2 l ε θε θ, where ε θ = S λ 2 R ρ 2 R ρ S λy X β. Therefore, i the case of SARAR,, the MLE of θ 0 is give by { } ε ˆθ = argmi θε θ θ Θ S λ 2 R ρ 2 3. It is hard to make ay geeral statemet about the effects ad magitudes of the pealty fuctios i both cases. Hepple 976 illustrates that the Jacobia term imposes a substatial pealty for the SARAR0, specificatio. To illustrate the effect of pealty fuctios for the case of SARMA, ad SARAR,, I use a distace based weight matrix for a sample of 9 coutries such that each coutry is coected to every other coutry. The elemets of the weight matrices are specified by 0 if i = j, w ij = m ij = d 2 ij if i j, 9 j= d 2 ij 3.2 where d ij betwee coutries i ad j is measured by the great-circle distace betwee coutry capitals. 6 Figure 3.2 shows the surface plots of pealty fuctios over a grid of spatial parameters. 6 d ij = R 0 arccos cos logitude i silatitude i silatitude j, where R 0 is the Earth s radius. logitude j coslatitude i coslatitude j + 63

176 Figure 3.2: The pealty fuctios for the dese weight matrix a The pealty fuctio for SARMA, b The pealty fuctio for SARAR, For the SARAR, specificatio, the value of the pealty fuctio decreases as the parameter combiatio λ, ρ moves away from 0, 0 i ay directio as show i Figure 3.2b. 7 O the other had, there is o such mootoic decrease i the pealty fuctio uder the SARMA, specificatio as illustrated i Figure 3.2a. The pealty fuctio of SARMA, obtais relatively larger values whe there is strog spatial depedece i the disturbace term, i.e., whe ρ is ear or. I cotrast, the pealty fuctio has smaller values whe there is strog spatial depedece i the depedet variable. This patter idicates that the sum ε θε θ is pealized as ρ moves toward to either or. I the case of SARAR,, this sum gets larger as λ, ρ moves toward ±, ± i ay directio, suggestig that the solutio of the miimizatio problem is restricted to the regio, +, +. Fially, i a small eighborhood of 0, 0, the surface plots i Figure 3.2 idicate that the pealty fuctios take values aroud, suggestig that the parameter estimates from the MLE ca be similar to those from the NLSE uder both specificatios. Next, I ivestigate the effect of heteroskedasticity o the MLE for the case of SARMA,. I assume that the true data geeratig process is characterized by Assumptio 3.. More 7 For SARAR,, the pealty fuctio is fλ, ρ, W, M = S λ 2 Rρ 2. 64

Essays On Robust Estimators For Non-Identically Distributed Observations In Spatial Econometric And Time Series Models

Essays On Robust Estimators For Non-Identically Distributed Observations In Spatial Econometric And Time Series Models City Uiversity of New York CUNY CUNY Academic Works Dissertatios, Theses, ad Capstoe Projects Graduate Ceter 0-204 Essays O Robust Estimators For No-Idetically Distributed Observatios I Spatial Ecoometric

More information

GMM Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances

GMM Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances City Uiversity of New York (CUNY) CUNY Academic Works Ecoomics Workig Papers CUNY Academic Works 203 GMM Estimatio of Spatial Autoregressive Models with Autoregressive ad Heteroskedastic Disturbaces Osma

More information

Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with Moving Average Disturbance Term

Heteroskedasticity of Unknown Form in Spatial Autoregressive Models with Moving Average Disturbance Term City Uiversity of New York (CUNY) CUNY Academic Works Ecoomics Workig Papers CUNY Academic Works 4 Heteroskedasticity of Ukow Form i Spatial Autoregressive Models with Movig Average Disturbace Term Osma

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Essays in Spatial Econometrics: Estimation, Specification Test and the Bootstrap. Dissertation

Essays in Spatial Econometrics: Estimation, Specification Test and the Bootstrap. Dissertation Essays i Spatial Ecoometrics: Estimatio, Specificatio Test ad the Bootstrap Dissertatio Preseted i Partial Fulfillmet of the Requiremets for the Degree Doctor of Philosophy i the Graduate School of The

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Cox-type Tests for Competing Spatial Autoregressive Models with Spatial Autoregressive Disturbances

Cox-type Tests for Competing Spatial Autoregressive Models with Spatial Autoregressive Disturbances Cox-type Tests for Competig Spatial Autoregressive Models with Spatial Autoregressive Disturbaces Fei Ji a,, Lug-fei Lee a a Departmet of Ecoomics, The Ohio State Uiversity, Columbus, OH 430 USA Abstract

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

Solution to Chapter 2 Analytical Exercises

Solution to Chapter 2 Analytical Exercises Nov. 25, 23, Revised Dec. 27, 23 Hayashi Ecoometrics Solutio to Chapter 2 Aalytical Exercises. For ay ε >, So, plim z =. O the other had, which meas that lim E(z =. 2. As show i the hit, Prob( z > ε =

More information

GMM estimation of spatial autoregressive models with unknown heteroskedasticity

GMM estimation of spatial autoregressive models with unknown heteroskedasticity Accepted Mauscript GMM estimatio of spatial autoregressive models with ukow heteroskedasticity Xu Li, Lug-fei Lee PII: S0304-4076(09)00288-7 DOI: 0.06/j.jecoom.2009.0.035 Referece: ECONOM 3288 To appear

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

Regression with an Evaporating Logarithmic Trend

Regression with an Evaporating Logarithmic Trend Regressio with a Evaporatig Logarithmic Tred Peter C. B. Phillips Cowles Foudatio, Yale Uiversity, Uiversity of Aucklad & Uiversity of York ad Yixiao Su Departmet of Ecoomics Yale Uiversity October 5,

More information

5.4 The spatial error model Regression model with spatially autocorrelated errors

5.4 The spatial error model Regression model with spatially autocorrelated errors 54 The spatial error model 54 Regressio model with spatiall autocorrelated errors I a multiple regressio model, the depedet variable Y depeds o k regressors X (=), X,, X k ad a disturbace ε: (4) is a x

More information

Specification and Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances

Specification and Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances Specificatio ad Estimatio of Spatial Autoregressive Models with Autoregressive ad Heteroskedastic Disturbaces Harry H. Kelejia ad Igmar R. Prucha Departmet of Ecoomics Uiversity of Marylad, College Park,

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

Lecture 24: Variable selection in linear models

Lecture 24: Variable selection in linear models Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION [412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION BY ALAN STUART Divisio of Research Techiques, Lodo School of Ecoomics 1. INTRODUCTION There are several circumstaces

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

1 General linear Model Continued..

1 General linear Model Continued.. Geeral liear Model Cotiued.. We have We kow y = X + u X o radom u v N(0; I ) b = (X 0 X) X 0 y E( b ) = V ar( b ) = (X 0 X) We saw that b = (X 0 X) X 0 u so b is a liear fuctio of a ormally distributed

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

MATHEMATICAL SCIENCES PAPER-II

MATHEMATICAL SCIENCES PAPER-II MATHEMATICAL SCIENCES PAPER-II. Let {x } ad {y } be two sequeces of real umbers. Prove or disprove each of the statemets :. If {x y } coverges, ad if {y } is coverget, the {x } is coverget.. {x + y } coverges

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Estimation of the Mean and the ACVF

Estimation of the Mean and the ACVF Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators

More information

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So, 0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Lecture Notes for Analysis Class

Lecture Notes for Analysis Class Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

POLS, GLS, FGLS, GMM. Outline of Linear Systems of Equations. Common Coefficients, Panel Data Model. Preliminaries

POLS, GLS, FGLS, GMM. Outline of Linear Systems of Equations. Common Coefficients, Panel Data Model. Preliminaries Outlie of Liear Systems of Equatios POLS, GLS, FGLS, GMM Commo Coefficiets, Pael Data Model Prelimiaries he liear pael data model is a static model because all explaatory variables are dated cotemporaeously

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Chapter 10: Power Series

Chapter 10: Power Series Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because

More information

Ω ). Then the following inequality takes place:

Ω ). Then the following inequality takes place: Lecture 8 Lemma 5. Let f : R R be a cotiuously differetiable covex fuctio. Choose a costat δ > ad cosider the subset Ωδ = { R f δ } R. Let Ωδ ad assume that f < δ, i.e., is ot o the boudary of f = δ, i.e.,

More information

Chimica Inorganica 3

Chimica Inorganica 3 himica Iorgaica Irreducible Represetatios ad haracter Tables Rather tha usig geometrical operatios, it is ofte much more coveiet to employ a ew set of group elemets which are matrices ad to make the rule

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Cox-type Tests for Competing Spatial Autoregressive Models with Spatial Autoregressive Disturbances

Cox-type Tests for Competing Spatial Autoregressive Models with Spatial Autoregressive Disturbances Cox-type Tests for Competig Spatial Autoregressive Models with Spatial Autoregressive Disturbaces Fei Ji a,, Lug-fei Lee a a Departmet of Ecoomics, Ohio State Uiversity, Columbus, OH 430 USA Abstract I

More information

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution A Note o Box-Cox Quatile Regressio Estimatio of the Parameters of the Geeralized Pareto Distributio JM va Zyl Abstract: Makig use of the quatile equatio, Box-Cox regressio ad Laplace distributed disturbaces,

More information

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Linear Regression Models, OLS, Assumptions and Properties

Linear Regression Models, OLS, Assumptions and Properties Chapter 2 Liear Regressio Models, OLS, Assumptios ad Properties 2.1 The Liear Regressio Model The liear regressio model is the sigle most useful tool i the ecoometricia s kit. The multiple regressio model

More information

ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t

ARIMA Models. Dan Saunders. y t = φy t 1 + ɛ t ARIMA Models Da Sauders I will discuss models with a depedet variable y t, a potetially edogeous error term ɛ t, ad a exogeous error term η t, each with a subscript t deotig time. With just these three

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

A supplement to Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive. Appendix A: Some Useful Lemmas

A supplement to Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive. Appendix A: Some Useful Lemmas A supplemet to Asymptotic Distributios of Quasi-Maximum Likelihood Estimators for Spatial Autoregressive Models (for referece oly; ot for publicatio) Appedix A: Some Useful Lemmas A. Uiform Boudedess of

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons Statistical Aalysis o Ucertaity for Autocorrelated Measuremets ad its Applicatios to Key Comparisos Nie Fa Zhag Natioal Istitute of Stadards ad Techology Gaithersburg, MD 0899, USA Outlies. Itroductio.

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Probability and Statistics

Probability and Statistics ICME Refresher Course: robability ad Statistics Staford Uiversity robability ad Statistics Luyag Che September 20, 2016 1 Basic robability Theory 11 robability Spaces A probability space is a triple (Ω,

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image

More information

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.09 Kolmogorov-Smirov type Tests for Local Gaussiaity i High-Frequecy Data George Tauche, Duke Uiversity Viktor Todorov,

More information

U8L1: Sec Equations of Lines in R 2

U8L1: Sec Equations of Lines in R 2 MCVU U8L: Sec. 8.9. Equatios of Lies i R Review of Equatios of a Straight Lie (-D) Cosider the lie passig through A (-,) with slope, as show i the diagram below. I poit slope form, the equatio of the lie

More information

Supplementary Material to A General Method for Third-Order Bias and Variance Corrections on a Nonlinear Estimator

Supplementary Material to A General Method for Third-Order Bias and Variance Corrections on a Nonlinear Estimator Supplemetary Material to A Geeral Method for Third-Order Bias ad Variace Correctios o a Noliear Estimator Zheli Yag School of Ecoomics, Sigapore Maagemet Uiversity, 90 Stamford Road, Sigapore 178903 emails:

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n

¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n Review Sheets for Stock ad Watso Hypothesis testig p-value: probability of drawig a statistic at least as adverse to the ull as the value actually computed with your data, assumig that the ull hypothesis

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet

More information

GEL estimation and tests of spatial autoregressive models

GEL estimation and tests of spatial autoregressive models GEL estimatio ad tests of spatial autoregressive models Fei Ji a ad Lug-fei Lee b a School of Ecoomics, Shaghai Uiversity of Fiace ad Ecoomics, ad Key Laboratory of Mathematical Ecoomics (SUFE), Miistry

More information

On the Bootstrap for Spatial Econometric Models

On the Bootstrap for Spatial Econometric Models O the Bootstrap for Spatial Ecoometric Models Fei Ji a, Lug-fei Lee a a Departmet of Ecoomics, The Ohio State Uiversity, Columbus, OH 4310 USA Abstract This paper is cocered with the use of the bootstrap

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information