There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the type f spatial prcess underlying yur data and infrm yur selectin f an apprpriate spatial regressin mdel (i.e., spatial errr r spatial lag in GeDa). The secnd part is intended t intrduce hw t specify and interpret tw spatial regressin mdels: the spatial errr mdel and the spatial lag mdel. The tw appraches have different assumptins and theretical implicatins abut the frm f the spatial prcess being analyzed. The spatial errr identifies spatial autcrrelatin in the errr structure f the specified regressin mdel. In cntrast, the spatial lag mdel identifies spatial autcrrelatin in the cvariance structure f the dependent variable. Objectives. Cnduct an OLS regressin analysis in GeDa using multiple weights matrices Examine the spatial and nn spatial diagnstics Save and explre the residuals frm the OLS mdel Specify and examine the diagnstics f a spatial lag and spatial errr regressin analysis Cmpare the results f the mdels and interpret the substantive implicatins Part 1: Spatial Diagnstics OLS. Open GeDa and lad suth00.shp using FIPS as the key field. What are imprtant crrelates f child pverty that shuld be included in the regressin mdel? In GeDa, yu can run a series f standard OLS regressins; nte that the assumptins f linearity and nrmality apply. Decisins abut variable transfrmatins and utliers shuld be made befre running an OLS regressin. The results f the regressin, f curse, als can assist this analytical prcess. Regressin Run an OLS regressin analysis f child pverty and sme reasnable crrelates (Regress>) Change the utput title; this helps keep yur recrds rganized when yu run multiple mdels (e.g., OLS1) Change the utput title with each run, r it will verwrite the riginal file; it des nt append t a single file The utput file is saved t the directry where the data are lcated The extensin is *.OLS and can be read in Wrdpad r MS Wrd Specify the utput frmat The Predicted Value and Residual ptin is nt useful with large data sets since it prints the values fr each bservatin and, thus, creates a huge utput (text) file This infrmatin can be added t the data table at anther pint The Cefficient Variance Matrix ptin prvides the variance f the estimates (n the diagnal) and all cvariances Used t carry ut custmized tests f cnstraints n the mdel cefficients in statistical packages ther than GeDa (e.g., STATA) The Mran s I z value ptin reprts an estimate f the spatial autcrrelatin in the residuals f the mdel yu are specifying Select this ptin; the Mran s I value is reprted autmatically, but tests fr statistical significance reprted nly when yu select this ptin Specify the regressin mdel Vss & Curtis 1
Dependent Variable: child pverty, SQRTPPOV (square rt transfrmed) r PPOV, if yu prefer Independent Variables: What shall we explre? Chse weights matrix (necessary t get spatial diagnstics): Which shuld we use? Chse Classic mdel Nte: In GeDa the include cnstant term ptin is checked by default; uncheck if yu have reasn t exclude a cnstant frm yur mdel (e.g., fixed effects mdel) Run the mdel by clicking n the Run buttn Chse Save if yu want t add predicted values and residuals t the data table; this is an ptin nly after running the mdel If yu select the OK buttn befre yu select the Save buttn, yu will need t rerun the mdel t get the estimates Name the variables (predicted values and/r residuals) smething meaningful (e.g., OLS1_RES) Yu will need t create a new shapefile t permanently append the new variables t yur table (it is like a wrking file in SAS) (activate the table bject>file>save t Shape File As ) Output File An utput windw autmatically appears when selecting OK The file als can be viewed in Wrdpad r MS Wrd; Ntepad is nt recmmended (can pen but the frmat is messy) File cntent: Summary statistics f the mdel and measures f fit Parameter estimates Mdel diagnstics The F statistic reprted in the tp sectin is a test f the null hypthesis that all regressin cefficients are jintly 0 Nt that useful, unless yur mdel is way ff base 3 imprtant statistics reprted at the tp fr mdel cmparisns: Lg likelihd: higher, better (less negative) Akaike Infrmatin Criterin (AIC): lwer, better ( 2L + 2K) Schwarz Criterin (SC): lwer, better ( 2L + 2K x ln(n)) where L is the lg likelihd, K is the number f parameters, and Ln(N) is natural lg f the frequency values f the bservatin Standard Diagnstics Multicllinearity: nt a test statistic, per se, but a diagnstic t suggest prblems with the stability f the regressin results due t multicllinearity > 30 is prblematic, in general Nte: high values are cmmn when interactin terms are used since the independent variables are pwers and crss prducts f each ther Additinal nte: I have fund this diagnstic t be unreliable in GeDa especially with small data sets; examine multicllinearity in ther statistical packages (e.g., SAS) Nrmality: Jarque Bera test Chi square distributins with 2 df Tests the assumptin f nrmality in the errrs Vss & Curtis 2
Heterskedasticity is tested n three null hyptheses Breusch Pagan: assumes heterskedasticity is a functin f the squares f the explanatry variables Kenker Bassett: same as BP, except residuals are studentized (made rbust t nnnrmality) White: des nt assume a specific functinal frm f heterskedasticity A NA is smetimes reprted fr this test when interactins are included in the mdel because all square pwers and crss prducts are cnsidered in this test fr heterskedasticity Mran s I (Errr) This is the glbal value, as reprted in the scatter plt, less any explanatry value f the predictrs and is derived frm the errrs f the regressin mdel Usually bserve sme reductin (cmpared t riginal MI n the utcme) What was ur riginal statistic? Hw d the values cmpare? Tests fr statistical significance are nt reprted (i.e., NA is reprted) if yu did nt select the Mran s I z value ptin when yu specified the utput Lagrange Multiplier In general, the LM is used in mathematical ptimizatin prblems and is a methd fr finding the lcal extreme values f a functin f several variables subject t ne r mre cnstraints Here, the LM gives sme indicatin f which type f spatial regressin mdel is mst apprpriate Cmpare as yu add predictrs; d nt run with the first mdel utput We are trying t eliminate spatial autcrrelatin frm ur mdel and can inapprpriately estimate it if we haven t exhausted the alternatives t a spatial dependence regressin mdel Errr, lag, r SARMA (bth lag and errr)? Only cnsider the rbust LM statistics when the standard LM values are statistically significant A larger LM suggests the mre likely mdel SARMA is always significant, it seems, and is nt that useful in practice It tends t be significant when either lag r errr is indicated, nt just when a higher rder mdel is The value can be cmpared with the standard LM values; if similar, then it is nt picking up a higher rder mdel Which mdel is indicated? Have we exhausted ther explanatins? What abut a trend surface r ther techniques t address spatial hetergeneity? Residuals. The predicted and residual values are appended at the end f the table if yu chse this ptin under the Save buttn when specifying the regressin mdel (pen data table). Maps Predicted value maps (Map>Std Dev>predicted value variable saved t table) In essence, smthed maps since the randm variability due t factrs ther than thse in the mdel has been smthed ut Residual maps (Map>Std Dev>residual value variable saved t table) Vss & Curtis 3
Gives a sense f spatial autcrrelatin patterns since they suggest any under r verpredictin in sub regins Quantile Maps f predicted values and residual values (Map>Quantile>variable) Predicted value quantile map shws where predicted pverty is higher (darker) and lwer (lighter) Residual value map is mre intuitive, fr me, and shws ver predictin (lighter) and under predictin (darker) Where is the mdel ver predicting? Under predicting? Is there evidence f spatial clustering? What abut the pssibility f spatial regimes? Mran Scatter Plt & LISA Map Run a Mran scatter plt n the residuals (Space>Univariate Mran) Use the same weights matrix that yu used in the regressin mdel It is purely descriptive Thrugh this apprach, we are nt able t btain reliable estimates fr significance tests r LISA map cnstructin because the permutatin functin ignres the fact that OLS residuals are already crrelated by cnstructin Still, it gives yu sme sense and it is usually nt far ff base Cnstruct a LISA map (Space>Univariate LISA) Use the same weight matrix that yu used in the regressin mdel Again, purely descriptive, but smewhat useful in identifying gegraphic areas where the mdel des nt explain the spatial distributin f the dependent variable Things t Cnsider. What d yu think is indicated by the tests fr spatial dependence based n the OLS residuals in terms f what mdel might be a gd fit fr yur data (errr r lag)? Hw r d the diagnstics fr spatial dependence differ when yu use different spatial weights matrices? Hw des the patterning f psitive and negative residuals in the chrpleth maps f yur OLS residuals relate t yur mdel diagnstics? What clustering is evidenced in the residuals using LISA maps? D yu think there might be any prcesses r mitted variables that culd help explain the clustering in the residuals? Part 2: Spatial Regressin Mdels Spatial Regressin. The specificatins f the spatial regressin mdel shuld be based n the results frm the standard OLS mdel t make meaningful cmparisns. Regressin Run a mdel that yu think ught t reasnably explain child pverty in the Suth (Regress>) Specify the utput frmat Specify the regressin mdel Dependent Variable: child pverty, SQRTPPOV (square rt transfrmed) r PPOV, if yu prefer Independent Variables: What shall we explre? (shuld be cnsistent with OLS t be cmpared) Chse weight matrix: Which shuld we use? (shuld be cnsistent with OLS t be cmpared) Run bth the Spatial Errr and Spatial Lag ptins fr cmparisn Vss & Curtis 4
Save the residuals, predicted values, and predicted errrs (chse the Save buttn and give the variables a meaningful name) Remember that yu will need t create a new shapefile t permanently append the new variables t yur table (it is like a wrking file in SAS) (activate the table bject>file>save t Shape File As ) Output File An utput windw autmatically appears when selecting OK The file cntent is similar t that reprted fr the classic OLS regressin Summary statistics f the mdel and measures f fit Parameter estimates Mdel diagnstics A pseud R squared is reprted and can be cmpared t the OLS mdels, yet the lg likelihd, AIC and SC are better fr mdel cmparisns T review Lg likelihd: bigger, better (less negative) AIC and SC: lwer, better Review the autregressive cefficient (ρ, spatial lag, r λ, spatial errr) Is it significant? What is the directin? Is it what yu expected? Review the explanatry variables Check the signs, significance, and magnitude Check mdel heterskedasticity Only the Breush Pagan test is reprted (tests n randm cefficients that assumes a functinal frm based n the squares f the explanatry variables) Als, can plt the mdel residuals (Explre>Scatter Plt) Y: residual values X: predicted values Check the likelihd rati test fr the specified spatial frm (lag r errr, depending n the mdel) This test cmpares the spatial mdel t the nn spatial alternative What is missing in the GeDa diagnstics is a direct cmparisn with the alternative spatial mdel (lag vs. errr); can get this thrugh SpaceStat, R, and, hpefully, in future versins f GeDa Fr nw, we cmpare the tw mdels n a number f different pints (LL, AIC, etc.) Predicted Values, Predictin Errrs and Residuals Predicted Values: the estimated value f child pverty ( I ˆ W ) 1 X ˆ Predictin Errrs: the difference between the bserved and predicted values f child pverty, btained by cnsidering the exgenus variables alne 1 ( I W ) Residuals: estimates fr the mdel errr term u ( I ˆ W ) y X ˆ Vss & Curtis 5
Cnstruct a univariate Mran scatter plt fr the residuals and errrs (Space>Univariate Mran) Residuals: shuld be clse t 0 since spatial autcrrelatin has been purged frm the mdel r, alternatively phrased, captured in the ρ r λ parameter Predictin Errrs: is abut the same as the riginal OLS MI statistic This is kay since, by definitin, they are spatially crrelated; the predicted errrs are an estimate fr the spatially transfrmed errrs Cmpare the scatter plt f the lag and errr mdel residuals What des this cmparisn indicate? Things t Cnsider. Which mdel, given all f the infrmatin we ve explred, is a better fit fr ur data? What des this mdel selectin mean, cnceptually, in terms f ur utcme variable? What, if any, substantive infrmatin is gained thrugh spatial regressin techniques? Vss & Curtis 6