Introduction to Spatial Statistics and Modeling for Regional Analysis Dr. Xinyue Ye, Assistant Professor Center for Regional Development (Department of Commerce EDA University Center) & School of Earth, Environment and Society, Bowling Green State University, OH Email: xye@bgsu.edu http://personal.bgsu.edu/~xye/ NARSC 2011
Acknowledgement Geoda Center, Arizona State University China Data Center, University of Michigan
Outline Space ESDA (Moran, LISA) Spatial Modeling (Gravity Model, Spatial Regression) Some new advances (open source package <code as the text>, comparative space time dynamics)
Space the importance of space to many socioeconomic theories (Goodchild et al., 2000)....GIScience is applicable to varying degrees in any space,... such as the three-dimensional space of the human brain,... (Goodchild, 2006) Spatial statistics and modeling take space into account
Spatial Analysis for Regional Development all activities have both spatial and temporal dimensions that cannot be meaningfully separated (Miller, 2004) integration of space and time generate much closer interactions among geography and other social sciences in general, while providing new perspectives for the role of geography in regional development The increasing availability of space-time data has outpaced the development of space-time analytical techniques across social sciences. treatment of spatial and temporal factors in economic growth will determine how the reality and the theoretical framework are examined. space-time analysis and data are equally important in regional analysis: spatial intelligence for regional analysis
Hence, Bode and Rey (2006) call for further research on integrating space into formal theoretical models of growth and convergence as well as on developing the next generation of analytical methods needed to implement those models. In addition, they maintain that these are the preconditions for reliable policy recommendations, one of the primary goals of economic research (Bode and Rey, 2006).
everything is related to everything else, but near things are more related than distant things (Tobler, 1970). Spatial dependency and heterogeneity reflect the inherent nature of many socio-economic processes. Geo-referenced data (attribute and locational information) often exhibits the properties of spatial dependency and spatial heterogeneity. Spatial relationships are modelled with spatial weight matrices: similarities (e.g. neighbourhood matrices) or dissimilarities (distance matrices) between spatial objects.
W = [w ij ], spatial matrix. w ij = 0 if i = j w ij > 0 if i and j are spatially connected If w * ij = w ij / Σ j w ij, W * is called rowstandardized W can measure similarity (e.g. connectivity) or dissimilarity (distances).
exploratory data analysis (EDA) The process of getting acquainted with data. interactively formulating hypotheses instead of testing hypotheses. ESDA can reveal complex spatial phenomenon not identified otherwise (Anselin, 1993), and it forms the basis for formulating spatially explicit research questions.
Tools and understanding Tukey (1977) states that a detective investigating a crime needs both tools and understanding. If he has no fingerprint powder, he will fail to find fingerprints on most surfaces. If he does not understand where the criminal is likely to have put his fingers, he will not look in the right places. Equally, the analyst of data needs both tools and understanding. tools implies both ready-to-use software executables and prototype approaches, while understanding signifies the knowledge of the principles of choosing appropriate tools. Hence, without the cross-fertilization between spatial statics and regional science, spatial statics will remain as a technique instead of a solution to pressing regional development issues (Rey and Ye, 2010).
To detect the spatial pattern, some standard global and new local spatial statistics have been developed. These include the Moran I, Geary C (see Cliff and Ord 1973, 1981), G statistics (Getis 1992), LISA (Anselin 1995), among others
Global Measures: A single value for the entire data set Local Measures: A unique number for each location An equivalent local measure can be calculated for most global measures Moran s I is most commonly used, and the local version is called Anselin s LISA
Global Moran s I I N i j W ij ( X i X )( X j X ) ( W ) i j ij i ( X i X ) 2
Where N is the number of cases X i is the variable value at a particular location X j is the variable value at another location W ij is a weight indexing location of i relative to j a continuous variable for polygons or points between 1.0 and + 1.0 0 indicates no spatial autocorrelation [approximate: technically it s 1/(n-1)] When autocorrelation is high, the I coefficient is close to 1 or -1 Negative/positive values indicate negative/positive autocorrelation Think of it as the correlation between neighboring values on a variable More precisely, the correlation between variable, X, and the spatial lag of X formed by averaging all the values of X for the neighboring polygons. I N i j ( W ij W ( X ) i j ij i i X )( X ( X i j X X 2 ) )
Calculating Anselin s LISA The local Moran statistic for areal unit i is: I i z i j where z i is the original variable x i in xi x zi standardized form SDx or it can be in deviation form x i x and w ij is the spatial weight The summation is across each row i of the j spatial weights matrix. w ij z j 15
Generic formulation: G i local global n j w a G w a ij ij n i n j ij ij where w ij a ij : spatial proximity between i and j : measured relation between object and its neighbors
Gravity Model
Spatial Regression Models ESDA and OLS diagnostics reveal the existence of spatial autocorrelation Identify the source Regression residuals (LM-Error) Mis-match of process and spatial units => systematic errors, correlated across spatial units Dependent variable (LM-Lag) Underlying socio-economic process has led to clustered distribution of variable values => influence of neighboring values on unit values
Spatial Autocorrelation in Residuals => Spatial Error Model y = Xβ + ε ε = λwε + ξ ε is the vector of error terms, spatially weighted (W); λ is the coefficient; and ξ is the vector of uncorrelated, homoskedastic errors Incorporates spatial effects through error term
Spatial Autocorrelation in Dep. Variable => Spatial Lag Model y = ρwy + Xβ + ε y is the vector of the dependent variable, spatially weighted (W); ρ is the coefficient Incorporates spatial effects by including a spatially lagged dependent variable as an additional predictor
Some new advances comparative space-time analysis of regional systems The two data sets are relative per capita gross domestic product (GDP)/income over the 1978-1998 period at the province (mainland China) and state (contiguous United States) levels
Some Highlights comparative analysis between different economic systems are currently lacking an inferential basis. lasting debates among the convergence, divergence, inverted-u, and Neo-Marxist uneven development schools; However, the findings are mixed and sometimes conflicting. the presence of spatial dependence and the partitioning of the economic units Space and time
Distance-Based Local Markov Transition Local indicators of spatial autocorrelation (LISA) LISA transition between two time points
move from Low-High section to Low-Low section?
LISA time paths in China (left) and the United States (right)
Inferential Approaches random labeling and spatial permutations of the relative values for two maps (two regional systems) simultaneously.
relative mobility in a Classic (and Local Moran) Markov transition matrix
Relative Mobility of Classic and Local Moran Markov in China and the U.S. (999 permutations)
Covariance networks of per capita incomes in China and the United States, 1978 1998
Spider graphs of Zhejiang Province (China) and California (the United States)
The convex hulls of poor regions in China and the United States, 1978 and 1998
Special issues 2012 Ye, X. and L. Liu (eds.) Special Issue on Spatial Crime Analysis and Modeling, Annals of GIS 2012 Wei, Y.H.D. and Ye, X. (eds.) Special Issue on Urbanization, Land Use, and Sustainable Development in China, Stochastic Environmental Research & Risk Assessment 2012 Bao, S., Ye, X. and Li, B. (eds.) Special Issue on Spatial Intelligence for Urban and Regional Analysis, International Journal of Applied Geospatial Research 2011 Neil, R., Carroll, M. and Ye, X. (eds.) Special Issue on Recession, Resilience, and Recovery, Economic Development Quarterly 2011 Ye, X. and Y.H.D. Wei (eds.) Special Issue on Globalization, Regional Development, and Public Policy in Asia, Regional Science Policy & Practice.