Context-dependent spatial analysis: A role for GIS?

J Geograph Syst (2000) 2:71±76 ( Springer-Verlag 2000 Context-dependent spatial analysis: A role for GIS? A. Stewart Fotheringham Department of Geography, University of Newcastle, Newcastle-upon-Tyne NE1 7RU, United Kingdom (e-mail: Stewart.Fotheringham@ncl.ac.uk) Abstract. The relationship between spatial analysis and GIS has been a debating point for over a decade. Some see GIS as leading the way to a new era in which the desire and the ability to analyse spatial data is widespread, not just in geography but throughout all disciplines in which spatial data are encountered. Others see GIS as simply providing a medium for the recycling of out-dated spatial analytical techniques and models. This paper describes a role for GIS through the development of new forms of `local' or `contextdependent' spatial analytical methods in which the focus is on exceptions to the general trend represented by the more traditional `global' methods. Key words: Local, global, context, spatial analysis, GIS 1 Introduction Although my research interests have meandered between spatial interaction/ spatial choice modelling and spatial statistics, a common theme to much of my work has been an interest in identifying and understanding di erences across space rather than similarities. The development of spatial disaggregations of global statistics was the subject of my PhD at McMaster University on origin-speci c distance-decay parameters in spatial interaction models (still a major interest) and it is the subject of recent research at the University of Newcastle on Geographically Weighted Regression. This concern for the local and the importance of context encompasses the dissection of global statistics into their local constituents; the concentration on local exceptions rather than the search for global regularities; and the production of local or mappable statistics rather than on `whole-map' values. This trend is important not only because it brings issues of space to the fore in analytical methods, but also because it refutes the rather naive criticism that quantitative geography is unduly concerned with the search for global generalities and `laws'. The raison d'etre for the development of local statistics is the low probability in many situations that the `average' results obtained from the analysis of a spatial data set drawn from a broad region apply equally to all parts of

72 A.S. Fotheringham that region, the assumption of traditional global statistics. Simply reporting one global set of results and ignoring any possible spatial variations in those results is as limiting as reporting a mean value of a spatial distribution without seeing a map of the data. Consequently, a variety of local statistical techniques for spatial data have been developed that allow the investigation of spatial variations in relationships and, consequently, the generation of vastly increased amounts of information on spatial processes. Obviously, local forms of spatial analysis are important to GIS because they result in geocoded output that can be mapped. Traditional statistics and models usually produce a single output, such as a regression coe½cient or a goodness-of- t statistic. Consequently, there is little scope, and indeed there seems little point, in integrating such techniques within a GIS (Fotheringham 2000; Wegener and Fotheringham 2000). However, local or context-dependent forms of statistics and models produce output for a set of geocoded locations so that this output can be mapped, linked to other displays, and queried, using the standard features of a GIS. While the development of local statistics has taken place completely independently of GIS in disciplines such as statistics, political science and economics, some of the impetus for the development of local statistics in geography has resulted from the growing interest in integrating advanced forms of spatial analysis and GIS ( Fotheringham and Wegener 2000). The remainder of this paper discusses the nature of local variations in relationships and then describes some local techniques for the analysis of spatial data. 2 The nature of local variations in relationships Suppose we had a set of spatial data for a region and we used these data to calibrate a regression model (it does not matter what type of regression model). The classic approach would be to use all the data in a single calibration of the model. This calibration would produce a single estimate of each parameter in the model depicting an `average' relationship across the region. It is implicitly assumed that the relationship being examined does not vary over the study region and that the computed `average' parameter estimate is therefore a good representation of the relationship for all parts of the region. One rather crude way to examine the robustness of this assumption would be to subdivide the study region into smaller spatial units and to calibrate the regression model separately with the data from each smaller unit. The degree to which the local parameter estimates varied over space would indicate whether our assumption of spatial stationarity was justi ed. Of course, we would expect there to be some variation in the parameter estimates if we were to do this simply because of random sampling variations. The contribution of this source of spatial non-stationarity is not usually of great interest but it does need to be accounted for in assessing the statistical signi cance of any spatial variation in results. That is, we are only interested in relatively large variations in parameter estimates which are unlikely to be caused by random sampling and which therefore constitute interesting spatial non-stationarity. The task of local statistics is to highlight such interesting spatial non-stationarity where it exists. Suppose that our experiment demonstrated that there were some signi cant variations in a particular relationship over space. What would be the

Context-dependent spatial analysis 73 interpretation of such variation? Actually there are two, contradictory interpretations, both of which, however, have important implications for spatial analysis. One is that, for whatever reasons, some relationships are intrinsically di erent across space. Perhaps, for example, there are spatial variations in people's attitudes or preferences or there are di erent administrative, political or other contextual issues that produce di erent responses to the same stimuli over space. This idea that human behaviour can vary intrinsically over space is consistent with post-modernist beliefs on the importance of place and locality as frames for understanding such behaviour. The identi cation of these local variations would then be a useful pre-cursor to more intensive studies which would attempt to highlight why such di erences occur. The other reason why measured relationships might exhibit signi cant spatial non-stationarity is that the model from which the relationships are estimated is a gross misspeci cation of reality and that one or more relevant variables are either omitted from the model or are represented by an incorrect functional form. This view, more in line with the positivist school of thought, runs counter to that discussed above in that it assumes a global statement of behaviour can be made but that the structure of our model is not su½ciently well-formed to allow us to make it. In this case mapping local statistics is useful in order to understand the nature of the model misspeci cation more clearly. For what parts of the study region does the model replicate observed data less accurately and does the spatial distribution of these parts suggest the addition of an extra explanatory variable to the model? In this sense, local analysis is seen as a model-building procedure in which the ultimate goal is to produce a global model that exhibits no signi cant spatial non-stationarity. 3 Attempts to measure local variations in relationships Within the last several years, there has been a relatively urry of academic work re ecting the calls of Fotheringham and Rogerson (1993) and Openshaw (1993) for greater attention to be given to local or mappable statistics. These attempts are reviewed in Fotheringham and Brunsdon (1999) and are not described in depth here. It is su½cient to state that local statistics are becoming of great interest in the analysis of spatial data across a variety of disciplines and they can be found in many di erent forms. For example, until relatively recently most applications of spatial point pattern analysis involved the calculation of some global statistic that described the whole point pattern and from which a conclusion was reached related to the clustered, dispersed or random nature of the whole pattern. Clearly, such an analysis is potentially awed in that interesting spatial variations in the point pattern are subsumed in the calculation of the average or global statistic. In many instances, particularly in the study of disease, such an approach would appear to be contrary to the purpose of the study, namely to identify any interesting local clusters. Amongst the rst example of a local point pattern analysis technique was the Geographical Analysis Machine (GAM) developed by Openshaw et al. (1987). This technique allows point patterns to be examined over small regions of the map and interesting local clusters can then be depicted on the map. If these results are displayed using a GIS, the interesting local clusters can be queried or interrogated in the usual way. More formal local statistics for univariate data have recently been developed by Getis and Ord (1992), Ord and Getis (1995) and by Anselin (1995).

74 A.S. Fotheringham The increasing availability of large and complex spatial datasets has led to several attempts being made to produce localised versions of traditionally global multivariate techniques, with perhaps the greatest challenge being to produce local versions of regression analysis. One of the earliest attempts to do this is the expansion method (Casetti 1972; Jones and Casetti 1992) which attempts to measure parameter `drift'. In this framework, parameters of a global model are expanded in terms of other attributes. If the parameters of the regression model are made functions of geographic space, trends in parameter estimates over space can then be measured ( Eldridge and Jones 1991). More recently, Geographically Weighted Regression (GWR) (Brunsdon et al. 1996; 1998; Fotheringham et al. 1997, 1999) has been developed to extend the traditional regression framework by allowing local rather than global parameters to be estimated. That is, the model to be estimated has the general form: y i ˆ a i0 X k a ik x ik e i 1 where y represents the dependent variable, x k represents the kth independent variable, e represents an error term and a ik is the value of the kth parameter at location i. In the calibration of this model it is assumed that observed data near to point i have more in uence in the estimation of the a ik s than do data located farther from point i. In essence, the equation measures the relationships inherent in the model around each point i. This `sphere of local in uence' is calibrated within the GWR framework by nding the sphere of local in uence that maximises the t to the data. Thus the weighting of any data point in GWR is not constant but varies with the point i at which the regression parameters are being estimated. Data from observations closer to i are weighted more heavily than those from farther away. Hence the estimator for the parameters in GWR is: ^a i ˆ x t w i x 1 x t w i y 2 where w i is an n by n matrix whose o -diagonal elements are zero and whose diagonal elements denote the geographical weighting of observed data for point i. Various types of spatial weighting functions can be used to de ne the weights in this procedure although in practice the exact form of the weighting function is relatively unimportant compared with the determination of the appropriate rate of distance-decay in whatever function is used. It is quite possible within GWR to have a separate weighting function for each point at which the local regression model is calibrated. In this way, the sphere of local in uence can be very localised in areas rich in data points and can extend in areas where data points become more sparse. It should be noted that as well as producing localised parameter estimates, the GWR technique described above will produce localised versions of all standard regression diagnostics including goodness-of- t measures such as r-squared. The latter can be particularly informative in understanding the application of the model being calibrated and in exploring the possibility of adding additional explanatory variables to the model. A list of recent GWR publications and code is provided at the GWR web site: www.ncl.ac.uk/geography/gwr/. It is also worth noting that

Context-dependent spatial analysis 75 the concept of geographical weighting can be applied to many statistical techniques and not just to a regression model. In this way it would be possible to derive a suite of local statistics, each of which mirrors a traditional global statistic. For instance, it is possible to derive a geographically weighted mean, a geographically weighted correlation coe½cient, and perform a geographically weighted principle components analysis. A nal example of a local modelling technique, and perhaps one of the earliest, is the spatial disaggregation of spatial ow or spatial interaction models ( Fotheringham 1981). That is, these models can be calibrated separately for each origin in a system (or for each destination if that seems useful) to provide information on spatial variations in the determinants of destination choice. From an accumulation of mapped examples of origin-speci c parameter estimates from spatial interaction models, it has been possible to identify the source of a severe misspeci cation bias in the general spatial interaction modelling formula (Fotheringham, 1984, 1986). It is worth stressing that such identi cation only came to light through an investigation of spatial variations in localised parameters and would have been missed in the calibration of a global model. The development of what have become known as `competing destinations' models in this way is a good example of the use of local models to identify misspeci cation in global models and to improve on global statements of spatial behaviour. 4 Summary The development of `local' rather than the `global' in spatial analysis and modelling is interesting for several reasons. First, it demonstrates a concern for identifying local exceptions rather than broad generalisations. Second, it provides an important link between statistical analysis and the potentially powerful visual display environments of various GIS and statistical graphics packages where the all-important display is the map. Local statistics should be mapped to highlight spatial trends, spatial variations and local areas of interest. Third, the development of local statistics allows relationships to be explored in more detail and will inevitably lead to a better understanding of spatial processes. References Anselin L (1995) Local indicators of spatial association ± LISA. Geographical Analysis 27:93±115 Brunsdon CF, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis 28:281±298 Brunsdon CF, Fotheringham AS, Charlton ME (1998) Spatial nonstationarity and autoregressive models. Environment and Planning A 30:957±973 Casetti E (1972) Generating models by the expansion method: Applications to geographic research. Geographical Analysis 4:81±91 Eldridge JD, Jones III JP (1991) Warped space: a geography of distance decay Professional Geographer 43:500±511 Fotheringham AS (1981) Spatial structure and distance-decay parameters, Annals of the Association of American Geographers 71:425±436 Fotheringham AS (1984) Spatial ows and spatial patterns. Environment and Planning A 16:529± 543

76 A.S. Fotheringham Fotheringham AS (1986) Modelling hierarchical destination choice. Environment and Planning A 18:401±418 Fotheringham AS (2000) GIS-Based spatial modelling: a step forwards or a step backwards? In: Fotheringham AS, Wegener M (eds.) Spatial models and GIS: New potential and new models. Taylor and Francis, London, pp. 21±30 Fotheringham AS, Brunsdon CF (1999) Local forms of spatial analysis. Geographical Analysis (in press) Fotheringham AS, Wegener M (eds.) (2000) Spatial models and GIS: New potential and new models. Taylor and Francis, London Fotheringham AS, Brunsdon FC, Charlton ME (1997) Measuring spatial variations in relationships with geographically weighted regression. In: Fischer MM, Getis A (eds.) Recent developments in spatial analysis, spatial statistics, behavioural modelling and computational intelligence. Springer, Berlin Heidelberg New York, pp. 60±82 Fotheringham AS, Brunsdon CF, Charlton ME (1999) Geographically weighted regression: A natural evolution of the expansion method for spatial data analysis. Environment and Planning A 30:1905±1927 Getis A, Ord JK (1992) The analysis of spatial association by use of distance statistics. Geographical Analysis 24: 189±206 Jones JP, Casetti E (1992) Applications of the expansion method. Routledge, London Openshaw S (1993) Exploratory space-time-attribute pattern analysers. In: Fotheringham AS, Rogerson PA (eds.) Spatial Analysis and GIS, Taylor and Francis, London, pp. 147±163 Openshaw S, Charlton ME, Wymer C, Craft AW (1987) A mark I geographical analysis machine for the automated analysis of point data sets. International Journal of Geographical Information Systems 1:359±377 Ord JK, Getis A (1995) Local spatial autocorrelation statistics: Distributional issues and an application. Geographical Analysis 27:286±306 Wegener M, Fotheringham AS (2000) New spatial models: Achievements and challenges. In: Fotheringham AS, Wegener M (eds.) Spatial models and GIS: New potential and new models. Taylor and Francis, London, pp. 261±269.