Formalization of GIS functionality

Formalization of GIS functionality Over the past four decades humans have invested significantly in the construction of tools for handling digital representations of spaces and their contents. These include architectural design tools such as Autodesk s AutoCAD, geographic information systems (GIS; Longley et al., 2010) such as Esri s ArcGIS, and tools for the manipulation of images such as Adobe s Photoshop. All of them support methods for acquiring and constructing representations, as well as tools for analysis, modeling, visualization, and many other functions. The various software packages that were developed over the same period for statistical analysis, such as S or SAS, implemented the body of statistical methodology and theory that had evolved over the previous century or more, with its established terminology. As a result there is substantial consistency between packages in how they implement standard tests such as Student s t. With the obvious exception of geometry, however, the body of theory available to developers of spatial technologies was much more limited (Goodchild, 1988a), and there was no equivalent of the extensive programs of statistical instruction available in institutions of higher education. Instead, terms were often invented as needed, and sometimes reflect physical analogies or the details of data or the processing steps of algorithms rather than rigorously defined principles or the conceptual objectives that lie behind specific functions. In the case of GIS, the term overlay referred to the operation of combining two or more layers of geographic information, with obvious allusion to the physical overlaying of transparent maps, while even terms with rigorous definitions in established disciplines, such as centroid or topology, were varied in meaning by an emerging community that believed it owed little to established theory. Digital spatial technologies had evolved enormously by the turn of the new century, and it had become possible to assert with some confidence that these technologies were capable of virtually any conceivable operation on spatial data -- in effect, that they allowed their users to explore a large number of fundamental spatial concepts. Substantial standardization of representations had been achieved through the efforts of consensus groups such as the Open Geospatial Consortium, allowing significant interoperability between systems, at least at the syntactic level. Very little standardization had been achieved, however, in the domain of functionality. Many spatial tools, especially GIS, had acquired a reputation for being difficult to learn and use, with complex and poorly designed user interfaces and little in the way of conceptual frameworks or organizing principles. We posit that the interfaces of spatial software provide us with a natural experiment in the identification of fundamental spatial concepts. As users have found more and more spatial concepts to explore, developers have responded with implementations, but without any overall design, any guiding principles of granularity, or any basic functional taxonomy. We use the example of GIS to illustrate this argument, and review prior attempts to bring some structure to GIS interface design. 1

Esri s most recent version of its desktop product, ArcMap, includes ArcToolbox, a collection of functions to manipulate the representation of the geographic information contained in its database. The toolbox is organized under the 18 headings shown in Table 1. Title Count of functions 3D Analyst Tools 34 Analysis Tools 19 Cartography Tools 43 Conversion Tools 46 Data Interoperability Tools 2 Data Management Tools 178 Editing Tools 7 Geocoding Tools 7 Geostatistical Analyst Tools 22 Linear Referencing Tools 7 Multidimension Tools 7 Network Analyst Tools 21 Parcel Fabric Tools 4 Schematics Tools 5 Server Tools 14 Spatial Analyst Tools 171 Spatial Statistics Tools 26 Tracking Analyst Tools 2 Total 615 Table 1: Organization of GIS functions in the ArcMap 10 Toolbox While some of these headings are distinct, it would be virtually impossible for a user to determine whether a given spatial concept would be regarded by Esri s designers as an instance of 3D analysis, analysis, geostatistical analysis, spatial analysis, or spatial statistics; and many other categories overlap substantially. Moreover the extreme variance of counts, from 2 to 178, suggests that the headings do a poor job of organizing the functions into comparable groups. Various efforts have been made by the GIS research community to bring some semblance of order to this apparent chaos. Of these, perhaps the most successful is the map algebra developed initally by Tomlin (1990) and similar in many respects to the image algebras of image processing (e.g., Ritter and Wilson, 2001). Operations on raster representations were organized into four distinct types: local operations performed on each raster cell or pixel independently; focal operations which compared each pixel with its immediate neighbors; zonal operations performed on contiguous cells sharing a common attribute; and global operations on all cells. The scheme was sufficiently general to be adopted by several software packages, including elements of Esri s ArcGIS. Significant extensions were later made by van Deursen (1995), Takeyama and Couclelis (1997), and others. 2

Note, however, that the scheme still places emphasis on the algorithmic operations of each function, rather than on its conceptual goals. The higher-level approaches of Kemp (1997a,b) and Vckovski (1998) to operations on continuous-field conceptualizations both come closer to the latter. Several taxonomies have also been proposed (see, for example, Berry, 1987; Dangermond, 1983; Maguire and Dangermond, 1991; Rhind and Green, 1988), though none is substantially more satisfactory than the ArcGIS taxonomy discussed earlier. Goodchild (1988b) grounded his taxonomy in the GIS s database, arguing that the latter formalized the representation of spatial data, and thus could provide a more formal grounding for a taxonomy of functions. His taxonomy is limited to discrete-object representations (for a discussion of the field/object dichotomy see for example Longley et al., 2010). In Unified Modeling Language (e.g., Zeiler, 1999): Analyze the attributes of a single class of objects (statistical analysis); Analyze one class of objects using both locational and attribute information; Analyze the attributes of an association class; Analyze more than one class of objects; Create a new association class from existing classes; and Create a new class from one or more existing classes. The results of further analysis along these lines were presented in a later paper by Burrough (1992). With further development, one might seek to devise a test of the completeness of this scheme, and to create a more formal description. Texts on spatial analysis using GIS (see, for example, O Sullivan and Unwin, 2003; Bailey and Gatrell, 1995) also provide some guidance in this area. Of these, Mitchell (1999, 2005) perhaps comes closest to grounding GIS analysis in spatial concepts, organizing his discussion around the conceptual objectives of analysis rather than the algorithmic mechanics. An earlier example of Mitchell s approach can be found in the work of de Man (1988). Another line of attack, and perhaps the most formally grounded of all of these, is represented by the work of Albrecht (1995, 1998). Work on the representation of geographic information had, by the 1990s, resulted in a widely accepted and formal model. The Open Geospatial Consortium s Simple Feature Model is one expression of this consensus. Given this, it is tempting to explore the formalization of GIS functionality through its grounding in representation. Several of the spatial analysis texts, especially that of Bailey and Gatrell (1995), do this to a limited degree, but without the formal structure provided by the Simple Feature Model. Moreover, since the latter is concerned exclusively with operations on discrete objects, the approach misses the opportunity to include operations on fields, or on field/object combinations or transformations. Albrecht concludes in the 1998 paper that there are 20 universal analytical GIS operations. References Albrecht, J., 1995. Semantic net of universal elementary GIS functions. In Proceedings, AutoCarto 12, Charlotte, NC, pp. 235-244. 3

Albrecht, J., 1998. Universal analytical GIS operations: a task-oriented systematization of data structure-independent GIS functionality. In M. Craglia and H.J. Onsrud, editors, Geographic Information Research: Transatlantic Perspectives, pp. 577-591. London: CRC Press. Bailey, T.C. and A.C. Gatrell, 1995. Interactive Spatial Data Analysis. Harlow, UK: Longman Scientific and Technical. Berry, J.K., 1987. Fundamental operations in computer-assisted map analysis. International Journal of Geographical Information Systems 1: 119-136. Burrough, P.A., 1992. Development of intelligent geographical information systems. International Journal of Geographical Information Systems 6(1): 1-12. Dangermond, J., 1983. A classification of software components commonly used in geographic information systems. In D.J. Peuquet and J. O Callaghan, editors, Design and Implementation of Computer-Based Geographic Information Systems, pp. 70-91. Amherst, NY: International Geographical Union, Commission on Geographical Data Sensing and Processing. van Deursen, W.P.A., 1995. Geographical Information Systems and Dynamic Models: Development and Application of a Prototype Spatial Modelling Language. Utrecht: Koninklijk Nederlands Aardrijkskundig Genntschap/Faculteit Ruimtelijke Wetenschappen Universiteit Utrecht. Goodchild, M.F., 1988a. A spatial analytic perspective on geographical information systems. International Journal of Geographical Information Systems 1: 327-334. Goodchild, M.F., 1988. Towards an enumeration and classification of GIS functions. Proceedings, International Geographic Information Systems (IGIS) Symposium: The Research Agenda II: 67-77. Washington, DC: National Aeronautics and Space Administration. Kemp, K.K., 1997a. Fields as a framework for integrating GIS and environmental process models. Part one: Representing spatial continuity. Transactions in GIS 1(3): 219-234. Kemp, K.K., 1997b. Fields as a framework for integrating GIS and environmental process models. Part two: Specifying field variables. Transactions in GIS 1(3): 235-246. Longley, P.A., M.F. Goodchild, D.J. Maguire, and D.W. Rhind, 2010. Geographic Information Systems and Science. Third Edition. Hoboken, NJ: Wiley. Maguire, D.J. and J. Dangermond, 1991. The functionality of GIS. In D.J. Maguire, M.F. Goodchild, and D.W. Rhind, editors, Geographical Information Systems: Principles and Applications 1: 319-335. Harlow, UK: Longman Scientific & Technical. de Man, W.H.E., 1988. Establishing a geographical information system in relation to its use. A process of strategic choices. International Journal of Geographical Information Systems 2(3): 245-262. Mitchell, A., 1999. The ESRI Guide to GIS Analysis. I. Geographic Patterns and Relationships. Redlands, CA: ESRI Press. Mitchell, A., 2005. The ESRI Guide to GIS Analysis. II. Spatial Measurements and Statistics. Redlands, CA: ESRI Press. 4

O Sullivan, D. and D.J. Unwin, 2003. Geographic Information Analysis. Hoboken, NJ: Wiley. Rhind, D.W. and N.P.A. Green, 1988. Design of a geographical information system for a heterogeneous scientific community. International Journal of Geographical Information Systems 2(2): 171-190. Ritter, G.X. and J.N. Wilson, 2001. Handbook of Computer Vision Algorithms in Image Algebra. Second Edition. Boca Raton: CRC Press. Takeyama, M. and H. Couclelis, 1997. Map dynamics: Integrating cellular automata and GIS through geo-algebra. International Journal of Geographical Information Science 11(1): 73-91. Tomlin, C.D., 1998. Geographic Information Systems and Cartographic Modeling. Englewood Cliffs, NJ: Prentice Hall. Vckovski, A., 1998. Interoperable and Distributed Processing in GIS. London: Taylor and Francis. Zeiler, M., 1999. Modeling Our World: The ESRI Guide to Geodatabase Design. Redlands, CA: ESRI Press. 5