THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF GEOGRAPHY

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF GEOGRAPHY ASSESSING THE COGNITVE ADEQUACY OF TOPOLOGICAL CALCULI: TRANSLATION VS. SCALING JINLONG YANG Spring 2011 A thesis submitted in partial fulfillment of the requirements for a baccalaureate degree in Geography with honors in Geography Reviewed and approved* by the following: Alexander Klippel Assistant Professor of Geography Thesis Supervisor Roger M. Downs Professor of Geography Honors Adviser * Signatures are on file in the Schreyer Honors College.

i ABSTRACT Movement patterns at the geographic scale are pervasive in the world we live in. Developing formalisms for capturing the spatial-temporal information of such movement patterns is becoming a central focus in spatial information science. To facilitate the meaningful interpretation of movement patterns, it is critical to design formalisms that are similar to humans conceptualization of space for both static and dynamically changing spatial relations. The research reported in this thesis focuses on cognitively assessing the adequacy of topological calculi in the capture of translation and scaling movements. The results show that topology plays a dominant role in conceptualizing geographic movement patterns, but that domain semantics influences the saliency of topologically distinguished ending relations.

ii TABLE OF CONTENTS LIST OF FIGURES... iii LIST OF TABLES... iv ACKNOWLEDGEMENTS... v Chapter 1 Introduction... 1 Chapter 2 Experiments... 6 Materials... 7 Participants... 9 Procedure... 10 Data Collection... 12 Chapter 3 Results... 13 Basic statistics... 13 Cluster analysis... 15 Multi-dimensional scaling... 20 Grouping raw frequencies... 24 Linguistic Analysis... 28 Chapter 4 Conclusions... 30 Bibliography... 31 Appendix A. ANOVA of the number of groups created over eight scenarios... 33 Appendix B. ANOVA of the amount of time that participants spent on the grouping task over eight scenarios... 39 Appendix C. Dendrograms of average linkage and complete linkage... 45

iii LIST OF FIGURES Figure 1. Conceptual neighborhood graph.... 2 Figure 2. Design layout of icons in hurricane scenario and lake scenario.... 8 Figure 3. Screenshots of the grouping interface of CatScan.... 11 Figure 4. Box plots of the number of groups created by participants in the eight scenarios... 14 Figure 5. Box plot of the amount of time that participants spent on the grouping task over eight scenarios..... 15 Figure 6. Dendrograms of four translation movement scenarios... 17 Figure 7. Dendrograms of four scaling movement scenarios... 18 Figure 8. MDS plots of four translation movement scenarios... 21 Figure 9. MDS plots of four scaling movement scenarios... 22 Figure 10. Screenshot of the KlipArt tool... 25 Figure 11. Line chart of the number of participants who placed all eight topological equivalence icons into the same group for each topologically distinguished ending relation over four translation scenarios... 27 Figure 12. Line chart of the number of participants who placed all eight topological equivalence icons into the same group for each topologically distinguished ending relation over four scaling scenarios.... 27

iv LIST OF TABLES Table 1 The design of icons in translation movement scenarios.... 6 Table 2 The design of icons in scaling movement scenarios.... 7 Table 3. An overview of participant information over eight scenarios.... 9 Table 4. The number (and percentage) of participants who using terms relating to topology or domain semantics in each scenario.... 28

v ACKNOWLEDGEMENTS The completion of this research would not have been possible without the support of many people. I would first like to thank my parents who provided love, support, and encouragement through my study. I am heartily thankful to my thesis supervisor, Dr. Alexander Klippel for his continuous guidance, support, and encouragement in the past two years. I want to especially thank my honors advisor, Dr. Roger Downs for his comments on my thesis. I am indebted to my colleagues, Rui Li who ran experiments and exchanged ideas with me, and Frank Hardisty who supplying me with technical support I owe special thanks Ping Zhao for lending me a laptop when my own one was out of function.

1 Chapter 1 Introduction Humans live in a dynamic spatial world. It is, therefore, important to develop an understanding of how humans think about space and time and about situations in which tools such as maps and computers (e.g., GPS and GIS) provide assistance to spatio-temporal thinking processes. However, computers characterize space in a quantitative way (e.g., object A, a chair, is 3.45 meters from object B, the door), while humans tend to characterize space in a qualitative manner (e.g. object A is near object B). To bridge this gap, it is essential to develop formalisms that are similar to humans characterizations of space. The study reported in this thesis focuses on assessing the cognitive adequacy of topology a qualitative calculus for characterizing spatial relations to model conceptualizations of two of the three major types of movements identified by Egenhofer and Al-Taha (1992) - translation movement (e.g., a hurricane moving toward/across a peninsula) and scaling movement (e.g., a lake extending and shrinking due to rainfall). Topology plays a central role in understanding the formal and cognitive characterization of movement patterns (Kurata & Egenhofer, 2009). On the one hand, it provides a way to filter out unnecessary details from humans conceptualizations of spatial relations. On the other hand, topology has shown potential to be an essential cognitive invariant in geographic event conceptualization (Egenhofer & Mark, 1995). Topological distinguished relations can be formally arranged in so-called conceptual neighborhood graphs (CNG) (Freksa, 1992). Research from cognitive science has shown that, in human s conceptualization of movement patterns, the ending relation of a movement pattern is of critical importance (Regier & Zheng, 2007). Hence, the movement patterns we are focusing on are distinguished on the basis of the topological relations that two spatial extended regions end in.

2 We derived nine topologically distinguished ending relations from the two most prominent topological calculi in spatial information science, the Egenhofer s 9-intersection model (Egenhofer & Herring, 1994) and the region connection calculus (RCC) (Randell, Cui, & Cohn, 1992). We focus on two spatially extended entities that are disconnected (DC) at the start. Depending on where the movement ends, nine ending relations are distinguished based on their conceptual paths through the conceptual neighborhood graph (Figure 1). i i Figure 1. Conceptual neighborhood graph. The nine topologically distinguished ending relations are elaborated below using two examples a hurricane scenario (translation movement) and a lake scenario (scaling movement):

3 DC1 CNG path: DC; The hurricane stops before it touches the peninsula. The lake stops extending before it touches the house. EC1 CNG path: DC EC; The hurricane stops when it just touches the peninsula. The lake stops extending when it just touches the house. PO1 CNG path: DC EC PO; The hurricane stops when half of its area is overlap with the peninsula. The lake stops extending when it engulfs half of the house. TPP1 CNG path: DC EC PO TPP; The hurricane stops when it is just overlap with the peninsula, but still connected to the ocean. The lake stops extending when the house is just fully submerged by the lake. NTPP CNG path: DC EC PO TPP NTPP; The hurricane stops when the hurricane is completely overlap with the peninsula. The lake stops extending when the house is completely submerged by the house (the edge of the house is not attached to the edge of the lake).

4 TPP2 CNG path: DC EC PO TPP NTPP TPP; Same as TPP1, but the hurricane connects to the other coast of the peninsula. The lake first extends until the house is fully engulfed, and then starts to shrink until it reaches the level of TPP again. PO2 CNG path: DC EC PO TPP NTPP TPP PO; Same as PO1, but the hurricane stops when half of its area is overlap with the other side of the peninsula. The lake first extends until the house is fully engulfed, and then starts to shrink until it reaches the level of PO again. EC2 CNG path: DC EC PO TPP NTPP TPP PO EC; Same as EC1, but the hurricane stops when it external connects to the other side of the peninsula. The lake first extends until the house is fully engulfed, and then starts to shrink until it reaches the level of EC again. DC2 CNG path: DC EC PO TPP NTPP TPP PO EC DC; Same as DC1, but the hurricane stops when it disconnects to the other side of the peninsula. The lake first extends until the house is fully engulfed, and then starts to shrink until it reaches the level of DC again.

5 Two questions are addressed through the experiments reported here: a) Does topology play a dominant role in conceptualizing different (geographic) movement patterns (i.e., translation and scaling)? b) How does domain semantics influence the salience of topological relations in conceptualizing geographic events? To shed light on these two questions, we designed animated stimuli of geographic events based on the nine topologically distinguished ending relations, and assessed the cognitive adequacy of topological calculi in translation movement and scaling movement through behavioral experiments.

6 Chapter 2 Experiments To shed light on how humans conceptualize translation and scaling movement patterns, we designed eight sets of animated icons that depict different geographic (and other) events in Adobe Flash 8. Within these events, four of them depict translation movement (such as a hurricane moving toward / across a peninsula) whereas the other four depict scaling movement (such as a lake extending and shrinking due to rain fall). For each scenario, there is a moving entity and a reference entity. Both entities are spatially extended. Details of eight scenarios are listed in the table below (Table 1 and Table 2): Table 1 Shows the design of icons in translation movement scenarios in the experiment. Scenario Sample Icon Moving Entity Reference Entity Hurricane (HUR) A hurricane A peninsula Ship (SHI) A ship A shallow water area Tornado (TOR) A tornado A city Geometry (GeoT) A gray circle area A gray triangle area

7 Table 2 Shows the design of icons in scaling movement scenarios in the experiment. Scenario Sample Icon Moving Entity Reference Entity Desert (DES) A recreation park (symbolized by a A desert area letter R which is enclosed by a boundary) Lake (LAK) A lake A house Oil spill (OIL) An oil spill A island Geometry (GeoS) A gray circle area A black diamond area Materials For each scenario, we created 72 animated icons for nine topologically distinguished ending relations (i.e., eight icons for each ending relation). All icons are 120*120 pixels in size. In the translation movement scenarios, the reference entity (e.g., peninsula, shallow water area, or city) is placed in the middle area of the icon (Figure 2, left). The starting point of the moving entity is randomly selected from a starting region, which is disconnected from the reference entity (Figure 2). Likewise, an ending point is also randomly selected from an ending region, which is on the other side of the reference entity. Thus, a path (straight from the starting

8 point to the ending point) is generated for the moving entity. It is noteworthy that the path is only used to determine the direction in which an entity moves. The moving entity does not necessarily need to reach the ending point of the path. When the animation starts, the moving entity moves along the path at a constant speed. Depending on where the moving entity stops relative to the reference entity, nine different topological ending relations are distinguished. Within the eight animated icons from the same topological equivalence class (e.g. DC1), the starting points and directions of moving entities are different from each other, but the topological ending relations between moving entity and reference entity are the same. Figure 2. Shown is the design layout of icons in hurricane scenario (translation movement) (the left icon) and lake scenario (scaling movement) (the right icon). In the scaling scenarios, the reference entity (recreation park, house, or island) is put in the central area of the icon. The coordinates of moving entity (e.g. desert, lake, or oil spill) are randomly selected from a starting region which is disconnected from the reference entity (Figure 2, right). When the animation starts, the moving entity expands at a constant speed. Depending on where the moving entity stops, nine topological ending relations are distinguished. Similar to the design in the translation movement scenarios, the starting points of eight animated icons in each

Scaling Translation 9 topological equivalent class are different from each other, but the topological ending relations between moving entity and reference entity are the same. To ensure the movements are perceptual clear in all animated icons, the duration of each movement is at least 2.0 seconds, followed by 1.5 seconds pause showing the ending relation between the moving entity and the reference entity before the animation restarts.. Participants We recruited 199 undergraduate students as participants at The Pennsylvania State University from introductory level geography courses. All participants were reimbursed with 10 USD for their participation. To ensure that participants were clearly aware of the domain semantics of the scenario they worked on, we checked the linguistic description provided by each participant, and replaced those participants who did not mention the scenario in their description (e.g. some participants referred the hurricane in the hurricane scenario as circle or white ball ). In addition, the data of two participants were accidentally overridden. Details for the final 20 participants in each scenario are listed below: Table 3. An overview of participant information over eight scenarios. Scenario # of participants # of female Average age Hurricane 20 9 21.7 Ship 20 9 20.3 Tornado 20 9 21.3 Geometry 20 4 21.1 Desert 20 5 21.5 Lake 20 7 19.7 Oil spill 20 8 20.6 Geometry 20 10 20.6 Total 160 61 20.8

10 Procedure All experiments were carried out as a group experiment in a GIS lab at Department of Geography, Penn State University. The GIS lab is equipped with 16 Dell desktops with 24-inch wide screen LCD monitors. To ensure that each participant can work on the experiment individually, view blocks were set up between participants such that they could not see each other s screens. All experiment tasks were performed in CatScan - a custom-made software tool (Klippel, Worboys, & Duckham, 2008) that allows animation presentation and data (grouping behavior and linguistic description) collection. In the experiments, the participants were randomly assigned to a desktop computer in the GIS lab, and then required to input their basic personal information (i.e., gender, age, field of study, etc.). After that, the participants were provided with a written introduction, which explained the scenario of the animated icons in that experiment and contained basic instructions for carrying out the experiment. To briefly train the participants how to perform grouping tasks in CatScan, a trial was provided in which the participants were asked to group a set of animal icons (dogs, cats, and camels) based on their own criteria. When the participants finished grouping all animal icons in the trial, they were able to proceed to the main experiment. In the grouping interface of the main experiment (Figure 3), all 72 animated icons were presented in the left panel of the interface. Animated icons on the left panel could be placed into groups on the right panel simply by clicking, dragging, and dropping the mouse. Groups could be created ( New Group button) or deleted ( Delete Group button). A third Compact Icons button was provided such that participants could compact all icons left in the left panel to the top. The Finish button was activated only after all icons were placed into group(s). All groups created by participants were automatically labeled with frames in distinct colors to assist participants finding a group they previously created at a later time. Averagely, the grouping task took 15 minutes. It is

11 critical to mention that the participants were clearly informed in the instruction that there was no right or wrong answer with respect to their grouping criteria or number of groups they created. After the participants finish the grouping task, they were shown the group(s) they previously created, each group at a time. For each group, they were asked to provide a short label (no more than five words) and a linguistic description to explain the criteria they used to create that group. Figure 3. Shown is the screenshots of the grouping interface of CatScan. The top one shows the initial screen that participants saw. The bottom one shows a mimic ongoing experiment in which a participant has created five groups.

12 Data Collection The following experiment data were automatically collected by CatScan during the experiment sessions: 1. The basic personal information of each participant (e.g. gender, age, field of study, etc.). 2. Which icons were placed into the same groups by participants. 3. The number of groups created by each participant. 4. The time (in seconds) each participant spent on performing the grouping task. 5. Linguistic description (short/long) of each group created by participants The grouping behavior of each participant was recorded in a 72 * 72 similarity matrix (72 is the number of icons we used in each scenario). All possible similarities between pairs of icons in a scenario are encoded in this symmetric matrix. Similarity between each pair of icons is binary encoded: A pair of icons coded as a 0 indicates that these two icons are not placed into the same group; a pair of icons coded as a 1 indicates that these two icons are placed into the same group. An overall similarity matrix (OSM) is obtained by summing over the similarity matrices of 20 participants in each scenario. Hence, the value of each cell in the OSM ranges from 0 (none of the 20 participants placed this pair of icons into the same group) to 20 (all 20 participants placed this pair of icons into the same group).

13 Chapter 3 Results Basic statistics The number of groups created by participants is shown in Figure 4 as box plots. The box plots reveal that the number of groups created in each single scenario is comparably similar over eight scenarios. Four outliers exist here: participant #11 in the hurricane scenario created 29 groups; participant #13 in the tornado scenario created 16 groups; participant #20 in the desert scenario created 18 groups; and participant #9 in the geometry (scaling) scenario created 17 groups. ANOVA (Appendix A) reveals that: a) There are no statistically significant differences in the number of groups created within four translation movement scenarios; b) Within four scaling movement scenarios, only the number of groups created in the geometry (scaling) scenario is statistically different from the number of groups created in the desert scenario (p < 0.05); c) Over all eight scenarios, only the geometry (scaling) scenario is statistically different from the desert scenario (p = 0.019) and ship scenario (p = 0.045) regarding to the number of groups created by participants.

14 Figure 4. Shows box plots of the number of groups created by participants in the eight scenarios. The Y- axis represents the number of groups created. Figure 5 shows the time that participants spent on performing the grouping task in each scenario, again in the form of a box plot. Based on the results shown by the box plots, the time participants spent on the grouping task is similar over four translation movement scenarios. Not surprisingly, ANOVA (Appendix B) also shows that there are no significant differences within translation movement scenarios. Within four scaling movement scenarios, the grouping time of the desert scenario is slightly shorter whereas the grouping time of the geometry (scaling) scenario is slightly longer when compared to the other two scenarios. ANOVA indicates that the only significant difference is between the grouping time of the desert scenario and the geometry (scaling) scenario (p < 0.05). We did not compare the time between translation movement

scenarios and scaling movement scenarios as in our experimental design, the durations of translation movements are generally shorter than the durations of scaling movements. 15 Figure 5. Shows the box plot of the amount of time that participants spent on the grouping task over eight scenarios. The Y-axis represents the amount of time (in seconds). Cluster analysis We used cluster analysis to examine the similarities among icons based on the overall similarity matrices. Comparison across different clustering methods has been suggested to cross-validate the interpretation (Clatworthy, et al., 2005). Thus, three different cluster methods are used here:

16 average linkage, complete linkage, and Ward s method. We mainly examined Ward s method dendrograms to identify patterns as it usually gives a near-optimal solution (Romesburg, 2004). The dendrograms generated from average linkage and complete linkage (Appendix C) were used to cross validate our interpretation. The first observation is, over all eight scenarios, icons with the same topologically equivalent CNG paths are forming distinct groups 1 (Figure 6 and Figure 7). This suggests that topology is a dominant criterion in participants grouping behavior. On the other hand, clusters of icons with different topological ending relations diverge from the main stream at different conceptual distances, which indicates that the nine topological ending relations are not equally salient. 1 Three exceptions occurred here. In hurricane scenario, a TPP1 icon falls into the cluster of PO1 icons and EC2 icons and DC2 icons form two mixed clusters. In the tornado scenario, PO2 icons are mixed with TPP2 icons.

17 Figure 6. The dendrograms of four translation movement scenarios that generated from cluster analysis in CLUSTAN TM.

18 Figure 7. The dendrograms of four scaling movement scenarios that generated from cluster analysis in CLUSTAN TM.

19 In translation movement scenarios, the dendrograms of the hurricane scenario and ship scenario show a very similar structure, which consist of three main clusters. The icons with no overlap or partial overlap relations (DC1/DC2, EC1/EC2, and PO1/PO2) form two main clusters, depending on whether moving entity (i.e., hurricane or ship) has crossed the reference entity (i.e., peninsula or shallow water area). The third main cluster is formed by icons whose ending relations are proper part relations (TPP1, TPP2, and NTPP). The only difference here is in the hurricane scenario where the PO2 icons are conceptually closer to the EC2 and DC2 icons, while in ship scenario, the PO2 icons are conceptually closer to the PO1, PO2, and NTPP icons. In the tornado scenario, icons with non-overlap ending relations (DC1, DC2, EC1, and EC2) are clearly separated from icons with overlap ending relations (PO1, PO2, TPP1, TPP2, and NTPP). The geometry (translation) scenario, in which domain semantics is absent, has a more distinct structure in its dendrogram. The DC1 and EC1 icons are clustered together and so too are the EC2 and DC2 icons. The partially overlapping icons (PO1 and PO2) form a distinct cluster, and proper part icons (TPP1, TPP2, and NTPP) form another cluster. In scaling movement scenarios, the grouping structures of desert, lake, and oil spill scenario are similar but show scenario specific differences. In all these three scenarios, the DC1, EC1, and PO1 icons are clustered together, and the EC2/DC2, PO2/TPP2, and NTPP/TPP1 icons are paired with each other. There are, however, some differences. In the desert scenario and lake scenario, the DC1 icons and EC1 icons are merged together first and then are tugged on to the PO1 icons, while in the oil spill scenario, the EC1 and PO1 icons are merged together first and then are tugged by the DC1 icons. This structure difference may result from the fuzziness of the effect of disaster as an oil spill reaching the coast of island (EC1) will cause damage while a desert reaching a recreation park (EC1) or a lake reaching a house (EC1) will not. Furthermore, the cluster of the PO2 and TPP2 icons are conceptually closer to the cluster of the DC2 and EC2

20 icons in desert scenario and oil spill scenario, while in the lake scenario, the cluster of the PO2 and TPP2 icons are closer to the cluster of the NTPP and TPP1 icons. Similar to what we have found in the translation movement scenarios, the dendrograms of the geometry (scaling) scenario shows a very different pattern than the other three real world scenarios. These interesting patterns in cluster analysis suggest three points: First, topology does play a dominant role in humans conceptualizing of movement patterns. Second, from a cognitive perspective, the similarities among nine topologically distinguished ending relations vary as a function of different movement types (i.e., translation and scaling), and different scenarios. Third, within each type of movement, the grouping behavior of participants is influenced by contextual (semantic) information. Multi-dimensional scaling To explore the similarities among icons from a different perspective, we performed a multidimensional scaling (MDS) analysis based on the overall similarity matrices (OSM) with the software CLUSTAN TM. We further visualized the MDS plots with a custom-made program in CatScan. Some interesting patterns emerge as we looked into the MDS plots (Figure 8 and Figure 9).

21 Figure 8. Shown are the MDS plots of four translation movement scenarios hurricane scenario (top left), ship scenario (top right), tornado scenario (bottom left), and geometry (translation) scenario (bottom right).

22 Figure 9. Shown are the MDS plots of four scaling movement scenarios desert scenario (top left), lake scenario (top right), oil spill scenario (bottom left), and geometry (scaling) scenario (bottom right). First, in the MDS plots of all eight scenarios, icons with the same topological ending relation basically form their own cluster, though there are some exceptions where icons whose topological ending relations are neighbors in the conceptual neighborhood graph overlap with each other. This finding supports the conclusion we draw from the cluster analysis topology does play a dominant role in humans conceptualizing of movement pattern.

23 Second, in the MDS plot of the hurricane scenario, all icons are distributed on a virtual arc in an order (counter-clockwise) that is identical to the order in the CNG. This pattern also exists in the MDS plots of the other two real world translation movement scenarios (i.e., the ship scenario and tornado scenario). Third, in the MDS plot of the lake scenario, three main clusters can be clearly identified. The first main cluster is formed by all the DC1 and EC1 icons, in which the house has never been submerged by the lake. We named this main cluster No Disaster. The second main cluster is exclusively formed by all the PO1 icons, in which the house is partially submerged by the lake at the end of the animation. We named this main cluster Medium Disaster. The third main cluster is formed by all other icons in the lake scenario. In these icons, the house has been completely submerged by the lake. We named this main cluster Complete Disaster. We also found the same pattern in the desert scenario. The MDS plot of oil spill, however, tells a slightly different story. All the EC1 icons, instead of falling in the No Disaster cluster, join the Medium Disaster cluster together with all the PO1 icons. This pattern matches the structure we saw in the cluster analysis, in which the EC1 icons are closer to the DC1 icons in the desert scenario and lake scenario but are closer to the PO1 icons in the oil spill scenario. From a domain semantics perspective, it is not surprising to see that participants considered an oil spill touching the coast of an island to be a Medium Disaster instead of No Disaster. In the case that an oil spill reaches the coast of an island, the beach will be contaminated by the black, disgusting oil. The three main clusters we identified confirm that domain semantics does have an influence on the grouping behavior of participants. This suggests that the conceptual distances between the nine topologically distinguished relations are influenced by domain semantics. Last, the MDS plots of the geometry scenarios (both for translation movement and scaling movement) show very distinct patterns. There are four main clusters in the MDS plot of

24 the geometry (translation) scenario. All the DC and EC icons are distributed on the upper part of the plot. Depending on whether the gray circle has crossed the gray triangle area (DC2 and EC2) or not (DC1 and EC1) in the animation, two main clusters are identified. All the icons whose ending relations are proper part (TPP1, TPP2, and NTPP) form a main cluster at the bottom left corner of the MDS plot, whereas all the icons whose ending relations are partial overlap (PO1 and PO2) form a main cluster at the right part of the MDS plot. In the MDS plot of geometry (scaling) scenario, icons with the same topological ending relation form a distinct cluster. The average conceptual distances among clusters of icons are greater than other real world scenarios. For both translation movement and scaling movement, the MDS plots of scenarios with contextual information (all six real world scenarios) are completely different from the MDS plots of scenarios without contextual information (geometry scenarios). Thus, this confirms that participants grouping behavior is influenced by contextual (semantics) information. Grouping raw frequencies To further analyze participants conceptualization of movement patterns, we performed an analysis of grouping raw frequencies using our custom-made visual analysis tool named KlipArt (Klippel, Hardisty, & Weaver, 2009). This tool allows us to dynamically explore each participant s grouping behavior in more detail (Figure 10), and to examine the linguistic description that participants provided for each group they created. For example, we can choose all the DC1 icons and examine how participants placed these eight icons into group(s). The linguistic description participants provided for each group is displayed in the software interface when icon(s) from that group is selected (not depicted in Figure 10). This function enables us to shed light on the rationale of participants grouping behavior.

25 Figure 10. Shown is a screenshot of the KlipArt tool (only the grouping behavior workspace). This figure shows 20 participants grouping behavior on eight DC1 icons (# 000-007) in the lake scenario. Each yellow square represents a participant. A black arrow connecting a participant to a group of icon(s) indicates that this group of icon(s) was placed into the same group by that participant. For instance, participant #11 placed eight DC1 icons into two groups: One group consists of icon #005 and #007; the other group consists of icons #000-004 and #006. Moreover, participant #2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19 placed all eight DC1 icons into one group. To assess cognitive adequacy of topological calculi from a statistical perspective, we extracted the number of participants who placed all topological equivalent classes (e.g., all eight icons of DC1) using KlipArt. This enables us to perform Chi-square analysis on participants grouping behavior.

26 We first focus on whether there are statistically significant differences among the number of participants who placed all eight topological equivalence icons into the same group over nine topological defined ending relations within each scenario. The results from Chi-square analysis show that only tornado scenario yields a significant difference within nine topological equivalent classes (X = 18.183, df = 8, p < 0.05). Second, we focus on whether there are statistically significant differences among the numbers of participants who placed all eight topological equivalence icons into the same group for each topologically distinguished ending relation over eight scenarios. We performed a Chisquare analysis for all eight scenarios. The only ending relation that shows statistically significant differences over eight scenarios is PO2 (X = 14.545, df = 7, p < 0.05). This may indicate that the saliency of PO2 is influenced by domain semantics. We followed up with two Chi-square analyses for translation movement scenarios and scaling movement scenarios separately. No significant difference is found within translation movement scenarios or within scaling movement scenarios. By additionally comparing the two line charts we created based on the raw counts (Figure 11 and Figure 12), we can infer that the PO2 icons are more frequently grouped together by participants in scaling movement scenarios than translation movement scenarios.

27 Figure 11. Shown is the line chart of the number of participants who placed all eight topological equivalence icons into the same group for each topologically distinguished ending relation over four translation scenarios Figure 12. Shown is the line chart of the number of participants who placed all eight topological equivalence icons into the same group for each topologically distinguished ending relation over four scaling scenarios.

Scaling Translation 28 Linguistic Analysis Language is like a window to cognition. To shed more light on participants grouping behavior, we followed up with an analysis of linguistic description that participants provided. Here we focused on two aspects topology and domain semantics. Participants who described movement patterns using terms relating to topology or domain semantics were identified by examining the short/long description. Table 4. The number (and percentage) of participants who using terms relating to topology or domain semantics in each scenario. Scenario Topology Domain semantics Count % Count % Hurricane 18 90.0 4 20.0 Ship 14 70.0 4 20.0 Tornado 19 95.0 8 40.0 Geometry 20 100.0 N/A N/A Total 51 88.8 16 26.7 Desert 19 95.0 7 35.0 Lake 18 90.0 13 65.0 Oil spill 19 95.0 7 35.0 Geometry 20 100 N/A N/A Total 56 95.0 27 45.0 Total 107 91.9 43 35.8 The results (Table 4) shows that the overwhelming portion of participants (91.9% on average over eight scenarios) used terms relating to topology such as inside, on the edge, overlap in their description for movement patterns. Participants more frequently used terms relating to domain semantics to describe scaling movement (27 out of 60 participants) than translation movement (16 out of 60 participants). For example, 13 out of 20 participants in the lake scenario used terms such as flooded, risk, damage, and threaten when described movement patterns. However, only 4 out of 20 participants in the hurricane scenario used terms

29 such as impact, destructing, affect, and cause damage in their description. This finding may suggests that domain semantics has more influence on conceptualizing scaling movement than translation movements.

30 Chapter 4 Conclusions Based on the findings discussed in the previous chapter, two main conclusions can be drawn. First, topology does play a dominant role in conceptualizing both translation and scaling movement patterns. As we have shown in the cluster analysis and multi-dimensional scaling, icons with the same topologically distinguished ending relation are conceptually closer than icons with other topologically distinguished ending relations. In addition, the linguistic descriptions from participants also reveal that topology is the main criterion in their grouping behavior. Second, from a cognitive perspective, the nine topologically distinguished ending relations are not equally salient across different scenarios. The similarities between these ending relations are influenced by domain semantics. In both translation and scaling movement scenarios, nine topological ending relations tend to be aggregated based on domain semantics. In contrast, the scenarios using geometric figures exhibit no such influence, albeit show that even geometrically certain topological relations are conceptually closer than others..

31 Bibliography Clatworthy, J., Buick, D., Hankins, M., Weinman, J., & Horne, R. (2005). The use and reporting of cluster analysis in health psychology: A review. British Journal of Health Psychology, 10(3), 329-358. Egenhofer, M. J., & Herring, J. (1994). Categorizing binary topological relations between regions, lines, and points in geographic databases. The, 9, 94-91. Egenhofer, M., & Mark, D. (1995). Naive geography. Spatial Information Theory A Theoretical Basis for GIS, 1-15. Egenhofer, M., & Al-Taha, K. (1992). Reasoning about gradual changes of topological relationships. Theories and methods of spatio-temporal reasoning in geographic space, 196-219. Freksa, C. (1992). Temporal reasoning based on semi-intervals. Artificial intelligence, 54(1-2), 199-227. Klippel, A., Hardisty, F., & Weaver, C. (2009). Star plots: How shape characteristics influence classification tasks. Cartography and Geographic Information Science, 36(2), 149-163. Klippel, A., Worboys, M., & Duckham, M. (2008). Identifying factors of geographic event conceptualisation. International Journal of Geographical Information Science, 22(2), 183-204. Kurata, Y., & Egenhofer, M. (2009). Interpretation of behaviors from a viewpoint of topology. In B. Gottfried & H. Aghajan (Eds.), Behaviour monitoring and interpretation. Ambient intelligence and smart environments. Amsterdam: IOS Press. Randell, D. A., Cui, Z., & Cohn, A. G. (1992). A spatial logic based on regions and connection. KR, 92, 165 176.

32 Regier, T., & Zheng, M. (2007). Attention to Endpoints: A Cross Linguistic Constraint on Spatial Meaning. Cognitive Science, 31(4), 705-719. Romesburg, C. (2004). Cluster analysis for researchers: Lulu press.

33 Appendix A. ANOVA of the number of groups created over eight scenarios Univariate Analysis of Variance Between-Subjects Factors Value Label N Scenario 1.00 hurricane 20 2.00 ship 20 3.00 tornado 20 4.00 geo_trans 20 5.00 desert 20 6.00 lake 20 7.00 oil 20 8.00 geo_scaling 20 Descriptive Statistics Dependent Variable:Groups Scenario Mean Std. Deviation N hurricane 6.5500 5.59582 20 ship 6.5000 2.91096 20 tornado 7.1500 3.21632 20 geo_trans 6.8500 3.92395 20 desert 6.1000 3.80996 20 lake 7.0000 2.73380 20 oil 6.6000 3.33088 20 geo_scaling 8.8500 3.13344 20 Total 6.9500 3.68372 160

34 Levene's Test of Equality of Error Variances a Dependent Variable:Groups F df1 df2 Sig..450 7 152.869 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a. Design: Intercept + Scenario Dependent Variable:Groups Tests of Between-Subjects Effects Source Type III Sum of Squares df Mean Square F Sig. Corrected Model 97.400 a 7 13.914 1.027.415 Intercept 7728.400 1 7728.400 570.196.000 Scenario 97.400 7 13.914 1.027.415 Error 2060.200 152 13.554 Total 9886.000 160 Corrected Total 2157.600 159 a. R Squared =.045 (Adjusted R Squared =.001)

35 Post Hoc Tests Dependent Variable:Groups Multiple Comparisons (I) Scenario (J) Scenario 95% Confidence Interval Mean Difference (I-J) Std. Error Sig. Lower Bound Upper Bound LSD hurricane ship.0500 1.16421.966-2.2501 2.3501 tornado -.6000 1.16421.607-2.9001 1.7001 geo_trans -.3000 1.16421.797-2.6001 2.0001 desert.4500 1.16421.700-1.8501 2.7501 lake -.4500 1.16421.700-2.7501 1.8501 oil -.0500 1.16421.966-2.3501 2.2501 geo_scaling -2.3000 1.16421.050-4.6001.0001 ship hurricane -.0500 1.16421.966-2.3501 2.2501 tornado -.6500 1.16421.577-2.9501 1.6501 geo_trans -.3500 1.16421.764-2.6501 1.9501 desert.4000 1.16421.732-1.9001 2.7001 lake -.5000 1.16421.668-2.8001 1.8001 oil -.1000 1.16421.932-2.4001 2.2001 geo_scaling -2.3500 * 1.16421.045-4.6501 -.0499 tornado hurricane.6000 1.16421.607-1.7001 2.9001 ship.6500 1.16421.577-1.6501 2.9501 geo_trans.3000 1.16421.797-2.0001 2.6001 desert 1.0500 1.16421.369-1.2501 3.3501 lake.1500 1.16421.898-2.1501 2.4501 oil.5500 1.16421.637-1.7501 2.8501 geo_scaling -1.7000 1.16421.146-4.0001.6001 geo_trans hurricane.3000 1.16421.797-2.0001 2.6001 ship.3500 1.16421.764-1.9501 2.6501

36 tornado -.3000 1.16421.797-2.6001 2.0001 desert.7500 1.16421.520-1.5501 3.0501 lake -.1500 1.16421.898-2.4501 2.1501 oil.2500 1.16421.830-2.0501 2.5501 geo_scaling -2.0000 1.16421.088-4.3001.3001 desert hurricane -.4500 1.16421.700-2.7501 1.8501 ship -.4000 1.16421.732-2.7001 1.9001 tornado -1.0500 1.16421.369-3.3501 1.2501 geo_trans -.7500 1.16421.520-3.0501 1.5501 lake -.9000 1.16421.441-3.2001 1.4001 oil -.5000 1.16421.668-2.8001 1.8001 geo_scaling -2.7500 * 1.16421.019-5.0501 -.4499 lake hurricane.4500 1.16421.700-1.8501 2.7501 ship.5000 1.16421.668-1.8001 2.8001 tornado -.1500 1.16421.898-2.4501 2.1501 geo_trans.1500 1.16421.898-2.1501 2.4501 desert.9000 1.16421.441-1.4001 3.2001 oil.4000 1.16421.732-1.9001 2.7001 geo_scaling -1.8500 1.16421.114-4.1501.4501 oil hurricane.0500 1.16421.966-2.2501 2.3501 ship.1000 1.16421.932-2.2001 2.4001 tornado -.5500 1.16421.637-2.8501 1.7501 geo_trans -.2500 1.16421.830-2.5501 2.0501 desert.5000 1.16421.668-1.8001 2.8001 lake -.4000 1.16421.732-2.7001 1.9001 geo_scaling -2.2500 1.16421.055-4.5501.0501 geo_scaling hurricane 2.3000 1.16421.050 -.0001 4.6001 ship 2.3500 * 1.16421.045.0499 4.6501 tornado 1.7000 1.16421.146 -.6001 4.0001 geo_trans 2.0000 1.16421.088 -.3001 4.3001 desert 2.7500 * 1.16421.019.4499 5.0501 lake 1.8500 1.16421.114 -.4501 4.1501 oil 2.2500 1.16421.055 -.0501 4.5501 Bonferroni hurricane ship.0500 1.16421 1.000-3.6521 3.7521

37 tornado -.6000 1.16421 1.000-4.3021 3.1021 geo_trans -.3000 1.16421 1.000-4.0021 3.4021 desert.4500 1.16421 1.000-3.2521 4.1521 lake -.4500 1.16421 1.000-4.1521 3.2521 oil -.0500 1.16421 1.000-3.7521 3.6521 geo_scaling -2.3000 1.16421 1.000-6.0021 1.4021 ship hurricane -.0500 1.16421 1.000-3.7521 3.6521 tornado -.6500 1.16421 1.000-4.3521 3.0521 geo_trans -.3500 1.16421 1.000-4.0521 3.3521 desert.4000 1.16421 1.000-3.3021 4.1021 lake -.5000 1.16421 1.000-4.2021 3.2021 oil -.1000 1.16421 1.000-3.8021 3.6021 geo_scaling -2.3500 1.16421 1.000-6.0521 1.3521 tornado hurricane.6000 1.16421 1.000-3.1021 4.3021 ship.6500 1.16421 1.000-3.0521 4.3521 geo_trans.3000 1.16421 1.000-3.4021 4.0021 desert 1.0500 1.16421 1.000-2.6521 4.7521 lake.1500 1.16421 1.000-3.5521 3.8521 oil.5500 1.16421 1.000-3.1521 4.2521 geo_scaling -1.7000 1.16421 1.000-5.4021 2.0021 geo_trans hurricane.3000 1.16421 1.000-3.4021 4.0021 ship.3500 1.16421 1.000-3.3521 4.0521 tornado -.3000 1.16421 1.000-4.0021 3.4021 desert.7500 1.16421 1.000-2.9521 4.4521 lake -.1500 1.16421 1.000-3.8521 3.5521 oil.2500 1.16421 1.000-3.4521 3.9521 geo_scaling -2.0000 1.16421 1.000-5.7021 1.7021 desert hurricane -.4500 1.16421 1.000-4.1521 3.2521 ship -.4000 1.16421 1.000-4.1021 3.3021 tornado -1.0500 1.16421 1.000-4.7521 2.6521 geo_trans -.7500 1.16421 1.000-4.4521 2.9521 lake -.9000 1.16421 1.000-4.6021 2.8021 oil -.5000 1.16421 1.000-4.2021 3.2021 geo_scaling -2.7500 1.16421.544-6.4521.9521

38 lake hurricane.4500 1.16421 1.000-3.2521 4.1521 ship.5000 1.16421 1.000-3.2021 4.2021 tornado -.1500 1.16421 1.000-3.8521 3.5521 geo_trans.1500 1.16421 1.000-3.5521 3.8521 desert.9000 1.16421 1.000-2.8021 4.6021 oil.4000 1.16421 1.000-3.3021 4.1021 geo_scaling -1.8500 1.16421 1.000-5.5521 1.8521 oil hurricane.0500 1.16421 1.000-3.6521 3.7521 ship.1000 1.16421 1.000-3.6021 3.8021 tornado -.5500 1.16421 1.000-4.2521 3.1521 geo_trans -.2500 1.16421 1.000-3.9521 3.4521 desert.5000 1.16421 1.000-3.2021 4.2021 lake -.4000 1.16421 1.000-4.1021 3.3021 geo_scaling -2.2500 1.16421 1.000-5.9521 1.4521 geo_scaling hurricane 2.3000 1.16421 1.000-1.4021 6.0021 ship 2.3500 1.16421 1.000-1.3521 6.0521 tornado 1.7000 1.16421 1.000-2.0021 5.4021 geo_trans 2.0000 1.16421 1.000-1.7021 5.7021 desert 2.7500 1.16421.544 -.9521 6.4521 lake 1.8500 1.16421 1.000-1.8521 5.5521 oil 2.2500 1.16421 1.000-1.4521 5.9521 Based on observed means. The error term is Mean Square(Error) = 13.554. *. The mean difference is significant at the.05 level.

Appendix B. ANOVA of the amount of time that participants spent on the grouping task over eight scenarios 39 Univariate Analysis of Variance Between-Subjects Factors Value Label N Scenario 1.00 hurricane 20 2.00 ship 20 3.00 tornado 20 4.00 geo_trans 20 5.00 desert 20 6.00 lake 20 7.00 oil 20 8.00 geo_scaling 20 Descriptive Statistics Dependent Variable:Time Scenario Mean Std. Deviation N hurricane 794.6590 356.62748 20 ship 767.8620 408.67363 20 tornado 942.6490 402.64376 20 geo_trans 752.1215 396.60155 20 desert 771.1055 352.82467 20 lake 1031.1185 492.87001 20 oil 972.1540 448.17817 20 geo_scaling 1150.9285 393.78093 20 Total 897.8247 423.26694 160

40 Levene's Test of Equality of Error Variances a Dependent Variable:Time F df1 df2 Sig..675 7 152.693 Tests the null hypothesis that the error variance of the dependent variable is equal across groups. a. Design: Intercept + Scenario Dependent Variable:Time Tests of Between-Subjects Effects Source Type III Sum of Squares df Mean Square F Sig. Corrected Model 3.084E6 7 440524.219 2.636.013 Intercept 1.290E8 1 1.290E8 771.755.000 Scenario 3083669.532 7 440524.219 2.636.013 Error 2.540E7 152 167118.158 Total 1.575E8 160 Corrected Total 2.849E7 159 a. R Squared =.108 (Adjusted R Squared =.067)

41 Post Hoc Tests Dependent Variable:Time Multiple Comparisons 95% Confidence Interval Mean Difference Lower Upper (I) Scenario (J) Scenario (I-J) Std. Error Sig. Bound Bound LSD hurricane ship 26.7970 129.27419.836-228.6092 282.2032 tornado -147.9900 129.27419.254-403.3962 107.4162 geo_trans 42.5375 129.27419.743-212.8687 297.9437 desert 23.5535 129.27419.856-231.8527 278.9597 lake -236.4595 129.27419.069-491.8657 18.9467 oil -177.4950 129.27419.172-432.9012 77.9112 geo_scaling -356.2695 * 129.27419.007-611.6757-100.8633 ship hurricane -26.7970 129.27419.836-282.2032 228.6092 tornado -174.7870 129.27419.178-430.1932 80.6192 geo_trans 15.7405 129.27419.903-239.6657 271.1467 desert -3.2435 129.27419.980-258.6497 252.1627 lake -263.2565 * 129.27419.043-518.6627-7.8503 oil -204.2920 129.27419.116-459.6982 51.1142 geo_scaling -383.0665 * 129.27419.004-638.4727-127.6603 tornado hurricane 147.9900 129.27419.254-107.4162 403.3962 ship 174.7870 129.27419.178-80.6192 430.1932 geo_trans 190.5275 129.27419.143-64.8787 445.9337 desert 171.5435 129.27419.187-83.8627 426.9497 lake -88.4695 129.27419.495-343.8757 166.9367 oil -29.5050 129.27419.820-284.9112 225.9012 geo_scaling -208.2795 129.27419.109-463.6857 47.1267 geo_trans hurricane -42.5375 129.27419.743-297.9437 212.8687 ship -15.7405 129.27419.903-271.1467 239.6657 tornado -190.5275 129.27419.143-445.9337 64.8787 desert -18.9840 129.27419.883-274.3902 236.4222

42 lake -278.9970 * 129.27419.032-534.4032-23.5908 oil -220.0325 129.27419.091-475.4387 35.3737 geo_scaling -398.8070 * 129.27419.002-654.2132-143.4008 desert hurricane -23.5535 129.27419.856-278.9597 231.8527 ship 3.2435 129.27419.980-252.1627 258.6497 tornado -171.5435 129.27419.187-426.9497 83.8627 geo_trans 18.9840 129.27419.883-236.4222 274.3902 lake -260.0130 * 129.27419.046-515.4192-4.6068 oil -201.0485 129.27419.122-456.4547 54.3577 geo_scaling -379.8230 * 129.27419.004-635.2292-124.4168 lake hurricane 236.4595 129.27419.069-18.9467 491.8657 ship 263.2565 * 129.27419.043 7.8503 518.6627 tornado 88.4695 129.27419.495-166.9367 343.8757 geo_trans 278.9970 * 129.27419.032 23.5908 534.4032 desert 260.0130 * 129.27419.046 4.6068 515.4192 oil 58.9645 129.27419.649-196.4417 314.3707 geo_scaling -119.8100 129.27419.356-375.2162 135.5962 oil hurricane 177.4950 129.27419.172-77.9112 432.9012 ship 204.2920 129.27419.116-51.1142 459.6982 tornado 29.5050 129.27419.820-225.9012 284.9112 geo_trans 220.0325 129.27419.091-35.3737 475.4387 desert 201.0485 129.27419.122-54.3577 456.4547 lake -58.9645 129.27419.649-314.3707 196.4417 geo_scaling -178.7745 129.27419.169-434.1807 76.6317 geo_scaling hurricane 356.2695 * 129.27419.007 100.8633 611.6757 ship 383.0665 * 129.27419.004 127.6603 638.4727 tornado 208.2795 129.27419.109-47.1267 463.6857 geo_trans 398.8070 * 129.27419.002 143.4008 654.2132 desert 379.8230 * 129.27419.004 124.4168 635.2292 lake 119.8100 129.27419.356-135.5962 375.2162 oil 178.7745 129.27419.169-76.6317 434.1807 Bonferroni hurricane ship 26.7970 129.27419 1.000-384.2833 437.8773 tornado -147.9900 129.27419 1.000-559.0703 263.0903 geo_trans 42.5375 129.27419 1.000-368.5428 453.6178

43 desert 23.5535 129.27419 1.000-387.5268 434.6338 lake -236.4595 129.27419 1.000-647.5398 174.6208 oil -177.4950 129.27419 1.000-588.5753 233.5853 geo_scaling -356.2695 129.27419.184-767.3498 54.8108 ship hurricane -26.7970 129.27419 1.000-437.8773 384.2833 tornado -174.7870 129.27419 1.000-585.8673 236.2933 geo_trans 15.7405 129.27419 1.000-395.3398 426.8208 desert -3.2435 129.27419 1.000-414.3238 407.8368 lake -263.2565 129.27419 1.000-674.3368 147.8238 oil -204.2920 129.27419 1.000-615.3723 206.7883 geo_scaling -383.0665 129.27419.099-794.1468 28.0138 tornado hurricane 147.9900 129.27419 1.000-263.0903 559.0703 ship 174.7870 129.27419 1.000-236.2933 585.8673 geo_trans 190.5275 129.27419 1.000-220.5528 601.6078 desert 171.5435 129.27419 1.000-239.5368 582.6238 lake -88.4695 129.27419 1.000-499.5498 322.6108 oil -29.5050 129.27419 1.000-440.5853 381.5753 geo_scaling -208.2795 129.27419 1.000-619.3598 202.8008 geo_trans hurricane -42.5375 129.27419 1.000-453.6178 368.5428 ship -15.7405 129.27419 1.000-426.8208 395.3398 tornado -190.5275 129.27419 1.000-601.6078 220.5528 desert -18.9840 129.27419 1.000-430.0643 392.0963 lake -278.9970 129.27419.910-690.0773 132.0833 oil -220.0325 129.27419 1.000-631.1128 191.0478 geo_scaling -398.8070 129.27419.068-809.8873 12.2733 desert hurricane -23.5535 129.27419 1.000-434.6338 387.5268 ship 3.2435 129.27419 1.000-407.8368 414.3238 tornado -171.5435 129.27419 1.000-582.6238 239.5368 geo_trans 18.9840 129.27419 1.000-392.0963 430.0643 lake -260.0130 129.27419 1.000-671.0933 151.0673 oil -201.0485 129.27419 1.000-612.1288 210.0318 geo_scaling -379.8230 129.27419.107-790.9033 31.2573 lake hurricane 236.4595 129.27419 1.000-174.6208 647.5398 ship 263.2565 129.27419 1.000-147.8238 674.3368

44 tornado 88.4695 129.27419 1.000-322.6108 499.5498 geo_trans 278.9970 129.27419.910-132.0833 690.0773 desert 260.0130 129.27419 1.000-151.0673 671.0933 oil 58.9645 129.27419 1.000-352.1158 470.0448 geo_scaling -119.8100 129.27419 1.000-530.8903 291.2703 oil hurricane 177.4950 129.27419 1.000-233.5853 588.5753 ship 204.2920 129.27419 1.000-206.7883 615.3723 tornado 29.5050 129.27419 1.000-381.5753 440.5853 geo_trans 220.0325 129.27419 1.000-191.0478 631.1128 desert 201.0485 129.27419 1.000-210.0318 612.1288 lake -58.9645 129.27419 1.000-470.0448 352.1158 geo_scaling -178.7745 129.27419 1.000-589.8548 232.3058 geo_scaling hurricane 356.2695 129.27419.184-54.8108 767.3498 ship 383.0665 129.27419.099-28.0138 794.1468 tornado 208.2795 129.27419 1.000-202.8008 619.3598 geo_trans 398.8070 129.27419.068-12.2733 809.8873 desert 379.8230 129.27419.107-31.2573 790.9033 lake 119.8100 129.27419 1.000-291.2703 530.8903 oil 178.7745 129.27419 1.000-232.3058 589.8548 Based on observed means. The error term is Mean Square(Error) = 167118.158. *. The mean difference is significant at the.05 level.

Appendix C. Dendrograms of average linkage and complete linkage 45