Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective

Similar documents
Intelligent GIS: Automatic generation of qualitative spatial information

Web Visualization of Geo-Spatial Data using SVG and VRML/X3D

Geospatial Semantics. Yingjie Hu. Geospatial Semantics

Implementing Visual Analytics Methods for Massive Collections of Movement Data

Comparing Color and Leader Line Approaches for Highlighting in Geovisualization

CONCEPTUAL DEVELOPMENT OF AN ASSISTANT FOR CHANGE DETECTION AND ANALYSIS BASED ON REMOTELY SENSED SCENES

Developing Geo-temporal Context from Implicit Sources with Geovisual Analytics

pursues interdisciplinary long-term research in Spatial Cognition. Particular emphasis is given to:

OECD QSAR Toolbox v.3.3. Step-by-step example of how to categorize an inventory by mechanistic behaviour of the chemicals which it consists

Finding geodata that otherwise would have been forgotten GeoXchange a portal for free geodata

Spatial Data Infrastructure Concepts and Components. Douglas Nebert U.S. Federal Geographic Data Committee Secretariat

Assessment of Physical status and Irrigation potential of Canals using ArcPy

VISUAL ANALYTICS APPROACH FOR CONSIDERING UNCERTAINTY INFORMATION IN CHANGE ANALYSIS PROCESSES

OECD QSAR Toolbox v.3.0

Animating Maps: Visual Analytics meets Geoweb 2.0

Alexander Klippel and Chris Weaver. GeoVISTA Center, Department of Geography The Pennsylvania State University, PA, USA

From User Requirements Analysis to Conceptual Design of a Mobile Augmented Reality Tool to be used in an Urban Geography Fieldwork Setting

CityGML XFM Application Template Documentation. Bentley Map V8i (SELECTseries 2)

A General Framework for Conflation

ArcGIS Web Tools, Templates, and Solutions for Defence & Intelligence. Renee Bernstein Esri Solutions Engineer

Integration for Informed Decision Making

Utilizing GIS Technology for Rockland County. Rockland County Planning Department Douglas Schuetz & Scott Lounsbury

Cognitive OpenLS Extending XML for location services by a high-level cognitive framework

5A.10 A GEOSPATIAL DATABASE AND CLIMATOLOGY OF SEVERE WEATHER DATA

Geospatial Semantics for Topographic Data

A Case Study for Semantic Translation of the Water Framework Directive and a Topographic Database

A spatial literacy initiative for undergraduate education at UCSB

PC ARC/INFO and Data Automation Kit GIS Tools for Your PC

Integrating Online and Geospatial Information Sources

Scott Pezanowski CURRICULUM VITAE

Toponym Disambiguation using Ontology-based Semantic Similarity

OECD QSAR Toolbox v.3.3

Topic 2: New Directions in Distributed Geovisualization. Alan M. MacEachren

OECD QSAR Toolbox v.4.1

OECD QSAR Toolbox v.3.4

Semantic Evolution of Geospatial Web Services: Use Cases and Experiments in the Geospatial Semantic Web

EEOS 381 -Spatial Databases and GIS Applications

FIRE DEPARMENT SANTA CLARA COUNTY

4CitySemantics. GIS-Semantic Tool for Urban Intervention Areas

Using ArcGIS for Intelligence Analysis. Tony Mason Suzanne Foss Vinh Lam

for an Informed Analysis of A Socio-Economic Perspective Adrijana Car, Marike Bontenbal and Marius Herrmann

Lecture 9: Geocoding & Network Analysis

The Road to Improving your GIS Data. An ebook by Geo-Comm, Inc.

2G1/3G4 GIS TUTORIAL >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Charter for the. Information Transfer and Services Architecture Focus Group

file://q:\report1\greenatlasfinalreportindex.html

Three-Dimensional Visualization of Activity-Travel Patterns

GEOGRAPHIC INFORMATION SYSTEMS Session 8

Open Contextual Cartographic Visualization

VGIscience Summer School Interpretation, Visualisation and Social Computing of Volunteered Geographic Information (VGI)

Building Inflation Tables and CER Libraries

Institutional Research with Public Data and Open Source Software

GENERALIZATION IN THE NEW GENERATION OF GIS. Dan Lee ESRI, Inc. 380 New York Street Redlands, CA USA Fax:

Welcome! Power BI User Group (PUG) Copenhagen

GIS Supports to Economic and Social Development

Assessing pervasive user-generated content to describe tourist dynamics

ESRI Object Models and Data Capture 2/1/2018

QGIS FLO-2D Integration

Dan Goldberg GIS Research Laboratory Department of Computer Science University of Southern California

GEO-INFORMATION (LAKE DATA) SERVICE BASED ON ONTOLOGY

The Global Statistical Geospatial Framework and the Global Fundamental Geospatial Themes

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

M A P T I T U D E A L O W C O S T G I S / D E S K T O P M A P P I N G A L T E R N A T I V E

19.2 Geographic Names Register General The Geographic Names Register of the National Land Survey is the authoritative geographic names data

A Data Repository for Named Places and Their Standardised Names Integrated With the Production of National Map Series

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

A BASE SYSTEM FOR MICRO TRAFFIC SIMULATION USING THE GEOGRAPHICAL INFORMATION DATABASE

Utilizing Data from American FactFinder with TIGER/Line Shapefiles in ArcGIS

a system for input, storage, manipulation, and output of geographic information. GIS combines software with hardware,

Basics of GIS. by Basudeb Bhatta. Computer Aided Design Centre Department of Computer Science and Engineering Jadavpur University

Geographic Information Systems (GIS) in Environmental Studies ENVS Winter 2003 Session III

Geospatial Big Data Analytics for Road Network Safety Management

V1.0. Session: Labelled Maps Verification, entering names into a GIS and Google Maps/Earth. Pier-Giorgio Zaccheddu

UNIT 4: USING ArcGIS. Instructor: Emmanuel K. Appiah-Adjei (PhD) Department of Geological Engineering KNUST, Kumasi

DEVELOPMENT OF GPS PHOTOS DATABASE FOR LAND USE AND LAND COVER APPLICATIONS

The Semantic Annotation Based on Mongolian Place Recognition

ONLINE DECISION SUPPORT TOOL FOR AVALANCHE RISK MANAGEMENT. Patrick Nairz* Avalanche Warning Center Tyrol, Austria

Developing 3D Geoportal for Wilayah Persekutuan Iskandar

Canadian Board of Examiners for Professional Surveyors Core Syllabus Item C 5: GEOSPATIAL INFORMATION SYSTEMS

Department of Computer Science and Engineering Indian Institute of Technology, Kanpur. Spatial Role Labeling

OECD QSAR Toolbox v.3.3. Predicting skin sensitisation potential of a chemical using skin sensitization data extracted from ECHA CHEM database

ENHANCING ROAD SAFETY MANAGEMENT WITH GIS MAPPING AND GEOSPATIAL DATABASE

Research on Object-Oriented Geographical Data Model in GIS

Multi agent Evacuation Simulation Data Model for Disaster Management Context

The Relevance of Spatial Relation Terms and Geographical Feature Types

Geodatabase Essentials Part One - Intro to the Geodatabase. Jonathan Murphy Colin Zwicker

Spatial Data Management of Bio Regional Assessments Phase 1 for Coal Seam Gas Challenges and Opportunities

Development of a server to manage a customised local version of OpenStreetMap in Ireland

Geodatabase An Introduction

Historical Spatio-temporal data in current GIS

ESRI* Object Models; Data Capture

Data Creation and Editing

Development of the system for automatic map generalization based on constraints

Georelational Vector Data Model

GeoAgent-based Knowledge Acquisition, Representation, and Validation

ESRI Object Models and Data Capture 9/18/ /18/2014 M. Helper GEO327G/386G, UT Austin 2. ESRI Arc/Info ArcView ArcGIS

Oracle Spatial: Essentials

From Language towards. Formal Spatial Calculi

Long Island Breast Cancer Study and the GIS-H (Health)

GIS for Crime Analysis. Building Better Analysis Capabilities with the ArcGIS Platform

Transcription:

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective Alexander Klippel 1, Alan MacEachren 1, Prasenjit Mitra 2, Ian Turton 1, Xiao Zhang 2, Anuj Jaiswal 2, Kean Soon 1, Jared Oyler 1, Rui Li 1 1 GeoVISTA Center, Department of Geography, The Pennsylvania State University, PA, USA 2 Information Science and Technology, The Pennsylvania State University, PA, USA In this paper, we present an approach and outline the initial steps toward a system to support theoretically-informed, analyst-guided, extraction and contextualization of statements about movement from text. Linguistic geographic references in various text corpora (e.g. web blogs) provide potentially important information about movement of entities (people, vehicles, etc) and about their underlying spatial behaviors. These qualitative geographic references can only be interpreted if put in an appropriate context. While progress has been made on geographic information retrieval from text, the progress has been relatively slow and has focused primarily on extracting and disambiguating place names. While that is an important and sometimes hard task (e.g., there are 1042 instances of the name Columbia in the Geo-graphic Names Information System), place name extraction is just a small part of the challenge. Our focus is placed on the analysis of movement patterns characterized linguistically. The initial focus is on route directions provided on the web. This comparatively restricted domain allows us to create a first system for the analysis (visually and conceptually) of movement patterns as characterized in natural language descriptions. A goal for the work reported is to demonstrate the effectiveness of our approach before extending it into other domains (such as adding a temporal component, multiple agents, etc.). We will characterize the basic components of our proposed analytical framework briefly. 113

Extracting route directions Classically, route directions contain an origin, a destination, and a body of route information. The flexibility of natural language, however, makes it possible to hide information regarding, for example, origin and destination somewhere on a website, or to have one destination but several origins (or vice versa). Our first task therefore was to identify individual route directions on websites. For this purpose, we crawled over 11,000 thousand reference documents containing route directions using different key words. These documents were analyzed (for example using word trees and document stemming) to define regular expressions that allow for identifying individual route directions. Mapping geographic way marks Once the route directions are extracted and stored as an XML file, a second component of our system parses them to identify geographic way marks that can be displayed on a map for an analyst to inspect. First, origin, destination and route direction body are tokenized. Second, the algorithm passes through the token stream extracting geocodable tokens such as zip codes and telephone numbers that can be recognized by using a simple regular pattern matching expression. Third each token is matched against a database of street name suffixes generated from the US Census TIGER dataset (e.g., Road, Blvd, St, etc.). Fourth, the proper street names are identified and looked up using the GeoNames system (geonames.org). Finally, a pass is made through the token stream to extract addresses which are defined as all the tokens between a number followed by a street token until a zip code token. Addresses are geocoded at geocoder.us, populated places at geonames.org and roads are resolved using a local database of the Open Street Map data. Finally, the token stream is then passed to a display renderer, which includes a text window linked dynamically to a map window. Brushing over a geographically encoded text token highlights the related map element, allowing the analyst to see if the system selected the right place or to choose the correct one where multiple places have been returned. 114

Fig. 1. Screen shot of tagged route directions (left) and there mapping (right, multi-road example). Fig. 2. Screenshot tagging tool. 115

An ontology of meaningful units of route parts So far we have discussed the identification of web documents containing route directions, an initial extraction of directions and parsing into origin, destination, and the body of route directions, and the identification of geocodable route parts that can be mapping to support iterative analysis of the linguistically encoded movement pattern and refinement of the route specified. To deepen our analysis and to pursue our long term goal of automatically extracting and geo-locating movement patterns, we randomly selected 30 documents (from the 11,000 thousand identified) to be hand tagged. To support this step, we developed a document tagging tool (Figure 2) that supports the creation and application of a taxonomy of route direction concepts, i.e. the conceptually primitive / meaningful route parts (as shown on the right part of Figure 2). This taxonomy of route direction elements is the basis for an ontological characterization and can be exported as an OWL file for further processing using ontology editors such as ConceptVISTA. We will analyze each meaningful route part as a spatial scene and formally characterize the spatial information the linguistic expressions provide, i.e. the characteristics of the route and the spatial environment it is embedded in. Linguistic underspecificity may be resolved using contextual information such as provided by the map. Conclusions and Outlook We present a first stage implementation of our approach that supports trial applications to validate effectiveness of system components in supporting the analysis of linguistically characterized movement patterns. Our next step is to implement an initial, integrated, visual workbench for analysis of linguistic movement patterns. We will use the workbench to compare different visual-computational tools to each other and to the base case of manual text analysis. A further aspect of validation focuses on the more theoretical aspects of movement patterns. The created ontology that is developed and used to identify conceptual primitives in route directions will be tested in an interrater agreement task by having multiple users annotate a randomly selected test set of websites containing route directions. We expect the result to confirm the cognitive validity of our approach, i.e. that the identified route parts are indeed meaningful. Our long term goal focuses on refining each part of our analytical framework to address two specific objectives: (1) build the conceptual/ theoretical/ data model framework needed to represent, extract, map, and interpret geographic accounts of movement found in text; and (2) apply the 116

framework to creating methods, tools, and a geovisual analytics workspace to accomplish these goals. Acknowledgements Research for this paper was funded by the National Geospatial-Intelligence Agency/NGA through the NGA University Research Initiative Program/NURI program. The views, opinions, and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the National Geospatial-Intelligence Agency or the U.S. Government. 117