vizlib: Using The Seven Stages of Visualization to Explore Population Trends and Processes in Local Authority Research

Similar documents
Point data in mashups: moving away from pushpins in maps

City, University of London Institutional Repository

City, University of London Institutional Repository

Visualisation of Uncertainty in a Geodemographic Classifier

Exploring Digital Welfare data using GeoTools and Grids

Implementing Visual Analytics Methods for Massive Collections of Movement Data

Appropriate Selection of Cartographic Symbols in a GIS Environment

Regionalizing and Understanding Commuter Flows: An Open Source Geospatial Approach

Population 24/7 Download: User Guide. Purpose

Visitor Flows Model for Queensland a new approach

Estimating Large Scale Population Movement ML Dublin Meetup

Explore. history landscapes family.

Fuzzy Geographically Weighted Clustering

Future Proofing the Provision of Geoinformation: Emerging Technologies

City Research Online. Permanent City Research Online URL:

Neighborhood Locations and Amenities

A GIS tool for the digitisation and visualisation of footpath hazards

City, University of London Institutional Repository

Population 24/7. David Martin, University of Southampton

Designing Interactive Graphics for Foraging Trips Exploration

An Open Source Geodemographic Classification of Small Areas In the Republic of Ireland Chris Brunsdon, Martin Charlton, Jan Rigby

Assessing the impact of seasonal population fluctuation on regional flood risk management

Treemaps and Choropleth Maps Applied to Regional Hierarchical Statistical Data

GeoWorlds: Integrated Digital Libraries and Geographic Information Systems

of places Key stage 1 Key stage 2 describe places

Three-Dimensional Visualization of Activity-Travel Patterns

Geographical Information System (GIS) Prof. A. K. Gosain

Map Name(s) and Date(s) Historical Map - Slice A

Visualizing Geospatial Graphs

a system for input, storage, manipulation, and output of geographic information. GIS combines software with hardware,

The Journal of Database Marketing, Vol. 6, No. 3, 1999, pp Retail Trade Area Analysis: Concepts and New Approaches

Encapsulating Urban Traffic Rhythms into Road Networks

Design and Development of a Large Scale Archaeological Information System A Pilot Study for the City of Sparti

GEOGRAPHICAL STATISTICS & THE GRID

GIS-BASED VISUALIZATION FOR ESTIMATING LEVEL OF SERVICE Gozde BAKIOGLU 1 and Asli DOGRU 2

MineScape Geological modelling, mine planning and design

Capital, Institutions and Urban Growth Systems

Mapping Travel-To-Work Flows. Oliver O Brien and James Cheshire Department of Geography, UCL

DATA 301 Introduction to Data Analytics Geographic Information Systems

Geospatial Data Visualization

Source Protection Zones. National Dataset User Guide

Introduction to Contour Maps

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective

Land-Line Technical information leaflet

Understanding China Census Data with GIS By Shuming Bao and Susan Haynie China Data Center, University of Michigan

GEOGRAPHY POLICY STATEMENT. The study of geography helps our pupils to make sense of the world around them.

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME:

CHAPTER 9 DATA DISPLAY AND CARTOGRAPHY

Profiling Burglary in London using Geodemographics

Effective Visualization Tool for Job Searching

An Introduction to Geographic Information System

GOVERNMENT MAPPING WORKSHOP RECOVER Edmonton s Urban Wellness Plan Mapping Workshop December 4, 2017

Using Social Media for Geodemographic Applications

A Review: Geographic Information Systems & ArcGIS Basics

Crime Analysis. GIS Solutions for Intelligence-Led Policing

In this exercise we will learn how to use the analysis tools in ArcGIS with vector and raster data to further examine potential building sites.

INTRODUCTION. In March 1998, the tender for project CT.98.EP.04 was awarded to the Department of Medicines Management, Keele University, UK.

Lecture 5. Symbolization and Classification MAP DESIGN: PART I. A picture is worth a thousand words

Extracting Patterns of Individual Movement Behaviour from a Massive Collection of Tracked Positions

Multiscale Spatio-Temporal Data Aggregation and Mapping for Urban Data Exploration

ArcGIS for Desktop. ArcGIS for Desktop is the primary authoring tool for the ArcGIS platform.

Canadian Historical GIS Partnership Development: Taking Steps for Historical Mapping in Canada

Joint-accessibility Design (JAD) Thomas Straatemeier

Reaxys Medicinal Chemistry Fact Sheet

Hennig, B.D. and Dorling, D. (2014) Mapping Inequalities in London, Bulletin of the Society of Cartographers, 47, 1&2,

Estonian Place Names in the National Information System and the Place Names Register *

Geographical Inequalities and Population Change in Britain,

Developing a Community Geographical Information System (GIS) in Rural India

geographic patterns and processes are captured and represented using computer technologies

Using OS Resources - A fieldwork activity for Key Stage 2

Introducing the WISERD Geoportal. WISERD DATA TEAM Dr Robert Berry & Dr Richard Fry, University of Glamorgan Dr Scott Orford, Cardiff University

DATA DISAGGREGATION BY GEOGRAPHIC

TOWARDS THE DEVELOPMENT OF A TAXONOMY FOR VISUALISATION OF STREAMED GEOSPATIAL DATA

A route map to calibrate spatial interaction models from GPS movement data

Exploratory Data Analysis and Decision Making with Descartes and CommonGIS

GIS = Geographic Information Systems;

Theory, Concepts and Terminology

Methodological issues in the development of accessibility measures to services: challenges and possible solutions in the Canadian context

Geog183: Cartographic Design and Geovisualization Spring Quarter 2018 Lecture 3: Thematic cartography, geovisualization and visual analytics

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

HOLY CROSS CATHOLIC PRIMARY SCHOOL

Oregon Department of Transportation. Geographic Information Systems Strategic Plan

GEOGRAPHY POLICY. Date: March Signed: Review: March 2019

DATA SCIENCE SIMPLIFIED USING ARCGIS API FOR PYTHON

Give 4 advantages of using ICT in the collection of data. Give. Give 4 disadvantages in the use of ICT in the collection of data

INDIANA ACADEMIC STANDARDS FOR SOCIAL STUDIES, WORLD GEOGRAPHY. PAGE(S) WHERE TAUGHT (If submission is not a book, cite appropriate location(s))

EXPLORATORY SPATIAL DATA ANALYSIS OF BUILDING ENERGY IN URBAN ENVIRONMENTS. Food Machinery and Equipment, Tianjin , China

Your web browser (Safari 7) is out of date. For more security, comfort and the best experience on this site: Update your browser Ignore

Spatial Data Science. Soumya K Ghosh

World Geography. WG.1.1 Explain Earth s grid system and be able to locate places using degrees of latitude and longitude.

Exploring representational issues in the visualisation of geographical phenomenon over large changes in scale.

Orienteering, the map and child development. (Organised outdoor play with maps)

CONCEPTUAL DEVELOPMENT OF AN ASSISTANT FOR CHANGE DETECTION AND ANALYSIS BASED ON REMOTELY SENSED SCENES

Visualisation of Spatial Data

Mapping UK. David Martin, University of Southampton. Open Data Workshop, Nottingham, 21 June 2011

Class 4J Spring Term Irian Jaya/Papua New Guinea Adapted from QCA Geography Unit 10 incorporating some elements of Unit 25

Fairfield Public Schools

DataShine Automated Thematic Mapping of 2011 Census Quick Statistics

Lecture 2. A Review: Geographic Information Systems & ArcGIS Basics

An interactive tool for teaching map projections

Transcription:

vizlib: Using The Seven Stages of Visualization to Explore Population Trends and Processes in Local Authority Research Robert Radburn, Jason Dykes, Jo Wood gicentre, School of Informatics, City University London, EC1V 0HB Telephone: +44 (0)20 7040 0212, Fax: +44 (0)20 7040 8584 E: {sbbd476 jad7 jwo} @soi.city.ac.uk, W: http://www.soi.city.ac.uk/~ {sbbd476 jad7 jwo} KEYWORDS: libraries, local authority, exploratory data analysis, visualization, Processing 1. Introduction We use data visualization to explore patterns in a large data set of library loans maintained by Leicestershire County Council (LCC). This work has resulted in hypotheses and insights that may influence policy. Such an approach is increasingly accessible to local authority researchers through high-level languages and toolkits including Processing (Fry and Reas, 2007), Prefuse (Heer et al., 2005) and ProtoVis (Heer & Bostock, 2009). Fry (2008) proposes a seven-stage process for manipulating and making sense of data from its acquisition through its visualization and ultimately to its interpretation. His model (Figure 1) is not rigid, but draws attention to the interdependencies between the various elements of the visual data analysis process. It may be useful as a framework for data visualization in local authorities. Figure 1: The Seven Stages of Visualization (Fry, 2004) The model structured our analysis and we use it here to describe data visualization with Fry s open set of software tools Processing (Fry, 2008) 1. This software sketchbook is being used in a variety of scientific domains to rapidly explore data with flexible interactive visualization prototypes (e.g. Meyer et al., 2009; Slingsby et al., 2009; Wood et al., in press). Before embarking on the seven stages of visualization, questions must be considered that are answerable with the data in hand: the more specific you can make your question, the more specific and clear the visual result will be (Fry, 2008.) We used the Leicestershire Library Services (LLS) TALIS database to address three questions pertinent to LLS: How does performance vary across the 54 libraries in Leicestershire? In which areas are the best customers living? Can the area you live in contribute to predictions of usage? 1 http://processing.org

2. The Seven Stages of Data Visualization Stage 1. ACQUIRE Obtain the Data The TALIS database contains a rich spatio-temporal record of every item loaned from a Leicestershire library. Data were obtained for 435,000 active library users over a two-year period. Stage 2. PARSE Structure the Data Fry s approach diverts effort from database design into analytical sketching (Fry, 2008). Flat text files were produced from TALIS with a consistent structure for visual prototyping. Stage 3. FILTER Identify the Data of Interest Data quality checks where undertaken and personal data removed from records. Population weighted centroids and OAC (Vickers et al., 2005; Vickers and Rees, 2007) codes were added to output areas (OAs) - the geography used to consider the home locations of library members. Stage 4. MINE Methods to Uncover Patterns in the Data The recency / frequency marketing technique used in LCC to segment records into quintiles was applied to structure the data (Radburn et al., 2007). Members of each library were categorized into one of twenty-five recency / frequency (RF) combinations. Signed chi statistics were calculated for individual libraries and OAs in line with the research questions to relate observed numbers with values expected according to regional population weighted averages. Figure 2: Sketch produced with LLS marketing manager outlining ideas on how the data could be depicted and interrogated. Note RF matrices (bottom left).

Stage 5. REPRESENT Visual Model for the Data Processing is designed for small-scale visualization that can be rapidly modified as the process of visual enquiry progresses. It encourages an exploratory approach to data visualization as code that links graphical methods can be quickly configured and applied to text files that have been parsed, filtered and mined (Radburn et al., 2009). Understanding of the context in which LLS use their data was developed through sketching with paper and pencil (Figure 2). The ideas generated were rapidly incorporated into interactive Processing data sketches that loaded flat files generated as a result of the parsing, filtering and mining of the TALIS data (Figures 3-7). 3. Findings Various graphical techniques were developed in Processing to depict the filtered and mined data and combined with dynamic behaviours to interactively explore the three research questions: (i) In which areas are the best customers living? Spider plots LLS wanted to link library and customer (see Figure 2, top centre). Lines linking libraries with the home locations of particular groups of their users give an indication of how the different facilities compete. For example, many of the most frequent and recent users of Market Harborough library come from nearby villages containing libraries (Figure 3). Figure 3: Locations of Market Harborough library best users and competing village libraries. OAs are coloured with diverging RdBu scheme showing variation from expected membership levels given the regional proportions (red higher than expected, blue lower). Library locations are shown using British National Grid with symbol sizes representing number of registered users.

Figure 4: Spatial treemap of Leicestershire OAs with diverging RdBu scheme showing variation from expected membership (red higher than expected, blue lower). Library locations are shown using British National Grid with symbol sizes representing number of registered users. Users of Thurmaston library are highlighted. (ii) Can the area you live in contribute to predictions of usage? Spatial Treemaps These non-occluding space-filling layouts that preserve aspects of underlying geography (Wood and Dykes, 2008) may reveal subtle patterns in the data. Local variations in library performance became evident through these data rich views. These include lower numbers of members than expected in Thurmaston and subsequent consideration of possible explanations, more sophisticated modelling and local service provision (Figure 4).

Figure 5: RF plots for 54 Leicestershire libraries - fixed size spatial treemap where circle size represents number of registered users per library. Top - numbers of users with scaled global sequential YlOrBn scheme. Bottom - signed-chi statistic with global diverging RdBu scheme (red for higher than expected, blue for lower than expected).

(iii) How does performance vary across the 54 libraries in Leicestershire? RF / CHI matrix plots: A matrix showing users in recency (by column) / frequency (by row) quintiles for each library (Figure 5) met LLS needs (see Figure 2, bottom left). The most recent and frequent users are located at the top right of these RF plots (Novos, 2004). Considering RF plots for 54 libraries concurrently was deemed useful by LLS particularly with animated transitions between spatial and spatially ordered views (Radburn et al., 2009). Concurrently visualizing signed chi statistics (Figure 5, bottom) reveals high numbers of low recency users in some larger libraries (Melton, Coalville, Loughborough, Wigston) but not others (e.g. Hinckley). Loughborough has more lapsed frequent users than expected in red. Stage 6. REFINE Improve the Representation of the Data Feedback on the visualization process was very positive with the LLS Marketing Manager commenting on how large complex data can be fully interrogated with different views. The speed with which new avenues of analysis could be explored was impressive, vindicating the use of Processing as a technology. Functionality developed through iterative refinement included: Quartile plots concentric distances travelled by the 25%, 50% and 75% closest users (Figure 6) reveal that despite the long-legged spiders most citizens use their local library. Standard Ellipse summarizing point distributions and indicating the directional pattern of usage. Figure 7 suggests that road and river networks affect spatial usage patterns. Although Birstall and Thurmaston libraries are in close proximity, there is little overlap between best users. This is useful information for LLS when deciding on opening hours as opening one library may not ensure recent / frequent usage by those in the neighbouring catchment. Stage 7. INTERACT Allow Users to Control What They See and How They See It Our visualization was an iterative process that involved data users and their expertise continually to drive the analysis. High levels of control were achieved through discussion and rapid development. Processing provided considerable scope for developing innovative and interactive graphics quickly and flexibly through access to low level functionality through high-level commands with little hindrance (Dykes, 2005). All three authors were able to develop and share Processing code during the visualization process. Interactive software controls were provided at all stages in all prototypes enabling view and focus to be changed through direct manipulation so that the graphical representations introduced here could be accessed (see Figure 1). 4. Conclusion The processes required to manipulate data from acquisition through to visualization are significant. Using Processing to apply Fry s visualization model enabled us to iteratively and efficiently ask a huge number of new questions of a large underused data set through interactive graphical means. This has produced some useful hypotheses. But how do we know what to do with the provisional and partial answers that result? We show that visualization has potential in the exploration of local authority data holdings. If visualization of this type, and indeed such large data sets, is to be used effectively to influence and develop policy in organizations that hold them then a focus on an eight stage of visualization may be important: act. This is a non-trivial stage with significant social as well as analytical and technical elements that require consideration. Acknowledgments This work was supported by an ESRC UPTAP User Fellowship RES-163-27-0017 and Leicestershire County Council. Usage data kindly provided by Leicestershire Library Service.

Figure 6: Quartile plots for all 54 libraries with geographic locations. Grey symbols are sized by borrower population. Red circles show distances travelled by 25, 50 and 75% of best users. Figure 7: Home locations of best users from neighbouring Birstall and Thurmaston libraries with standard ellipse and 1:50,000 LandRanger. Crown Copyright/database right 2009. An Ordnance Survey/EDINA supplied service.

References Dykes, J. (2005). Facilitating Interaction for Geovisualization. In Exploring Geovisualization. Amsterdam: Elsevier, pp. 265-291. Fry, B. (2004). Computational Information Design, Massachusetts Institute of Technology. Fry, B. (2008). Visualizing Data. Sebastopol, CA: O'Reilly, 382 pp. Fry, B., & Reas, C. (2007). Processing: A Programming Handbook for Visual Designers and Artists. Cambridge, MA: The MIT Press, 736 pp. Heer, J. and Bostock. M. (2009). ProtoVis: A Graphical Toolkit for Visualization. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1121-1128. Meyer, M., Munzner, T. & Pfister, H. (2009). MizBee: A Multiscale Synteny Browser. IEEE Transactions on Visualization and Computer Graphics, 15(6), 897-904. Novos, J. (2004). Drilling down: Turning Customer data into profits with a spreadsheet, booklocker.com, 196 pp. Heer, J., Card, S.K & Landay, J.A. (2005). Prefuse: a toolkit for interactive information visualization, Proceedings of the SIGCHI Conference on Human factors in Computing Systems, 421-430. ACM New York, NY, USA. Radburn, R., Thomas, N., Forster, P., and Pye, S. (2007). Beyond the comfort zone. Public Library Journal, 22(3), 20-23. Radburn. R., Dykes, J. and Wood. J. (2009). vizlib: Developing Capacity for Exploratory Data Analysis in Local Government Visualization of Library Customer Behaviour, Proceedings of GISRUK, 1-3 April, Durham UK. Slingsby, A., Dykes, J. & Wood, J. (2009). Configuring Hierarchical Layouts to Address Research Questions. IEEE Transactions on Visualization and Computer Graphics, 15(6), 977-984. Vickers, D., Rees, P., and Birkin, M. (2005). Creating the National Classification of Census Output Areas: Data, Methods and Results, Working paper 03/3, School of Geography, University of Leeds. Vickers, D. & Rees, P. (2007). Creating the National Statistics 2001 Output Area Classification. Journal of the Royal Statistical Society, Series A, 170(2), 379-404. Wood, J., and Dykes, J. (2008). Spatially Ordered Treemaps, IEEE Transactions on Visualization and Computer Graphics, 14(6), 1348-1355. Wood, J., Dykes, J. & Slingsby, A., (in press) Visualization of Origins, Destinations and Flows with OD Maps. The Cartographic Journal. Biographies Robert Radburn was an ESRC UPTAP research fellow at the gicentre, City University London developing capacity for visual exploratory analysis in local government, and is currently Research and Intelligence Team Leader at Leicestershire County Council. Dr. Jason Dykes is a Senior Lecturer at the gicentre, City University London undertaking applied and theoretical research in, around and between information visualization, interactive analytical cartography and human-centred design. Dr. Jo Wood is a Reader in geographic information at the gicentre, City University London with research interests in geovisualization, terrain modelling and object oriented programming for spatial sciences.