Geob 370 Lab Crime in the City

Similar documents
Geob 370 Lab Crime in the City

Tutorial using the 2011 Statistics Canada boundary files and the Householder survey

In this exercise we will learn how to use the analysis tools in ArcGIS with vector and raster data to further examine potential building sites.

Data Structures & Database Queries in GIS

Working with ArcGIS: Classification

Task 1: Open ArcMap and activate the Spatial Analyst extension.

Working with Census 2000 Data from MassGIS

Task 1: Start ArcMap and add the county boundary data from your downloaded dataset to the data frame.

GIS Workshop UCLS_Fall Forum 2014 Sowmya Selvarajan, PhD TABLE OF CONTENTS

Tutorial 8 Raster Data Analysis

Learning ArcGIS: Introduction to ArcCatalog 10.1

Lecture 2. A Review: Geographic Information Systems & ArcGIS Basics

The Geodatabase Working with Spatial Analyst. Calculating Elevation and Slope Values for Forested Roads, Streams, and Stands.

Instructions for Mapping 2011 Census Data

(THIS IS AN OPTIONAL BUT WORTHWHILE EXERCISE)

Overlay Analysis II: Using Zonal and Extract Tools to Transfer Raster Values in ArcMap

Exercise on Using Census Data UCSB, July 2006

Vector Analysis: Farm Land Suitability Analysis in Groton, MA

Utilizing Data from American FactFinder with TIGER/Line Shapefiles in ArcGIS

Exercise 3: GIS data on the World Wide Web

Displaying Latitude & Longitude Data (XY Data) in ArcGIS

Geography 281 Map Making with GIS Project Four: Comparing Classification Methods

Institutional Research with Public Data and Open Source Software

The data for this lab comes from McDonald Forest. We will be working with spatial data representing the forest boundary, streams, roads, and stands.

User Guide. Affirmatively Furthering Fair Housing Data and Mapping Tool. U.S. Department of Housing and Urban Development

GEOG 487 Lesson 7: Step-by-Step Activity

Compilation of GIS data for the Lower Brazos River basin

Building a Hydrologic Base Map Prepared by David R. Maidment Waterways Centre for Freshwater Research University of Canterbury

Lab 2: Projecting Geographic Data

Calculating Conflict Density and Change over Time in Uganda using Vector Techniques

A Review: Geographic Information Systems & ArcGIS Basics

Different Displays of Thematic Maps:

Trouble-Shooting Coordinate System Problems

Downloading GPS Waypoints

LAB 2 - ONE DIMENSIONAL MOTION

Using the Stock Hydrology Tools in ArcGIS

Analysis of Change in Land Use around Future Core Transit Corridors: Austin, TX, Eric Porter May 3, 2012

Acknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2

Outline. ArcGIS? ArcMap? I Understanding ArcMap. ArcMap GIS & GWR GEOGRAPHICALLY WEIGHTED REGRESSION. (Brief) Overview of ArcMap

Module 10 Summative Assessment

How to Make or Plot a Graph or Chart in Excel

Visual Studies Exercise, Assignment 07 (Architectural Paleontology) Geographic Information Systems (GIS), Part II

Geog 210C Spring 2011 Lab 6. Geostatistics in ArcMap

ArcGIS 9 ArcGIS StreetMap Tutorial

Spatial Analysis using Vector GIS THE GOAL: PREPARATION:

INTRODUCTION TO ARCGIS Version 10.*

2G1/3G4 GIS TUTORIAL >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Introduction to ArcGIS 10.2

1. Double-click the ArcMap icon on your computer s desktop. 2. When the ArcMap start-up dialog box appears, click An existing map and click OK.

WlLPEN L. GORR KRISTEN S. KURLAND. Universitats- und Landesbibliothek. Bibliothek Architektur und Stadtebau ESRI

Delineation of Watersheds

Map image from the Atlas of Oregon (2nd. Ed.), Copyright 2001 University of Oregon Press

Geography 281 Map Making with GIS Project Eight: Comparing Map Projections

AFFH-T User Guide September 2017 AFFH-T User Guide U.S. Department of Housing and Urban Development

GIS IN ECOLOGY: ANALYZING RASTER DATA

Electric Fields and Equipotentials

GIS Software. Evolution of GIS Software

CE 365K Exercise 1: GIS Basemap for Design Project Spring 2014 Hydraulic Engineering Design

Studying Topography, Orographic Rainfall, and Ecosystems (STORE)

Physics E-1ax, Fall 2014 Experiment 3. Experiment 3: Force. 2. Find your center of mass by balancing yourself on two force plates.

A Street Named for a King

Urban Canopy Tool User Guide `bo`

Exercise 4 Estimating the effects of sea level rise on coastlines by reclassification

Within this document, the term NHDPlus is used when referring to NHDPlus Version 2.1 (unless otherwise noted).

How to Create Stream Networks using DEM and TauDEM

Using Microsoft Excel

MIS 0855 Data Science (Section 005) Fall 2016 In-Class Exercise (Week 4) Visualizing with Maps

Lecture 2. Introduction to ESRI s ArcGIS Desktop and ArcMap

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

Introduction to Coastal GIS

Week 8 Cookbook: Review and Reflection

Trouble-Shooting Coordinate System Problems

GEOREFERENCING, PROJECTIONS Part I. PRESENTING DATA Part II

ST-Links. SpatialKit. Version 3.0.x. For ArcMap. ArcMap Extension for Directly Connecting to Spatial Databases. ST-Links Corporation.

Lecture 1 Introduction to GIS. Dr. Zhang Spring, 2017

Fundamentals of ArcGIS Desktop Pathway

Demographic characteristics of the School of International Studies 9 th Grade class and their success their first semester.

In order to follow this exercise you need to have completed exercise 1.

Getting Started. Start ArcMap by opening up a new map.

M E R C E R W I N WA L K T H R O U G H

Your work from these three exercises will be due Thursday, March 2 at class time.

Global Atmospheric Circulation Patterns Analyzing TRMM data Background Objectives: Overview of Tasks must read Turn in Step 1.

Moving into the information age: From records to Google Earth

Studying Topography, Orographic Rainfall, and Ecosystems (STORE)

Introduction to Computer Tools and Uncertainties

BASIC TECHNOLOGY Pre K starts and shuts down computer, monitor, and printer E E D D P P P P P P P P P P

Experiment 1: The Same or Not The Same?

Computer simulation of radioactive decay

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

MERGING (MERGE / MOSAIC) GEOSPATIAL DATA

Esri UC2013. Technical Workshop.

Chem 253. Tutorial for Materials Studio

v WMS Tutorials GIS Module Importing, displaying, and converting shapefiles Required Components Time minutes

Geodatabases and ArcCatalog

Exercise 6: Using Burn Severity Data to Model Erosion Risk

Lab 1 Uniform Motion - Graphing and Analyzing Motion

Geographic Systems and Analysis

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

WORKING WITH DMTI DIGITAL ELEVATION MODELS (DEM)

Experiment: Oscillations of a Mass on a Spring

Transcription:

Crime in the City In this lab you ll be introduced to some important GIScience concepts the modifiable areal unit problem and the effects it has on statistics derived from spatial data, and some cartographic axioms ( do not map totals ). You will also learn how to relate spatial and attribute data in ArcMap (JOINS), and how to work with Statistics Canada census files (census geography and census attributes). Introduction As mentioned in class, traditional (aspatial) statistical approaches should not be applied blindly to spatial data because geography creates dependencies in the data that violate traditional statistical assumptions. One of the geographic dependencies that is ever-present in any spatial analysis is the dependency of the results on the spatial unit selected for the analysis (that is, when analysing socioeconomic, crime or health data, should we use neighbourhoods as the unit of analysis, census tracts, cities, or some other unit?). This dependency of the results on the spatial units is formally known as the Modifiable Areal Unit Problem (MAUP). The problem consists of two interrelated parts. First, there is uncertainty about what constitutes the objects of spatial study the scale and aggregation effects. Second, there are the implications that this uncertainty presents when interpreting the results of spatial analyses. The scale effect is the tendency for different statistical values (e.g., a mean, correlation coefficient, or regression coefficient) to be obtained from the same set of data when that data is grouped at different levels of spatial resolution (e.g., individual households can be grouped into dissemination areas, census tracts, cities, regions). For example, if you determine the relation between two variables using dissemination area data, and then using census tract data, the relations will likely be different, as you ll discover later in this lab. The aggregation or zoning effect is observed because we can group individuals (or crime incidents, or disease clusters) into many different configurations of larger units. If we were to aggregate the dissemination areas into zones of similar size to census tracts, but in a different spatial arrangement than that used by Statistics Canada, we would likely find that the statistics (e.g., the mean income for our census tracts) were different between the two groups of data. That is, using different aggregation schemes can also result in very different sets of statistical relations being observed amongst a set of variables. The second part of the MAU problem follows from the uncertainty described above in choosing aggregated zonal units (i.e., there are no natural geographies when working with socioeconomic data 1 census tracts or dissemination areas are artificial units). Since different areal aggregations (or zonations) of the same data produce different results, generalizing the results of spatial studies is fraught with difficulties that is, how dependent are the results of any analysis on the choice of spatial units used (i.e., the specific aggregation scheme)? How can we compare the results of studies from one city to another, when simply changing the spatial units used in both studies could dramatically alter the conclusions reached? 1 Note, however, in physical geography some features such as drainage basins do have natural boundaries. 1

The MAUP has been known since the early 30's when, in a study of the scale effect in census data, the authors noted that although high correlations could be observed amongst attributes when examined at the census tract level, the correlations (i.e., associations) between individuals and the attributes was very low (i.e., while at the census tract level it appeared as through there was a strong relation between race and literacy, at the individual level there was no relation between race and literacy) (Gehlke and Biehl 1934). This particular effect of MAUP is more formally known as ecological fallacy, and is one of the serious problems associated with the MAUP. Given that many policy decisions are made on the basis of statistical associations derived from the analysis of spatial data, more attention should be paid to the problem (e.g., if funding for multicultural activities is dependent on the percentage of the neighbourhood s population that is ethnic, changing the neighbourhood boundaries can dramatically alter the percentages and thereby the funding allocation). In this lab you will explore how scale affects the relations amongst crime occurrences and census variables, and explore some of ArcGIS s data manipulation and presentation techniques. In particular, you will be using a simple regression analysis to model the relation between mischief crimes (within a DA or CT) and auto thefts (within the associated DA or CT), and how dwelling value relates to both types of crime. That is, for example, can knowing the number of mischief crimes in a CT enable us to predict how many cars will be stolen within that CT 2. By examining the residuals we can identify areas where the number of auto thefts are above or below what the model predicts and look for reasons behind those differences (i.e., what differentiates one set of areas from another; why isn t the relation between the two types of crimes constant across areas and scales of analysis?). Your lab report is due in two weeks. The Data In order to make the production and interpretation of the results easier, you will use only the data for Vancouver in this lab. (The spatial data files listed below have already been clipped to Vancouver [extracted from DMTI files], and imported into a geodatabase.) The files you will need for this lab include: Census Geography 2006 census tracts (CTs) (named CT_Van) (extracted from GVRDct06_Carto) 2006 dissemination areas (DAs) (named DA_Van) (extracted from GVRDda06_Carto) Ancillary spatial data files City boundary (named City) Parks and lakes (named Parks_Lakes) Crime data Information on types of crime geocoded to the block level (named Van_crimes) Use GetData to obtain the geography files you will be using in this lab (Geob 370; Lab 3). You should end up with a geodatabase called MAUP.gdb in C:\data\Lab3. Although for this lab you will be obtaining the census data using Abacus, Joey will demonstrate in the lab how to obtain 2 A strong correlation could imply that the two types of crimes are committed by similar individuals. 2

the census attribute data files from the Departmental server, and where the descriptions of the census data files can be found. You will be using DMTI census files in this analysis, and may wish to do so in your final projects 3. Important: you will need to create your own MXD file, and remember to save your work often as you progress through the lab, since I have not provided one for you. Starting ArcMap 10 As you should know by now, ArcMap will store any newly created files in a default geodatabase (C:\Documents and Settings\...\My Documents\ ArcGIS\Default.gdb). Furthermore, upon starting, ArcMap will ask you to select a template (My Template: Blank Map). Since the data for this lab (and, we strongly suggest, the data you will be using in your projects, in order to make things less complicated) should be stored in C:\data, your first step whenever starting an ArcMap project is to set the default database to reside in C:\data: 1) Click on the folder icon to the right of C:\Documents \Default.gdb 2) Navigate to C:\data. This may require you to: a. Click on Go up one level b. Click on Connect to Folder c. Navigate to C:\data\Lab3 d. Click on OK 3) Select MAUP.gdb. 4) Click on Add. 5) Click on OK. To confirm that the default geodatabase has been set correctly 4 : 1) Click on Catalog 2) Select Folder Connections 3) Confirm that C:\data\Lab3 is listed as a Folder Connection. 4) Right-mouse click on MAUP 5) Ensure that MAUP is identified as the default geodatabase (if so, the entry Make Default Geodatabase will not be clickable). 6) You should confirm where the default geodatabase resides every time you start up ArcMap. Add all of the files in the MAUP geodatabase to your map. 3 Unfortunately, you cannot use the DMTI census geography files with the Statistics Canada census geography files since they do not match (that is, there is a slight shift between the two sets of files, and the boundaries will not match if you attempt to combine the DMTI census geography files with the Statistics Canada census geography files). To see the differences, look at how the Parks_Lakes, derived from Stats Can data, doesn t line up with the DMTI files (e.g., look at the coastline around Stanley Park). 4 If you correctly perform that steps above the following steps are redundant, but being doubly sure that your results will be stored in the appropriate location is always useful exercise. 3

Crime data The crime data being used in this lab was downloaded from the City of Vancouver VanMap web site (http://former.vancouver.ca/vanmap/c/crimedata.htm). Each incident address in the file was generalized to the (hundred) block level in order to prevent the identification of the actual incident location or address. The original file listed the type of crime in a single column ( TYPE ); however, for spatial analytical purposes presenting the incident types in this format is inconvenient. Thus, to make things easier for you, I have added two columns (Mischief and AutoTheft) to the original file and used Select by attributes and then the Field Calculator to fill in the column: a value of 1 in a column indicates that that type of crime (mischief under/over $5000 or theft of auto under/over $5000) occurred at that address. Census data (attributes) You have been provided with the census geography files for Vancouver for both the census tracts and the dissemination areas (DAs). To obtain census attribute data at the DA level we will follow the instructions provided by Tom Brittnacher (UBC GIS Librarian): Census Data and GIS. Complete instructions on how to obtain CT-level attribute data, as well as how to obtain DA-level data is provided in Tom s article. Below is a summary of the steps required to obtain the DA-level data for this lab: 1. Go to the Abacus website: http://abacus.library.ubc.ca/jspui/ 2. Click the UBC Member link on the right under Login to Abacus 3. Log in using your CWL or library barcode and pin. 4. In the search box, type topic-based tabulations 5. Click on Census of Canada. Topic-based Tabulations, 2006. 6. You will see a number of tables listed that cover a variety of topics and geographic areas. For the purposes of this lab we are looking for data related to the average dwelling value (look for Housing and Shelter Costs - Value of Dwelling (14). 7. Click the link at the left of the appropriate row in order to download the file (you may need to rightmouse click). It is in Beyond 20/20 format, but is compressed into a ZIP file. Save the file. 8. After it has been downloaded, find the file and unzip it (typically: right-mouse click and Open with Windows Explorer) into C:\data\Lab3. 9. Open the file in Beyond 20/20 (double-clicking on the IVT file you just unzipped should open the file in Beyond 20/20). 10. Select Geography and drag it to the row heading (Figure 1). Figure 1 Selecting Geography in Beyond 20/20 4

11. Select Value of dwelling and drag it to the column heading (Figure 2). Figure 2 Selecting Value of dwelling in Beyond 20/20 12. Scroll to the right until you see the column labelled Average value of dwelling ($). Select it, and then right-mouse click and select Show. All of the other columns should disappear. 13. In order to select the British Columbia dissemination areas (all those beginning with 59) you ll need to scroll down through the list of geographies (about four-fifths of the way down through the table) and then click on the first BC entry, listed as British Columbia (59) (Figure 3). Figure 3 Selecting British Columbia Geographies in Beyond 20/20 14. Scroll down the table until you find the last BC entry, just before Yukon Territory. Hold down the shift key and click on the last entry for British Columbia. All the BC DAs should now be coloured black. Click on Item in the menu and select Show. Only BC DA s should remain (Figure 4). Figure 4 Select Vancouver's EAs in Beyond 20/20 5

15. Select the Next Label button (highlighted in Fig. 4) until the geography shows up as eight-digit dissemination area codes (along with some text-only rows). 16. To export the table, click on File in the menu and select Save As Change the type to Excel Worksheet (*.xls). Navigate to C:\data\Lab3, and give the file a name (DADwelling.xls). Click OK. 17. Open the file you just saved in Excel 5. You should delete the first two rows and the fourth row. The column heading Average value of dwelling $ includes spaces and symbols, which ArcGIS won t accept. Change the heading to AVG_Dwelling. Change the first column heading to DAID 6 (Figure 5) Figure 5 Excel showing the Beyond 20/20 results 18. Save the Excel file and close it (Excel may require that you save the file as an.xlsx file). Attribute Join The next step in using the census data is to link the attribute file (DADwelling) to the geography file (DA_Van) using a Join (attributes from a table). Your first step will be to Import the Excel file into the MAUP geodatabase: 1) Click on Catalog 2) Right-mouse click on MAUP (the geodatabase) 3) Select Import > Table (Single) 4) Click on the Folder icon (right side of Input Table selection box) 5) You may need to click on to move into C:\data\Lab3 (not in the MAUP geodatabase) 6) Select the Excel file (double-click on the name), and then select the worksheet (likely called DADwelling$), and then click on Add. 7) Name the output table DADwelling, and then click on OK. Before joining the attribute table to the census geography file, you should open the attribute table of the geography file (DA_Van) and note the fields that are present, and how many records (spatial units) are present in the file. After completing the join note how many records/rows remain in the joined feature class attribute tables (and, by association, how many geographic areas remain). You will discover that DA_Van has fewer records after the join. The reason for 5 If Excel does not allow you to open DADwelling.xls, you may need to Save As the file in Beyond 20/20 as a CSV file and read that file into Excel (DWDWelling.csv). In addition, you may then need to set the FORMAT of the DAID column explicitly to TEXT within Excel before saving it and importing it into ArcMap. 6 Note: Although the DA identifiers are numbers, they must be formatted as text within Excel in order for ArcMap to properly match the DA identifiers in the Excel file to the (text) DA identifiers in the geography file (DA_Van). 6

the difference is that Statistics Canada does not provide data for a dissemination area if the population is less than a specified amount 7. You should now Join the attribute table (DADwelling) to DA_Van using DAUID as the linking (field) variable 8. You must select the Join Options Keep only matching records, or else you will run into problems later on). Clicking on Validate Join will tell you how many records will be lost after the join. In order to be able to manipulate the data in the later steps of this lab you need to Export the Data into MAUP.gdb after you have joined the attribute data to the geography file (exporting the data makes the joins permanent [Right-mouse click on the layer name, select Data > Export Data, and navigate to C:\data\MAUP.gdb. You may have to set the Save as type: to File and Personal Geodatabase feature classes.]). Name the exported file DA_VanD, and add it to the Table of Contents. To speed things up, I have already added the census attribute average dwelling value to CT_Van (as Avg_Dwelling). Question 1: How many people must live in a dissemination area before Statistics Canada does not suppress the data? Review this document (Introducing the Dissemination Area for 2001; a local copy is here) and search for Minimum population. Exactly how many dissemination areas are lost after joining the attribute table to the DA layer? Why does Statistics Canada suppress such data? What is random rounding and how does that affect the data? Would the effect of random rounding on any examination of analytical results be greater when working with DA files or with CT files (and why)? [5 marks] Spatial Joins The next step in preparing the data for analysis is to combine the crime data with the census data using spatial joins. The spatial joins will allow us to summarize the number of crime events within the respective spatial units (DA or CT). Right-mouse click on DA_VanD, select Joins and Relates, and then select Joins. Select Join data from another layer based on spatial location (a choice under What do you want to join to this layer?). The layer to join to this layer will be Van_Crimes. Under How do you want the attributes to be summarized? Select Sum. Save the results into the MAUP geodatabase as DA_Crimes. Do a spatial join using CT_Van and Van_Crimes, and save the results into the geodatabase as CT_Crimes. Open the attribute tables of the two new files, and note the variables that have been added. The variables that will be used in the following analyses are Avg_Dwelling, Count_, Sum_Mischief and Sum_AutoTheft. Count_ represents the total number of all crimes reported within that spatial unit (be it a DA or CT). 7 This loss of records after joining attribute files to spatial data is something you should always be on the watch for, since it could influence any subsequent interpretations of the data. 8 For those who can t recall how to perform a simple join: Right-mouse click on CT_Burn, select Joins and Relates, select Join. You will be joining attributes from a table. Select the appropriate tables and field names. To learn more, search ArcMap help for Essentials of joining tables. 7

Data Exploration Simple regression and spatial autocorrelation In order to explore the relation between the number of mischief crimes and the number of auto thefts (in either a DA or a CT) we will perform a regression analysis (that is, does the number of mischief crimes in a DA or CT predict what the number of auto thefts will be in that DA or CT?). ArcMap 10 contains a regression routine that can perform this analysis. We will explore the effect of scale on the relation by comparing the results of the DA analysis with those using the CT s. To conduct a regression analysis: 1) Display the ArcToolbox window. 2) Select Spatial Statistics Tools, and then 3) Select Modeling Spatial Relations, and then 4) Select Ordinary Least Squares. In the pop-up window that appears, fill in the fields following the example presented below (be sure to fill in the Additional Options) (Figure 6). 5) Click on Show Help >> and then on Tool Help in order to learn more about ordinary least squares and how to interpret the results. For the DA analysis, use DA_Crimes 9 as the Input Feature Class and adjust the names of the optional output tables accordingly. The Output Feature Class maps show the residuals 10 of the regression analyses (accept the default classification). The coefficients and diagnostic output tables should appear at the bottom of the Table of Contents (List by Source view) 11. Question 2: What are the regression equations that describe the relation between the number of mischief crimes and the number of auto thefts for the DA data and the CT data? The numbers you are looking for can be found in the VanDA_COT and VanCT_COT tables, respectively. Recall that the general form of a regression equation is y = a + bx; here, y refers to the predicted number of auto thefts, a refers to the Intercept, and b refers to the coefficient associated with x the observed number of mischief crimes. What is the R 2 value associated with each regression equation (found in the DOT tables)? [3] 9 Note that the VanDA_OLS map will have far fewer DA s present than does DA_Crimes. The reason for the dramatic loss of DA s is that if there have been no auto thefts in a DA then that DA is excluded from the analysis. 10 What are the residuals? Recall that the regression is modelling the relation between the number of mischief crimes in a DA or CT and the number of auto thefts in that area. In effect, you are using the number of mischief crimes in a DA or CT to predict the number of auto thefts in that area. The regression model produces a regression equation (a linear model between mischief crimes and auto thefts) that summarizes the relation for the entire study region (all of the City of Vancouver). You can then use the regression equation to determine what the predicted number of auto thefts should be in each DA or CT given the number of mischief crimes for that area. Subtracting the predicted value from the actual value recorded in that DA or CT produces a residual. The larger the residual the greater the difference between the predicted value and the actual value. A positive residual indicates that the reported number of auto thefts is greater than that predicted by the model, while a negative residual indicates that the number of auto thefts is lower than that predicted by the model. If geography matters, then the residuals will reflect the influence of (e.g.) the surrounding neighbourhood characteristics on each DA / CT s number of mischief crimes and the number of stolen automobiles, and the residuals will be spatially autocorrelated. 11 It might help you to interpret the results if you review the explanations you ll find if you go to Help > ArcGIS Desktop Help, and type in Interpreting OLS Results as your question. 8

Figure 6 Establishing the OLS parameters In order to determine how well the global aspatial regression describes the relation between the number of mischief crimes and number of auto thefts, we can look at the spatial autocorrelation of the residuals. If an aspatial model adequately describes the statistical relation between the variables then the residuals will not be spatially autocorrelated. If geography does matter (or the regression model ignores important variables) then the aspatial model will normally produce results that are spatially autocorrelated. To determine whether or not the residuals are spatially autocorrelated we can perform the following Moran s I test on the residuals: 1) Within ArcToolbox, 2) Select Spatial Statistics Tools 3) Select Analyzing Patterns 4) Select Spatial Autocorrelation (Moran s I) 5) In the pop-up window that appears a. Select VanCT_OLS as the Input Feature Class b. Select Residual as the Input Field c. Select Generate Report d. Accept the defaults for all other choices e. Click on OK 6) To view the results of your spatial autocorrelation test: a. Click on Geoprocessing (menu choice) b. Select Results 9

7) The ArcToolbox window should now display Results: a. Expand the Current Session b. Expand Spatial Autocorrelation (Morans I) c. If you double-click on the entry Report File: MoransI_Result.html within the Spatial Autocorrelation (Morans I) results you can view the graphical output (i.e., what is the Moran s I value, the Z value and how significant are the results?). (Read the URL for the file in order to determine where ArcMap has written the report file.) 8) You can switch between ArcToolbox and Results by clicking on the tabs (which may be along the bottom of the ArcToolbox / Results window, or on the side, depending on how you established your working environment within ArcMap). 9) Repeat the analysis for the results presented in the VanDA_OLS file. Question 3: What are the Moran s I values and the significance levels for the CT and DA regression results? Are the residuals spatially autocorrelated? What do the results mean in terms of the distribution of auto thefts? Do the aspatial regressions produce results that adequately represent the geographic nature of the relation between mischief and auto theft? Do you feel that the OLS results are still valid, even though some spatial autocorrelation was observed in the residuals? [4] Displaying the results of your data analyses In order to create a map (a layout in ArcGIS terminology) that includes two set of maps (the CT and DA results), you will need to Insert a new Data Frame (into the Data View s Table of Contents) and copy the DA regression results layer in that data frame, as well as copy the City and Parks_Lakes layers into the new data frame. To copy a layer from one data frame to another, right-mouse click on the layer copy it, then left-mouse select the new Data Frame name, and right-mouse click paste it. To make it easier to keep track of things, left-mouse select each data frame s name, and use Properties / General to change the name of the first data frame from Layers to CT Map, and change the name of the New Data Frame to DA MAP. In the Data View you will need to Activate each Data Frame in order to switch between the two sets of results. You can now create a layout (View / Layout View) with the two data frames so that a single map shows the two sets of results (the DA and CT maps of the residuals) 12. Use the Data Frame Properties Size and Position tab of the data frame s properties to ensure that the two displays have the same width and height, so that a single bar scale can be used for both maps. Add the R 2 values to each data frame s layout in the bottom right corner (these values can be found in the COT tables), as well as the Moran s I values. Note that the correlation coefficient (the R 2 value) is larger indicating a stronger relation between the number of mischief crimes and the number of auto thefts when working with the census tract data than with the dissemination area data. Such an increase in R 2 is expected when working with aggregated units such as census tracts. 12 Additional instructions are provided on this page: http://www.geog.ubc.ca/courses/geob370/labs/arcgis_layout.htm 10

For display purposes 13, place City as the top-most layer, using a hollow fill and an outline of 2.0. That will enable you to clearly see the boundaries of the city. Also add the parks_lakes layer; colour the parks green and the lakes blue using the Category field [Symbology: Categories / Unique Values]. Question 4: Describe the patterns of the residuals how the two maps present (dis)similar patterns and what the patterns say about the number of auto thefts relative to the number of mischief crimes in Vancouver? What impact has scale (i.e., using census tracts versus dissemination areas) had on the results? Visually identify two census tracts that appear to exhibit particularly high variability (that is, where the dissemination area map indicates that the single value assigned to the census tract poorly reflects the variability present in the DA s). It might help to zoom into an area and use the Identify tool in order to obtain specific values. [10] [4 of which will be for your map] Exploratory Spatial Analyses The Vancouver Police Department would like to see if there are areas of the city that exhibit similar patterns with respect to the number of mischief crimes and auto thefts. They also feel that property values should be considered when developing these crime neighbourhoods. ArcMap has a Spatial Statistics Tool, within Mapping Clusters, that will produce exactly what the Police Department is looking for: Grouping Analysis. 14 After some preliminary analyses it was decided that, for both the census tracts and the dissemination areas, five groups (aka crime neighbourhoods) would be an appropriate number to work with 15. The grouping analysis parameters for the census tract analysis are presented below (Figure 7); for the DA analysis change the output file names to DA_group5 but keep the parameter settings the same. Using the Output Report Files you should be able to determine the characteristics of each of the different crime neighbourhoods. With that information you should be able to provide a meaningful name for each neighbourhood type (that is, when producing your map the different areas should be provided with a meaningful name rather than simply being labeled as 1, 2, 3, etc.). For example, by looking at the Parallel Box Plot, we can see that the census tracts that have been assigned to group 1 exhibit the highest number of mischief crimes as well as the highest number of auto thefts in the city; the property values in those census tracts are below average for the city, however. On the other hand, census tracts belonging to group 2 exhibit very low numbers of mischief crimes as well as very low numbers of auto thefts; however, the property values are above the average for the city. Given this information you should be able to provide an appropriate name for the different groups. 13 To move one layer so that it displays above another, you need to select the List by Drawing Order option of the Table of Contents. 14 Another very useful overview of grouping analysis is provided here. 15 This number was determined, in part, by running Grouping Analysis with the number of groups set to 15 and checking on Evaluate Optimal Number of Groups. The results of this analysis helped identify what an optimal number of groups would be. 11

Question 5: Produce a map showing the results of the grouping analysis for both the DA s and the CT s (similar to the style of map you produced in Question 4), with the groups named (as above). Provide a note that briefly describes what the two maps are showing (a paragraph or two written such that the City Police could understand what the maps are showing). In general, what do the results of the grouping analysis tell you about income (as reflected in the dwelling values) and security in the City of Vancouver (that is, is there a relation between areas of the city with higher dwelling values and criminal activities)? [10] Figure 7: Establishing the grouping analysis parameters 12

Variable normalization When working with census, crime and health data, it is often very important to normalize the variables prior to mapping them (i.e., converting an extensive variable to an intensive variable). For example, consider producing a map showing the number of mischief crimes within a CT. Produce a map that shows just that using CT_Crimes, Sum_Mischief as the variable being mapped, and the Quantities / Graduated Colors / Natural Breaks classification scheme (5 breaks). View the map and examine where the high and low values can be found. Now, normalize the data (an option found in the Symbolization tab of the layer properties window) by dividing the values by the Normalization variable Count_. Notice that the pattern changes areas where mischief represents the majority of crimes are now highlighted. Next, change the normalization variable to Shape_Area. The values being mapped now represent the intensity of mischief crimes (i.e., # mischief crimes per square metre). Each map provides a different perspective on the geographic distribution of mischief crime in Vancouver. It is often useful / informative to normalize a variable before mapping it. The important decision when working with extensive data then becomes what is the appropriate variable with which to normalize the variable (that is, which variable Y should be used to normalize variable X, and what message should the map send)? Question 6: What are some of the census tracts that change the most after the normalizations (use Identify to determine the census tract numbers)? If you were working as a spatial analyst for the Vancouver Police, how would you describe the different perspectives presented on each map to your superiors? [4] References Armhein C. 1995. Searching for the elusive aggregation effect: Evidence from statistical simulations. Environment & Planning A, 27(1): 105. Green, M. and Flowerdew, R. 1996. New evidence on the modifiable areal unit problem. Pages 41-54 in P. Longley and M. Batty (eds) Spatial analysis: modelling in a GIS environment. Cambridge: GeoInformation International. Martin, D. 1991. Geographic Information Systems and their socioeconomic applications. London: Routledge. Openshaw, S. 1996. Developing GIS-relevant zone-based spaital analysis methods. Page 55-73 in P. Longley and M. Batty (eds) Spatial analysis: modelling in a GIS environment. Cambridge: GeoInformation International. Wrigley, N., Holt, T., Steel, D., and Tranmer, M. 1996. Analysing, modelling, and resolving the ecological fallacy. Pages 25-40 in P. Longley and M. Batty (eds). Spatial analysis: modelling in a GIS environment. Cambridge: GeoInformation International. 13