Inclusion of Non-Street Addresses in Cancer Cluster Analysis

Similar documents
Acknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2

Census Geography, Geographic Standards, and Geographic Information

Visualization of Origin- Destination Commuter Flow Using CTPP Data and ArcGIS

Utilizing Data from American FactFinder with TIGER/Line Shapefiles in ArcGIS

Tracey Farrigan Research Geographer USDA-Economic Research Service

ZIP Code Tabulation Areas For Census 2000

FleXScan User Guide. for version 3.1. Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango. National Institute of Public Health

BROOKINGS May

Long Island Breast Cancer Study and the GIS-H (Health)

GEOCODING SELF-REPORTED ADDRESSES: LESSONS LEARNED. Karyn Backus CT Department of Public Health

Modeling Urban Sprawl: from Raw TIGER Data with GIS

Visualization of Commuter Flow Using CTPP Data and GIS

Spatial Organization of Data and Data Extraction from Maptitude

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

Areal Interpolation Methods using Land Cover and Street Data. Jeff Bourdier GIS Master s s Project Summer 2006

Appendix B: Data Sources. In this exercise you will: Find data sources Download needed data

Rural Alabama. Jennifer Zanoni. Geography Division U.S. Census Bureau. Alabama State Data Center 2018 Data Conference Tuscaloosa, Alabama

GIS Level 2. MIT GIS Services

Bayesian Hierarchical Models

Demographic Data. How to get it and how to use it (with caution) By Amber Keller

Census 2000 ZCTAs ZIP Code Tabulation Areas Technical Documentation 2000

2010 Census Data Release and Current Geographic Programs. Michaellyn Garcia Geographer Seattle Regional Census Center

Working with Census 2000 Data from MassGIS

GIS 520 Data Cardinality. Joining Tabular Data to Spatial Data in ArcGIS

Geographic Products and Data. Improvements in Spatial Accuracy and Accessing Data

The History Behind Census Geography

Exercise on Using Census Data UCSB, July 2006

Analyzing the Geospatial Rates of the Primary Care Physician Labor Supply in the Contiguous United States

Updating the Urban Boundary and Functional Classification of New Jersey Roadways using 2010 Census data

Frontier and Remote (FAR) Area Codes: A Preliminary View of Upcoming Changes John Cromartie Economic Research Service, USDA

Tax Jurisdiction Sourcing Data Bases

Targeted LiDAR use in Support of In-Office Address Canvassing (IOAC) March 13, 2017 MAPPS, Silver Spring MD

WlLPEN L. GORR KRISTEN S. KURLAND. Universitats- und Landesbibliothek. Bibliothek Architektur und Stadtebau ESRI

Colm O Muircheartaigh

The Road to Data in Baltimore

A Comprehensive Method for Identifying Optimal Areas for Supermarket Development. TRF Policy Solutions April 28, 2011

Cluster Analysis using SaTScan

GIS Lecture 5: Spatial Data

The History Behind Census Geography

SaTScan TM. User Guide. for version 7.0. By Martin Kulldorff. August

John Laznik 273 Delaplane Ave Newark, DE (302)

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

GIS for Actuaries Part 1 CAS RPM Garrett Bradford March 15, 2016

Rural Pennsylvania: Where Is It Anyway? A Compendium of the Definitions of Rural and Rationale for Their Use

The Joys of Geocoding (from a Spatial Statistician s Perspective)

Dan Goldberg GIS Research Laboratory Department of Computer Science University of Southern California

Railway suicide clusters: how common are they and what predicts them? Lay San Too Jane Pirkis Allison Milner Lyndal Bugeja Matthew J.

Spatial Clusters of Rates

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES

Hennepin GIS. Tree Planting Priority Areas - Analysis Methodology. GIS Services April 2018 GOAL:

Institutional Research with Public Data and Open Source Software

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

Valdosta State University Strategic Research & Analysis

NSF CNH-Ex # Political Fragmentation Indicator Database (Version 3.01)

GIS Lecture 4: Data. GIS Tutorial, Third Edition GIS 1

Dasymetric Mapping Using High Resolution Address Point Datasets

Urban Growth and Development Using SLEUTH: Philadelphia Metropolitan Region

A spatial scan statistic for multinomial data

Using Geographic Information Systems for Exposure Assessment

Improving TIGER, Hidden Costs: The Uncertain Correspondence of 1990 and 2010 U.S. Census Geography

Techniques for Science Teachers: Using GIS in Science Classrooms.

Using American Factfinder

These modules are covered with a brief information and practical in ArcGIS Software and open source software also like QGIS, ILWIS.

Housing Action Illinois Membership Coverage and Recommended New Office Location

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

An Introduction to SaTScan

Are You Maximizing The Value Of All Your Data?

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning

Medical GIS: New Uses of Mapping Technology in Public Health. Peter Hayward, PhD Department of Geography SUNY College at Oneonta

CWPP_Wildland_Urban_Interface_Boundaries

GIS for the Non-Expert

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World

Catalogue No. 92F0146GIE. Dissemination Area Reference Maps 2001 Census Reference Guide

Background of Project

Analysis of Bank Branches in the Greater Los Angeles Region

Using ArcGIS Server to Bring Geospatial Analysis

Outline. ArcGIS? ArcMap? I Understanding ArcMap. ArcMap GIS & GWR GEOGRAPHICALLY WEIGHTED REGRESSION. (Brief) Overview of ArcMap

Lecture 9: Geocoding & Network Analysis

Geospatial Analysis of Job-Housing Mismatch Using ArcGIS and Python

GIS Spatial Statistics for Public Opinion Survey Response Rates

Developing Spatial Data to Support Statistical Analysis of Education

The Building Blocks of the City: Points, Lines and Polygons

DRAFT RURAL-URBAN POPULATION CHANGE IN PUERTO RICO, 1990 TO 2000

Outline. Chapter 1. A history of products. What is ArcGIS? What is GIS? Some GIS applications Introducing the ArcGIS products How does GIS work?

Demographic characteristics of the School of International Studies 9 th Grade class and their success their first semester.

College Of Science Engineering Health: GIS at UWF

Presented by Mark Coding & Rohan Richards National Spatial Data Management Division, Jamaica Esri User Conference 2014

Popular Mechanics, 1954

Keywords: Air Quality, Environmental Justice, Vehicle Emissions, Public Health, Monitoring Network

Changes in Transportation Infrastructure and Commuting Patterns in U.S. Metropolitan Areas,

Introducing GIS analysis

Big-Geo-Data EHR Infrastructure Development for On-Demand Analytics

Policy Paper Alabama Primary Care Service Areas

Census Transportation Planning Products (CTPP)

Deriving Population Exposure Fatality Rate Estimates for Tornado Outbreaks Using Geographic Information Systems (GIS)

Sidestepping the box: Designing a supplemental poverty indicator for school neighborhoods

Geodatabase for Sustainable Urban Development. Presented By Rhonda Maronn Maurice Johns Daniel Ashney Jack Anliker

1 ST High Level Forum on United Nations Global Geoespatial Information Management GEOGRAPHIC REFERENCING OF ECONOMIC UNITS. México

Neighborhood Locations and Amenities

Transcription:

Inclusion of Non-Street Addresses in Cancer Cluster Analysis Sue-Min Lai, Zhimin Shen, Darin Banks Kansas Cancer Registry University of Kansas Medical Center

KCR (Kansas Cancer Registry) KCR: population-based central registry for cancer incidence information in Kansas Kansas residents diagnosed or treated with cancer in other states, such as MO, NE, CO, OK, TX, are also registered in the KCR database through data exchange. Approximately 13,000 cancer cases collected annually About 16%-18% with no standard street address

Address Types Street address: 1200 Main St, Kansas City, KS 66160 Non-Street address: PO BOX 32, city name, KS 12345 RR 2 BOX 269, city name, KS 12345 RFD 4 BOX 15, city name, KS 12345 HC 69 BOX 17, city name, KS 12345

Purpose Use datasets (cancer dataset from KCR, population dataset, GIS files) : To identify the address characteristics for the prostate cancer patients collected in Kansas To examine the effect of the non-street addresses on the results of cluster analysis To test the viability of using ZCTAs as the areal units for cluster analysis

ZCTA ZCTA(ZIP Code Tabulation Area), an entity developed by the U.S. Census Bureau for tabulating summary statistics Initially designed to overcome difficulties in defining the land area covered by each USPS ZIP Code Each ZCTA aggregates the census blocks that have the same predominant ZIP code associated with the residential mailing addresses in the Census Bureau s Master Address File

Data Used Cancer Cases: 9,356 prostate cancer cases diagnosed between 1995 and 1999 for Kansas residents Population: U.S. Census 2000 Summary File, classified by gender and by eighteen age groups Roads file for geocoding reference: US Census TIGER/Line 2000 program, converted to ESRI coverage and shapefile formats County/ZCTA boundary files: Provided by the Kansas State Data Access and Support Center (DASC), projected to the reference system of Lambert Conformal Conic, with the map unit of meter

Counties:105

Census Tracts:727

ZCTAs:721

Beale Codes Beale Codes: measure the rurality degree From the USDA Economic Research Service, also named ERS Rural-Urban Continuum Codes Divided into metropolitan counties, and non-metropolitan counties Metropolitan counties: 3 sub metro groupings by the population size of their metro area >=1 million, 250,000-1 million, <=250,000 Non-metropolitan counties:6 non-metro groupings by degree of urbanization and adjacency to a metro areas, >=20,000, 2,500-1999, <2,500, adjacency or not Beale code 1 through 9: 105 Kansas Counties, 6, 4, 7, 3, 8, 11, 23, 4, 39

Geocoding ArcInfo 8.3 desktop, match score set at 80 for both automatic and interactive process, side offset was set at 5 meters Cases with street addresses were geocoded to the points along the street segments, and then spatially joined to the ZCTA/county polygons Cases with or without street addresses were also geocoded directly to ZCTAs by patient mailing ZIP codes

Geocoding Continued USPS online address lookup: zip codes Yahoo online map: missing street number range in the reference files Zabasearch: patient identifiers TIGER/Line files and cases by individual counties for interactive geocoding, (237mb for shapefile, 156 mb for DBF table, slow in query, selection, sorting)

SaTScan To test both the presence of clustering and the general location of the clusters Circular window, Likelihood Ratio, simulation by Monte Carlo randomization, ranking, p value Assume that the count of prostate cancers in each polygon follows the Poisson distribution Null hypothesis that within any age group, the risk of getting prostate cancer is the same across the ZCTA or county polygons

SaTScan Input 1. Coordinates table: ZCTA/County FIPS Code (Federal Information Processing Standards), X coordinate, and Y coordinate. ZCTA X, Y: centroid coordinates of the ZCTA polygons County X, Y: centroid coordinates of the largest incorporated place 2. Case table: ZCTA/County FIPS Code, 18 age groups, and prostate cancer case counts 3. Population table: ZCTA/County FIPS Code, 18 age groups, population size, time (null variable) 4. Other parameters setting: purely spatial, scanning high rates only, 999 Monte Carlo replications, Cartesian coordinates, 50% population at risk for maximum spatial cluster size.

Distribution of Cases with No Street Addresses Beale Code* Problematic Cases Total Cases Percentage (%) 1 86 1966 4 2 115 2096 5 3 120 1163 10 4 96 556 17 5 105 880 12 6 212 668 32 7 279 1099 25 8 58 128 45 9 418 800 52 * Higher Beale score indicates lower urbanization level

In Kansas, prostate cancer cases with no standard street addresses were more likely to be collected in the rural areas than in the urban areas.

Counties: 84 % vs. 100 % Cases

Without including the non-street addresses, the polygons in urban areas could appear as significant cluster.

ZCTAs: 84% Cancer Cases

ZCTAs : 100% Cancer Cases

Single Race Only: White

10 Counties/164 ZCTAs around Wichita

Conclusions: When analysis only includes the cases with street addresses, insignificant cancer clusters could appear significant statistically, which is more likely to happen in the urban areas in Kansas, and the identified clusters often appears with higher SMR. After adjusting for the covariates of underlying population density and age with real-world case dataset, ZCTAs produced a meaningful results, similar to counties or census tracts in both the presence of the significant clustering and the general location of the clusters. After considering race and urbanization factors, the clustering presence and cluster location remain by ZCTAs, worthy of further field investigation

Conclusion Continued: With the limit of boundary instability, there is the potential to conduct cluster analysis with cancer registry data using the aggregated units of ZCTAs 1. simplify the time-consuming geocoding operation in comparison to Census Tracts 2. utilize the geographically based census statistics in comparison to USPS ZIP codes 3. produce a geographic pattern with a finer resolution in comparison to counties

THANK YOU!