Evaluating Travel Impedance Agreement among Online Road Network Data Providers Derek Marsh Eric Delmelle Coline Dony Department of Geography and Earth Sciences University of North Carolina at Charlotte
GoogleMaps 205 miles Rand McNally 205.4miles
Yahoo 205.93 miles Open Mapquest 205.38 miles
Online geographic data providers 1 Web services such as: Google Maps, Bing Maps, MapQuest Provide unprecedented access to spatial data and analytical tools geocoding addresses identifying points of interest determining travel directions Simple network analysis without the need for a GIS network dataset No data preparation necessary Available to GIS and non-gis users alike
Online geographic data providers 1 For sizeable use, generally require a paid license Directions service requests are limited otherwise Google Maps 2,500/day Bing Maps 10,000/90-days MapQuest 5,000/day An alternative is using openly sourced, public domain volunteered geographic information (VGI) MapQuestOpen unlimited
Volunteered Geographic Information 1 the widespread engagement of large numbers of private citizens, often with little in the way of formal qualifications, in the creation of geographic information (Goodchild 2007) One of the most successful examples of VGI, OpenStreetMap (OSM), offers a free, editable map of the world with no restrictions governing use for spatial analysis
VGI data quality 1 Despite VGI s potential, the question remains: What is the quality of this data? Because participants potentially lack any formal training in geographic data collection, central coordination is weak to non-existent, and adherence to a particular data structure is not required, no assumptions can be made about the overall quality of uploaded data (Goodchild & Li 2012)
Literature VGI data quality - Comparative assessments 2 Girres & Touya (2010) In comparison to the French National Mapping Agency, point positional displacement was on average 6.65 meters Haklay (2010) In comparison to the Ordnance Survey of Great Britain, greater than 81% overlap among major roads and an average of 6 meters point displacement of the OSM dataset within study sites across London Ciepłuch et al. (2010) In comparison to Google Maps and Bing Maps, accuracy is inconsistent among all three providers
VGI data quality - Indicator assessments 2 if one individual contributes an error, others can be expected to edit and correct the error, and the success of this mechanism rises in proportion to the number who look at the contribution (Linus law) (Goodchild & Li 2012) Haklay et al. (2010) Positional accuracy improved with an increase in the number of contributors up to a threshold (n>13) at which improvement stabilized Keßler & Groot (2010) Without a reference dataset, the volume of user contribution to an area or object in OSM is positively correlated to trustworthiness of the dataset
Research objectives and 3 questions I. Evaluating the Uncertainty of Travel Impedance Estimates What is the degree of uncertainty in travel impedance estimates among online road network data providers? Do routes calculated using VGI data present significantly different travel impedance estimates in comparison to commercial online spatial datasets? II. VGI User Contribution Applying Linus s Law at the Network Object Level Correlation between number of contributors and level of agreement?
Methodology 4 Identify O-D Pairs Travel Estimation: d ij,k, t ij,k Origins Destinations Network Snapping Tertiary Roads O-D Pairs Lat/Long Points Batch Routing Network Provider API Google Maps ArcGIS Online JavaScript Object Notation (JSON) Travel Time & Distance Estimates OpenStreetMap Linus s Law Disagreement Assessment: Distance weighted contributor average Route Contributor Average ݓ ) C a = ݐ ) ݓ * c Store Contributor Information Network Metadata API OpenStreetMap Identify route road segments Network Metadata API MapQuest Open OpenStreetMap 1. Difference (Δd, Δt) among online providers 2. Percent Difference 3. Correlation (r)
Case study area 4 North Carolina offers several clear urban locations, a diverse road network, and a range of topographical environments to assess road network uncertainty.
Methodology 4 Origins Destinations Identify O-D Pairs Network Snapping Tertiary Roads O-D Pairs Lat/Long Points Remove limited access roads from network dataset. Origins and destinations selected from tertiary roads Modified dataset segmented at nodes; begin nodes serve as candidate origin and destination points Specific implementation study area dependent; discussed further in results Select n*2 number of randomly distributed of candidate points used to form n number of origindestination (OD) pairs Store OD pairs in text file as latitude, longitude and unique identifier
Example of North Carolina - (total = 100,000 OD pairs): 14,300 pairs are selected in each of seven distance intervals: 0-50 kilometers (km), 50-100 km, 100-150 km, 150-200 km, 200-250 km, 250-500 km and 500-1000 km. It was necessary to increase the range of the category intervals for the longer distances to accomplish an equally stratified sample. Results OD selection 6 Road network, State of North Carolina Exclude interstate highways Identify begin and nodes of all resulting road segments Exclude begin nodes in the proximity of highways (incorrect snapping) (*) 300 pairs of vertices were selected at random for each county (stratified random sampling of vertices) P =306,788 Q =47,059,285,078 P =30,000 P c = 300(*) Q =449,985,000
Results OD selection 6 Ex) North Carolina All pairs of OD points Spider map of OD pairs originating or terminating in Ashe County
Methodology Online data providers (k): Reference Datasets: Google Maps (TeleAtlas) ArcGIS Online (NavTeq) VGI Dataset: OpenStreetMap Technical Issues: Google Directions API limited to 2,500 requests per day ArcGIS Online requires license OpenStreetMap directions algorithm provided by MapQuest Open Assuming no significant difference due to heuristic or routing algorithm Travel estimations do not account for traffic or other realtime data Precision limited to 1/10 th mile Batch Routing Network Provider API Google Maps ArcGIS Online OpenStreetMap JavaScript Object Notation (JSON) 4 Travel Estimation: d ij,k, t ij,k In Python: For each OD pair, a URL string is formed that includes the network provider web address, OD coordinates, and routing specifications. A new URL is created for each provider, k. Results returned in JavaScript Object Notation (JSON), an easily read data format that uses key-value pairs. Travel Time & Distance Estimates
Methodology Travel Impedance Estimates d ij : travel distance t ij : travel time Batch Routing Network Provider API Google Maps ArcGIS Online OpenStreetMap JavaScript Object Notation (JSON) 4 Travel Estimation: d ij,k, t ij,k Travel Time & Distance Estimates ODIndex originlat originlng destinationlat destinationlng GoogleMile GoogleMin ArcGISMile ArcGISMin OSMMile OSMMin 1 35.2458-80.8045 35.2261-80.9443 10.4 18 11.1 17.4 10.4 17.0 2 35.0783-80.8225 35.0758-80.8946 5.5 13 5.5 13.0 5.5 12.1 3 35.2163-80.7872 35.1013-80.8255 10.3 21 10.2 19.7 10.1 16.5 4 35.2355-80.7941 35.2922-80.9497 11.5 20 12.1 20.2 12.1 20.2 5 35.2606-80.8525 35.0418-80.8542 18.9 24 18.8 24.7 18.9 22.1 6 35.2185-80.7703 35.4468-80.8836 21.4 25 21.8 27.1 21.9 25.9 7 35.2212-80.8299 35.1972-80.7583 5.1 8 4.8 8.0 4.9 7.2 8 35.3304-80.7344 35.2503-80.9238 14.7 17 14.7 18.5 14.8 19.0 9 35.0441-80.7744 35.3107-80.7203 26.9 28 26.7 30.0 27.0 30.3 10 35.4195-80.8763 35.1178-80.9753 27.7 32 27.6 34.5 27.7 35.8
Results 6 Low uncertainty in estimated travel distance ArcGIS Online overestimates
Results 6 Correlation Coefficients NC Outlier(s) Google Maps includes ferries in the routing calculation
Results 6
What about contributors?
Methodology 4 = Segment distance ݓ (selected) = Total ݓ = ݐ segment distance c ݓ = Number of segment contributors Distance weighted contributor average Route Contributor Average ݓ ) C a = ݐ ݓ ) C a = ݐ ) ݓ * c * c ݓ ) Fewer contributors are required to validate shorter road segments, Linus s Law Store Contributor Information Network Metadata API OpenStreetMap Identify route road segments Network Metadata API MapQuest Open OpenStreetMap but a higher proportion of contributors is needed to verify the accuracy of a longer route A sample of road segments is used from the total route; thus, the user average is proportional to the length of known road segments
Results Linus s Law 6 North Carolina OD pairs Level of uncertainty decreases as number of contributors increases Initial increase in uncertainty corresponds to greatest sample of contributor averages (overall average = 3.27) Large number of outliers
Results at different distances 6 0-25mi 25-75mi 75-250mi >250mi
Discussion and conclusion 7 Correlation coefficients and percent difference both resulted in relatively high agreement. 1. Uncertainty was extremely low at long travel distances 2. Shorter, county wide distances showed greater uncertainty among all providers 3. The VGI dataset OSM was as reliable as the two commercial providers in estimating travel distance 4. OpenStreetMap may be a viable dataset for routing and navigation purposes within the selected study areas
Discussion and conclusion 7 VGI User Contribution Applying Linus s Law at the Network Object Level 1. Disagreement decreases with increasing number of contributors 2. Relationship not uniform across different route lengths.
Future research opportunities 7 Approach could be expanded to new areas of the OSM dataset (e.g. other regions and countries) Urban travel Rural travel Analyze overlap among individual routes to explain where and why travel impedance uncertainty occurred Is the trend of the Linus s Law valid in other states and other countries?
Thank you Derek Marsh Eric Delmelle Coline Dony Department of Geography and Earth Sciences University of North Carolina at Charlotte
Results Correlation Coefficients Mecklenburg County 6 Greater uncertainty across all providers Correlation still high Same pattern of under/overestimation Greater uncertainty at 20-35 miles
Results Percent Difference Mecklenburg County 6 Trend in correlation plots are corroborated by percent difference ArcGIS Online produces greater uncertainty around 15 miles OSM has greater uncertainty at 30 miles