Quality Assessment of Shuttle Radar Topography Mission Digital Elevation Data Third International Conference on Geographic Information Science College Park, Maryland, October 20-23 Ashton Shortridge Dept. of Geography Michigan State University ashton@msu.edu Thanks to Scott Oppman, Oakland County (Michigan) Information Technology Dept for data! NASA personnel and many others for flying the & processing all that data! Shuttle Radar Topography Mission Flown in February, 2000 Collected data over 80% Earth's land area All land between 60 degrees N, 56 degrees S Data released at 1 arc second interval for US Released at 3 arc second interval for row Data Collection Radar signals transmitted from Shuttle Received back at two antennas One in shuttle bay One on end of 60m boom Difference between two signals used to reconstruct elevation http://www.jpl.nasa.gov/srtm/missionoverview.html Data Resources Galapagos Information: http://srtm.usgs.gov/ Americas Download: http://seamless.usgs.gov/ Global Download: http://glcfapp.umiacs.umd.edu:8080/esdi/index.jsp http://photojournal.jpl.nasa.gov/catalog/pia02735
GTOPO 30 Galapagos Quality? How to consider data quality when this is the best available data for most everywhere?! ~1 km cells Find a data-rich location Examine error using careful methods Quantify Error and correlate with other characteristics Outline Case Study: part of Oakland County, MI Available data A non-raster based methodology for evaluating raster data accuracy Relationship between error & land cover Ortonville and the Shuttle Mission Study site: northern Oakland County, MI Ortonville (population 1,535) and environs 8.4 km x 6.7 km region Area facing rapid development at Detroit urban fringe Diverse land cover Varied topography (for MI!) Survey Elevations Survey Elevations Oakland County GIS contracted for detailed high accuracy countywide DEM from Woolpert LLC Aerial Photography collected in April 2000 Derived Points and Breaklines Stated accuracy 1 foot vertical 2.5 ft horizontal 46,065 points Range: 277 360 m. mean: 310.6 6.6 km 8.3 km
Oakland Data Characteristics Irregular postings Michigan State Plane, southern zone NAD 83, Units int'l feet Vertical units: International Ft. (NAVD 88) DEM Data Data obtained from http://seamless.usgs.gov/ 1 NED NAD83, vertical units meters, NAVD 88 3 (to match with non-us product) WGS84, vertical units meters As DEMs NED Land Cover Data 1 1992 NLCD Land Cover Modified Anderson Level 2 from Landsat TM NAD83, from Seamless 30 meter 2001 Land Cover Michigan GAP, Multiple Anderson Levels from Landsat TM Michigan GeoRef (oblique Mercator) Relatively Incompatible
1992 NLCD 2001 GAP-IFMAP : 11 (Water) : 21 (LI Resid) : 22 (HI Resid) : 23 (Com/Ind) : 41 (Dec. Forest) : 42 (Evg. Forest) : 43 (Mix Forest) : 81 (Pasture) : 82 (Row Crops) : 91 (Wood. WL) : 92 (Em. H. WL) : 11 (Water) : 21 (LI Resid) : 22 (HI Resid) : 23 (Com/Ind) : 41 (Dec. Forest) : 42 (Evg. Forest) : 43 (Mix Forest) : 81 (Pasture) : 82 (Row Crops) : 91 (Wood. WL) : 92 (Em. H. WL) Methods How to integrate this data? Different datums, coordinate systems, vertical units, spatial resolutions... Identify a method that is gentlest on the original data vs A Raster Methodology Decide upon a common system Datum / Projection / Coordinate System Origin, Dimensions, Cell Size Preprocess data to that system Project, Resample, Clip rasters Project, Convert Oakland Co. points to raster Subtract TRUE from & Intersect with Landcover Alternative, Point-Based Methodology (I) Assume elevations are gridded spot heights Not areal averages Decide upon a common system Datum / Projection / Coordinate System Locations at which to conduct analysis I chose to compare at the DEM locations Preprocess data to that system Convert rasters to points, project the points Project Oakland Co. points Alternative, Point-Based Methodology (II) Interpolate 'True' heights at and NED spot locations IDW, power 2, closest 6 neighbors Interpolate land cover classes at and NED spot locations Nearest - Neighbor Subtract 'True' from 'DEM' & Intersect with Landcover
Platforms Methodology 1 (Raster) implemented in Arc 8.2 (ESRI) Methodology 2 (Point) implemented in R 1.9.1 (Open Source Statistics Software) Descriptive Statistics & LC Correlations in R for both approaches Raster Method Error Statistics Results - Error Mean: 2.92 m.; SD: 3.79 m.; RMSE: 4.78 m NED Error Statistics Mean: 1.06 m.; SD: 1.49 m.; RMSE: 1.83 m Points Method Error Statistics Mean: 2.95 m.; SD: 3.93 m.; RMSE: 4.92 m NED Error Statistics Mean: 1.07 m.; SD: 1.51 m.; RMSE: 1.85 m NED Error (Raster) DEM Error (Point) -9.4 15.1 m. Error 1992 NLCD : 11 (Water) -9 22.7 m. : 21 (LI Resid) : 22 (HI Resid) : 23 (Com/Ind) : 41 (Dec. Forest) : 42 (Evg. Forest) : 43 (Mix Forest) : 81 (Pasture) : 82 (Row Crops) : 91 (Wood. WL) : 92 (Em. H. WL)
Land Cover and Error Error by LC Class (2001) Error split by overlying land cover type Significant difference (p-value < 2.2e-16) in mean error between Land Cover Classes One-way test of means Kruskal-Wallis rank sum test Forest Classes associated with substantial positive error bias too high Upland Oak Forest Mixed Dec. Pines Upland Mixed For. Error by LC Class (1992) NED Error by LC Class (2001) Forest NED Error by LC Class (1992) Wilcoxan Rank-Sum Test () mu=0 mu=3 mu=4 Class p-value p-value p-value Desc 11 0.398 1 1 Open Water 21 *** 1 1 Low Intensity Residential 22 0.158 1 1 High Intensity Residential 23 0.622 1 1 Commercial/Indust/Transport 41 *** *** *** Deciduous Forest 42 *** *** *** Evergreen Forest 43 *** 0.003 0.075 Mixed Forest 81 *** 1 1 Pasture/Hay 82 *** 1 1 Row Crops 91 *** 0.033 1 Woody Wetlands 92 *** 1 1 Emergent Herb. Wetlands *** indicates << 0.0001
Discussion - Methods Point-based method minimized change to elevations Interpolation must occur to evaluate error at each node in projected NED and Differences with raster method were slight Elevation differences reduced RMSE ~10 th meter lower Effect of forested land cover reduced Still highly significantly biased Discussion Error RMSE is well within specifications < 16 meters (4.9 m. for study area) Error significantly higher than zero Average ~ 3 m. is too high Significantly more bias in forested areas Means in the 4-6 meter range Discussion Error (II) RMSE magnitude strongly linked to forested land cover Returns not striking the ground RMSE 6-8 meter range Opportunity for statistical error models Employ landcover characteristics to adjust (co)variance models Identify canopy height? meets basic specs Conclusions But mean error is positive (biased too high) And variation of error is correlated with landcover Forests and Error Forests introduce positive bias Evergreen forests may be more error-prone Expect regions not experiencing leaf-off conditions to have higher error than Michigan Preprocessing choices make slight difference Split into Land Cover Classes (1992 NLCD Land Cover Classification System) Error (m.) Code NPts Mean STD RMSE Description 11 74-0.38 3.16 3.16 Open Water 21 202 1.35 2.39 2.74 Low Intensity Residential 22 24 0.52 2.36 2.37 High Intensity Residential 23 64 0.04 1.84 1.82 Commercl/Indust/Trans 41 2740 4.75 4.17 6.32 Deciduous Forest 42 252 6.08 4.13 7.35 Evergreen Forest 43 28 5.04 4.05 6.42 Mixed Forest 81 1447 1.34 2.72 3.03 Pasture/Hay 82 1289 0.91 2.82 2.96 Row Crops 91 722 3.44 3.40 4.83 Woody Wetlands 92 202 1.08 3.21 3.37 Emergent Herb. Wetlands