06 April 2017 The role of topological outliers in the spatial analysis of georeferenced social media data René Westerholt, Heidelberg University Seminar on Spatial urban analytics: big data, methodologies, and behavioural implications Geography Colloquium, Harvard University 1
Companion paper GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 2
What is a topological outlier? A spatial unit that interacts in an unusual way and causes topologically-induced variance. Tiefelsdorf (1999) Local areas Counties GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 3
What is a topological outlier? A spatial unit that interacts in an unusual way and causes topologically-induced variance. Tiefelsdorf (1999) GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 4
Why topological outliers in social media data? Users perceive space in different ways Environmental acoustics (Iosa et al. 2012) Age (Sugovic & Witt 2013) Emotional and bodily states (Zadra & Clore 2011)... Individual linguistic skills Technical restrictions (e.g., 140 characters)... Leads to inevitable heterogeneity Leads to erroneous connections within spatial analyses GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 5
Research questions 1. What effects do (strong) topological outliers have on the spatial analysis of social media feeds? 2. What is the role of scale? Methods: Semivariogram Covariance Moran s I / Moran Scatterplot Eigenvalue analysis of spatial weights... GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 6
Data Synthetic dataset 2 different scales 2 different Gaussians Partial overlap Twitter dataset 23 million tweets London, one year NLP (LDA) GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 7
Semivariogram from tweets Unusual shape Lack of spatial structure appears quickly Semivariogram levels-off to nonspatial variance Clustering and repulsion at small scales GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 8
Eigenvalue analysis GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 9
Moran scatterplot Positive slope = positive spatial autocorrelation GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 10
Spatial artefact patterns Small-scale interacting with large-scale Positive slope = positive spatial autocorrelation Large-scale interacting with small-scale GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 11
Spatial artefact patterns Artefacts are function of topological variability Pattern is a function of attribute and spatial lag GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 12
Effect of differing scales GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 13
Conclusions Topological outiers lead to unexpected analysis outcomes Type of spatial weights is very important contradictions possible Erroneous spatial interaction causes fake spatial processes Scale differences have impact on how fake processes operate Spatial analytical approaches must account for topological outliers Heterogeneity is also a chance: it allows better understanding of how neighbourhoods are composed GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 14
Thank you! Questions? René Westerholt westerholt@uni-heidelberg.de Further information: Westerholt, R., Resch, B., & Zipf, A. (2015). A local scale-sensitive indicator of spatial autocorrelation for assessing high-and low-value clusters in multiscale datasets. International Journal of Geographical Information Science, 29 (5), 868-887. Westerholt, R., Steiger, E., Resch, B., & Zipf, A. (2016). Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis. PLOS ONE, 11(9), e0162360. Steiger, E., Westerholt, R., & Zipf, A. (2016). Research on social media feeds A GIScience perspective. European Handbook of Crowdsourced Geographic Information, 237-254. GIScience Research Group / Institute of Geography / Heidelberg University / René Westerholt 15