Space-time data. Simple space-time analyses. PM10 in space. PM10 in time

Similar documents
Point-Referenced Data Models

Spatial analysis of a designed experiment. Uniformity trials. Blocking

Geostatistical Modeling for Large Data Sets: Low-rank methods

Chapter 1. Summer School GEOSTAT 2014, Spatio-Temporal Geostatistics,

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis

Areal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield

Statistícal Methods for Spatial Data Analysis

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1

CBMS Lecture 1. Alan E. Gelfand Duke University

Introduction. Semivariogram Cloud

An Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Fluvial Variography: Characterizing Spatial Dependence on Stream Networks. Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA)

Geostatistics for Gaussian processes

Chapter 4 - Fundamentals of spatial processes Lecture notes

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University

PCA, Kernel PCA, ICA

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Introduction to Geostatistics

E(x i ) = µ i. 2 d. + sin 1 d θ 2. for d < θ 2 0 for d θ 2

CS281 Section 4: Factor Analysis and PCA

Basics of Point-Referenced Data Models

Spatial Interpolation & Geostatistics

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

Introduction to Spatial Data and Models

Lab #3 Background Material Quantifying Point and Gradient Patterns

Introduction to Spatial Data and Models

GEOSTATISTICS. Dr. Spyros Fountas

Econ 2120: Section 2

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

Practicum : Spatial Regression

Concepts and Applications of Kriging

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19

Recap on Data Assimilation

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Independent Component Analysis and Its Application on Accelerator Physics

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17

Data Mining Techniques

Time-lapse filtering and improved repeatability with automatic factorial co-kriging. Thierry Coléou CGG Reservoir Services Massy

PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH

Dimensionality Reduction and Principal Components

Principal Components Analysis (PCA)

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

Spatial point processes in the modern world an

Estimation of non-stationary spatial covariance structure

Chapter 3 - Temporal processes

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Concepts and Applications of Kriging

PRINCIPAL COMPONENTS ANALYSIS

Spatial Data Mining. Regression and Classification Techniques

7 Geostatistics. Figure 7.1 Focus of geostatistics

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

SPATIO-TEMPORAL METHODS IN CLIMATOLOGY. Christopher K. Wikle Department of Statistics, University of Missouri Columbia, USA

Dimensionality Reduction and Principle Components

Summary STK 4150/9150

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph

Bayesian Transgaussian Kriging

A non-gaussian decomposition of Total Water Storage (TWS), using Independent Component Analysis (ICA)

PRINCIPAL COMPONENT ANALYSIS

Low-rank methods and predictive processes for spatial models

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

GIST 4302/5302: Spatial Analysis and Modeling

Lecture: Face Recognition and Feature Reduction

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements:

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Exploring the World of Ordinary Kriging. Dennis J. J. Walvoort. Wageningen University & Research Center Wageningen, The Netherlands

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Data Break 8: Kriging the Meuse RiverBIOS 737 Spring 2004 p.1/27

Types of Spatial Data

Lecture 5 Geostatistics

Unconstrained Ordination

A kernel indicator variogram and its application to groundwater pollution

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

LINEAR ALGEBRA KNOWLEDGE SURVEY

Lecture: Face Recognition and Feature Reduction

Statistics 910, #5 1. Regression Methods

14 Singular Value Decomposition

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Space-time analysis using a general product-sum model

Lecture 6: Methods for high-dimensional problems

An Introduction to Pattern Statistics

Principal component analysis (PCA) for clustering gene expression data

Optimal Interpolation

Feature Engineering, Model Evaluations

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Combining Incompatible Spatial Data

Lecture 2: Univariate Time Series

Lecture 14 Bayesian Models for Spatio-Temporal Data

Computational functional genomics

Chapter 4: Factor Analysis

Spatial Backfitting of Roller Measurement Values from a Florida Test Bed

Stat 5101 Lecture Notes

Machine Learning (Spring 2012) Principal Component Analysis

1 Principal Components Analysis

Transcription:

Space-time data Observations taken over space and over time Z(s, t): indexed by space, s, and time, t Here, consider geostatistical/time data Z(s, t) exists for all locations and all times May consider temporal extensions of other types of data later Key point: Many different methods / approaches What question do you want to answer? Some possible questions / goals: Describe spatial pattern for obs. taken at same time Describe temporal pattern for obs. taken at same location Does the spatial pattern change over time? or does the temporal pattern change over space? Predict / map values in (space, time) Fit a dynamic model to (space, time) data c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 1 / 42 Simple space-time analyses Describe spatial pattern for obs. taken at same time Divide the data by times (or time bins) Describe spatial pattern at each time Can plot spatial data at each time Describe temporal pattern for obs. taken at same location Divide data by individual locations Apply time-series methods to each location Can overlay multiple time series on one plot c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 2 / 42 PM10 in space PM10 2009 12 22009 12 232009 12 242009 12 25 [1.025,3.894] (3.894,6.764] (6.764,9.633] 2009 12 262009 12 272009 12 282009 12 29 (9.633,12.5] (12.5,15.37] (15.37,18.24] (18.24,21.11] (21.11,23.98] (23.98,26.85] (26.85,29.72] (29.72,32.59] (32.59,35.46] 2009 12 302009 12 31 (35.46,38.33] (38.33,41.2] (41.2,44.07] c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 3 / 42 PM10 in time air.zoo 0 10 20 30 40 Dec 22 Dec 24 Dec 26 Dec 28 Dec 30 Index c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 4 / 42

What does same pattern mean? Same pattern in values: focus on Z(s). Same spatial structure: focus on patchiness and variability look at semivariograms First implies second, second doesn t imply first Example: How accurate is a forecast of snow amount? Does the forecast have the same pattern as reality? Predict snow amount at a location well Predict there will be a patch of heavy snow somewhere in central IA c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 5 / 42 Less simple space time analyses Summarize temporal pattern by one or a few numbers Then look at spatial variation in those summaries called statisticians call this Principle Components Analysis (PCA) Extend geostatistical analyses to space-time (3D) convert time to space Or, assume time and space independent Model how process evolves over time c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 6 / 42 Summarize pattern by reducing dimensionality Describes temporal pattern in values (Z(s)) over space Example: Data: 20 var. measured on 3 samples / 20 times at 3 locations Z value 2 3 4 5 6 7 8 1 2 3 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis Location - Part 8 Spring 2018 7 / 42 The three variables are correlated 5 6 7 8 Location 1 3.5 4.5 5.5 6.5 5 6 7 8 Location 2 Location 3 2 3 4 5 6 7 3.5 4.5 5.5 6.5 2 3 4 5 6 7 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 8 / 42

Correspond to spatial patterns at each time Some are high/low/high; some are low/high/low; some are flat Z value 2 3 4 5 6 7 8 1 2 3 Location c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 9 / 42 One variable will summarize much of the info in 3 variables value of X 2 3 4 5 6 7 8 Location 1 Location 2 Location 3 4 3 2 1 0 1 2 3 Axis 1 score c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 10 / 42 How that axis 1 score is computed: Have Z ij : value of the j th variable for obs. i Center each observation by that variable s mean Z ij = Z ij Z j Zij has mean 0 for each variable j compute a weighted average of the Zj s for each observation S i = α 1 Z i1 + α 2 Z i2 + α 3 Z i3 For each variable, the S i have mean 0 For these data, α 1 = 0.903, α 2 = 0.965, α 3 = 1.361. How are the α j values determined? Simpler example: 2 variables, both centered to mean 0 Want to summarize both variables by one new score Define a line, project each point onto that line c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 11 / 42 Empirical orthogonal functions Some lines not so good: little spread in the projected points Some lines very good: lots of spread in the projected points X2 X2 X1 X1 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 12 / 42

EOF/Principal Component Analysis find α s that maximize spread of the projected scores An eigenvector / eigenvalue problem (details on request) Continue beyond one axis. consider all 2nd axes that are to axis 1 find the one with maximum Variance of projected scores Axis 2 score Axis 1 score c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 13 / 42 User s guide to EOF / PCA Axes are orthogonal, so axis 1 scores are not correlated with axis 2, or axis 3,... Could be good or not-so-good Good: axis scores represent independent pieces of information Not-so-good: physical things are probably not orthogonal, so may be hard to relate EOF axes to physical things Can decompose total variability into contributions from each axis Total variability = Var (variable 1) + Var (variable 2) + Eigenvalue for each axis is the Variance of scores on that axis Sum of eigenvalues = Total variability Contribution of each axis usually expressed as percentage c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 14 / 42 User s guide to EOF / PCA What is the temporal pattern represented by an EOF? plot scores on an axis vs time How strongly is a temporal pattern expressed at each location? Display as map of the α s for each location. c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 15 / 42 Example: 3 location, 20 times data set Variances of each variable: 0.960, 1.304, 2.102. Total variability = 4.367 Variance of scores on each axis: 3.598, 0.592, 0.176 (sum: 4.367) % variance: 82.4, 13.6, 4.0 First axis represents most of the variability across the three times. test.eof1d$pc1 [ 0.7173, 721] ( 721, 269] ( 269,0.0183] (0.0183,635] (635,0.5087] 3 2 1 0 1 2 3 4 Jan 2014 May 2014 Sep 2014 Jan 2015 May 2015 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 16 / 42

Two views of PCA / EOF decomposing variation: EOF 1 represents 82.4% of the variation reduced rank approximation to a matrix Reduced rank approximations Calculate two vectors: one for locations, L, one for times, T Already have locations: EOF scores for axis 1 based on that can calculate vector for times intuitively how strong is the spatial pattern that time Can approximate matrix A by L T, actually A ij = L i T j Approximates matrix with 60 values by 20 + 3 Can improve approximation by adding 2nd axis: A = L (1) T (1) + L (2) T (2) Known as Singular Value Decomposition Very closely related to the Eigen Decomposition used in PCA c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 17 / 42 Have presented simplest (classic) form of EOF s Statistical view: PCA on covariance matrix PCA on covariance or PCA on correlation matrix? Total variability can be driven by one (or a very few) variables with large variance Eg. Var X1 = 100, Var X2 = 2, Var X3 = 2, Var X4 = 2 most important axis will be X1 because it has a large variance EOF/PCA analysis of covariance matrix only makes sense when each variable has same units Statistical PCA more frequently done on correlation matrix covariance matrix: center each variable correlation matrix: center and standardize each variable (so each Z has sd = Var =1 An axis then represents a bundle of correlated variables, i.e., variables that change together (+ or -1) axes get large eigenvalues by representing many variables that change together. c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 18 / 42 EOF extensions: rotated axes rotate axes to make them more physically relevant choice of rotation still determined by data, not outside knowledge Many algorithms, each chooses rotation differently Orthogonal rotations: axes still orthogonal Most common is varimax algorithm: make α s close to 0 or ± 1 so axes tend to either ignore a variable or include it completely Oblique rotations: allow axes to be non-orthogonal Much more difficult to define criteria for good set of axes Statistics: known as factor analysis c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 19 / 42 EOF extensions: Extended EOF Analysis based on covariance between two locations treats each time as an independent observation What if times are not independent? Extended EOF: incorporate temporal correlation See references for additional information c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 20 / 42

PM10 in Germany sd s for each EOF are: 23.3, 12.7, 8.7, 7.7, 6.2, 5.6, 3.2, 2.8, and 1.9 First EOF accounts for 58.3% of total variability air.eof.time[, 1] 40 20 0 10 20 Dec 22 2009 Dec 25 2009 Dec 28 2009 Dec 31 2009 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 21 / 42 PM10 in Germany predeof1 2009 12 22 2009 12 23 2009 12 24 2009 12 25 2009 12 26 2009 12 27 2009 12 28 2009 12 29 2009 12 30 2009 12 31 PM10 2009 12 22 2009 12 23 2009 12 24 2009 12 25 2009 12 26 2009 12 27 2009 12 28 2009 12 29 2009 12 30 2009 12 31 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 22 / 42 Space-time data What about pattern as strength/scale of heterogeneity? Semivariogram for spatial pattern autocorrelation function for temporal pattern autocorrelation function is Cor Y t, Y t+δ for different δ δ is the time lag (equivalent to distance in space) How strongly correlated are obs. one time apart (δ = 1)? How quickly does correlation die out with time lag? Can describe autocorrelation / autocovariance as semivariance in time γ time (δ) = E [Z(s, t) Z(s, t + δ)] 2 = σ 2 (1 ρ(δ)) under assumption of 2nd order stationarity c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 23 / 42 Space-time data Lots of things you could do My approach is choose methods that answer your questions Is the spatial pattern similar at each time? Estimate and fit variograms for each time How similar are they? Could construct an approximate test based on weighted SS from fits Predict over space at each time separately. Could divide data by time, estimate time-specific variogram Nothing new, but ignores any temporal dependence If believe similar spatial patterns each time: Better estimate of variogram by combining information across times: all times have same variogram: compute average of time-specific vg s variogram changes smoothly, γ t similar to γ t 1 ˆγ t (h) = λˆγ t (h) + (1 λ)ˆγ t 1(h) Combine spatial and temporal dependence: space-time kriging c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 24 / 42

PM 10 in Germany 0 0 50 100 150 200 dist c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 25 / 42 PM 10 in Germany 0 0.0 0.8 1.0 1.2 1.4 dist c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 26 / 42 PM 10 in Germany 0 0.0 0.8 1.0 1.2 1.4 dist c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 27 / 42 PM 10 in Germany 0 1 2 3 4 5 6 1.0 0.5 0.0 0.5 1.0 Time lag estimated autocorrelation c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 28 / 42

Space-time kriging Consider data as one long vector for all locations and times Pools information from all times to estimate a single spatial variogram Pools information from all locations to estimate a single temporal variogram Once you have variograms: predicting is easy. Use the big VC matrix to krige. Two relatively simple models for the space-time variogram Metric ST model Separable ST model c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 29 / 42 log PM 10 in Germany 8 0.7 6 0.5 time lag (days) 4 0.3 2 0.1 0 0.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 30 / 42 log PM 10 in Germany lag0 lag1 lag2 lag3 lag4 lag5 lag6 lag7 lag8 lag9 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 31 / 42 Space-time kriging: Metric model Define time as a third coordinate Define an anisotropy coefficient, θ, (geometric anisotropy ideas) to relate one unit of time to equivalent distance hij = (si sj) 2 + θ(t i t j ) 2 Fit one joint variogram No issues with nugget or sill nugget is γ(h) for h close to 0 (similar time, similar location) sill is γ(h) for large separation in time, or distant locations, or both c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 32 / 42

log PM 10 in Germany sample metric 0.7 8 time lag (days) 6 4 0.5 0.3 2 0.1 0 0.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 33 / 42 log PM 10 in Germany sample metric lag0 lag1 lag2 lag3 lag4 lag5 lag6 lag7 lag8 lag9 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 34 / 42 Space-time data: Separable model Write space-time covariance as a product of a spatial covariance function and a temporal covariance function Cov Z(si, t i ), Z(sj, t j ) = σ 2 Cor space (si, sj) Cor time (t i, t j ) Single sill Sometimes sum used instead of product Commonly used because it s simple Simplifies a lot of matrix computations Dominant model, especially prior to 2000 But it s probably too simple to be realistic Assumes no space-time interaction c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 35 / 42 log PM 10 in Germany sample separable 0.7 8 0.5 6 time lag (days) 4 0.3 2 0.1 0 0.0 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 36 / 42

log PM 10 in Germany sample separable lag0 lag1 lag2 lag3 lag4 lag5 lag6 lag7 lag8 lag9 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 37 / 42 log PM 10 in Germany PM10 2009 12 22 2009 12 23 2009 12 24 2009 12 25 40 35 2009 12 26 2009 12 27 2009 12 28 2009 12 29 30 25 20 2009 12 30 2009 12 31 15 10 5 c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 38 / 42 Hierarchical modeling Kriging is a dumb predictor Only uses observed values and their patterns Space-time data often arise because of ecological/environmental processes blob of pollutant gets carried downstream or downwind Use knowledge of the process(es) to predict Z(s, t) given information about flow fields and Z(s, t 1) Hierarchical models provide a way to think about modeling such data Two separate levels in a hierarchical model Process model: dynamical model describing how the system works probably includes variables not directly measured Observation model: how what you observe relates to the quantities in the process model c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 39 / 42 Hierarchical modeling Classic example: predicting location of a deep-space satellite Process model has 9 state variables: (X,Y,Z) position, (X,Y,Z) velocity, (X,Y,Z) acceleration Observation model has intermittent fixes on position (X,Y,Z) Data measured with error Use data to estimate parameters in process and observation models Use model to predict position each day c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 40 / 42

Hierarchical models When all error distributions are normal, estimate parameters by maximum likelihood Procedure to predict state variables known as the Kalman filter More generally, likelihood has integrals that can not be analytically solved Bayesian inference Adds third level to model: specification of prior distributions Spatial application: Environmental contaminant data with < detection limit observations Process model: Z(s) follows a spatial correlation model Observational model: data are Z(s) when above detection limit and < dl when not c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 41 / 42 Hierarchical models Details very problem-specific If you re interested in more: Stat 444: Bayesian statistics Stat 534: Ecological statistics ends with section on hierarchical modeling for ecological data Banerjee, Carlin and Gelfand or Wikle and Cressie books c Philip M. Dixon (Iowa State Univ.) Spatial Data Analysis - Part 8 Spring 2018 42 / 42