Simple example of analysis on spatial-temporal data set

Similar documents
What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19

Asymptotic standard errors of MLE

What s for today. Random Fields Autocovariance Stationarity, Isotropy. c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, / 13

What s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17

Mean square continuity

Statistícal Methods for Spatial Data Analysis

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

A TEST FOR STATIONARITY OF SPATIO-TEMPORAL RANDOM FIELDS ON PLANAR AND SPHERICAL DOMAINS

Cross-covariance Functions for Tangent Vector Fields on the Sphere

Chapter 4 - Fundamentals of spatial processes Lecture notes

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Non-stationary Cross-Covariance Models for Multivariate Processes on a Globe

Statistical Inference and Visualization in Scale-Space for Spatially Dependent Images

A test for stationarity of spatio-temporal random fields on planar and spherical domains

lecture 2 and 3: algorithms for linear algebra

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2

Chapter 4 - Fundamentals of spatial processes Lecture notes

Linear Models for the Prediction of Animal Breeding Values

Nonstationary cross-covariance models for multivariate processes on a globe

Multivariate Statistical Analysis

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1)

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Chapter 4: Factor Analysis

Paper Review: NONSTATIONARY COVARIANCE MODELS FOR GLOBAL DATA

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

A new covariance function for spatio-temporal data analysis with application to atmospheric pollution and sensor networking

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Switching Regime Estimation

lecture 3 and 4: algorithms for linear algebra

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

Matrix decompositions

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Theory and Computation for Gaussian Processes

Spatial smoothing using Gaussian processes

Statistical Models for Monitoring and Regulating Ground-level Ozone. Abstract

K-Means and Gaussian Mixture Models

Midterm 1 and 2 results

Overview of Spatial Statistics with Applications to fmri

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time

Comparing Non-informative Priors for Estimation and. Prediction in Spatial Models

Time Series: Theory and Methods

Important Matrix Factorizations

Simulating Random Variables

Hypothesis Testing One Sample Tests

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Modelling Dependence in Space and Time with Vine Copulas

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Groundwater permeability

An Introduction to Nonstationary Time Series Analysis

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets

The Matrix Reloaded: Computations for large spatial data sets

What s for today. More on Binomial distribution Poisson distribution. c Mikyoung Jun (Texas A&M) stat211 lecture 7 February 8, / 16

Spatial Extremes in Atmospheric Problems

Bayesian Linear Regression

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Multivariate Gaussian Random Fields with SPDEs

Interpolation of Spatial Data

Maths for Signals and Systems Linear Algebra for Engineering Applications

1 Data Arrays and Decompositions

Autonomous Mobile Robot Design

Lecture 2: Univariate Time Series

Hilbert Space Methods for Reduced-Rank Gaussian Process Regression

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

The Matrix Reloaded: Computations for large spatial data sets

A note on multivariate Gauss-Hermite quadrature

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Spatio-temporal precipitation modeling based on time-varying regressions

Imputation Algorithm Using Copulas

Factor Analysis and Kalman Filtering (11/2/04)

Data Analysis and Statistical Methods Statistics 651

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

A Complete Spatial Downscaler

Model Selection for Geostatistical Models

Statistical Sciences Symposium, 2014

Lecture 4: Applications of Orthogonality: QR Decompositions

Probabilistic Graphical Models

A Frequency Domain Approach for the Estimation of Parameters of Spatio-Temporal Stationary Random Processes

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Lecture 11: Regression Methods I (Linear Regression)

STA Homework 1. Due back at the beginning of class on Oct 20, 2008

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE -MODULE2 Midterm Exam Solutions - March 2015

Introductory Econometrics. Review of statistics (Part II: Inference)

Principal Component Analysis (PCA) for Sparse High-Dimensional Data

An Introduction to GAMs based on penalized regression splines. Simon Wood Mathematical Sciences, University of Bath, U.K.

Statistical analysis of a spatio-temporal model with location dependent parameters and a test for spatial stationarity

Lecture 11: Regression Methods I (Linear Regression)

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

1 Description of variables

Testing Self-Similarity Through Lamperti Transformations

Bayesian Transformed Gaussian Random Field: A Review

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

On Gaussian Process Models for High-Dimensional Geostatistical Datasets

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting

Transcription:

Simple example of analysis on spatial-temporal data set I used the ground level ozone data in North Carolina (from Suhasini Subba Rao s website) The original data consists of 920 days of data over 72 locations in North Carolina I used first 50 days of data over 28 locations Missing values are imputed in a simple way The data is standardized, so that it s mean is close to zero and the distribution is close to be Normal c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 1 / 20

Location c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 2 / 20

Normality Normal Q Q Plot Frequency 0 50 150 250 350 Sample Quantiles 2 1 0 1 2 3 2 1 0 1 2 3 ozone 3 2 1 0 1 2 3 Theoretical Quantiles c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 3 / 20

Temporal correlation Series o3[, 10] Series o3[, 15] ACF 0.0 0.2 0.4 0.6 0.8 1.0 ACF 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 25 30 Lag 0 5 10 15 20 25 30 Lag c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 4 / 20

Two simple models considered Model 1: isotropic, fully symmetric Matérn model Parameters: α, β 1, β 2, ν Model 2: separable model with both Matérn models Tried same number of covariance parameters as above (fix smoothness for space and time same) as well as the case with different spatial and temporal smoothness parameter values c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 5 / 20

Fitted results From model 1: α = 0.158, β 1 = 0.00770 (miles), β 2 = 105.617 (days), ν = 0.115, LogLik=108.584 From model 2: α = 0.034, β 1 = 0.301, β 2 = 108.853, ν = 0.114, LogLik=107.859 From model 2: (different smoothness) α = 0.147, β 1 = 0.295, β 2 = 109.324, ν 1 = 1.205, ν 2 = 0.114, LogLik=107.859 c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 6 / 20

Now... How to simulate random fields? c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 7 / 20

Simulation of Random Fields Now we will see some methods of simulating spatial data Why do we care about it? Sometimes we have to demonstrate our statistical method works by showing that the method exhibits satisfactory long-run behavior We may have to perform randomization tests or other hypothesis tests (we will see an example in a minute) Many times we lack of replication in spatial data sets c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 8 / 20

Example of usage of simulated spatial data set Jun, Knutti, and Nychka (2008, JASA) We have 20 climate models (each model is a gigantic system of PDEs) The assumption is that some of these climate models have correlated errors For a given time point, we have one output from each climate model If we estimate correlation between a certain pair of model errors, we cannot do inference on the statistical significance of it What do we do? c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 9 / 20

Example of usage of simulated spatial data set Our idea is that, we build a spatial (or spatial-temporal) model for each climate model Based on our spatial model, we do independent simulation many many times for each climate model If climate model errors are independent, the correlation that we get from the independent simulation should be similar to the actual correlation that we get from the actual climate model outputs Our spatial model and simulation from it gives us a ground for the test for the significance of the correlation c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 10 / 20

Jun et al. (2008, Tellus) c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 11 / 20

Jun et al. (2008, JASA) c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 12 / 20

Jun et al. (2008,Tellus) c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 13 / 20

Jun et al. (2008,Tellus) c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 14 / 20

Simulations and Gaussianity From now on, we will assume the spatial field has a Gaussian distribution This is due to the fact that all we know about the process is mean and the covariance structure Gaussian distribution is a special distribution that is determined 100 % by mean and the covariance We can not do much if the process is not Gaussian c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 15 / 20

Conditional vs unconditional simulation Suppose our spatial domain is D R 2 and we have observations on s 1,, s m for some m We consider simulating values of the spatial process (with the same mean and same covariance, and thus same distribution under Gaussianity) on the locations of D other than s 1,, s m for some m Conditional simulation means that we respect the observation values. That is, our simulated values on the locations of observations should be the same as the actual observations Unconditional simulation means that we do not have such restriction Obviously unconditional simulation would be easier than the conditional simulation c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 16 / 20

Unconditional simulation of Gaussian random fields Suppose we consider simulating Y N(µ, Σ) Note that since Σ is symmetric and positive definite, we can write Σ = Σ 1/2 (Σ 1/2 ) T Then if X N(0, I ) (I is an identity matrix with the same dimension as Σ), we can show that µ + Σ 1/2 X has the same distribution as Y Now the question is now to find Σ 1/2 c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 17 / 20

Unconditional simulation of Gaussian random fields 1 Cholesky decomposition As we discussed in the previous lecture, we can decompose Σ = U T U for U being a upper triangular matrix and we can have all of the diagonals of U being positive (and they are unique) You can use Σ 1/2 = U T 2 Eigenvalue decomposition There is another way of decomposing Σ That is, we can let Σ = P P T where is a diagonal matrix with its diagonal values as eigenvalues of Σ (they should be all positive) Also P should be orthonormal matrix (P T P = I ) Since the diagonals of are positive, we can let Σ 1/2 = P 1/2 P T c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 18 / 20

Conditional simulation of Gaussian random fields Suppose we want to simulate the random field Z(s), s ind Suppose also that we have observations Z(s 1 ),, Z(s m ) We denote the simulated values of Z as S A conditional simulation produces n = m + k values such that S(s) = [Z(s 1 ),, Z(s m ), S(s m+1 ),, S(S m+k )] c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 19 / 20

Conditional simulation of Gaussian random fields 1 Sequential simulation for Gaussian random field Note the fact from the multivariate Gaussian distribution that if ( ) ( ) ( ) Z(s0 ) µ0 σ 2 c N(, T ) Z(s) µ c Σ then, Z(s 0 ) Z(s) N(µ 0 + c T Σ 1 (Z(s) µ), σ 2 c T Σ 1 c) Using the above fact, we calculate the conditional distribution of S(s m+i ) given Z(s 1 ),, Z(s m ), S(s m+1 ),, S(S m+i 1 )] which is Gaussian 2 Conditioning a simulation by Kriging Consider the decomposition Z(s) = p sk (s; Z) + Z(s) p sk (s; Z) Then we replace Z(s) p sk (s; Z) by S(s) p sk (s; S m ), where p sk (s; S m ) denotes the simple kriging predictor at location s based on the values of the unconditional simulation at s 1,, s m Then we can show that the above quantity has the desired property c Mikyoung Jun (Texas A&M) Stat647 Lecture 16 October 23, 2012 20 / 20