Chapter 4 - Fundamentals of spatial processes Lecture notes

Similar documents
Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 3 - Temporal processes

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström.

Point-Referenced Data Models

Basics of Point-Referenced Data Models

Chapter 4 - Spatial processes R packages and software Lecture notes

Summary STK 4150/9150

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

What s for today. Random Fields Autocovariance Stationarity, Isotropy. c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, / 13

Hierarchical Modelling for Univariate Spatial Data

Introduction to Geostatistics

Kriging Luc Anselin, All Rights Reserved

Hierarchical Modeling for Univariate Spatial Data

Bayesian Transgaussian Kriging

Practicum : Spatial Regression

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Introduction to Spatial Data and Models

What s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model

Introduction to Spatial Data and Models

CBMS Lecture 1. Alan E. Gelfand Duke University

Lecture 9: Introduction to Kriging

Geostatistical Modeling for Large Data Sets: Low-rank methods

Regression with correlation for the Sales Data

Space-time analysis using a general product-sum model

Overview of Spatial Statistics with Applications to fmri

REML Estimation and Linear Mixed Models 4. Geostatistics and linear mixed models for spatial data

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Introduction. Semivariogram Cloud

Hierarchical Modelling for Univariate Spatial Data

Model Selection for Geostatistical Models

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH

An Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University

Spatial Backfitting of Roller Measurement Values from a Florida Test Bed

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Exploring the World of Ordinary Kriging. Dennis J. J. Walvoort. Wageningen University & Research Center Wageningen, The Netherlands

Covariance function estimation in Gaussian process regression

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Bayesian Transformed Gaussian Random Field: A Review

Spatial analysis is the quantitative study of phenomena that are located in space.

E(x i ) = µ i. 2 d. + sin 1 d θ 2. for d < θ 2 0 for d θ 2

Chapter 1. Summer School GEOSTAT 2014, Spatio-Temporal Geostatistics,

Gaussian predictive process models for large spatial data sets.

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Statistics & Data Sciences: First Year Prelim Exam May 2018

Non-gaussian spatiotemporal modeling

Bayesian Thinking in Spatial Statistics

Statistícal Methods for Spatial Data Analysis

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

STAT 100C: Linear models

Interpolation and 3D Visualization of Geodata

What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Time Series Analysis

A time series is called strictly stationary if the joint distribution of every collection (Y t

variability of the model, represented by σ 2 and not accounted for by Xβ

Hierarchical Modeling for Spatio-temporal Data

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph

On dealing with spatially correlated residuals in remote sensing and GIS

Geostatistics in Hydrology: Kriging interpolation

Simple Linear Regression

STAT 100C: Linear models

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Exercises for STK4150/ Environmental and spatial statistics

Geostatistics for Seismic Data Integration in Earth Models

Hierarchical Modelling for Univariate and Multivariate Spatial Data

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1

Multivariate Geostatistics

Bayesian Linear Regression

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Bayesian linear regression

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Index. Geostatistics for Environmental Scientists, 2nd Edition R. Webster and M. A. Oliver 2007 John Wiley & Sons, Ltd. ISBN:

Spatial Interpolation & Geostatistics

Spatio-temporal prediction of site index based on forest inventories and climate change scenarios

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

Analysing geoadditive regression data: a mixed model approach

Influence of parameter estimation uncertainty in Kriging: Part 2 Test and case study applications

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17

Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study

y(x) = x w + ε(x), (1)

Simple example of analysis on spatial-temporal data set

Multivariate Spatial Process Models. Alan E. Gelfand and Sudipto Banerjee

1 Isotropic Covariance Functions

Multivariate spatial modeling

Basics in Geostatistics 2 Geostatistical interpolation/estimation: Kriging methods. Hans Wackernagel. MINES ParisTech.

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

An Introduction to Bayesian Linear Regression

Using linear and non-linear kriging interpolators to produce probability maps

Spatio-Temporal Geostatistical Models, with an Application in Fish Stock

BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN

Bayesian Linear Models

Estimation of non-stationary spatial covariance structure

Hierarchical Modeling for Multivariate Spatial Data

Transcription:

TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017

STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative correlation when competition Part of a space-time process Temporal snapshot Temporal aggregation Statistical analysis Incorporate spatial dependence into spatial statistical models Active research field Computer intensive tasks gives specialized software

STK4150 - Intro 3 Hierarchical (statistical) models Data model Process model In time series setting - state space models

STK4150 - Intro 4 Hierarchical (statistical) models Data model Process model In time series setting - state space models Example Y t =αy t 1 + W t, t = 2, 3,... Z t =βy t + η t, t = 1, 2, 3,... W t ind (0, σ 2 W ) η t ind (0, σ 2 W ) Process model Data model We will do similar type of modelling now, separating the process model and the data model: Z(s i ) = Y (s i ) + ε(s i ), ε(s i ) iid

STK4150 - Intro 5 Spatial prediction An unknown function is of interest, i.e. Y (s), s D Standard problem: Observed Z = (Z 1,..., Z m ) Z i = Y (s i ) + ε i, i = 1,..., n Want to predict function in an unobserved position s 0, i.e. Y (s 0 ) Multiple methods, assumptions and framework are different. Interpolation, Regression, Splines Kriging Hierarchical model (with a Gaussian random field)

Methods for spatial prediction Regression: Find function of minimum least squares. K f (s) = β i f i (s) = β T f(s) i=1 Know: β = (F T F) 1 F T Z, with F ij = f j (s i ). Kriging: Find optimal unbiased linear predictor under Squared error loss. min k,l E{(Y (s 0) k L T Z) 2 } Hierarchical model: Find optimal predictor under Squared error loss. Know: a(z) = E{Y (s 0 ) Z} min E{(Y (s 0) a(z)) 2 } a(z)

Geostatistical models Assume {Y (s)} is a Gaussian process (Y (s 1 ),..., Y (s m )) is multivariate Gaussian for all m and s 1,..., s m Need to specify µ(s) = E(Y (s)) C Y (s, s ) = cov(y (s), Y (s )) Assuming 2. order stationarity: µ(s) = µ, s C Y (s, s ) = C Y (s s ), s, s Common extension: µ(s) = x(s) T β Often: Z(s i ) Y (s i ), σε 2 independent N(Y (s i ), σε) 2

STK4150 - Intro 8 Covariance function and Variogram Dependence can be specified through covariance functions or the Variogram 2γ Y (h) var[y (s + h) Y (s)] =var[y (s + h)] + var[y (s)] 2cov[Y (s + h), Y (s)] =2C Y (0) 2C Y (h) Variograms are more general than covariance functions Variogram can exist even if var[y (s)] =! In variograms relative changes are modelled model rather than the process it self. In Geostatistics it is common to use variograms. All formulas Covariance formulas have corresponding Variogram formulas. γ Y (h) sometimes called semi-variogram

Stationary, isotropic, anisotropic Strong stationarity For any (s 1,..., s m ) and any h [Y (s 1 ),..., Y (s m )] = [Y (s 1 + h),..., Y (s m + h)] Stationarity in mean E(Y (s)) = µ, for all s D A Stationarity covariance (Depend only on lag ) cov(y (s), Y (s + h)) = C Y (h), for all s, s + h D A Isotropic covariance (depend only on length of lag) cov(y (s), Y (s + h)) = C Y ( h ) for all s, s + h D A Geometric anisotropy in covariance ( Rotated coordinate system ) cov(y (s), Y (s + h)) = C Y ( Ah ), for all s, s + h D A Weak stationarity : Stationarity in mean and a stationarity covariance. (recall time series)

Isotropic covariance functions/variograms Matern covariance function C Y (h; θ) = σ 2 1{2 θ2 1 Γ(θ 2 )} 1 { h /θ 1 } θ2 K θ2 ( h /θ 1 ) Powered-exponential C Y (h; θ) = σ 2 1 exp{ ( h /θ 1 ) θ2 } Exponential Gaussian C Y (h; θ) = σ 2 1 exp{ ( h /θ 1 )} C Y (h; θ) = σ 2 1 exp{ ( h /θ 1 ) 2 } Note: Many different ways to define the Range parameter θ 1. e.g. σ 2 1 exp{ 3( h /R)ν }

Stationary spatial covariance functions STK4150 - Intro 11

Simulation of stationary spatial random field STK4150 - Intro 12

Isotropic Covariance Benefit of isotropic assumption is that we only need 1D functions Note: not all covariance functions / variograms valid in 1D are valid as isotropic covariance functions in higher dimensions.

TK4150 - Intro 14 Bochner s theorem A covariance function needs to be positive definite Theorem (Bochner, 1955) If C Y (h) dh <, then a valid real-valued covariance function can be written as C Y (h) = cos(ω T h)f Y (ω)dω where f Y (ω) 0 is symmetric about ω = 0. f Y (ω): Spectral density of C Y (h).

Nugget effect and Sill We have C Z (h) =cov[z(s), Z(s + h)] =cov[y (s) + ε(s), Y (s + h) + ε(s + h)] =cov[y (s), Y (s + h)] + cov[ε(s), ε(s + h)] =C Y (h) + σεi 2 (h = 0) Assume Then C Y (0) =σy 2 lim [C Y (0) C Y (h)] =0 h 0 C Z (0) = σy 2 + σε 2 lim [C Z (0) C Z (h)] = σε 2 = c 0 h 0 Possible to include nugget effect also in Y -process. Sill Nugget effect TK4150 - Intro 15

Nugget/sill STK4150 - Intro 16

STK4150 - Intro 17 Nugget/sill Sill is a fixed finite level which the variogram converges towards at large lag (i.e. h ). Variograms without a sill (γ(h) as h ) has no equivalent correlation function. The nugget effect is variability below the resolution in our model. In the setting with point observations, i.e. Z i = Y (s i ) + ε i it is difficult (often impossible) to distinguish nugget effect in random field Y (s i ) and observation error ε i. This becomes a modelling choice. Some softwares mixes nugget effect and observation error.

STK4150 - Intro 18 Estimation of variogram/covariance function 2γ Z (h) var[z(s + h) Z(s)] Const expectation = E[Z(s + h) Z(s)] 2 Can estimate from all pairs having distance h between. Problem: Few/no pairs for all h Simplifications Isotropic: γ Z (h) = γ 0 Z ( h ) Lag bin: 2 γ 0 Z (h) = ave{(z(s i) Z(s j )) 2 ; s i s j T (h)} If covariates, use residuals

Boreality data Empirical variogram: Boreality data

STK4150 - Intro 20 Testing for independence If independence: γ 0 Z (h) = σ2 Z Test-statistic F = γ Z (h 1 )/ σ 2 Z, h 1 smallest observed distance Reject H 0 for F 1 large Permutation test: Keep spatial positions, and keep values of residuals, but scramble the pairing, i.e. reassign residuals to new spatial positions. Recalculate F for all permutations of Z (or for a random sample of permutations) If observed F is above 97.5% percentile, reject H 0 Boreality example: P-value = 0.0001

STK4150 - Intro 21 Prediction in multivariate Gaussian models (Z(s 1 ),..., Z(s m ), Y (s 0 )) is multivariate Gaussian Can use rules about conditional distributions: ( ) (( ) ( )) X1 µ1 Σ11 Σ MVN, 12 X 2 µ 2 Σ 21 Σ 22 Need E(X 1 X 2 ) =µ 1 + Σ 12 Σ 1 22 (X 2 µ 2 ) var(x 1 X 2 ) =Σ 11 Σ 12 Σ 1 22 Σ 21 Expectations: As for ordinary linear regression Covariances: New!

Prediction in the spatial model ( ) (( ) ( )) Y (s0 ) µ(s0 ) CY (s MVN, 0, s 0 ) c(s 0 ) Z µ Z c(s 0 ) T c(s 0 ) =cov[z, Y (s 0 )] =cov[y, Y (s 0 )] =(C Y (s 0, s 1 ),..., C Y (s 0, s m )) =c Y (s 0 ) C Z ={C Z (s i, s j )} { C Y (s i, s i ) + σ 2 C Z (s i, s j ) = ε, C Y (s i, s j ), s i = s j s i s j C Z E(Y (s0) Z) = µ(s0) + c(s 0 ) T C 1 Z (Z µ Z ) Var(Y (s0) Z) = C Y (s0, s0) c(s 0 ) T C 1 Z c(s 0)

Kriging = Prediction Model Y (s) =x(s) T β + δ(s) Z i =Y (s i ) + ε i Prediction of Y (s 0 ), i.e. in an unobserved location. Linear predictors {L T Z + k} Optimal predictor minimize MSPE(L, k) E[Y (s 0 ) L T Z k] 2 =var[y (s 0 ) L T Z k] + {E[Y (s 0 ) L T Z k]} 2 Note: Do not make any distributional assumptions

TK4150 - Intro 24 Kriging MSPE(L, k) =var[y (s 0 ) L T Z k] + {E[Y (s 0 ) L T Z k]} 2 =var[y (s 0 ) L T Z k] + {µ Y (s 0 ) L T µ z k]} 2 Second term is zero if k = µ Y (s 0 ) L T µ Z. First term (c(s 0 ) = cov[z, Y (s 0 )]): var[y (s 0 ) L T Z k] =C Y (s 0, s 0 ) 2L T c(s 0 ) + L T C Z L Derivative wrt L T : giving 2c(s 0 ) + 2C Z L = 0 L = C 1 Z c(s 0) Y (s 0 ) =µ Y (s 0 ) + c(s 0 ) T C 1 Z [Z µ Z ] MSPE(L, k ) =C Y (s 0, s 0 ) c(s 0 ) T C 1 Z c(s 0)

Kriging (Simple Kriging) The Kriging predictions with a known mean is: c(s 0 ) =cov[z, Y (s 0 )] =cov[y, Y (s 0 )] =(C Y (s 0, s 1 ),..., C Y (s 0, s m )) =c Y (s 0 ) C Z ={C Z (s i, s j )} { C Y (s i, s i ) + σ 2 C Z (s i, s j ) = ε, C Y (s i, s j ), s i = s j s i s j Y (s 0 ) = µ Y (s 0 ) + c Y (s 0 ) T C 1 Z (Z µ Z )

Gaussian assumptions Recall now Y, Z are MVN ( ) (( Z µz = MVN Y(s 0 ) µ Y (s 0 ) Give ) ( )) CZ c, Y (s 0 ) T c Y (s 0 ) C Y (s 0, s 0 ) Y (s 0 ) Z N(µ Y (s 0 ) + c(s 0 ) T C 1 Z [Z µ Z ], C Y (s 0, s 0 ) c(s 0 ) T C 1 Z c(s 0)) Same as kriging! The book derive this directly without using the formula for conditional distribution

STK4150 - Intro 27 Compare to interpolation Then: Assume no observation error Consider deviations from µ(s),i.e. f (s) = Y (s) µ(s). Set f i (s) = C Y (s s i ) F = C Z (= C Y no observation error) (F T F) 1 F T = C 1 Z (symmetry) β = C 1 Z Z f (s 0 ) = n i=1 β ic Y (s 0 s i ) Which gives the same result again: f (s 0 ) = β T c(s 0 ) = (Z T )C 1 Z c(s 0) Y (s 0 ) = µ(s 0 ) + c(s 0 ) T C 1 Z (Z µ Z) If we include µ(s) we get

Spatial prediction comments Given the mean and covariance function Kriging and the a Gaussian random field model give identical spatial correlations. There are stronger assumptions underlying the Gaussian model than needed in Kriging Kriging is the optimal linear predictor for any distribution (not only gaussian) For the Gaussian model the linear prediction is optimal (among all), for other distributions there might be other predictors which are better. The Gaussian model is easier to extend using a Hierarchical approach In Kriging and Gaussian random field, interpolating functions are determined by data In Kriging it is common to use plugin estimate for the covariance function The hierarchical approach is suited to include uncertainty on model parametrers. the Mean Squared Prediction Error (MSPE) is used for both Kriging and hierarchical model (HM)

STK4150 - Intro 29 So fa and Next: Simple kriging Linear predictor Assume parameters known Equal to conditional expectation

STK4150 - Intro 30 So fa and Next: Simple kriging Linear predictor Assume parameters known Equal to conditional expectation Unknown parameters Ordinary kriging Plug-in estimates Bayesian approach Non-Gaussian models

STK4150 - Intro 31 Unknown parameters So far assumed parameters known, what if unknown? Direct approach -Universal and Ordinary kriging (for mean parameters) Plug-in estimate/empirical Bayes Bayesian approach

Kriging (cont) Assuming Then and Y (s 0 ) = µ Y (s 0 ) + c Y (s 0 ) T C 1 Z (Z µ Z ) E[Y (s)] =x(s) T β Z(s i ) Y (s i ), σ 2 ε ind.gau(y (s i ), σ 2 ε) µ Z =µ Y = Xβ C Z =Σ Y + σ 2 εi c Y (s 0 ) =(C Y (s 0, s 1 ),..., C Y (s 0, s m )) T Y (s 0 ) = x(s 0 )β + c Y (s 0 ) T [Σ Y + σ 2 εi] 1 (Z Xβ)

Kriging Simple kriging, EY (s) = x(s) T β, known Y (s 0 ) = x(s 0 )β + c Y (s 0 ) T C 1 (Z Xβ) Z

Kriging Simple kriging, EY (s) = x(s) T β, known Y (s 0 ) = x(s 0 )β + c Y (s 0 ) T C 1 (Z Xβ) Z Universal kriging, EY (s) = x(s) T β, unknown Ŷ (s 0 ) =x(s 0 ) T βgls + c Y (s 0 ) T C 1 Z (Z X β gls ) β gls =[X T C 1 Z X] 1 X T C 1 z Z

Kriging Simple kriging, EY (s) = x(s) T β, known Y (s 0 ) = x(s 0 )β + c Y (s 0 ) T C 1 (Z Xβ) Z Universal kriging, EY (s) = x(s) T β, unknown Ŷ (s 0 ) =x(s 0 ) T βgls + c Y (s 0 ) T C 1 Z (Z X β gls ) β gls =[X T C 1 Z X] 1 X T C 1 z Z Bayesian kriging, EY (s) = x(s) T β, with β N(β 0, Σ 0 ) Ŷ (s 0 ) =x(s 0 ) T βb + c Y (s 0 ) T C 1 Z (Z X β B ) β B =β 0 + Σ 0 X T (XΣ 0 X T + C Z ) 1 (Z Xβ 0 ) Possible to show the SK and UK are the limiting cases of BK Σ 0 0 BK SK Σ 0 BK UK

TK4150 - Intro 36 Ordinary kriging Ordinary kriging, EY (s) = µ, unknown (special case of UK) Ŷ (s 0 ) ={c Y (s 0 ) + 1(1 1T C Z 1c y (s 0 )) 1 T C 1 Z 1 } T C 1 Z Z = µ gls + c Y (s 0 ) T C 1 Z (Z 1 µ gls) µ gls =[1 T C 1 Z 1] 1 1 T C 1 z Z To minimize the MSPE, we make an unbiased estimate with the minimum variance. Unbiased constraint: Prediction variance: MSPE(λ) = E(Y (s 0 ) λ T Z) 2 E[Y (s 0 )] E[λ T Z] = 0 PV (λ) = C Y (s 0, s 0 ) 2λ T c Y (s 0 ) + λ T C Z λ

Ordinary kriging equations We have: E[Y (s 0 )] = µ and E[λ T Z] = λ T E[Z] = λ T 1µ thus The problem then becomes: subject to: µ = λ T 1µ 1 = λ T 1 min PV (λ) λ 1 = λ T 1 Solved by Lagrange multiplier, i.e. minimize C Y (s 0, s 0 ) 2λ T c Y (s 0 ) + λ T C Z λ 2κ(λ T 1 1) Differentiation wrt λ T gives (which is combined to the final result presented on previous page): 2c Y (s 0 ) + 2C Z λ =2κ1 λ T 1 =1