Lecture 9: Introduction to Kriging

Similar documents
Point-Referenced Data Models

Basics in Geostatistics 2 Geostatistical interpolation/estimation: Kriging methods. Hans Wackernagel. MINES ParisTech.

Chapter 4 - Fundamentals of spatial processes Lecture notes

Linear Model Under General Variance Structure: Autocorrelation

Kriging Luc Anselin, All Rights Reserved

PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH

Geostatistics in Hydrology: Kriging interpolation

Introduction. Semivariogram Cloud

Basics of Point-Referenced Data Models

Introduction to Spatial Data and Models

A time series is called strictly stationary if the joint distribution of every collection (Y t

University of California, Los Angeles Department of Statistics. Universal kriging

Gridding of precipitation and air temperature observations in Belgium. Michel Journée Royal Meteorological Institute of Belgium (RMI)

Introduction to Spatial Data and Models

University of California, Los Angeles Department of Statistics. Effect of variogram parameters on kriging weights

Geostatistics for Seismic Data Integration in Earth Models

Lecture - 30 Stationary Processes

Investigation of Monthly Pan Evaporation in Turkey with Geostatistical Technique

ST505/S697R: Fall Homework 2 Solution.

ENGRG Introduction to GIS

Influence of parameter estimation uncertainty in Kriging: Part 2 Test and case study applications

Some general observations.

The Matrix Reloaded: Computations for large spatial data sets

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

MATH 1231 MATHEMATICS 1B CALCULUS. Section 5: - Power Series and Taylor Series.

Empirical Bayesian Kriging

7 Geostatistics. Figure 7.1 Focus of geostatistics

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression with correlation for the Sales Data

Correcting Variogram Reproduction of P-Field Simulation

Spatial Data Mining. Regression and Classification Techniques

Basic Probability Reference Sheet

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Multiple realizations using standard inversion techniques a

1 Introduction to Generalized Least Squares

Bayesian Transgaussian Kriging

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

The Matrix Reloaded: Computations for large spatial data sets

Time-lapse filtering and improved repeatability with automatic factorial co-kriging. Thierry Coléou CGG Reservoir Services Massy

Chapter 4 - Fundamentals of spatial processes Lecture notes

The General Linear Model. Monday, Lecture 2 Jeanette Mumford University of Wisconsin - Madison

General Regression Model

Comparison of rainfall distribution method

Function Approximation

Introduction to Geostatistics

Stanford Exploration Project, Report 105, September 5, 2000, pages 41 53

Long-Run Covariability

Econometrics 2, Class 1

Uncertainty in merged radar - rain gauge rainfall products

Park Forest Math Team. Meet #3. Algebra. Self-study Packet

Linear Models in Machine Learning

df=degrees of freedom = n - 1

Let V be a vector space, and let X be a subset. We say X is a Basis if it is both linearly independent and a generating set.

Next is material on matrix rank. Please see the handout

at least 50 and preferably 100 observations should be available to build a proper model

1 Review of the dot product

Series. richard/math230 These notes are taken from Calculus Vol I, by Tom M. Apostol,

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

Spatial Backfitting of Roller Measurement Values from a Florida Test Bed

MATH 1231 MATHEMATICS 1B Calculus Section 4.4: Taylor & Power series.

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Basics: Definitions and Notation. Stationarity. A More Formal Definition

The ProbForecastGOP Package

Interpolation {x,y} Data with Suavity. Peter K. Ott Forest Analysis & Inventory Branch BC Ministry of FLNRO Victoria, BC

Machine Learning 4771

SIO 221B, Rudnick adapted from Davis 1. 1 x lim. N x 2 n = 1 N. { x} 1 N. N x = 1 N. N x = 1 ( N N x ) x = 0 (3) = 1 x N 2

If we want to analyze experimental or simulated data we might encounter the following tasks:

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd.

Improving Spatial Data Interoperability

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Chapter 14 Stein-Rule Estimation

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Anomaly Density Estimation from Strip Transect Data: Pueblo of Isleta Example

Automatic Determination of Uncertainty versus Data Density

Optimal Interpolation

Last Update: April 7, 201 0

Interpolation and 3D Visualization of Geodata

Review of Multiple Regression

Simple Linear Regression for the Climate Data

Statistics 910, #5 1. Regression Methods

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Vectors Part 1: Two Dimensions

Regression Models - Introduction

Detection ASTR ASTR509 Jasper Wall Fall term. William Sealey Gosset

Propagation of Errors in Spatial Analysis

AP Calculus Chapter 9: Infinite Series

Multivariate Time Series: VAR(p) Processes and Models

Section 5.4. Ken Ueda

On block bootstrapping areal data Introduction

Your schedule of coming weeks. One-way ANOVA, II. Review from last time. Review from last time /22/2004. Create ANOVA table

Multivariate Geostatistics

Applied Regression Analysis

ENSC327 Communications Systems 19: Random Processes. Jie Liang School of Engineering Science Simon Fraser University

Some Review Problems for Exam 2: Solutions

3E4: Modelling Choice. Introduction to nonlinear programming. Announcements

COMS 4771 Introduction to Machine Learning. James McInerney Adapted from slides by Nakul Verma

1 What does the random effect η mean?

ZONAL KRIGING. 5.1: Introduction and Previous Work CHAPTER 5

Lecture 3: Statistical sampling uncertainty

Transcription:

Lecture 9: Introduction to Kriging Math 586 Beginning remarks Kriging is a commonly used method of interpolation (prediction) for spatial data. The data are a set of observations of some variable(s) of interest, with some spatial correlation present. Usually, the result of kriging is the expected value ( kriging mean ) and variance ( kriging variance ) computed for every point within a region. Practically, this is done on a fine enough grid. Illustration: suppose we observe some variable Z along 1-dim space (X). There are 5 measurements made. We might ask ourselves, knowing the probabilistic behavior of the random field being observed, what are possible trajectories (realizations) of the random field, that agree with the data? This is answered by conditional simulation. In Fig. 1, you see two sets of 5 conditional simula- Conditional simulation: 5 realizations Other 5 realizations: higher krig. variance Z 0 2 4 6 8 Z 0 2 4 6 8 2 4 6 8 X 2 4 6 8 X Figure 1: Conditional simulations, data given by dots tions for the same data. The one on the left is for one value of σ 2 = variance 1

of the random field, the one on the right is for another, higher, value. All the realizations are faithful to the data, but also faithful to the statistical model for the random field (i.e. mean and variogram) that we selected. Kriging mean for every location can be thought of as the average of the whole ensemble of possible realizations, conditioned on data. Kriging variance is the variance of that ensemble. (The ensemble is of course infinite, we only show 5 of its representatives.) You may see that the trajectories tend to diverge away from the observed data, that is, the kriging variance increases. Also, for the random field with higher variance (the one on the right in Fig. 1) We will observe similar qualitative behavior frequently in the future. Simple kriging (SK) Let us observe some stationary (WSS) random field V (x) at some points x j, j = 1,..., n. First, assume that the mean m and covariance function C( ) of this process are known. The case of prediction with the known mean is often called simple kriging. In order to better understand what happens and to devise an extendable approach, let s attack the question by directly trying to minimize MSE. We will seek an estimate ˆV 0 of the value of V at the point x 0. For simplicity, denote V j := V (x j ). Also, denote C(i, j) = Cov(V i, V j ). We may assume that the mean m = 0. Otherwise, subtract m from all of the V j values, estimate V 0, then add the mean back. Search for the estimate of the form ˆV 0 = n λ j V j, j=1 under the assumption of 0 mean, it is automatically unbiased. (Why?) We will find the kriging weights λ j that minimize MSE: [ ( MSE = E V 0 ) ] 2 λ j V j = V ar (V 0 ) λ j V j = (Why?) = V ar(v 0 ) 2 λ j Cov(V 0, V j ) + λ j λ i Cov(V i, V j ) That is, MSE = C(0, 0) 2 λ j C(0, j) + λ j λ i C(i, j) (1) 2

where C(i, j) = C V (x i x j ) might be interpreted as the elements of covariance matrix C. Note that the values C(0, j) depend on the location x 0. To minimize MSE, we take the derivatives with respect to λ k and equate to 0: MSE λ k = 2C(0, k) + 2λ k C(k, k) + 2 j k λ j C(k, j) = 0, k = 1,..., n That is, solve the system of equations 1 n λ j C(k, j) = C(0, k), k = 1,..., n (2) j=1 In matrix form, the above equations look like Cλ = b same as λ = C 1 b (3) If we denote the covariance matrix of the vector (V 0, V 1,..., V n ) as Σ, then [ ] σ 2 Σ = 0 b and MSE= σsk 2 = C(0, 0) 2λ b+λ Cλ = C(0, 0) λ b b C Now compare the equation (3) with the formula for conditional mean in Lecture 6, p. 4. It turns out that the above minimization argument is directly related to the BLUE theory for multivariate Normal! To avoid ambiguity, we will sometimes denote the optimal kriging weights λ SK. From the equations (1) and (2), the optimal kriging variance is σ 2 SK = C(0, 0) λ SK j C(0, j) (4) Compare this to the conditional variance formula from Lecture 6. What happens when you predict at one of the existing points x j? It is clear that the best choice is to just take V j, in which case the kriging variance is 0! If we use the covariance function such that C(h) C(0)σ 2 as h 0, that is, dealing with a continuous random field (no nugget!), then we should obtain lim ˆV (x0 ) = V j and σsk 2 0 x 0 x j 1 Note that we really did not use the assumption of WSS stationarity, since we allowed C(i, j) to vary with location x i. We don t have to assume a constant mean m, either, and can replace it by varying values m i, as long as these are known. 3

A simple case: Suppose that C(i, j) = 0, i j, only C(0, j) 0, j = 1,..., n. In this case, the kriging equations (2) are solved by λ j = σ 0 C(0, j) C(j, j) = ρ 0,j, σ j where ρ 0,j is the correlation coefficient. We obtain the prediction ˆV 0 σ 0 = ρ 0,j V j σ j That is, the weights reflect the strength of correlation. This reminds us of simple linear regression. Also, σ 2 SK = σ 2 0(1 ρ 2 0,j) where we recognize the quantity ρ 2 0,j as coefficient of determination R 2! (Question: why ρ 2 0,j 1?) Summary: Kriging is an exact interpolator (it preserves the observations) when there s no nugget effect. Kriging weights λ j depend only on locations and covariance function C, not on data. Kriging variance also depends on locations and C only. Kriging is a smooth predictor (if two points are close, then their kriging estimates are close). In MVN case, SK is the optimal predictor in the mean square sense (follows from MVN theory). Computationally, we only need to find the matrix C 1 once, then, as we vary the location, we will only vary vector b in (3) 4

Example Northern NM precipitation, January log-model residuals The covariance function used in kriging is C(h) = 0.1144e h/10.4451, with the practical range of about 30. One look at the kriging st. dev. map tells us that we cannot reliably predict the precipitation (at least for January) in-between the stations. 5

Ordinary kriging (OK) The next issue will be estimating the mean m, which we assumed known in simple kriging. Option 1: first, estimate mean from the observations, then apply SK. Option 2: do both at the same time (Ordinary kriging). Option 1: We could try and estimate m by simply averaging the observations, but this does not take into account the correlations between V j. For example, if two locations are close together, and there is spatial continuity, then we have effectively one observation instead of two. This will require a Generalized least square approach, discussed later. Option 2: To keep the estimate unbiased, we need Thus, we need λ j = 1. m = E [V 0 ] = λ j E [V j ] = λ j m MSE is still expressed by (1), and we need to minimize it. We will apply Lagrange multiplier method of constrained optimization: minimize the function F (λ, µ) = MSE 2µ( λ j 1) where µ is the Lagrange multiplier. Taking partials with respect to λ k and µ and equating them to 0, we will get a system of equations { λ OK j C(k, j) µ = C(0, k), k = 1,..., n λ OK j = 1 In matrix form: C(1, 1)... C(1, n) 1 C(2, 1)... C(2, n) 1............ C(n, 1)... C(n, n) 1 1... 1 0 The kriging variance is, similarly to (4), λ 1 λ 2... λ n µ = σ 2 OK = C(0, 0) λ OK j C(0, j) + µ C(0, 1) C(0, 2)... C(0, n) 1 Overall, σ 2 OK is slightly higher than σ 2 SK because of higher uncertainty associated with estimating m. 6

Example continued Let s zoom in on a portion of NM precipitation map: the locations marked are stations close to the point x 0 = 470, y 0 = 4010, marked by. Comparing kriging st.dev. at that point, σ SK = 0.2778, σ OK = 0.2779, and maximum possible is σ = C(0, 0) = 0.3382. The kriging weights λ j were sorted according to their absolute values and the first 10 are: lambda_sk lambda_ok point no. [1,] 0.3551 0.3569 8 [2,] 0.3443 0.3455 14 [3,] 0.0527 0.0544 64 [4,] 0.0283 0.0306 45 [5,] 0.0279 0.0295 77 [6,] 0.0117 0.0135 15 [7,] 0.0103 0.0126 63 [8,] 0.0090 0.0109 5 [9,] -0.0036-0.0020 25 [10,] -0.0025-0.0005 49 Generally, closer points get higher weights. Notice, though, that 15 is masked by the presence of 14, and some stations get negative weights! 7

8