Basics of Point-Referenced Data Models

Size: px

Start display at page:

Download "Basics of Point-Referenced Data Models"

Abel Camron Bond
5 years ago
Views:

1 Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45

2 Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Note that time series follows this approach with r = 1; we will usually have r = 2 or 3 Chapter 2: Basics of Point-Referenced Data Models p. 1/45

3 Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Note that time series follows this approach with r = 1; we will usually have r = 2 or 3 We begin with essentials of point-level data modeling, including stationarity, isotropy, and variograms key elements of the "Matheron school" Chapter 2: Basics of Point-Referenced Data Models p. 1/45

4 Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Note that time series follows this approach with r = 1; we will usually have r = 2 or 3 We begin with essentials of point-level data modeling, including stationarity, isotropy, and variograms key elements of the "Matheron school" Then we add the spatial (typically Gaussian) process modeling that enables likelihood (and Bayesian) inference in these settings. Chapter 2: Basics of Point-Referenced Data Models p. 1/45

5 Scallops catch sites, NY/NJ coast, USA Chapter 2: Basics of Point-Referenced Data Models p. 2/45

6 Scallops sites with log-catch contours Chapter 2: Basics of Point-Referenced Data Models p. 3/45

7 Image plot with log-catch contours Latitude Longitude Chapter 2: Basics of Point-Referenced Data Models p. 4/45

8 Stationarity Suppose our spatial process has a mean, µ (s) = E (Y (s)), and that the variance of Y (s) exists for all s D. The process is said to be strictly stationary (also called strongly stationary) if, for any given n 1, any set of n sites {s 1,...,s n } and any h R r, the distribution of (Y (s 1 ),...,Y (s n )) is the same as that of (Y (s 1 + h),...,y (s n + h)). Chapter 2: Basics of Point-Referenced Data Models p. 5/45

9 Stationarity Suppose our spatial process has a mean, µ (s) = E (Y (s)), and that the variance of Y (s) exists for all s D. The process is said to be strictly stationary (also called strongly stationary) if, for any given n 1, any set of n sites {s 1,...,s n } and any h R r, the distribution of (Y (s 1 ),...,Y (s n )) is the same as that of (Y (s 1 + h),...,y (s n + h)). A less restrictive condition is given by weak stationarity (also called second-order stationarity): A process is weakly stationary if µ (s) µ and Cov (Y (s),y (s + h)) = C (h) for all h R r such that s and s + h both lie within D. Chapter 2: Basics of Point-Referenced Data Models p. 5/45

10 Notes on Stationarity Weak stationarity implies that the covariance relationship between the values of the process at any two locations can be summarized by a covariance function C (h) (sometimes called a covariogram), and this function depends only on the separation vector h. Chapter 2: Basics of Point-Referenced Data Models p. 6/45

11 Notes on Stationarity Weak stationarity implies that the covariance relationship between the values of the process at any two locations can be summarized by a covariance function C (h) (sometimes called a covariogram), and this function depends only on the separation vector h. Note that with all variances assumed to exist, strong stationarity implies weak stationarity. Chapter 2: Basics of Point-Referenced Data Models p. 6/45

12 Notes on Stationarity Weak stationarity implies that the covariance relationship between the values of the process at any two locations can be summarized by a covariance function C (h) (sometimes called a covariogram), and this function depends only on the separation vector h. Note that with all variances assumed to exist, strong stationarity implies weak stationarity. The converse is not true in general, but it does hold for Gaussian processes Chapter 2: Basics of Point-Referenced Data Models p. 6/45

13 Variograms Suppose we assume E[Y (s + h) Y (s)] = 0 and define E[Y (s + h) Y (s)] 2 = V ar (Y (s + h) Y (s)) = 2γ (h). This expression makes sense when the left-hand side depends only on h (so that the right-hand side can be written at all), and not the particular choice of s. If this is the case, we say the process is intrinsically stationary. Chapter 2: Basics of Point-Referenced Data Models p. 7/45

14 Variograms Suppose we assume E[Y (s + h) Y (s)] = 0 and define E[Y (s + h) Y (s)] 2 = V ar (Y (s + h) Y (s)) = 2γ (h). This expression makes sense when the left-hand side depends only on h (so that the right-hand side can be written at all), and not the particular choice of s. If this is the case, we say the process is intrinsically stationary. The function 2γ (h) is then called the variogram, and γ (h) is called the semivariogram. Chapter 2: Basics of Point-Referenced Data Models p. 7/45

15 Variograms Suppose we assume E[Y (s + h) Y (s)] = 0 and define E[Y (s + h) Y (s)] 2 = V ar (Y (s + h) Y (s)) = 2γ (h). This expression makes sense when the left-hand side depends only on h (so that the right-hand side can be written at all), and not the particular choice of s. If this is the case, we say the process is intrinsically stationary. The function 2γ (h) is then called the variogram, and γ (h) is called the semivariogram. Note that intrinsic stationarity requires only the first and second moments of the differences Y (s + h) Y (s). It says nothing about the joint distribution of a collection of variables Y (s 1 ),...,Y (s n ), and thus provides no likelihood. Chapter 2: Basics of Point-Referenced Data Models p. 7/45

16 Relationship between C(h) and γ(h) 2γ(h) = V ar (Y (s + h) Y (s)) = V ar(y (s + h)) + V ar(y (s)) 2Cov(Y (s + h),y (s)) = C(0) + C(0) 2C (h) = 2 [C (0) C (h)]. Chapter 2: Basics of Point-Referenced Data Models p. 8/45

17 Relationship between C(h) and γ(h) 2γ(h) = V ar (Y (s + h) Y (s)) = V ar(y (s + h)) + V ar(y (s)) 2Cov(Y (s + h),y (s)) = C(0) + C(0) 2C (h) = 2 [C (0) C (h)]. Thus, γ (h) = C (0) C (h). Chapter 2: Basics of Point-Referenced Data Models p. 8/45

18 Relationship between C(h) and γ(h) 2γ(h) = V ar (Y (s + h) Y (s)) = V ar(y (s + h)) + V ar(y (s)) 2Cov(Y (s + h),y (s)) = C(0) + C(0) 2C (h) = 2 [C (0) C (h)]. Thus, γ (h) = C (0) C (h). So given C, we are able to determine γ. But what about the converse: in general, can we recover C from γ?... Chapter 2: Basics of Point-Referenced Data Models p. 8/45

19 Relationship between C(h) and γ(h) In γ (h) = C (0) C (h), we can add or subtract a constant on the right side, so C (h) is not identified. Chapter 2: Basics of Point-Referenced Data Models p. 9/45

20 Relationship between C(h) and γ(h) In γ (h) = C (0) C (h), we can add or subtract a constant on the right side, so C (h) is not identified. If the spatial process is ergodic, then C (h) 0 as h, where h denotes the length of the h vector. Chapter 2: Basics of Point-Referenced Data Models p. 9/45

21 Relationship between C(h) and γ(h) In γ (h) = C (0) C (h), we can add or subtract a constant on the right side, so C (h) is not identified. If the spatial process is ergodic, then C (h) 0 as h, where h denotes the length of the h vector. Taking the limit of both sides of γ (h) = C (0) C (h) as h, we then have that lim h γ (h) = C (0). Chapter 2: Basics of Point-Referenced Data Models p. 9/45

22 Relationship between C(h) and γ(h) In γ (h) = C (0) C (h), we can add or subtract a constant on the right side, so C (h) is not identified. If the spatial process is ergodic, then C (h) 0 as h, where h denotes the length of the h vector. Taking the limit of both sides of γ (h) = C (0) C (h) as h, we then have that lim h γ (h) = C (0). Thus, using the dummy variable u to avoid confusion, C (h) = C(0) γ(h) = lim u γ (u) γ (h). So if lim h γ (h) exists, C (h) is well defined. Chapter 2: Basics of Point-Referenced Data Models p. 9/45

23 Relationship between C(h) and γ(h) In γ (h) = C (0) C (h), we can add or subtract a constant on the right side, so C (h) is not identified. If the spatial process is ergodic, then C (h) 0 as h, where h denotes the length of the h vector. Taking the limit of both sides of γ (h) = C (0) C (h) as h, we then have that lim h γ (h) = C (0). Thus, using the dummy variable u to avoid confusion, C (h) = C(0) γ(h) = lim u γ (u) γ (h). So if lim h γ (h) exists, C (h) is well defined. And then, intrinsic stationarity implies weak (second-order) stationarity. Thus weak stationarity implies intrinsic stationarity, but the converse is not true in general. Chapter 2: Basics of Point-Referenced Data Models p. 9/45

24 Isotropy If the semivariogram γ (h) depends upon the separation vector only through its length h, then we say that the process is isotropic; if not, we say it is anisotropic. Chapter 2: Basics of Point-Referenced Data Models p. 10/45

25 Isotropy If the semivariogram γ (h) depends upon the separation vector only through its length h, then we say that the process is isotropic; if not, we say it is anisotropic. For an isotropic process, γ (h) is a real-valued function of a univariate argument, and can be written as γ ( h ). Chapter 2: Basics of Point-Referenced Data Models p. 10/45

26 Isotropy If the semivariogram γ (h) depends upon the separation vector only through its length h, then we say that the process is isotropic; if not, we say it is anisotropic. For an isotropic process, γ (h) is a real-valued function of a univariate argument, and can be written as γ ( h ). If the process is intrinsically stationary and isotropic, it is also called homogeneous. Chapter 2: Basics of Point-Referenced Data Models p. 10/45

27 Isotropy If the semivariogram γ (h) depends upon the separation vector only through its length h, then we say that the process is isotropic; if not, we say it is anisotropic. For an isotropic process, γ (h) is a real-valued function of a univariate argument, and can be written as γ ( h ). If the process is intrinsically stationary and isotropic, it is also called homogeneous. Isotropic processes are popular because of their simplicity, interpretability, and because a number of relatively simple parametric forms are available as candidates for C (and γ). Denoting h by t for notational simplicity, the next two tables provide a few examples... Chapter 2: Basics of Point-Referenced Data Models p. 10/45

28 Some common isotropic covariograms Model Covariance function, C(t) Linear C(t) does not exist 0 if t 1/φ Spherical C(t) = σ 2 [ 1 2 3φt + 2 1(φt)3] if 0 < t 1/φ { τ 2 + σ 2 otherwise σ 2 exp( φt) if t > 0 Exponential C(t) = τ 2 + σ 2 otherwise { Powered σ 2 exp( φt p ) if t > 0 C(t) = exponential τ 2 + σ 2 otherwise Matérn at ν = 3/2 C(t) = { σ 2 (1 + φt) exp( φt) if t > 0 τ 2 + σ 2 otherwise Chapter 2: Basics of Point-Referenced Data Models p. 11/45

29 Some common isotropic variograms model Linear γ(t) = Spherical γ(t) = Exponential γ(t) = Powered exponential Matérn at ν = 3/2 Variogram, { γ(t) γ(t) = γ(t) = { { { τ 2 + σ 2 t if t > 0 0 otherwise τ 2 + σ 2 if t 1/φ τ 2 + σ 2 [ 3 2 φt 1 2 (φt)3] if 0 < t 1/φ 0 otherwise τ 2 + σ 2 (1 exp( φt)) if t > 0 0 otherwise τ 2 + σ 2 (1 exp( φt p )) if t > 0 0 otherwise τ 2 + σ 2 [ 1 (1 + φt)e φt] if t > 0 0 o/w Chapter 2: Basics of Point-Referenced Data Models p. 12/45

30 Example: Spherical semivariogram γ(t) = τ 2 + σ 2 if t 1/φ τ 2 + σ 2 [ 3 2 φt 1 2 (φt)3] if 0 < t 1/φ 0 otherwise While γ(0) = 0 by definition, γ(0 + ) lim t 0 + γ(t) = τ 2 ; this quantity is called the nugget. Chapter 2: Basics of Point-Referenced Data Models p. 13/45

31 Example: Spherical semivariogram γ(t) = τ 2 + σ 2 if t 1/φ τ 2 + σ 2 [ 3 2 φt 1 2 (φt)3] if 0 < t 1/φ 0 otherwise While γ(0) = 0 by definition, γ(0 + ) lim t 0 + γ(t) = τ 2 ; this quantity is called the nugget. lim t γ(t) = τ 2 + σ 2 ; this asymptotic value of the semivariogram is called the sill. (The sill minus the nugget, σ 2 in this case, is called the partial sill.) Chapter 2: Basics of Point-Referenced Data Models p. 13/45

32 Example: Spherical semivariogram γ(t) = τ 2 + σ 2 if t 1/φ τ 2 + σ 2 [ 3 2 φt 1 2 (φt)3] if 0 < t 1/φ 0 otherwise While γ(0) = 0 by definition, γ(0 + ) lim t 0 + γ(t) = τ 2 ; this quantity is called the nugget. lim t γ(t) = τ 2 + σ 2 ; this asymptotic value of the semivariogram is called the sill. (The sill minus the nugget, σ 2 in this case, is called the partial sill.) Finally, the value t = 1/φ at which γ(t) first reaches its ultimate level (the sill) is called the range, R 1/φ. (Both R and φ are sometimes referred to as the "range," but φ is more naturally called the decay parameter.) Chapter 2: Basics of Point-Referenced Data Models p. 13/45

33 3 common semivariogram models Linear; tau2 = 0.2, sig2 = Spherical; tau2 = 0.2, sig2 = 1, phi = Expo; tau2 = 0.2, sig2 = 1, phi = 2 Note that for the linear model (left panel), γ (t) as t, and so this semivariogram does not correspond to a weakly stationary process (although it is intrinsically stationary). Chapter 2: Basics of Point-Referenced Data Models p. 14/45

34 3 common semivariogram models Linear; tau2 = 0.2, sig2 = Spherical; tau2 = 0.2, sig2 = 1, phi = Expo; tau2 = 0.2, sig2 = 1, phi = 2 Note that for the linear model (left panel), γ (t) as t, and so this semivariogram does not correspond to a weakly stationary process (although it is intrinsically stationary). The nugget is τ 2, but the sill and range are both infinite. Chapter 2: Basics of Point-Referenced Data Models p. 14/45

35 Notes on the exponential model The sill is only reached asymptotically, meaning that strictly speaking, the range R = 1/φ is infinite. Chapter 2: Basics of Point-Referenced Data Models p. 15/45

36 Notes on the exponential model The sill is only reached asymptotically, meaning that strictly speaking, the range R = 1/φ is infinite. To define the notion of an "effective range", for t > 0, C(t) = lim u γ(u) γ(t) = τ 2 + σ 2 [ τ 2 + σ 2 (1 exp( φt)) ] = σ 2 exp( φt) However, with γ (h) = C (0) C (h), since γ(0 + ) = τ 2, by convention we (and other textbook authors) set C(0) = C(0 + ) = τ 2 + σ 2. Chapter 2: Basics of Point-Referenced Data Models p. 15/45

37 Notes on the exponential model Thus we have C(t) = { τ 2 + σ 2 if t = 0 σ 2 exp( φt) if t > 0. Then the correlation between two points distance t apart is exp( φt); note that exp( φt) = 1 for t = 0 + and exp( φt) = 0 for t =, as expected. Chapter 2: Basics of Point-Referenced Data Models p. 16/45

38 Notes on the exponential model Thus we have C(t) = { τ 2 + σ 2 if t = 0 σ 2 exp( φt) if t > 0. Then the correlation between two points distance t apart is exp( φt); note that exp( φt) = 1 for t = 0 + and exp( φt) = 0 for t =, as expected. We define the effective range, t 0, as the distance at which this correlation has dropped to only Setting exp( φt 0 ) equal to this value we obtain t 0 3/φ, since log(0.05) 3. Chapter 2: Basics of Point-Referenced Data Models p. 16/45

39 Notes on the exponential model Thus we have C(t) = { τ 2 + σ 2 if t = 0 σ 2 exp( φt) if t > 0. Then the correlation between two points distance t apart is exp( φt); note that exp( φt) = 1 for t = 0 + and exp( φt) = 0 for t =, as expected. We define the effective range, t 0, as the distance at which this correlation has dropped to only Setting exp( φt 0 ) equal to this value we obtain t 0 3/φ, since log(0.05) 3. Finally, the form of C(t) shows why the nugget τ 2 is often viewed as a nonspatial effect variance, and the partial sill (σ 2 ) is viewed as a spatial effect variance. Chapter 2: Basics of Point-Referenced Data Models p. 16/45

40 The Matèrn Correlation Function The Matèrn is a very versatile family: { σ 2 C(t) = 2 ν 1 Γ(ν) (2 νtφ) ν K ν (2 (ν)tφ) if t > 0 τ 2 + σ 2 if t = 0 K ν is the modified Bessel function of order ν (computationally tractable in C/C++ or geor) Chapter 2: Basics of Point-Referenced Data Models p. 17/45

41 The Matèrn Correlation Function The Matèrn is a very versatile family: { σ 2 C(t) = 2 ν 1 Γ(ν) (2 νtφ) ν K ν (2 (ν)tφ) if t > 0 τ 2 + σ 2 if t = 0 K ν is the modified Bessel function of order ν (computationally tractable in C/C++ or geor) ν is a smoothness parameter: Chapter 2: Basics of Point-Referenced Data Models p. 17/45

42 The Matèrn Correlation Function The Matèrn is a very versatile family: { σ 2 C(t) = 2 ν 1 Γ(ν) (2 νtφ) ν K ν (2 (ν)tφ) if t > 0 τ 2 + σ 2 if t = 0 K ν is the modified Bessel function of order ν (computationally tractable in C/C++ or geor) ν is a smoothness parameter: ν = 1/2 exponential; ν Gaussian; ν = 3/2 convenient closed form for C(t),γ(t) Chapter 2: Basics of Point-Referenced Data Models p. 17/45

43 The Matèrn Correlation Function The Matèrn is a very versatile family: { σ 2 C(t) = 2 ν 1 Γ(ν) (2 νtφ) ν K ν (2 (ν)tφ) if t > 0 τ 2 + σ 2 if t = 0 K ν is the modified Bessel function of order ν (computationally tractable in C/C++ or geor) ν is a smoothness parameter: ν = 1/2 exponential; ν Gaussian; ν = 3/2 convenient closed form for C(t),γ(t) in two-dimensions, the greatest integer in ν indicates the number of times process realizations will be mean-square differentiable. Chapter 2: Basics of Point-Referenced Data Models p. 17/45

44 Some covariance function theory To be a valid covariance function the function must be positive definite Chapter 2: Basics of Point-Referenced Data Models p. 18/45

45 Some covariance function theory To be a valid covariance function the function must be positive definite Whether a function is positive definite or not can depend upon dimension Chapter 2: Basics of Point-Referenced Data Models p. 18/45

46 Some covariance function theory To be a valid covariance function the function must be positive definite Whether a function is positive definite or not can depend upon dimension Bochner s Theorem: C is a valid covariance function if and only if it is the characteristic function of a random variable that is symmetric about 0, i.e., C(h) = cos(w T h)g(dw). Chapter 2: Basics of Point-Referenced Data Models p. 18/45

47 Some covariance function theory To be a valid covariance function the function must be positive definite Whether a function is positive definite or not can depend upon dimension Bochner s Theorem: C is a valid covariance function if and only if it is the characteristic function of a random variable that is symmetric about 0, i.e., C(h) = cos(w T h)g(dw). Fourier transform, spectral distribution, spectral density Chapter 2: Basics of Point-Referenced Data Models p. 18/45

48 Some covariance function theory To be a valid covariance function the function must be positive definite Whether a function is positive definite or not can depend upon dimension Bochner s Theorem: C is a valid covariance function if and only if it is the characteristic function of a random variable that is symmetric about 0, i.e., C(h) = cos(w T h)g(dw). Fourier transform, spectral distribution, spectral density inversion formula could be used to check if C(h) is valid Chapter 2: Basics of Point-Referenced Data Models p. 18/45

49 Constructing valid covariance functions One can construct valid covariance function by using properties of characteristic functions: multiplying valid covariance functions (corresponds to summing independent random variables) Chapter 2: Basics of Point-Referenced Data Models p. 19/45

50 Constructing valid covariance functions One can construct valid covariance function by using properties of characteristic functions: multiplying valid covariance functions (corresponds to summing independent random variables) mixing covariance functions (corresponds to mixing distributions) Chapter 2: Basics of Point-Referenced Data Models p. 19/45

51 Constructing valid covariance functions One can construct valid covariance function by using properties of characteristic functions: multiplying valid covariance functions (corresponds to summing independent random variables) mixing covariance functions (corresponds to mixing distributions) convolving covariance functions (if C 1 and C 2 are valid then C 12 (s) = C 1 (s u)c 2 (u)du is valid). Chapter 2: Basics of Point-Referenced Data Models p. 19/45

52 Variogram model fitting How does one choose a good parametric variogram model? Historically, one plots the empirical semivariogram, γ(t) = 1 2N(t) (s i,s j ) N(t) [Y (s i ) Y (s j )] 2, where N(t) is the set of pairs such that s i s j = t, and N(t) is the number of pairs in this set. Chapter 2: Basics of Point-Referenced Data Models p. 20/45

53 Variogram model fitting How does one choose a good parametric variogram model? Historically, one plots the empirical semivariogram, γ(t) = 1 2N(t) (s i,s j ) N(t) [Y (s i ) Y (s j )] 2, where N(t) is the set of pairs such that s i s j = t, and N(t) is the number of pairs in this set. Usually need to grid up the t-space into intervals I 1 = (0,t 1 ),...,I K = (t K 1,t K ) for 0 < t 1 < < t K. Represent each interval by its midpoint, and redefine N(t k ) = {(s i, s j ) : s i s j I k }, k = 1,...,K. Chapter 2: Basics of Point-Referenced Data Models p. 20/45

54 Empirical variogram: scallops data Gamma(d) distance Chapter 2: Basics of Point-Referenced Data Models p. 21/45

55 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: Chapter 2: Basics of Point-Referenced Data Models p. 22/45

56 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers Chapter 2: Basics of Point-Referenced Data Models p. 22/45

57 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers A sample average of squared differences can be badly behaved. Chapter 2: Basics of Point-Referenced Data Models p. 22/45

58 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers A sample average of squared differences can be badly behaved. It uses data differences, rather than the data itself. Chapter 2: Basics of Point-Referenced Data Models p. 22/45

59 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers A sample average of squared differences can be badly behaved. It uses data differences, rather than the data itself. The components of the sum will be dependent within and across bins, and N(t k ) will vary across bins. Chapter 2: Basics of Point-Referenced Data Models p. 22/45

60 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers A sample average of squared differences can be badly behaved. It uses data differences, rather than the data itself. The components of the sum will be dependent within and across bins, and N(t k ) will vary across bins. Informally, one plots γ(t), and then an appropriately shaped theoretical variogram is fit by eye or by trial and error to choose the nugget, sill, and range. Chapter 2: Basics of Point-Referenced Data Models p. 22/45

61 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers A sample average of squared differences can be badly behaved. It uses data differences, rather than the data itself. The components of the sum will be dependent within and across bins, and N(t k ) will vary across bins. Informally, one plots γ(t), and then an appropriately shaped theoretical variogram is fit by eye or by trial and error to choose the nugget, sill, and range. Traditional but more formal fitting uses least squares, weighted least squares or generalized least squares Chapter 2: Basics of Point-Referenced Data Models p. 22/45

62 Variogram model fitting (cont d) This method of moments estimate (analogue of the usual sample variance s 2 ) may not be so great: It will be sensitive to outliers A sample average of squared differences can be badly behaved. It uses data differences, rather than the data itself. The components of the sum will be dependent within and across bins, and N(t k ) will vary across bins. Informally, one plots γ(t), and then an appropriately shaped theoretical variogram is fit by eye or by trial and error to choose the nugget, sill, and range. Traditional but more formal fitting uses least squares, weighted least squares or generalized least squares For us, a (likelihood or Bayesian) estimation problem! Chapter 2: Basics of Point-Referenced Data Models p. 22/45

63 Anisotropy Here association depends upon the separation vector between locations (i.e., direction and distance). Chapter 2: Basics of Point-Referenced Data Models p. 23/45

64 Anisotropy Here association depends upon the separation vector between locations (i.e., direction and distance). Example: geometric anisotropy, where c(s s ) = σ 2 ρ((s s ) T B(s s )). Chapter 2: Basics of Point-Referenced Data Models p. 23/45

65 Anisotropy Here association depends upon the separation vector between locations (i.e., direction and distance). Example: geometric anisotropy, where c(s s ) = σ 2 ρ((s s ) T B(s s )). B is positive definite with ρ a valid correlation function in R r (e.g., exponential). Chapter 2: Basics of Point-Referenced Data Models p. 23/45

66 Anisotropy Here association depends upon the separation vector between locations (i.e., direction and distance). Example: geometric anisotropy, where c(s s ) = σ 2 ρ((s s ) T B(s s )). B is positive definite with ρ a valid correlation function in R r (e.g., exponential). We omit the range/decay parameter φ since we now use B; when r = 2 this means three range parameters" instead of one. Chapter 2: Basics of Point-Referenced Data Models p. 23/45

67 Anisotropy Here association depends upon the separation vector between locations (i.e., direction and distance). Example: geometric anisotropy, where c(s s ) = σ 2 ρ((s s ) T B(s s )). B is positive definite with ρ a valid correlation function in R r (e.g., exponential). We omit the range/decay parameter φ since we now use B; when r = 2 this means three range parameters" instead of one. Contours of constant association arising from c above are elliptical. In particular, the contour corresponding to ρ =.05 provides the effective range in each spatial direction. Chapter 2: Basics of Point-Referenced Data Models p. 23/45

68 Anisotropy (cont d) We can extend geometric anisotropy to product geometric anisotropy, e.g. c(s s ) = σ 2 ρ 1 ((s s ) T B 1 (s s ))ρ 2 ((s s ) T B 2 (s s )), a valid covariance function provided ρ 1 and ρ 2 are valid. Chapter 2: Basics of Point-Referenced Data Models p. 24/45

69 Anisotropy (cont d) We can extend geometric anisotropy to product geometric anisotropy, e.g. c(s s ) = σ 2 ρ 1 ((s s ) T B 1 (s s ))ρ 2 ((s s ) T B 2 (s s )), a valid covariance function provided ρ 1 and ρ 2 are valid. Both geometric anisotropy and product geometric anisotropy are special cases of range anisotropy (Zimmerman, 1993); suggests we might also define Chapter 2: Basics of Point-Referenced Data Models p. 24/45

70 Anisotropy (cont d) We can extend geometric anisotropy to product geometric anisotropy, e.g. c(s s ) = σ 2 ρ 1 ((s s ) T B 1 (s s ))ρ 2 ((s s ) T B 2 (s s )), a valid covariance function provided ρ 1 and ρ 2 are valid. Both geometric anisotropy and product geometric anisotropy are special cases of range anisotropy (Zimmerman, 1993); suggests we might also define sill anisotropy: Given a variogram γ(h), lim c γ(ch/ h ) depends upon h Chapter 2: Basics of Point-Referenced Data Models p. 24/45

71 Anisotropy (cont d) We can extend geometric anisotropy to product geometric anisotropy, e.g. c(s s ) = σ 2 ρ 1 ((s s ) T B 1 (s s ))ρ 2 ((s s ) T B 2 (s s )), a valid covariance function provided ρ 1 and ρ 2 are valid. Both geometric anisotropy and product geometric anisotropy are special cases of range anisotropy (Zimmerman, 1993); suggests we might also define sill anisotropy: Given a variogram γ(h), lim c γ(ch/ h ) depends upon h nugget anisotropy: Given a variogram γ(h), lim c 0 γ(ch/ h ) depends upon h. Chapter 2: Basics of Point-Referenced Data Models p. 24/45

72 Exploration of Spatial Data First step in analyzing data Chapter 2: Basics of Point-Referenced Data Models p. 25/45

73 Exploration of Spatial Data First step in analyzing data First Law of Geostatistics : Data = Mean + Error Chapter 2: Basics of Point-Referenced Data Models p. 25/45

74 Exploration of Spatial Data First step in analyzing data First Law of Geostatistics : Data = Mean + Error Mean: first-order behavior Chapter 2: Basics of Point-Referenced Data Models p. 25/45

75 Exploration of Spatial Data First step in analyzing data First Law of Geostatistics : Data = Mean + Error Mean: first-order behavior Error: second-order behavior (covariance function) Chapter 2: Basics of Point-Referenced Data Models p. 25/45

76 Exploration of Spatial Data First step in analyzing data First Law of Geostatistics : Data = Mean + Error Mean: first-order behavior Error: second-order behavior (covariance function) EDA tools examine both first and second order behavior Chapter 2: Basics of Point-Referenced Data Models p. 25/45

77 Exploration of Spatial Data First step in analyzing data First Law of Geostatistics : Data = Mean + Error Mean: first-order behavior Error: second-order behavior (covariance function) EDA tools examine both first and second order behavior Preliminary displays: From simple location maps to surface, image, and contour plots Chapter 2: Basics of Point-Referenced Data Models p. 25/45

78 First Law of Geostatistics = + data mean error Chapter 2: Basics of Point-Referenced Data Models p. 26/45

79 Arrangement of plots N S UTM coordinates E W UTM coordinates Chapter 2: Basics of Point-Referenced Data Models p. 27/45

80 Drop-line scatter plot Chapter 2: Basics of Point-Referenced Data Models p. 28/45

81 Longitude Surface plot Chapter 2: Basics of Point-Referenced Data Models p. 29/ Latitude logsp

82 Image-contour plot Latitude Longitude Chapter 2: Basics of Point-Referenced Data Models p. 30/45

83 EDA for assessing anisotropy Directional semivariogram: Choose angle classes η i ± ǫ, i = 1,...,L where ǫ is the halfwidth of the angle class and L is the number of angle classes Chapter 2: Basics of Point-Referenced Data Models p. 31/45

84 EDA for assessing anisotropy Directional semivariogram: Choose angle classes η i ± ǫ, i = 1,...,L where ǫ is the halfwidth of the angle class and L is the number of angle classes Example: Take ǫ = 22.5 and L = 4, so we obtain the four cardinal directions measured counterclockwise from the x-axis, 0, 45, 90, and 135 Chapter 2: Basics of Point-Referenced Data Models p. 31/45

85 EDA for assessing anisotropy Directional semivariogram: Choose angle classes η i ± ǫ, i = 1,...,L where ǫ is the halfwidth of the angle class and L is the number of angle classes Example: Take ǫ = 22.5 and L = 4, so we obtain the four cardinal directions measured counterclockwise from the x-axis, 0, 45, 90, and 135 For a given angle class, the Matheron empirical semivariogram provides a directional semivariogram for angle η i. Chapter 2: Basics of Point-Referenced Data Models p. 31/45

86 EDA for assessing anisotropy Directional semivariogram: Choose angle classes η i ± ǫ, i = 1,...,L where ǫ is the halfwidth of the angle class and L is the number of angle classes Example: Take ǫ = 22.5 and L = 4, so we obtain the four cardinal directions measured counterclockwise from the x-axis, 0, 45, 90, and 135 For a given angle class, the Matheron empirical semivariogram provides a directional semivariogram for angle η i. Theoretically, all types of anisotropy can be assessed from these directional semivariograms; however, in practice determining whether the sill, nugget, and/or range varies with direction can be difficult. Chapter 2: Basics of Point-Referenced Data Models p. 31/45

87 EDA for assessing anisotropy Rose diagram: Created from the directional semivariograms to evaluate geometric anisotropy Chapter 2: Basics of Point-Referenced Data Models p. 32/45

88 EDA for assessing anisotropy Rose diagram: Created from the directional semivariograms to evaluate geometric anisotropy At an arbitrarily selected γ, for a directional semivariogram at angle η, the distance d at which the directional semivariogram attains γ can be interpolated. Chapter 2: Basics of Point-Referenced Data Models p. 32/45

89 EDA for assessing anisotropy Rose diagram: Created from the directional semivariograms to evaluate geometric anisotropy At an arbitrarily selected γ, for a directional semivariogram at angle η, the distance d at which the directional semivariogram attains γ can be interpolated. The rose diagram is a plot of angle η and corresponding distance d in polar coordinates. Chapter 2: Basics of Point-Referenced Data Models p. 32/45

90 EDA for assessing anisotropy Rose diagram: Created from the directional semivariograms to evaluate geometric anisotropy At an arbitrarily selected γ, for a directional semivariogram at angle η, the distance d at which the directional semivariogram attains γ can be interpolated. The rose diagram is a plot of angle η and corresponding distance d in polar coordinates. If an elliptical contour describes the extremities of the rose diagram reasonably well, then the process exhibits geometric anisotropy; a circular contour suggests isotropy. Chapter 2: Basics of Point-Referenced Data Models p. 32/45

91 EDA for assessing anisotropy Panel A below shows directional semivariograms for the 1990 scallop data (the semivariogram points are connected only to aid comparison). Possible conclusions: the variability in the 45 direction (parallel to the coastline) is significantly less than in the other three directions; Chapter 2: Basics of Point-Referenced Data Models p. 33/45

92 EDA for assessing anisotropy Panel A below shows directional semivariograms for the 1990 scallop data (the semivariogram points are connected only to aid comparison). Possible conclusions: the variability in the 45 direction (parallel to the coastline) is significantly less than in the other three directions; the variability perpendicular to the coastline (135 ) is very erratic, possibly exhibiting sill anisotropy. Chapter 2: Basics of Point-Referenced Data Models p. 33/45

93 EDA for assessing anisotropy Panel A below shows directional semivariograms for the 1990 scallop data (the semivariogram points are connected only to aid comparison). Possible conclusions: the variability in the 45 direction (parallel to the coastline) is significantly less than in the other three directions; the variability perpendicular to the coastline (135 ) is very erratic, possibly exhibiting sill anisotropy. Panel B below shows the rose diagram for the 1990 scallop data, using the γ contour of 4.5. It is approximately elliptical, oriented parallel to the coastline ( 45 ) possible geometric anisotropy! Chapter 2: Basics of Point-Referenced Data Models p. 33/45

94 EDA for assessing anisotropy A B Gamma (dist) Degrees h, N-S Distance (km) h, E-W Chapter 2: Basics of Point-Referenced Data Models p. 34/45

95 EDA for assessing anisotropy A B Gamma (dist) Degrees h, N-S Distance (km) h, E-W Warning: it is dangerous to read too much significance and interpretation into directional variograms. No sample sizes (and thus no assessments of variability) are attached to these pictures. Directional variograms from truly isotropic data often exhibit differences of the magnitudes seen above. Chapter 2: Basics of Point-Referenced Data Models p. 34/45

96 Empirical semivariogram contour plots Also called contour maps of the grouped variogram values, or isarithmic plots of the semivariogram. For each of the ( N 2) pairs of sites in R 2, calculate h x and h y, the separation distances along each axis. Chapter 2: Basics of Point-Referenced Data Models p. 35/45

97 Empirical semivariogram contour plots Also called contour maps of the grouped variogram values, or isarithmic plots of the semivariogram. For each of the ( N 2) pairs of sites in R 2, calculate h x and h y, the separation distances along each axis. Since the sign of h y depends upon the order of the sites, insist h y 0. (Take ( h x, h y ) when h y < 0.) Chapter 2: Basics of Point-Referenced Data Models p. 35/45

98 Empirical semivariogram contour plots Also called contour maps of the grouped variogram values, or isarithmic plots of the semivariogram. For each of the ( N 2) pairs of sites in R 2, calculate h x and h y, the separation distances along each axis. Since the sign of h y depends upon the order of the sites, insist h y 0. (Take ( h x, h y ) when h y < 0.) Aggregate these pairs into rectangular bins B ij and compute the empirical semivariogram for B ij as γ ij = 1 2N Bij {(k,l):(s k s l ) B ij } (Y (s k ) Y (s l )) 2, where N Bij is the number of sites in bin B ij. Chapter 2: Basics of Point-Referenced Data Models p. 35/45

99 Empirical semivariogram contour plots Labeling the center of the (i,j)th bin by (x i,y j ), a three dimensional plot of γ ij versus (x i,y j ) yields an empirical semivariogram surface. A contour plot of a smoothed version of this surface produces the ESC plot. Chapter 2: Basics of Point-Referenced Data Models p. 36/45

100 Empirical semivariogram contour plots Labeling the center of the (i,j)th bin by (x i,y j ), a three dimensional plot of γ ij versus (x i,y j ) yields an empirical semivariogram surface. A contour plot of a smoothed version of this surface produces the ESC plot. The ESC plot can be used to assess departures from isotropy: isotropy is depicted by circular contours, while geometric anisotropy is implied by elliptical contours. Chapter 2: Basics of Point-Referenced Data Models p. 36/45

101 Empirical semivariogram contour plots Labeling the center of the (i,j)th bin by (x i,y j ), a three dimensional plot of γ ij versus (x i,y j ) yields an empirical semivariogram surface. A contour plot of a smoothed version of this surface produces the ESC plot. The ESC plot can be used to assess departures from isotropy: isotropy is depicted by circular contours, while geometric anisotropy is implied by elliptical contours. A rose diagram traces only one arbitrarily selected contour of this plot. Chapter 2: Basics of Point-Referenced Data Models p. 36/45

102 Empirical semivariogram contour plots Labeling the center of the (i,j)th bin by (x i,y j ), a three dimensional plot of γ ij versus (x i,y j ) yields an empirical semivariogram surface. A contour plot of a smoothed version of this surface produces the ESC plot. The ESC plot can be used to assess departures from isotropy: isotropy is depicted by circular contours, while geometric anisotropy is implied by elliptical contours. A rose diagram traces only one arbitrarily selected contour of this plot. A possible drawback to the ESC plot is the occurrence of sparse counts in extreme bins. However, these bins may be trimmed before smoothing if desired. Chapter 2: Basics of Point-Referenced Data Models p. 36/45

103 ESC plot of the 1993 scallop data h, N-S h, E-W Plot is overlaid on the bin centers and their counts. Chapter 2: Basics of Point-Referenced Data Models p. 37/45

104 ESC plot of the 1993 scallop data h, N-S h, E-W Plot is overlaid on the bin centers and their counts. Suggests geometric anisotropy! Chapter 2: Basics of Point-Referenced Data Models p. 37/45

105 ESC plot of the 1993 scallop data h, N-S h, E-W Plot is overlaid on the bin centers and their counts. Suggests geometric anisotropy! The ESC values in the h y 0 row provides an alternative to the usual 0 directional semivariogram. Chapter 2: Basics of Point-Referenced Data Models p. 37/45

106 Classical spatial prediction (Kriging) Named by Matheron (1963) in honor of D.G. Krige, a South African mining engineer whose seminal work on empirical methods for geostatistical data inspired the general approach Chapter 2: Basics of Point-Referenced Data Models p. 38/45

107 Classical spatial prediction (Kriging) Named by Matheron (1963) in honor of D.G. Krige, a South African mining engineer whose seminal work on empirical methods for geostatistical data inspired the general approach Optimal spatial prediction: given observations of a random field Y = (Y (s 1 ),...,Y (s n )), predict the variable Y at a site s 0 where it has not been observed Chapter 2: Basics of Point-Referenced Data Models p. 38/45

108 Classical spatial prediction (Kriging) Named by Matheron (1963) in honor of D.G. Krige, a South African mining engineer whose seminal work on empirical methods for geostatistical data inspired the general approach Optimal spatial prediction: given observations of a random field Y = (Y (s 1 ),...,Y (s n )), predict the variable Y at a site s 0 where it has not been observed Under squared error loss, the best linear prediction minimizes E[Y (s 0 ) ( l i Y (s i ) + δ 0 )] 2 (the kriging equations) over δ 0 and l i. Chapter 2: Basics of Point-Referenced Data Models p. 38/45

109 Classical spatial prediction (Kriging) Named by Matheron (1963) in honor of D.G. Krige, a South African mining engineer whose seminal work on empirical methods for geostatistical data inspired the general approach Optimal spatial prediction: given observations of a random field Y = (Y (s 1 ),...,Y (s n )), predict the variable Y at a site s 0 where it has not been observed Under squared error loss, the best linear prediction minimizes E[Y (s 0 ) ( l i Y (s i ) + δ 0 )] 2 (the kriging equations) over δ 0 and l i. Under intrinsic stationarity and adopting unbiasedness, δ 0 drops out. So, with an estimate of γ, one immediately obtains the ordinary kriging estimate. Chapter 2: Basics of Point-Referenced Data Models p. 38/45

110 Classical spatial prediction (Kriging) Named by Matheron (1963) in honor of D.G. Krige, a South African mining engineer whose seminal work on empirical methods for geostatistical data inspired the general approach Optimal spatial prediction: given observations of a random field Y = (Y (s 1 ),...,Y (s n )), predict the variable Y at a site s 0 where it has not been observed Under squared error loss, the best linear prediction minimizes E[Y (s 0 ) ( l i Y (s i ) + δ 0 )] 2 (the kriging equations) over δ 0 and l i. Under intrinsic stationarity and adopting unbiasedness, δ 0 drops out. So, with an estimate of γ, one immediately obtains the ordinary kriging estimate. No distributional assumptions are required for the Y (s i ). Chapter 2: Basics of Point-Referenced Data Models p. 38/45

111 Difficulties Limitation of a constant mean so introduce a mean surface, and then we have universal kriging Chapter 2: Basics of Point-Referenced Data Models p. 39/45

112 Difficulties Limitation of a constant mean so introduce a mean surface, and then we have universal kriging But what if mean surface unknown? variogram unknown? If we put estimates of both of these into the kriging equations, we fail to take into account the uncertainty in these estimates Chapter 2: Basics of Point-Referenced Data Models p. 39/45

113 Difficulties Limitation of a constant mean so introduce a mean surface, and then we have universal kriging But what if mean surface unknown? variogram unknown? If we put estimates of both of these into the kriging equations, we fail to take into account the uncertainty in these estimates So, we turn to Gaussian processes and likelihoodbased methods, and work with the covariance function directly. We now redo the calculations in this setting... Chapter 2: Basics of Point-Referenced Data Models p. 39/45

114 Kriging with Gaussian processes Given covariate values x(s i ), i = 0, 1,...,n, suppose Y = Xβ + ǫ, where ǫ N (0, Σ). Chapter 2: Basics of Point-Referenced Data Models p. 40/45

115 Kriging with Gaussian processes Given covariate values x(s i ), i = 0, 1,...,n, suppose Y = Xβ + ǫ, where ǫ N (0, Σ). For a spatial covariance structure having no nugget effect, we specify Σ as Σ = σ 2 H (φ) where (H (φ)) ij = ρ(φ;d ij ), where d ij = s i s j, the distance between s i and s j, and ρ is a valid correlation function on R r. Chapter 2: Basics of Point-Referenced Data Models p. 40/45

116 Kriging with Gaussian processes Given covariate values x(s i ), i = 0, 1,...,n, suppose Y = Xβ + ǫ, where ǫ N (0, Σ). For a spatial covariance structure having no nugget effect, we specify Σ as Σ = σ 2 H (φ) where (H (φ)) ij = ρ(φ;d ij ), where d ij = s i s j, the distance between s i and s j, and ρ is a valid correlation function on R r. For a model having a nugget effect, we instead set Σ = σ 2 H (φ) + τ 2 I, where τ 2 is the nugget effect variance. Chapter 2: Basics of Point-Referenced Data Models p. 40/45

117 Kriging with Gaussian processes We seek the function f (y) that minimizes [ the ] mean-squared prediction error, E (Y (s 0 ) f (y)) 2 y, i.e., we work with the conditional distribution of Y (s 0 ) y Chapter 2: Basics of Point-Referenced Data Models p. 41/45

118 Kriging with Gaussian processes We seek the function f (y) that minimizes [ the ] mean-squared prediction error, E (Y (s 0 ) f (y)) 2 y, i.e., we work with the conditional distribution of Y (s 0 ) y But this expression equals { } E (Y (s 0 ) E[Y (s 0 ) y]) 2 y + {E[Y (s 0 ) y] f (y)} 2, since the expectation of the cross-product term is 0. Chapter 2: Basics of Point-Referenced Data Models p. 41/45

119 Kriging with Gaussian processes We seek the function f (y) that minimizes [ the ] mean-squared prediction error, E (Y (s 0 ) f (y)) 2 y, i.e., we work with the conditional distribution of Y (s 0 ) y But this expression equals { } E (Y (s 0 ) E[Y (s 0 ) y]) 2 y + {E[Y (s 0 ) y] f (y)} 2, since the expectation of the cross-product term is 0. Since the second term is nonnegative, we have [ ] { } E (Y (s 0 ) f (y)) 2 y E (Y (s 0 ) E[Y (s 0 ) y]) 2 y for any function f (y). Chapter 2: Basics of Point-Referenced Data Models p. 41/45

120 Kriging with Gaussian processes Equality holds above f (y) = E[Y (s 0 ) y], so it must be that the predictor f(y) that minimizes the error is the conditional expectation of Y (s 0 ) given the data. Chapter 2: Basics of Point-Referenced Data Models p. 42/45

121 Kriging with Gaussian processes Equality holds above f (y) = E[Y (s 0 ) y], so it must be that the predictor f(y) that minimizes the error is the conditional expectation of Y (s 0 ) given the data. This result is quite intuitive from a Bayesian point of view, since this f(y) is just the posterior mean of Y (s 0 )! Chapter 2: Basics of Point-Referenced Data Models p. 42/45

122 Kriging with Gaussian processes Equality holds above f (y) = E[Y (s 0 ) y], so it must be that the predictor f(y) that minimizes the error is the conditional expectation of Y (s 0 ) given the data. This result is quite intuitive from a Bayesian point of view, since this f(y) is just the posterior mean of Y (s 0 )! Now consider estimation of this best predictor, first in the wildly unrealistic situation in which all the population parameters (β,σ 2,φ, and τ 2 ) are known. Suppose ( Y 1 Y 2 ) N (( µ 1 µ 2 ), ( Ω 11 Ω 12 Ω 21 Ω 22 )) where Ω 21 = Ω T 12. Chapter 2: Basics of Point-Referenced Data Models p. 42/45,

123 Kriging with Gaussian processes Then p (Y 1 Y 2 ) is normal with mean and variance E[Y 1 Y 2 ] = µ 1 + Ω 12 Ω 1 22 (Y 2 µ 2 ) and V ar[y 1 Y 2 ] = Ω 11 Ω 12 Ω 1 22 Ω 21. Chapter 2: Basics of Point-Referenced Data Models p. 43/45

124 Kriging with Gaussian processes Then p (Y 1 Y 2 ) is normal with mean and variance E[Y 1 Y 2 ] = µ 1 + Ω 12 Ω 1 22 (Y 2 µ 2 ) and V ar[y 1 Y 2 ] = Ω 11 Ω 12 Ω 1 22 Ω 21. In our framework, Y 1 = Y (s 0 ) and Y 2 = y, meaning that Ω 11 = σ 2 + τ 2, Ω 12 = γ T, and Ω 22 = Σ = σ 2 H (φ) + τ 2 I, where γ T = ( σ 2 ρ(φ;d 01 ),...,σ 2 ρ(φ;d 0n ) ). Chapter 2: Basics of Point-Referenced Data Models p. 43/45

125 Kriging with Gaussian processes Then p (Y 1 Y 2 ) is normal with mean and variance E[Y 1 Y 2 ] = µ 1 + Ω 12 Ω 1 22 (Y 2 µ 2 ) and V ar[y 1 Y 2 ] = Ω 11 Ω 12 Ω 1 22 Ω 21. In our framework, Y 1 = Y (s 0 ) and Y 2 = y, meaning that Ω 11 = σ 2 + τ 2, Ω 12 = γ T, and Ω 22 = Σ = σ 2 H (φ) + τ 2 I, where γ T = ( σ 2 ρ(φ;d 01 ),...,σ 2 ρ(φ;d 0n ) ). Substituting these values into the mean and variance formulae above, we obtain E [Y (s 0 ) y] = x T 0 β + γ T Σ 1 (y Xβ), and V ar[y (s 0 ) y] = σ 2 + τ 2 γ T Σ 1 γ. Chapter 2: Basics of Point-Referenced Data Models p. 43/45

126 Kriging with Gaussian processes Next, consider how these answers are modified in the more realistic scenario where the model parameters are unknown. We modify f(y) to f (y) = x T β ( ) 0 + γ T Σ 1 y X β, ( T where γ = ˆσ 2 ρ(ˆφ;d 01 ),..., ˆσ 2 ρ(ˆφ;d 0n )), Σ = ˆσ 2 H(ˆφ), and β = β ( 1 WLS = X X) T Σ 1 X T Σ 1 y. Chapter 2: Basics of Point-Referenced Data Models p. 44/45

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry