Multivariate spatial models and the multikrig class

Size: px

Start display at page:

Download "Multivariate spatial models and the multikrig class"

Ambrose Carr
6 years ago
Views:

1 Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR ENAR Spring Meetings March 15, 2009

2 Outline Overview of multivariate spatial regression models Case study: pedotransfer functions and soil water profiles The multikrig class Case study: NC temperature and precipitation 2

3 A Spatial Regression Model A spatial regression model: where Y = Xβ + h + ɛ (n 1) (n q)(q 1) (n 1) (n 1) E[h] = 0, Var[h] = Σ h E[ɛ] = 0, Var[ɛ] = σ 2 I h and ɛ are independent Y N (Xβ, V), V = Σ h + σ 2 I β = (X V 1 X) 1 X V 1 Y, ĥ = Σ hv 1 (Y X β) 3

4 Multivariate Regression A multivariate, multiple regression model: where Y = Xβ + ɛ (n p) (n q)(q p) (n p) Each of the n rows of Y represents a p-vector observation Each of the p columns of β represent regression coefficients for each variable The rows of ɛ represents a collection of iid error vectors with zero mean and common covariance matrix, Σ 4

5 Multivariate Regression MLEs are straightforward to obtain: β = (X X) 1 X Y (q p) Σ = n 1Y PY (p p) where P = I X(X X) 1 X Note that the columns of β can be obtained through p univariate regressions 5

6 Vec and Kronecker The Kronecker product of an m n matrix A and an r q matrix B is an mr nq matrix: A B = a 11 B a 12 B a 1n B a 21 B a 22 B a 2n B a m1 B a m2 B a mn B Some properties: A (B + C) = A B + A C A (B C) = (A B) C (A B)(C D) = AC BD (A B) = A B (A B) 1 = A 1 B 1 A B = A m B n 6

7 Vec and Kronecker The vec-operator stacks the columns of a matrix: A = Some properties: [ a11 a 12 a 21 a 22 ] vec(a) = vec(axb) = (B A) vec X a 11 a 21 a 12 a 22 tr(a B) = vec(a) vec(b) vec(a + B) = vec(a) + vec(b) vec(αa) = α vec(a) 7

8 Multivariate Regression Revisited Rewrite the multivariate, multiple regression model: What is Var[vec ɛ]? vec(y) = (I p X) vec(β) + vec(ɛ) (np 1) (np qp)(qp 1) (np 1) What is the GLS estimator for vec(β)? 8

9 A Multivariate Spatial Model Extend the multivariate, multiple regression model: where vec(y) = (I p X) vec(β) + vec(h) + vec(ɛ) (np 1) (np qp)(qp 1) (np 1) (np 1), Var[vec(h)] = Σ h = Var[vec(ɛ)] = Σ I n Σ 11 Σ 12 Σ 1p Σ 12 Σ 22 Σ 2p Σ 12 Σ 2p Σ pp 9

10 A Multivariate Spatial Model One simplification to the spatial covariance matrix is to use a Kronecker form: Σ h = ρ K where ρ is a p p matrix of scale parameters K is an n n spatial covariance 10

11 A Multivariate Spatial Model Extend the multivariate, multiple regression model: vec(y) = (I p X) vec(β) + vec(h) + vec(ɛ) (np 1) (np qp)(qp 1) (np 1) (np 1) OR Y = Xβ + h + ɛ Now everything follows 11

12 Case Study: Pedotransfer Functions Soil characteristics such as composition (clay, silt, sand) are commonly measured and easily obtainable Unfortunately, crop models require water holding characteristics such as the wilting point or lower limit (LL) and the drained upper limit (DUL) which are not so easy to obtain Often the LL and DUL are a function of depth - soil water profile 12

13 Case Study: Pedotransfer Functions Pedotransfer functions are commonly used to estimate LL and DUL Differential equations, regression, nearest neighbors, neural networks, etc Often specialized by soil type and/or region Develop a new type of pedotransfer function that can capture the entire soil water profile (LL & DUL as a function of depth) Characterize the variation! 13

14 Soil Water Profiles SIL SICL CL S depth proportion water

15 The Big Picture Soil Weather Many, many other things Big, black box Yield Soil Water holding characteristics Bulk density Etc Weather (20 years) Solar radiation Temperature max/min Precipitation The CERES Crop Model 15

16 The Big Picture Given a complicated array of inputs, the CERES crop model will give the yields of, for example, maize Deterministic output variation in yields also of interest Goals: Establish a framework to study sources of variation in crop yields Assess impacts of climate change on crop yields 16

17 Data n = 272 measurements on N = 63 soil samples Gijsman et al (2002) Ratliff et al (1983), Ritchie et al (1987) Includes measurements of: depth, soil composition and texture percentages of clay, sand, and silt bulk density, organic matter, and field measured values of LL and DUL 17

18 Data The soil texture measurements form a composition Z clay + Z silt + Z sand = 1 and Z clay, Z silt,z sand are the proportions of each soil component Not really three variables To remove the dependence, the additive log-ratio transformation (Aitchinson, 1986) is applied, defining two new variables X 1 = log Z sand Z clay X 2 = log Z silt Z clay 18

19 Data - Composition vs LL CLAY SAND SILT 19

20 Data - Composition vs LL log(silt/clay) log(sand/clay) 20

21 A Multi-objective Pedotransfer Function The model for the multi-objective pedotransfer function for a particular soil is where Y 0 = T 0 β + h(x 0 ) + ɛ(d 0 ) Y 0 = log LL 1 LL d 1 d and d is the number of measurements (depths) and i = DUL i LL i, 21

22 A Multi-objective Pedotransfer Function The model for the multi-objective pedotransfer function for a particular soil is Y 0 = T 0 β + h(x 0 ) + ɛ(d 0 ) where and T 0 = [ 1 X0 Z LL, X 0 Z,0 ], X 0 is the transformed soil composition information Z LL and Z are additional covariates for LL and Z LL includes organic carbon Z includes linear and quadratic terms for depth 22

23 A Multi-objective Pedotransfer Function The model for the multi-objective pedotransfer function for a particular soil is where Y 0 = T 0 β + h(x 0 ) + ɛ(d 0 ) h(x 0 ) is a two-dimensional spatial process that controls the smoothness of the contribution of X ɛ(d 0 ) is an error process that accounts for the dependence in LL and for a particular depth and accounts for dependence across depths (one-dimensional spatial process) 23

24 A Multi-objective Pedotransfer Function Letting Y = log [LL 11 LL 1d1 LL 21 LL NdN 11 1d1 21 NdN ], then Y is multivariate normal with E[Y] = Tβ Var[Y] = Σ h + Σ ɛ [ ] ρ1 0 Σ h = 0 ρ 2 Σ ɛ = S R K with K ij = k(x i, X j ) S is the covariance of (LL, ) at a fixed depth R is the (spatial) covariance across depths 24

25 Covariance Structures The covariance function for h is the Matern family C(d) = σ 22(θd/2)ν K ν (θd) Γ(ν) where σ 2 is a scale parameter, θ represents the range, ν controls the smoothness σ 2 = 1 (the ρ controls the variances), ν = 1, and θ is taken to be approximately the range of the data These choices represent a covariance structure that is consistent with the thin-plate spline estimator (large range, more smoothness) Analogous to fixing the kernel and estimating the bandwidth with kernel estimators 25

26 Covariance Structures The covariance function across depths is exponential C(d) = σ 2 exp ( d/θ) where again σ 2 is a scale parameter and θ represents the range The parameters σ 2 = 1 (the matrix S controls the variances) and θ is estimated from the data The multiple realizations of the soils allow for improved ability to estimate both scale and range parameters 26

27 Covariance Structures Σ h = [ ρ1 0 0 ρ 2 ] K Σ ɛ = S R 27

28 Spatial Smoothing Write Σ h + Σ ɛ = [ ρ1 0 0 ρ 2 ] [[ η1 0 = s 11 0 η 2 = s 11 Ω [ s11 s K + 12 s 12 s 22 ] ] R K + [ 1 v12 v 12 v 22 ] R ] The amount of smoothing is due to the relative contributions of the variance components, ie η 1 and η 2 Different degrees of smoothing are allowed for LL and Also, this construction allows for different degrees of variation in the error terms for LL and the variables 28

29 The Estimator The model suggests an estimator of the form Ŷ 0 = T 0ˆβ + K 0ˆδ, where K 0 = [ η1 0 0 η 2 ] K To fit the model, we must estimate: η 1, η 2 and s 11 β, δ R and the other entries of S 29

30 REML Take the QR decomposition of T T = [Q 1 Q 2 ] [ R 0 ] Then Q 2Y has zero mean and covariance matrix given by Q 2 (Σ h + Σ ɛ )Q 2 Maximize (numerically) the likelihood based on Q 2Y which is only a function of the covariance parameters Estimates of β and δ follow directly ˆβ = (T ˆΩ 1 T) 1 T ˆΩ 1 Y ˆδ = ˆΩ 1 (Y Tˆβ) 30

31 An Iterative Approach 0 Initialize: compute K and set S = I and R = I 1 Estimate η 1 and η 2 (and s 11 ) via a simplified type of REML (grid search) 2 Then ˆβ = (T ˆΩ 1 T) 1 T 1 ˆΩ 1 Y ˆδ = ˆΩ 1 (Y Tˆβ) 3 Compute residuals and a Update S (R fixed) closed form solution b Update R (S fixed) grid search for θ 4 Repeat items 1-3 until convergence 31

32 An Iterative Approach Let Y = µ + h + ɛ, where h and ɛ are independent Gaussian random variables; the conditional distribution of Y µ h given h is a zero mean Gaussian with covariance matrix ɛ Thus, the log-likelihood associated with the residuals is given by n 2 S R vec(u) (S 1 R 1 ) vec(u) The quadratic form can be written as tr(s 1 i j r ij u j u i ) where r ij is the ijth element of R 1 and u i is the bivariate, unstacked residual for the ith observation 32

33 An Iterative Approach An update for S can be written as Ŝ = 1 n i j r ij u j u i = 1 n U R 1 U where U is the n 2 matrix of unstacked residuals Again, a simple grid search for θ is used to obtain a new value for R 33

34 Parameter Estimates η 1 η 2 S 11 S 22 S 12 θ REML Iterative

35 Soil Composition and LL X1 X2 Y

36 Soil Composition and X1 X2 Y

37 Soil Composition and LL/ SAND CLAY SILT SAND CLAY SILT

38 Organic Carbon and LL orgc partial residuals

39 Depth and depth partial residuals

40 Residuals (Within Depth) LL residuals Delta residuals

41 Spatial Covariance Across Depth covariance distance 41

42 Prediction Error The real benefit of considering our estimator as a spatial process is in the interpretation with respect to the uncertainty of the estimator: The thin-plate spline is a biased estimator with uncorrelated error; not easy to quantify the bias (interpolation error and smoothing error) The spatial process estimator is unbiased, but with correlated error; more complicated error structure but conceptually straightforward to work with 42

43 Prediction Error The estimator can be written as Ŷ 0 = T 0ˆβ + K 0ˆδ = A 0 Y, where A 0 = + T 0 (T ˆΩ 1 T) 1 T ˆΩ 1 K 0 (ˆΩ 1 ˆΩ 1 T(T ˆΩ 1 T) 1 T ˆΩ 1) 43

44 Prediction Error Hence, Var(Y 0 Ŷ 0 ) = Var(Y 0 A 0 Y) = Var(Y 0 ) + A 0 Var(Y)A 0 2A 0Cov(Y, Y 0 ) Var(Y 0 ) and Var(Y) are computed by plugging in parameters estimates for Σ h and Σ ɛ The covariance between Y 0 and Y comes from h and is based on the distance between the transformed composition data 44

45 Generation of Soil Profiles Simulations of log LL and log were generated from a multivariate normal with mean A 0 Y and variance given by the prediction error We use an average soil composition profile computed from the data and assumed to constant across all depths, D = {5, 15, 30, 45, 60, 90, 120, 150} Represent a draw from the posterior distribution of soil water profiles based on the estimated mean and covariance structure and the uncertainty gleaned from the data 45

46 Generation of Soil Profiles SIL S depth proportion water 46

47 Application: Crop Models Two soils (SIL, S) Given the soil composition, organic carbon, depth, etc, 100 soil profiles (LL, DUL) were generated Twenty years of weather (solar radiation, temperature min/max, and precipitation) Yield output generated from the CERES-Maize crop model 47

48 Crop Yields SIL (red), S (blue), total annual precipitation (solid line) 48

49 Crop Yields SIL (red), S (blue), average annual temperature (solid line) 49

50 The multikrig Class Create a multivariate spatial regression within the univariate Krig framework: Y = Tβ + h + ɛ Y = Y 1 Y p T = I p X h = h 1 h p ɛ = ɛ 1 ɛ p 50

51 The multikrig Class Create a multivariate spatial regression within the univariate Krig framework: E[Y] = Tβ Y = Tβ + h + ɛ Var[Y] = Σ h + Σ ɛ = ρ 1 ρ p V(θ) + = ρ 1 R V(θ) + s 11 S I n = ρ 1 (R V(θ) + λs I n ) s 11 s 12 s 1p s 21 s 22 s 2p s p1 s p2 s pp I n 51

52 The multikrig Class Create a multivariate spatial regression within the univariate Krig framework: Y = Tβ + h + ɛ E[Y] = Tβ Var[Y] = Σ h + Σ ɛ = ρ 1 (R V(θ) + λs I n ) Given S, R, and θ, use Krig to estimate β, ρ 1 and λ 52

53 The multikrig Class Issues: Specifying x, Y, and Z Mean function (nullfunction) Covariance function (covfunction) Error function (wghtfunction) Estimation (S, R, and θ) 53

54 Krig Function Krig <- function (x, Y, Z, nullfunction = "Krignullfunction", covfunction = "stationarycov", wghtfunction = NULL, nullargs = NULL, covargs = NULL, wghtargs = NULL) x is an n q matrix of spatial locations Y is a n-vector of observations observations Z is a n q matrix of additional covariates 54

55 multikrig Function multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null) s is an n q matrix of spatial locations Y is an n p matrix of observations Z is either: a n q matrix of additional covariates, or a list of n q i matrices of additional covariates 55

56 multikrig Function multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 56

57 Y <- c(y) Y = Y 1 Y p multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 57

58 x <- expandgrid(1:n,1:d) x = n n p multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 58

59 nz <- kronecker(diag(d),cbind(s,z)) Z = s Z s Z multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ d <- ncol(y) n <- nrow(y) Y <- c(y) x <- expandgrid(1:n,1:d) nz <- kronecker(diag(d),cbind(s,z)) obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 59

60 multinull <- function(x,z=null,dropz=false){ data <- dataframe(a=asfactor(x[,2])) X <- modelmatrix( a,data=data,contrasts=list(a="contrtreatment")) return(cbind(x,z)) } T = s Z s Z s Z multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } 60

61 Covariance Function Issue: x is now a matrix of indices Solution: pass the spatial locations as an argument to the covariance function 61

62 multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ covargs$s <- s obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } multicov <- function(x1,x2,marginal=false,c=na,s,theta,rho,smoothness){ ind <- unique(x1[,2]) temp <- stationarycov(s[x1[,1][x1[,2]==1],], s[x2[,1][x2[,2]==1],], Covariance="Matern",theta=theta,smoothness=smoothness) if (length(ind)>1){ for } } return(temp) } (i in 2:length(ind)){ temp2 <- rho[i-1]*stationarycov(s[x1[,1][x1[,2]==i],], s[x2[,1][x2[,2]==i],], Covariance="Matern",theta=theta,smoothness=smoothness) d1 <- dim(temp) d2 <- dim(temp2) temp <- rbind(cbind(temp,matrix(0,d1[1],d2[2])), cbind(matrix(0,d1[1],d2[2]),temp2)) 62

63 Weight Function multikrig <- function(s,y,z, covfunction="multicov",covargs=null, wghtfunction="multiwght",wghtargs=null){ obj <- Krig(x=x,Y=c(Y),Z=nZ,method="REML", nullfunction="multinull", covfunction=covfunction,covargs=covargs, wghtfunction=wghtfunction,wghtargs=wghtargs) } multiwght <- function(x,sp){ n <- length(unique(x[,1])) return(kronecker(solve(s),diag(n))) } 63

64 Estimation Krig will estimate β, ρ 1 and λ REML GCV (not quite there) How to estimate S, R, and θ? 64

65 Thanks! wwwimageucaredu/ ssain Sain, SR, Mearns, L, Shrikant, J, and Nychka, D (2006), A multivariate spatial model for soil water profiles, Journal of Agricultural, Biological, and Environmental Statistics, 11,

Multivariate spatial models and the multikrig class

Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR SAMSI Summer School on Spatial Statistics July 28 August 1, 2009 Outline Overview of multivariate spatial regression models