REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University

SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10

OUTLINE 1. Introduction 2. The Krige and Regress (KR) Estimator 3. The Maximum Likelihood (ML) Estimator 4. Simulation Results 5. Conclusions

Notation = vector of responses observed at locations s 1,..., s n X = vector of unobserved predictors at locations s 1,..., s n = vector of observations from predictor process at locations t 1,..., t m ɛ= error vector at locations s 1,..., s n

1. = β 0 1 n 1 + β 1 X + ɛ. Assumptions 2. X,, and ɛ are generated by spatially autocorrelated stationary Gaussian processes. 3. X and are generated by the same spatial process. 4. X and ɛ are independent of each other. 5. Spatial autocorrelations are each given by a parametric model.

Model = β 0 1 n 1 + β 1 X + ɛ ( X ɛ N n (0, Σ ɛ ) ) [ ΣX Σ X ]) N n+m (µ X 1 (n+m) 1, Σ X Σ ) Σ X, Σ, Σ X depend on θ X = (θ X,1 θ X,2 θ X,3 ) Σ ɛ depends on θ ɛ = (θ ɛ,1 θ ɛ,2 θ ɛ,3

KRIGE AND REGRESS Kriging ˆX = ˆµ X 1 n 1 + Σ X Σ 1 ( ˆµ X1 m 1 ) where ˆµ X is the best linear unbiased estimator of µ X : so that ˆµ X = 1 1 mσ 1 1 1 m Σ 1 1 m 1 ˆX = Λ where Λ depends on Σ and Σ X.

KRIGE AND REGRESS Regression where e can write ˆβ KR = ( ˆX Σ 1 ɛ ˆX = ˆX) 1 ˆX Σ 1 ɛ [ ] 1 n 1 ˆX. ˆβ 1,KR = ( Λ MΛ ) 1 Λ M where M = 1 1 n Σ 1 ɛ 1 n 1 Σ 1 ɛ Σ 1 ɛ 1 n n Σ 1 ɛ.

Variance of ˆβ KR Starting with the identity we get var( ˆβ 1,KR ) = E[var( ˆβ KR )] + var[e( ˆβ KR )], var( ˆβ 1,KR ) = β 2 1[E(Q 3 Q 2 2 )+E(Q 2 1Q 2 2 ) (E(Q 1 Q 1 2 )) 2 ]+E(Q 1 2 ) where Q 1 = Λ Σ 1 ɛ Σ X Σ 1 Q 2 = Λ Σ 1 ɛ Λ + Λ Σ 1 ɛ Q 3 = Λ Σ 1 ɛ (Σ X Σ X Σ 1 Σ X )Σ 1 ɛ (1 1 n Σ X Σ 1 1 1 m)µ X Λ.

Estimating θ X from e are assuming N (µ X 1 m 1, Σ ) and, with h ij =distance between sampling locations of i and j, (Σ ) ij = { θx,1, h ij = 0 θ X,2 exp( θ X,3 h ij ), h ij > 0 Estimate the parameters θ = (θ X,1, θ X,2, θ X,3 ) by Restricted Maximum Likelihood (REML).

Estimating θ ɛ Given N (Xβ, Σ ɛ ), can t use REML because X not known. May work to estimate θ ɛ from approximate residuals ˆɛ = ˆX ˆβ where ˆβ is the unweighted KR estimate of β. e take Σ ɛ = σ ɛ identity where σ ɛ is scalar.

Approximate Estimators ˆX = ˆµ X 1 n 1 + ˆΣ ˆΣ 1 X ( ˆµ X1 m 1 ) ˆβ KR = ( ˆX ˆX) 1 ˆX var( ˆβ 1,KR ) = ˆβ 1,KR[E( 2 ˆQ ˆQ 2 3 2 ) + E( ˆQ 2 1 1 + E( ˆQ 2 ) ˆQ 2 2 ) (E( ˆQ 1 ˆQ 1 2 )) 2 ] where ˆQ 1, ˆQ2, and ˆQ 3 are as before except with Σ and Σ X estimated.

Point Estimates ˆβ 1,KR KR Estimates of β 1 40 35 30 25 20 15 10 5 0 5 Known Estimated Covariance Parameters

30 Variance Estimates of ˆβ 1,KR 0.5*log(Est. Var./True Var.) 25 20 15 10 5 0 Known Estimated Covariance Parameters

Naive Variance Estimates of ˆβ 1,KR : var naïve = ( ˆX Σ 1 ɛ ˆX) 1 1.6 Naive Variance Estimates 1.4 1.2 1 0.8 0.6 0.4 0.2 Known Covariance Parameters

Naive Variance Estimates of ˆβ 1,KR Log(Naive Var. Est.) 6 4 2 0 2 4 Unknown Covariance Parameters

Nominal 95% Confidence Intervals ˆβ KR ± 1.96 var Covariance Parameters Unknown % Coverage 89.46 Average idth 1.33e+10

e will show Consistency of ˆβ KR N( ˆβKR β) D N(0, Σ L ). hy bother? If we find the maximum likelihood estimates β ML by a Newton-Raphson maximization of the likelihood function with consistent parameter estimates as starting values, then N( ˆβML β) D N(0, E[I] 1 ), where E[I] is the information matrix.

Consistency of ˆβ KR Notation Suppose we have N iid observations [ 1 1 ] [ N N ] where each i is n 1 and each i is m 1. 0 5 10 0 5 10... 0 5 10 0 5 10

Consistency of ˆβ KR Assumptions Assume [Xi] ( [ ]) ΣX Σ X i iid N µ X 1 (n+m) 1, Σ X Σ, ɛ i iid N(0, σ 2 I), and i = X i β + ɛ i, where X i = [1 n 1 X i ] and β = [ β0 β 1 ]

Consistency of ˆβ KR More assumptions ˆµ X = 1 [ N N i=1 (1 1 1 m ˆΣ 1 m 1) 1 1 ˆΣ 1 1 m i] θ X is estimated by REML E( ˆX 1 ˆX 1 ) is invertible Σ X 0 then N( ˆβ KR β) D N(0, Σ L ).

N-consistency of ˆβKR The asymptotic covariance matrix of ˆβ KR is Σ L = covariance when only β unknown + loss of efficiency for estimating µ X + loss of efficiency for estimating θ X

Maximum Likelihood [ ], ith N iid observations i i the negative log-likelihood is 2l = N log Σ + N i=1 V i Σ 1 V i. where Σ is a block-diagonal matrix with copies of [ ] β 2 Σ 1 = 1 Σ X + Σ ɛ β 1 Σ X β 1 Σ X Σ along the diagonal and [ i (β 0 + β 1 µ X )1 n 1 V i = i µ X 1 m 1 ].

Let φ = Efficiency of the Maximum Likelihood Estimator ] [β 0 β 1 µ X θ X θ ɛ. If we compute φ ML by a Newton-Raphson minimization of 2l with consistent parameter estimates as initial values, then N( ˆφML φ) D N(0, E[I(φ)] 1 ), where ( 2 l {E[I(φ)]} ij = E φ i φ j provided some regularity conditions are met. ).

A Variance Estimator for ˆβ ML e can use an information-based variance estimator: or var 1 ( ˆφ ML ) = I 1 ( ˆφ ML ) var 2 ( ˆφ ML ) = [E(I)] 1 φ= ˆφML. var 1 often fails to be positive definite in simulations.

Simulation Study 16 treatments with 75 replications each Fixed parameters: β 0 = β 1 = µ X = 1, θ X,1 = θ ɛ,1 = 0 Remaining parameters varied to define treatments: θ ɛ2 {0.1, 1} θ ɛ3 {0.5, 3} θ X2 {0.25, 1} θ X3 {0.5, 1.5} n = m = 50, but N = 1

Simulation Sampling Locations 0 5 10 0 5 10 Locations

How θ 3 Affects Covariance C(h) = exp( θ 3 h) 1 0.8 Exponential Covariogram θ 3 =0.5 θ 3 =1 θ 3 =2.5 θ 3 =5 Covariance 0.6 0.4 0.2 0 0 2 4 6 8 10 Distance=h

Simulation Study Treatments TR: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 θ ɛ,2 + + + + + + + + θ ɛ,3 + + + + + + + + θ X,2 + + + + + + + + θ X,3 + + + + + + + + + indicates the larger value. indicates the smaller value.

Simulation Study Variance Estimates 40 30 Sign(Est. Var.)*Log Est. Var. 20 10 0 10 20 30 40 log(monte Carlo MSE) sgn(est. Var)log Est. Var. Est. θ X,2 /Est. θ X,3 <0.075 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Experiment

Simulation Study KR vs. ML 5 4 3 MSE(ML)=6.816 MSE(KR)=35.98 Bias(ML)= 0.075 Bias(KR)=0.173 2 ML 1 0 1 2 2 1 0 1 2 3 4 5 KR

Conclusions e have no good variance estimator for ˆβ KR. Information-based variance estimator for ˆβ ML appears useful, even when N = 1, provided no numerical problems. ML yields more precise point estimates than KR.