REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University
SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10
OUTLINE 1. Introduction 2. The Krige and Regress (KR) Estimator 3. The Maximum Likelihood (ML) Estimator 4. Simulation Results 5. Conclusions
Notation = vector of responses observed at locations s 1,..., s n X = vector of unobserved predictors at locations s 1,..., s n = vector of observations from predictor process at locations t 1,..., t m ɛ= error vector at locations s 1,..., s n
1. = β 0 1 n 1 + β 1 X + ɛ. Assumptions 2. X,, and ɛ are generated by spatially autocorrelated stationary Gaussian processes. 3. X and are generated by the same spatial process. 4. X and ɛ are independent of each other. 5. Spatial autocorrelations are each given by a parametric model.
Model = β 0 1 n 1 + β 1 X + ɛ ( X ɛ N n (0, Σ ɛ ) ) [ ΣX Σ X ]) N n+m (µ X 1 (n+m) 1, Σ X Σ ) Σ X, Σ, Σ X depend on θ X = (θ X,1 θ X,2 θ X,3 ) Σ ɛ depends on θ ɛ = (θ ɛ,1 θ ɛ,2 θ ɛ,3
KRIGE AND REGRESS Kriging ˆX = ˆµ X 1 n 1 + Σ X Σ 1 ( ˆµ X1 m 1 ) where ˆµ X is the best linear unbiased estimator of µ X : so that ˆµ X = 1 1 mσ 1 1 1 m Σ 1 1 m 1 ˆX = Λ where Λ depends on Σ and Σ X.
KRIGE AND REGRESS Regression where e can write ˆβ KR = ( ˆX Σ 1 ɛ ˆX = ˆX) 1 ˆX Σ 1 ɛ [ ] 1 n 1 ˆX. ˆβ 1,KR = ( Λ MΛ ) 1 Λ M where M = 1 1 n Σ 1 ɛ 1 n 1 Σ 1 ɛ Σ 1 ɛ 1 n n Σ 1 ɛ.
Variance of ˆβ KR Starting with the identity we get var( ˆβ 1,KR ) = E[var( ˆβ KR )] + var[e( ˆβ KR )], var( ˆβ 1,KR ) = β 2 1[E(Q 3 Q 2 2 )+E(Q 2 1Q 2 2 ) (E(Q 1 Q 1 2 )) 2 ]+E(Q 1 2 ) where Q 1 = Λ Σ 1 ɛ Σ X Σ 1 Q 2 = Λ Σ 1 ɛ Λ + Λ Σ 1 ɛ Q 3 = Λ Σ 1 ɛ (Σ X Σ X Σ 1 Σ X )Σ 1 ɛ (1 1 n Σ X Σ 1 1 1 m)µ X Λ.
Estimating θ X from e are assuming N (µ X 1 m 1, Σ ) and, with h ij =distance between sampling locations of i and j, (Σ ) ij = { θx,1, h ij = 0 θ X,2 exp( θ X,3 h ij ), h ij > 0 Estimate the parameters θ = (θ X,1, θ X,2, θ X,3 ) by Restricted Maximum Likelihood (REML).
Estimating θ ɛ Given N (Xβ, Σ ɛ ), can t use REML because X not known. May work to estimate θ ɛ from approximate residuals ˆɛ = ˆX ˆβ where ˆβ is the unweighted KR estimate of β. e take Σ ɛ = σ ɛ identity where σ ɛ is scalar.
Approximate Estimators ˆX = ˆµ X 1 n 1 + ˆΣ ˆΣ 1 X ( ˆµ X1 m 1 ) ˆβ KR = ( ˆX ˆX) 1 ˆX var( ˆβ 1,KR ) = ˆβ 1,KR[E( 2 ˆQ ˆQ 2 3 2 ) + E( ˆQ 2 1 1 + E( ˆQ 2 ) ˆQ 2 2 ) (E( ˆQ 1 ˆQ 1 2 )) 2 ] where ˆQ 1, ˆQ2, and ˆQ 3 are as before except with Σ and Σ X estimated.
Point Estimates ˆβ 1,KR KR Estimates of β 1 40 35 30 25 20 15 10 5 0 5 Known Estimated Covariance Parameters
30 Variance Estimates of ˆβ 1,KR 0.5*log(Est. Var./True Var.) 25 20 15 10 5 0 Known Estimated Covariance Parameters
Naive Variance Estimates of ˆβ 1,KR : var naïve = ( ˆX Σ 1 ɛ ˆX) 1 1.6 Naive Variance Estimates 1.4 1.2 1 0.8 0.6 0.4 0.2 Known Covariance Parameters
Naive Variance Estimates of ˆβ 1,KR Log(Naive Var. Est.) 6 4 2 0 2 4 Unknown Covariance Parameters
Nominal 95% Confidence Intervals ˆβ KR ± 1.96 var Covariance Parameters Unknown % Coverage 89.46 Average idth 1.33e+10
e will show Consistency of ˆβ KR N( ˆβKR β) D N(0, Σ L ). hy bother? If we find the maximum likelihood estimates β ML by a Newton-Raphson maximization of the likelihood function with consistent parameter estimates as starting values, then N( ˆβML β) D N(0, E[I] 1 ), where E[I] is the information matrix.
Consistency of ˆβ KR Notation Suppose we have N iid observations [ 1 1 ] [ N N ] where each i is n 1 and each i is m 1. 0 5 10 0 5 10... 0 5 10 0 5 10
Consistency of ˆβ KR Assumptions Assume [Xi] ( [ ]) ΣX Σ X i iid N µ X 1 (n+m) 1, Σ X Σ, ɛ i iid N(0, σ 2 I), and i = X i β + ɛ i, where X i = [1 n 1 X i ] and β = [ β0 β 1 ]
Consistency of ˆβ KR More assumptions ˆµ X = 1 [ N N i=1 (1 1 1 m ˆΣ 1 m 1) 1 1 ˆΣ 1 1 m i] θ X is estimated by REML E( ˆX 1 ˆX 1 ) is invertible Σ X 0 then N( ˆβ KR β) D N(0, Σ L ).
N-consistency of ˆβKR The asymptotic covariance matrix of ˆβ KR is Σ L = covariance when only β unknown + loss of efficiency for estimating µ X + loss of efficiency for estimating θ X
Maximum Likelihood [ ], ith N iid observations i i the negative log-likelihood is 2l = N log Σ + N i=1 V i Σ 1 V i. where Σ is a block-diagonal matrix with copies of [ ] β 2 Σ 1 = 1 Σ X + Σ ɛ β 1 Σ X β 1 Σ X Σ along the diagonal and [ i (β 0 + β 1 µ X )1 n 1 V i = i µ X 1 m 1 ].
Let φ = Efficiency of the Maximum Likelihood Estimator ] [β 0 β 1 µ X θ X θ ɛ. If we compute φ ML by a Newton-Raphson minimization of 2l with consistent parameter estimates as initial values, then N( ˆφML φ) D N(0, E[I(φ)] 1 ), where ( 2 l {E[I(φ)]} ij = E φ i φ j provided some regularity conditions are met. ).
A Variance Estimator for ˆβ ML e can use an information-based variance estimator: or var 1 ( ˆφ ML ) = I 1 ( ˆφ ML ) var 2 ( ˆφ ML ) = [E(I)] 1 φ= ˆφML. var 1 often fails to be positive definite in simulations.
Simulation Study 16 treatments with 75 replications each Fixed parameters: β 0 = β 1 = µ X = 1, θ X,1 = θ ɛ,1 = 0 Remaining parameters varied to define treatments: θ ɛ2 {0.1, 1} θ ɛ3 {0.5, 3} θ X2 {0.25, 1} θ X3 {0.5, 1.5} n = m = 50, but N = 1
Simulation Sampling Locations 0 5 10 0 5 10 Locations
How θ 3 Affects Covariance C(h) = exp( θ 3 h) 1 0.8 Exponential Covariogram θ 3 =0.5 θ 3 =1 θ 3 =2.5 θ 3 =5 Covariance 0.6 0.4 0.2 0 0 2 4 6 8 10 Distance=h
Simulation Study Treatments TR: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 θ ɛ,2 + + + + + + + + θ ɛ,3 + + + + + + + + θ X,2 + + + + + + + + θ X,3 + + + + + + + + + indicates the larger value. indicates the smaller value.
Simulation Study Variance Estimates 40 30 Sign(Est. Var.)*Log Est. Var. 20 10 0 10 20 30 40 log(monte Carlo MSE) sgn(est. Var)log Est. Var. Est. θ X,2 /Est. θ X,3 <0.075 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Experiment
Simulation Study KR vs. ML 5 4 3 MSE(ML)=6.816 MSE(KR)=35.98 Bias(ML)= 0.075 Bias(KR)=0.173 2 ML 1 0 1 2 2 1 0 1 2 3 4 5 KR
Conclusions e have no good variance estimator for ˆβ KR. Information-based variance estimator for ˆβ ML appears useful, even when N = 1, provided no numerical problems. ML yields more precise point estimates than KR.