Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1
OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling Eamples with simulated and real data 2
SPATIAL DATA 37 38 39 40 41 latgridvals 38 40 42 44 46 48 109 108 107 106 105 104 103 102 90 85 80 longridvals 75 70 109 37 38 39 40 41 108 107 106 38 40 42 44 46 48 105 104 103 102 Colorado annual precipitation 90 85 80 75 monthly PM10 70 3
SPATIAL DATA AND MODELS Features of spatial data highly interactive, comple surfaces high df:n; heterogeneous coverage possible nonstationarity spatial model is often a component of a larger model spatiotemporal model with covariates possibly replicated Some standard models Thin-plate splines Basis function models with penalized coefficients Gaussian processes (kriging or Bayesian) 4
GAUSSIAN PROCESS DISTRIBUTION Infinite-dimensional joint distribution for f(), X : Eample: f( ) a spatial process, X = R 2 f( ) GP(µ( ), C(, )) Finite-dimensional marginals are normal Types of covariance functions, C(f( i ), f( j )): stationary, isotropic stationary, anisotropic nonstationary X j j i i 5
GAUSSIAN PROCESS DISTRIBUTION Infinite-dimensional joint distribution for f(), X : Eample: f( ) a spatial process, X = R 2 f( ) GP(µ( ), C(, )) Finite-dimensional marginals are normal Types of covariance functions, C(f( i ), f( j )): stationary, isotropic stationary, anisotropic nonstationary X j j i i 6
GAUSSIAN PROCESS DISTRIBUTION Infinite-dimensional joint distribution for f(), X : Eample: f( ) a spatial process, X = R 2 f( ) GP(µ( ), C(, )) Finite-dimensional marginals are normal Types of covariance functions, C(f( i ), f( j )): stationary, isotropic stationary, anisotropic nonstationary X j j i i 7
CORRELATION MODELLING R(f( i ), f( j )) = R ( ) τij 2 τ 2 ij = ( i j ) T ( i j ) ρ 2 Correlation 0.0 0.4 0.8 ρ = 0.2 ρ = 0.04 0.0 0.2 0.4 Distance Correlation scale 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 4 0 2 4 4 0 2 4 Nonstationarity: correlation a function of location 37 38 39 40 41 109 107 105 103 5.0 5.5 6.0 6.5 7.0 7.5 37 38 39 40 41 109 107 105 103 5.0 5.5 6.0 6.5 7.0 7.5 8
NONSTATIONARY COVARIANCE spatial deformation (Sampson, Guttorp & co-workers) mitures of Gaussian processes (Fuentes) or splines (Wood et al.) kernel convolution method (Higdon, Swall, and Kern) R NS (f( i ), f( j )) = c ij R P k i (u)k j (u)du Guaranteed positive definite Gaussian kernels: ( ) k i (u) ep (u i ) T Σ 1 i (u i ) R NS (f( i ), f( j )) = c ij ep f( ) GP(µ, σ 2 R NS (, )) ( ( i j ) T ( Σi +Σ j 2 ) 1 (i j ) 9
NONSTATIONARY GPS IN 2D Kernel Structure Sample Function y 0.0 0.2 0.4 0.6 0.8 1.0 1.0 z 0.5 0.0 0.5 1.0 1.0 0.8 0.6 y 0.4 0.2 0.2 0.4 1.0 0.8 0.6 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.0 10
GENERALIZING THE HSK KERNEL METHOD Squared eponential form: ( ( ) ) 2 τij ep c ρ ij ep ( ( ) 1 ( i j ) T Σi +Σ j (i 2 j )) Theorem 1: if R(τ ) is positive definite for R P, P = 1, 2,..., then R NS (f( i ), f( j )) = c ij R ( Q ij ) is positive definite for R P, P = 1, 2,... Theorem 2: Smoothness properties of original stationary correlation retained Specific case of Matérn nonstationary covariance ( ) ν ( ) 1 2 ντij 2 2 ντij K 2 Γ(ν)2 ν 1 ρ ν ρ 1 Γ(ν)2 ν 1 ( 2 νqij ) ν Kν ( 2 νqij ) More fleible form, differentiability not constrained, possible asymptotic advantages 11
A BASIC BAYESIAN SPATIAL MODEL Model: Y i N(f( i ), η 2 ), i R 2 f( ) GP(µ, σ 2 R NS (, ; ν, Σ ( ) )) Let R NS be the nonstationary Matérn correlation Kernels (Σ ) constructed based on stationary GP priors Define multiple kernel matrices, Σ, X Smoothly-varying (element-wise) in covariate space Positive definite MCMC sampling of all parameters including those determining Σ ( ) 12
1 EMPIRICAL ASSESSMENT - SIMULATED FUNCTIONS 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 2 2 1.0 0.8 0.6 0.4 0.2 0.2 0.41 0.6 0.81.0 z 6 4 0.4 0.6 0.6 1 0.8 1.0 0.4 2 0.8 1.0 z 0.5 0.0 0.5 0.3 0.5 0.7 0.9 1 krige tps GAM sgp nsgp fks nn 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 13 2 test MSE 0.02 0.08 0.14 krige tps GAM sgp nsgp fks nn 2 0.0 0.2 0.4 0.6 0.8 1.0 test MSE 0.02 0.06 0.10 0.14
COLORADO PRECIPITATION CHARACTERIZATION 37 38 39 40 41 109 108 107 106 105 104 103 102 14
COLORADO PRECIPITATION SURFACES 109 107 105 103 109 107 105 103 37 38 39 40 41 5.0 5.5 6.0 6.5 7.0 7.5 37 38 39 40 41 stationary form nonstationary form 15 5.0 5.5 6.0 6.5 7.0 7.5
EMPIRICAL ASSESSMENT - COLORADO PRECIPITATION krige krige2 GAM GAM2 sgp nsgp fks nn test MSE 0.05 0.10 0.15 0.20 0.25 0.30 Comments surface is comple relative to number of obs; high df:n ratio bias-variance tradeoff simple two-region nonstationary models offer little improvement 16
FUTURE WORK Further implementation of nonstationary model: Approaches for constraining hyperparameters for better fits and miing Computationally tractable approaches for multiple realizations Ad hoc approaches for fitting the nonstationary covariance structure criteria for choosing a spatial smoother based on data spacing, df:n, signal to noise ratio Fourier basis representation for spatial processes for fast Bayesian estimation with non-normal data or complicated models 17
NONSTATIONARY GPS IN 1D 0.1 0.3 0.5 Kernel standard deviation y 2.0 1.0 0.0 Some sample functions y 1 0 1 2 0.0 0.2 0.4 0.6 0.8 1.0 Some kernels 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.2 0.4 0.6 0.8 1.0 y 1.5 0.5 0.5 1.5 0.0 0.4 0.8 y 2.0 1.0 0.0 1.0 0.0 0.4 0.8 18
SMOOTHLY-VARYING KERNEL MATRICES Spectral decomposition (R P ): Σ = Γ T Λ Γ in R 2, stationary GP priors on unnormalized eigenvector coordinates (a, b ) and on logarithm of second eigenvalue (λ,2 ) efficient parameterizations of Φ( ) {a( ), b( ), λ 2 ( )} using basis function approimation to a stationary GP a, b λ 2 λ 1 19
DEGREE OF SMOOTHING ρ = 0.2 ρ = 0.04 y 1 0 1 2 y 1 0 1 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 y 2 1 0 1 2 y 2 1 0 1 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 0.3 0.4 0.5 0.6 0.7 20