Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL

Size: px

Start display at page:

Download "Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL"

Muriel Shepherd
6 years ago
Views:

1 1 Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL

2 2 MOVING AVERAGE SPATIAL MODELS

3 Kernel basis representation for spatial processes z(s) Define m basis functions k 1 (s),..., k m (s). basis ! Here k j (s) is normal density cetered at spatial location ω j : k j (s) = 1 2π exp{ 1 2 (s ω j) 2 } s set z(s) = m k j(s)x j where x N(0,I m ). j=1 Can represent z =(z(s 1 ),..., z(s n )) T as z = Kx where K ij = k j (s i )

4 x and k(s) determine spatial processes z(s) k j (s)x j z(s) basis! z(s)! ! s! s Continuous representation: z(s) = m j=1 k j(s)x j where x N(0,I m ). Discrete representation: For z =(z(s 1 ),..., z(s n )) T, z = Kx where K ij = k j (s i )

5 Estimate z(s) by specifying k j (s) and estimating x k j (s)x j z(s) fitted bases!1.0! y(s)!2.0!1.5!1.0! ! s! s Discrete representation: For z =(z(s 1 ),..., z(s n )) T, z = Kx where K ij = k j (s i ) standard regression model: y = Kx + ɛ

6 Formulation for the 1-d example Data y = (y(s 1 ),..., y(s n )) T observed at locations s 1,..., s n. Once knot locations ω j,j=1,..., m and kernel choice k(s) are specified, the remaining model formulation is trivial: Likelihood: L(y x, λ y ) λ n 2 y exp 1 2 λ y(y Kx) T (y Kx) 6 where K ij = k(ω j s i ). Priors: π(x λ x ) λ m 2 x exp 1 2 λ xx T x Posterior: π(λ x ) λ a x 1 x exp{ b x λ x } π(λ y ) λ a y 1 y exp{ b y λ y } π(x, λ x, λ y y) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } λ a x+ m 2 1 x exp { λ x [b x +.5x T x] }

7 Posterior and full conditionals Posterior: π(x, λ x, λ y y) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } Full conditionals: λ a x+ m 2 1 x exp { λ x [b x +.5x T x] } π(x ) exp{ 1 2 [λ yx T K T Kx 2λ y x T K T y + λ x x T x]} π(λ x ) λ a x+ m 2 1 x exp { λ x [b x +.5x T x] } 7 π(λ y ) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } Gibbs sampler implementation x N((λ y K T K + λ x I m ) 1 λ y K T y, (λ y K T K + λ x I m ) 1 ) λ x Γ(a x + m 2,b x +.5x T x) λ y Γ(a y + n 2,b y +.5(y Kx) T (y Kx))

8 Gibbs sampler: intuition Gibbs sampler for a bivariate normal density π(z) =π(z 1,z 2 ) 1 ρ ρ exp 1 2 ( z 1 z 2 ) 1 ρ ρ 1 1 z 1 z 2 Full conditionals of π(z): 9 z 1 z 2 N(ρz 2, 1 ρ 2 ) z 2 z 1 N(ρz 1, 1 ρ 2 ) initialize chain with z 0 N 0 0, 1 ρ ρ 1 z 2 (z 1 1,z 1 2)=z 1 draw z 1 1 N(ρz 0 2, 1 ρ 2 ) now (z 1 1,z 0 2) T π(z) z 0 =(z 0 1,z 0 2) (z 1 1,z 0 2) z 1

9 Gibbs sampler: intuition Gibbs sampler gives z 0,z 2,...,z T which can be treated as dependent draws from π(z). 10 If z 0 is not a draw from π(z), then the initial realizations will not have the correct distribution. In practice, the first 100?, 1000? realizations are discarded. The draws can be used to make inference about π(z): Posterior mean of z is estimated by: ˆµ 1 = 1 ˆµ 2 T Posterior probabilities: T k=1 z1 k z2 k z 2 P (z 1 > 1) = 1 T P (z 1 >z 2 )= 1 T T k=1 T k=1 90% interval: (z [5%] 1,z [95%] 1 ). I[z k 1 > 1] I[z k 1 >z k 2] z 1

10 1-d example m =6knots evenly spaced between.3 and 1.2. n =5data points at s =.05,.25,.52,.65,.91. k(s) is N(0, sd =.3) a y = 10, b y = 10 (.25 2 ) strong prior at λ y =1/.25 2 ; a x =1, b x = basis functions 1 1 y(s) 0 k j (s) 0!1!1!2!2 20 x j! s !2! iteration! x,! y! s iteration

11 1-d example From posterior realizations of knot weights x, one can construct posterior realizations of the smooth fitted function z(s) = m j=1 k j (s)x j. Note strong prior on λ y required since n is small. 3 2 posterior realizations of z(s) 3 2 mean & pointwise 90% bounds z(s), y(s) 1 0 z(s), y(s) !1!1!2!2! s! s

12 1-d example m = 20 knots evenly spaced between 2 and 12. n = 18 data points evenly spaced between 0 and 10. k(s) is N(0, sd = 2) data posterior mean and 80% CI for z(s) 8 y y s s posterior dist for lam_y posterior dist for lam_x lambda_y lambda_x

13 1-d example m = 20 knots evenly spaced between 2 and 12. n = 18 data points evenly spaced between 0 and 10. k(s) is N(0, sd = 1) data posterior mean and 80% CI for z(s) 9 y y s s posterior dist for lam_y posterior dist for lam_x lambda_y lambda_x

14 Basis representations for spatial processes z(s) Represent z(s) at spatial locations s 1,..., s n. z =(z(s 1 ),..., z(s n )) T N(0, Σ z ). Recall z = Kx, where KK T = Σ z and x N(0,I n ). Gives a discrete representation of z(s) at locations s 1,..., s n. discrete representation continuous representation 10 basis!1.0! basis!1.0! s s z(s i )= n j=1 K ijx j z(s) = n k j(s)x j j=1 Columns of K give basis functions. Can use a subset of these basis functions to reduce dimensionality.

15 decomposition basis covariance Cholesky (w/piv) basis!1.0! location s s location s 11 SVD basis!1.0! location s s location s kernels basis!1.0! location s s location s

16 How many basis kernels? Define m basis functions k 1 (s),..., k m (s). m = 20? basis!1.0! location s s location s 12 m = 10? basis!1.0! location s s location s m =6? basis!1.0! location s s location s

17 Moving average specifications for spatial models z(s) smoothing kernel k(s) 13 white noise process x =(x 1,..., x m ) at spatial locations ω 1,..., ω m. x N(0, 1 I m ) λ x spatial process z(s) constructed by convolving x with smoothing kernel k(s) z(s) = m x jk(ω j s) j=1 z(s) is a Gaussian process with mean 0 and covariance given by Cov(z(s),z(s )) = 1 λ x m j=1 k(ω j s)k(ω j s )

18 14

19 Example: constructing 1-d models for z(s) z(s), x(s) s m = 20 knot locations ω 1,..., ω m equally spaced between 2 and 12. x =(x 1,..., x m ) T N(0,I m ) z(s), x(s) z(s) = m k=1 k(ω k s)x k k(s) is a normal density with sd = 1 4, 1 2, and 1 4th frame uses k(s) = 1.4 exp{ 1.4 s }. s 16 z(s), x(s) z(s), x(s) s General points: smooth kernels required spacing depends on kernel width knot spacing 1 sd for normal k(s) kernel width is equivalent to scale parameter in GP models s

20 17

21 MRF formulation for the 1-d example Data y = (y(s 1 ),..., y(s n )) T observed at locations s 1,..., s n. Once knot locations ω j,j=1,..., m and kernel choice k(s) are specified, the remaining model formulation is trivial: Likelihood: L(y x, λ y ) λ n 2 y exp 1 2 λ y(y Kx) T (y Kx) 18 where K ij = k(ω j s i )x j. Priors: π(x λ x ) λ m 2 x exp 1 2 λ xx T Wx Posterior: π(λ x ) λ a x 1 x exp{ b x λ x } π(λ y ) λ a y 1 y exp{ b y λ y } π(x, λ x, λ y y) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } λ a x+ m 2 1 x exp { λ x [b x +.5x T Wx] }

22 Posterior: Posterior and full conditionals (MRF formulation) π(x, λ x, λ y y) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } Full conditionals: λ a x+ m 2 1 x exp { λ x [b x +.5x T Wx] } π(x ) exp{ 1 2 [λ yx T K T Kx 2λ y x T K T y + λ x x T Wx]} π(λ x ) λ a x+ m 2 1 x exp { λ x [b x +.5x T Wx] } 19 π(λ y ) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } Gibbs sampler implementation x N((λ y K T K + λ x W ) 1 λ y K T y, (λ y K T K + λ x W ) 1 ) λ x Γ(a x + m 2,b x +.5x T x) λ y Γ(a y + n 2,b y +.5(y Kx) T (y Kx))

23 1-d example - using MRF prior for x post mean & 80% ci for z(s) post mean for x s (d) gp model (c) iid x s; k(s) fat (b) rw x s; k(s) skinny (a) iid x s; k(s) skinny estimated process z(s) estimated x & Kx spatial location s

24 8 hour max for ozone on a summer day in the Eastern US 21 n = 510 ozone measurements ω k s laid out on a hexagonal lattice as shown. k(s) is circular normal, with sd = lattice spacing. Choice of width for k(s): could look at empirical variogram could estimate using ML or cross-validation could treat as additional parameter in posterior

25 A multiresolution spatial model formulation 22 z(s) =z coarse (s)+z fine (s) Coarse process: m c = 27 locations ω1, c..., ωm c c grid. x c =(x c1,..., x cmc ) T N(0, 1 λ c I mc ) on a hexagonal coarse smoothing kernel k c (s) is normal with sd = coarse grid spacing. Fine process: m f = 87 locations ω1 f,..., ωm f f grid. on a hexagonal x f =(x f1,..., x fmf ) T N(0, 1 λ I mf ) f fine smoothing kernel k f (s) is normal with sd = fine grid spacing. note: coarse kernel width is twice the fine kernel width.

26 23 Model: where Multiresolution formulation and full conditionals y = K c x c + K f x f + ɛ y = Kx + ɛ K =(K c K f ) and x = Define λ W x = c I mc 0 0 λ f I mf Gibbs sampler implementation then becomes x c x f x N((λ y K T K + W x ) 1 λ y K T y, (λ y K T K + W x ) 1 ) λ c Γ(a x + m c 2,b x +.5x T c x c ) λ f Γ(a x + m f 2,b x +.5x T f x f ) λ y Γ(a y + n 2,b y +.5(y Kx) T (y Kx))

27 Multiresolution model for 8 hour max ozone coarse formulation 24 coarse + fine formulation

28 Basic binary classification example data true field binary spatial process z (s) spatial area partitioned into two regions: z (s) =1and z (s) =0. n = 10 measurements y =(y 1,..., y n ) T taken at spatial locations s 1,..., s n. y i = z (s i )+ɛ i, i =1,..., n; ɛ i iid N(0, 1), i =1,..., n

29 x Z Constructing a binary spatial process z (s) knot values x smoothing kernel z(s) k(s) Z ! s s s s Y X x convolved with k(s) gives z(s) x N(0,I m ) where each x j is located at ω j ; z(s) = m j=1 x jk(s ω j ) 26 z(s) binary z*(s) Z Y X Y X Now define the binary field z (s) by clipping z(s): z (s) =I[z(s) > 0].

30 Model Formulation Define n = 10 observations y =(y(s 1 ),..., y(s n )) T. Define z =(z (s 1 ),..., z (s n )) T. Define x =(x(ω 1 ),..., x(ω m )) T to be the m = 25-vector of white noise knot values at spatial grid sites ω 1,..., ω m. Recall z (s) and the vector z are determined by the knot values x. 27 Likelihood Independent normal prior for x L(y z ) exp{ 1 2 (y z ) T (y z )} π(x) exp{ 1 2 xt x} Posterior distribution π(x y) exp{ 1 2 (y z ) T (y z ) 1 2 xt x}

31 Sampling the posterior via Metropolis Full conditional distributions π(x j x j,y) exp{ 1 2 (y z ) T (y z ) 1 2 x2 j}, j=1,..., m Metropolis implementation for sampling from π(x y): Initialize x at x =0. Cycle thru full conditionals updating each x j according to Metropolis rules. 28 generate proposal x j U[x j r, x j + r]. compute acceptance probability α = min 1, π(x j x j,y) π(x j x j,y) update x j to new value: x new j = x j x j with probability α with probability 1 α Here we ran for T = 1000 scans, giving realizations x 1,..., x T from the posterior. Discarded the first 100 for burn in. Note: proposal width r tuned so that x j is accepted about half the time.

Sampling from non-standard multivariate distributions Nick Metropolis Computing pioneer at Los Alamos National Laboratory inventor of the Monte

Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Journal of Chemical Physics.

32 Sampling from non-standard multivariate distributions Nick Metropolis Computing pioneer at Los Alamos National Laboratory inventor of the Monte Carlo method inventor of Markov chain Monte Carlo: 29 Equation of State Calculations by Fast Computing Machines (1953) by N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller and E. Teller, Journal of Chemical Physics. Originally implemented on the MANIAC1 computer at LANL Algorithm constructs a Markov chain whose realizations are draws from the target (posterior) distribution. Constructs steps that maintain detailed balance.

33 Gibbs Sampling and Metropolis for a bivariate normal density π(z 1,z 2 ) 1 ρ ρ exp 1 2 ( z 1 z 2 ) 1 ρ ρ 1 1 z 1 z 2 30 sampling from the full conditionals also called heat bath z 1 z 2 N(ρz 2, 1 ρ 2 ) z 2 z 1 N(ρz 1, 1 ρ 2 ) Metropolis updating: generate z1 U[z 1 r, z 1 + r] calculate α = min{1, π(z 1,z 2) set z new 1 = π(z 1,z 2 ) = π(z 1 z 2) π(z 1 z 2 ) } z 1 with probability α z 1 with probability 1 α

34 31 Posterior realizations of z (s) =I[z(s) > 0]

35 data Posterior mean of z (s) =I[z(s) > 0] true field posterior mean

36 Application locating archeological sites ?? ? ?? ? ? ?? Data at locations over 10 m 2 area Assume y(s i ) z (s i ) N(µ(z (s i )), 1) µ(0) = 1, µ(1) = 2. Recall z(s) = m j=1 x j k(s ω j ) and z (s) =I[z(s) > 0]. m =8 8 lattice of knot locations ω 1,..., ω m ; sd of k(s) equal to knot spacings.

37 Posterior mean and realizations for z (s) =I[z(s) > 0] 34

38 Bayesian formulation Well.NW Well.N Well.NE Time Time Time Well.W Well.E Time Time 42 Well.SW Well.S Well.SE Time L(y η(z)) Σ 1 2 exp 1 2 (y η(z))t Σ 1 (y η(z)) π(z λ z ) λ m 2 z exp Time 1 2 zt W z z π(λ z ) λ a z 1 z exp {b z λ z } π(z,λ z y) L(y η(z)) π(z λ z ) π(λ z ) Time

39 Posterior realizations of z under MRF and moving average priors Well Data MRF Realization MRF Realization MRF Posterior Mean I I I I P P P GP Realization GP Realization GP Posterior Mean

40 44 References D. Higdon (1998) A process-convolution approach to modeling temperatures in the North Atlantic Ocean (with discussion), Environmental and Ecological Statistics, 5: D. Higdon (2002) Space and space-time modeling using process convolutions, in Quantitative Methods for Current Environmental Issues (C. Anderson and V. Barnett and P. C. Chatwin and A. H. El-Shaarawi, eds), D. Higdon, H. Lee and C. Holloman (2003) Markov chain Monte Carlo approaches for inference in computationally intensive inverse problems, in Bayesian Statistics 7, Proceedings of the Seventh Valencia International Meeting (Bernardo, Bayarri, Berger, Dawid, Heckerman, Smith and West, eds).

41 45 NON-STATIONARY SPATIAL CONVOLUTION MODELS

42 A convolution-based approach for building non-stationary spatial models * = white noise smoothing kernel spatial field 46 z(s) = m x jk(ω j s) = m x jk s (ω j ) j=1 j=1 x N(0,I m ) where each x j is located at ω j over a regular 2-d grid.

43 Convolutions for constructing non-stationary spatial models 47 m z(s) = x jk s (ω j ) j=1 x N(0,λ 1 x I m ) at regular 2-d lattice locations ω 1,..., ω m. z(s) GP (0,C(s 1,s 2 ) where C(s 1,s 2 )=λ 1 x k s 1 (ω j )k s2 (ω j ) j=1 smoothing kernel k s () changes smoothly over spatial location m

44 Defining smoothly varying kernels via basis kernels k 1 () k 2 () k 3 () 48 Define k s () =w 1 (s)k 1 ( s)+w 2 (s)k 2 ( s)+w 3 (s)k 3 ( s) k s () is a weighted combination of kernels centered at s. Define weights that change smoothly over space.

45 Defining smoothly varying kernels via basis kernels 49 Define iid, mean 0 Gaussian Processes φ 1 (s), φ 2 (s) and φ 3 (s) Set w i (s) = exp{φ i (s)} exp{φ 1 (s)} + exp{φ 2 (s)} + exp{φ 3 (s)} Estimate φ i (s) s like any other parameter in the analysis.

46 50 An application to Piazza Road Superfund site

47 51 MOVING AVERAGE/BASIS SPACE-TIME MODELS

48 A convolution-based approach for building space-time models marked point process smoothing kernel space-time field time time 52 space space Define space-time domain S T Define discrete knot process x(s, t) on {(ω 1, τ 1 ),..., (ω m, τ m )} within S T Define smoothing kernel k(s, t) Construct space-time process z(s, t) z(s, t) = z(s, t) = m j=1 k((s, t) (ω j, τ j ))x(ω j, τ j ) or with varying kernel m j=1 k st(ω j, τ j )x j

49 A Space-time model for ocean temperatures ocean temperature at constant potential density 50N 45N latitude 40N 35N 30N 25N 53 Data: y =(y 1,..., y n ) T at space-time locations: (s 1,t 1 ),..., (s n,t n ) Times: assume data are centered (ȳ =0) degrees C 20N W 25W 20W 15W 10W 5W longitude

50 Knot locations and kernels smoothing kernels and latent grid latitude 50N 45N 40N 35N 30N 25N 20N 54 m knot locations over a space-time grid (ω 1, τ 1 ),..., (ω m, τ m ) Spatial knot locations shown; Temporal knot spacing 7 years; ; Kernels vary with spatial location k st (ω, τ) =k s (ω, τ) use spatial mixture of normal basis kernels. 30W 25W 20W 15W 10W 5W longitude

51 Likelihood: Formulation for the ocean example L(y x, λ y ) λ n 2 y exp 1 2 λ y(y Kx) T (y Kx) where K ij = k si t i (ω j, τ j )x j. Priors: π(x λ x ) λ m 2 x exp 1 2 λ xx T x 55 Posterior: π(λ x ) λ a x 1 x exp{ b x λ x } π(λ y ) λ a y 1 y exp{ b y λ y } π(x, λ x, λ y y) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } λ a x+ m 2 1 x exp { λ x [b x +.5x T x] }

52 Full conditionals: Full conditionals for ocean formulation π(x ) exp{ 1 2 [λ yx T K T Kx 2λ y x T K T y + λ x x T x]} π(λ x ) λ a x+ m 2 1 x exp { λ x [b x +.5x T x] } π(λ y ) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } 56 Gibbs sampler implementation x N((λ y K T K + λ x I m ) 1 λ y K T y, (λ y K T K + λ x I m ) 1 ) x j N λ y rj T k j 1, λ y kj T k j + λ x λ y kj T k j + λ x λ x Γ(a x + m 2,b x +.5x T x) λ y Γ(a y + n 2,b y +.5(y Kx) T (y Kx)) where k j is jth column of K and r j = y j j k j x j.

53 Posterior mean of the space-time temperature field 50N N N N 45N 45N 40N 40N 40N 35N 35N 35N 30N 30N 30N 25N 25N 25N 20N N N W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 57 50N N N N 45N 45N 40N 40N 40N 35N 35N 35N 30N 30N 30N 25N 25N 25N 20N N N W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W

54 Deviations from time-averaged mean temperature field 50N N N N 45N 45N 40N 40N 40N 35N 35N 35N 30N 30N 30N 25N 25N 25N 20N N N W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 58 50N N N N 45N 45N 40N 40N 40N 35N 35N 35N 30N 30N 30N 25N 25N 25N 20N N N W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W

55 Posterior probabilities of differing from time-averaged mean field 50N N N N 45N 45N 40N 40N 40N 35N 35N 35N 30N 30N 30N 25N 25N 25N 20N N N W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 59 50N N N N 45N 45N 40N 40N 40N 35N 35N 35N 30N 30N 30N 25N 25N 25N 20N N N W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W 30W 25W 20W 15W 10W 5W

56 Alternative approaches for building space-time models latent process smoothing kernel space-time field time time space 60 space Define space-time domain S T with T = {1,..., n t } space Discrete knot process x(s, t) on {ω 1,..., ω ns } {1,..., n t }, n t n s = m Define spatial smoothing kernel k(s) Construct space-time process z(s, t) z(s, t) = n s z(s, t) = n s j=1 k(s ω j)x(ω j,t) or with varying kernel j=1 k st(ω j )x jt

57 8 hour max for ozone over summer days in the Eastern US n = 510 ozone measurements each of 30 days ω k s laid out on same hexagonal lattice

58 62 Temporally evolving latent x(s, t) process Times: t T = {1,..., n t }, n t = 30 spatial knot locations: W = {ω 1,..., ω ns }, n s = 27 x(s, t) defined on W T Space-time process z(s, t) obtained by convolving x(s, t) with k(s): z(s, t) = n s j=1 k(ω j s)x(ω j,t) = n s j=1 k s(ω j )x jt Specify locally linear MRF priors for each x j =(x j1,..., x jnt ) T π(x j λ x ) λ n t 2 exp 1 2 λ xx T j Wx j where 1 if i j =1 1 if i = j =1or i = j = n W ij = t 2 if 1 <i= j < n t 0 otherwise So for x =(x 11,x 21..., x ns 1,x 12,..., x ns 2,..., x 1nt,..., x ns n t ) T π(x λ x ) λ n t n s 2 exp 1 2 λ xx T (W I ns )x

59 Formulation for temporally evolving z(s, t) Data: at each time t, observe n-vector y t =(y 1t,..., y nt ) T at sites s 1,..., s n. Likelihood for data observed at time t: L(y t x t, λ y ) λ n 2 y exp 1 2 λ y(y t K t x t ) T (y K t x t ) where K t ij = k(ω j s i ) 63 Define n n t -vector y =(y T 1,..., y T n t ) T Likelihood for entire data y: L(y x, λ y ) λ nn t 2 y where K = diag(k 1,..., K n t). Priors: exp 1 2 λ y(y Kx) T (y Kx) π(x λ x ) λ m 2 x exp 1 2 λ xx T Wx π(λ x ) λ a x 1 x exp{ b x λ x } π(λ y ) λ a y 1 y exp{ b y λ y }

60 Posterior and full conditionals π(x, λ x, λ y y) λ a y+ nn t 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } Full conditionals: λ a x+ n sn t 2 1 x exp { λ x [b x +.5x T Wx] } π(x ) exp{ 1 2 [λ yx T K T Kx 2λ y x T K T y + λ x x T Wx]} π(λ x ) λ a x+ m 2 1 x exp { λ x [b x +.5x T Wx] } π(λ y ) λ a y+ n 2 1 y exp { λ y [b y +.5(y Kx) T (y Kx)] } 64 Gibbs sampler implementation x N((λ y K T K + λ x W ) 1 λ y K T y, (λ y K T K + λ x W ) 1 ) x jt N λ y r T tjk tj + n j x j λ y k T tjk tj + n j λ x, 1 λ y k T j k j + n j λ x λ x Γ(a x + m 2,b x +.5x T x) λ y Γ(a y + n 2,b y +.5(y Kx) T (y Kx)) where k tj is jth column of K t, r tj = y t j j k tj x tj, n j = number of neighbors of x jt, and x jt =mean of neighbors of x jt

61 DLM setup for ozone example Given latent process x t =(x 1,t,..., x 27,t ) T, t =1,..., 30 y t =(y 1t,..., y ny t) T at sites s 1t,..., s ny t y t = K t x t + ɛ t x t = x t 1 + ν t 65 where K t is the n y 27 matrix given by: Kij t = k(s it ω j ),t=1,..., 30, ɛ t ν t i.i.d. N(0, σɛ 2 ), t=1,..., 30, i.i.d. N(0, σν), 2 t=1,..., 30, and x 1 N(0, σ 2 xi 27 ). can use dynamic linear model (DLM)/Kalman filter machinery single site MCMC works too See Stroud et.al. (1999) for alternative model.

62 Posterior mean for first 9 days

63 Posterior mean of selected x j s latent proces values x(s,t) independent over time t (days) latent proces values x(s,t) random walk over time t (days)

64 68 References D. Higdon (1998) A process-convolution approach to modeling temperatures in the North Atlantic Ocean (with discussion), Environmental and Ecological Statistics, 5: J. Stroud, P. Müler and B. Sanso (2001) Dynamic Models For Spatio- Temporal Data, Journal of the Royal Statistical Society, Series B, 63, D. Higdon (2002) Space and space-time modeling using process convolutions, in Quantitative Methods for Current Environmental Issues (C. Anderson and V. Barnett and P. C. Chatwin and A. H. El-Shaarawi, eds), M. West and J. Harrison (1997) Bayesian Forecasting and Dynamic Models (Second Edition), Springer-Verlag.

Efficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions

Efficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions Catherine A. Calder Department of Statistics The Ohio State University 1958 Neil Avenue Columbus,