Non-gaussian spatiotemporal modeling

Size: px

Start display at page:

Download "Non-gaussian spatiotemporal modeling"

Arron Hoover
5 years ago
Views:

1 Dec, / 37 Non-gaussian spatiotemporal modeling Thais C O da Fonseca Joint work with Prof Mark F J Steel Department of Statistics University of Warwick Dec, 2008

2 Dec, / 37 1 Introduction Motivation 2 Spatiotemporal modeling Nonseparable models Heavy tailed processes 3 Simulation Results Data 1 Data 2 Data 3 4 Temperature data Non-gaussian spatiotemporal modeling Results

3 Dec, / 37 Motivation Spatiotemporal data Due to the proliferation of data sets that are indexed in both space and time, spatiotemporal models have received an increased attention in the literature. Maximum temperature data - Spanish Basque Country (67 stations)

4 Dec, / 37 Motivation Spatiotemporal data Due to the proliferation of data sets that are indexed in both space and time, spatiotemporal models have received an increased attention in the literature. Maximum temperature data - Spanish Basque Country (67 stations) Madrid Paris s s 2

5 Dec, / 37 Motivation Example 31 time points (july 2006) maximum temperature time index

6 Dec, / 37 Motivation Typical problem Given: observations Z(s i, t j ) at a finite number locations s i, i = 1,..., I and time points t j, j = 1,..., J. Desired: predictive distribution for the unknown value Z(s 0, t 0 ) at the space-time coordinate (s 0, t 0 ). Focus: continuous space and continuous time which allow for prediction and interpolation at any location and any time. Z(s, t), (s, t) D T, where D R d, T R

7 Dec, / 37 Motivation Typical problem Given: observations Z(s i, t j ) at a finite number locations s i, i = 1,..., I and time points t j, j = 1,..., J. Desired: predictive distribution for the unknown value Z(s 0, t 0 ) at the space-time coordinate (s 0, t 0 ). Focus: continuous space and continuous time which allow for prediction and interpolation at any location and any time. Z(s, t), (s, t) D T, where D R d, T R

8 Dec, / 37 Motivation Typical problem Given: observations Z(s i, t j ) at a finite number locations s i, i = 1,..., I and time points t j, j = 1,..., J. Desired: predictive distribution for the unknown value Z(s 0, t 0 ) at the space-time coordinate (s 0, t 0 ). Focus: continuous space and continuous time which allow for prediction and interpolation at any location and any time. Z(s, t), (s, t) D T, where D R d, T R

9 Dec, / 37 Motivation General modeling formulation The uncertainty of the unobserved parts of the process can be expressed probabilistically by a random function in space and time: {Z(s, t); (s, t) D T}. We need to specify a valid covariance structure for the process. C(s 1, s 2 ; t 1, t 2 ) = Cov(Z(s 1, t 1 ), Z(s 2, t 2 ))

10 Dec, / 37 Motivation General modeling formulation The uncertainty of the unobserved parts of the process can be expressed probabilistically by a random function in space and time: {Z(s, t); (s, t) D T}. We need to specify a valid covariance structure for the process. C(s 1, s 2 ; t 1, t 2 ) = Cov(Z(s 1, t 1 ), Z(s 2, t 2 ))

11 Dec, / 37 Motivation Non-gaussian spatiotemporal models But building adequate models for these processes is not an easy task. One observation of the process simplifying assumptions: Stationarity: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C(s 1 s 2, t 1 t 2 ) Isotropy: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C( s 1 s 2, t 1 t 2 ) Separability: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C s (s 1, s 2 )C t (t 1, t 2 ) Gaussianity: The process has finite dimensional Gaussian distribution. Models based on Gaussianity will not perform well (poor predictions) if the data are contaminated by outliers; there are regions with larger observational variance;

12 Dec, / 37 Motivation Non-gaussian spatiotemporal models But building adequate models for these processes is not an easy task. One observation of the process simplifying assumptions: Stationarity: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C(s 1 s 2, t 1 t 2 ) Isotropy: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C( s 1 s 2, t 1 t 2 ) Separability: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C s (s 1, s 2 )C t (t 1, t 2 ) Gaussianity: The process has finite dimensional Gaussian distribution. Models based on Gaussianity will not perform well (poor predictions) if the data are contaminated by outliers; there are regions with larger observational variance;

13 Dec, / 37 Motivation Non-gaussian spatiotemporal models But building adequate models for these processes is not an easy task. One observation of the process simplifying assumptions: Stationarity: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C(s 1 s 2, t 1 t 2 ) Isotropy: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C( s 1 s 2, t 1 t 2 ) Separability: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C s (s 1, s 2 )C t (t 1, t 2 ) Gaussianity: The process has finite dimensional Gaussian distribution. Models based on Gaussianity will not perform well (poor predictions) if the data are contaminated by outliers; there are regions with larger observational variance;

14 Dec, / 37 Motivation Non-gaussian spatiotemporal models But building adequate models for these processes is not an easy task. One observation of the process simplifying assumptions: Stationarity: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C(s 1 s 2, t 1 t 2 ) Isotropy: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C( s 1 s 2, t 1 t 2 ) Separability: Cov(Z(s1, t 1 ), Z(s 2, t 2 )) = C s (s 1, s 2 )C t (t 1, t 2 ) Gaussianity: The process has finite dimensional Gaussian distribution. Models based on Gaussianity will not perform well (poor predictions) if the data are contaminated by outliers; there are regions with larger observational variance;

15 Dec, / 37 Motivation Example Maximum temperature data - Spanish Basque Country empirical variance empirical variance station index time index For this reason, we consider spatiotemporal processes with heavy tailed finite dimensional distributions.

16 Dec, / 37 Motivation Example Maximum temperature data - Spanish Basque Country empirical variance empirical variance station index time index For this reason, we consider spatiotemporal processes with heavy tailed finite dimensional distributions.

17 Dec, / 37 We will consider processes that are stationary isotropic nonseparable non-gaussian

18 Dec, / 37 Nonseparable models Continuous mixture Idea: Continuous mixture of separable covariance functions [Ma, 2002]. It takes advantage of the well known theory developed for purely spatial and purely temporal processes. Nonseparable model Z(s, t) = Z 1 (s; U)Z 2 (t; V) (1) (U, V) is a bivariate random vector with correlation c. Unconditional covariance C(s, t) = C 1 (s; u)c 2 (t; v)df(u, v) (2) C(s, t) is valid and nonseparable.

19 Dec, / 37 Nonseparable models Continuous mixture Idea: Continuous mixture of separable covariance functions [Ma, 2002]. It takes advantage of the well known theory developed for purely spatial and purely temporal processes. Nonseparable model Z(s, t) = Z 1 (s; U)Z 2 (t; V) (1) (U, V) is a bivariate random vector with correlation c. Unconditional covariance C(s, t) = C 1 (s; u)c 2 (t; v)df(u, v) (2) C(s, t) is valid and nonseparable.

20 Dec, / 37 Nonseparable models Continuous mixture Idea: Continuous mixture of separable covariance functions [Ma, 2002]. It takes advantage of the well known theory developed for purely spatial and purely temporal processes. Nonseparable model Z(s, t) = Z 1 (s; U)Z 2 (t; V) (1) (U, V) is a bivariate random vector with correlation c. Unconditional covariance C(s, t) = C 1 (s; u)c 2 (t; v)df(u, v) (2) C(s, t) is valid and nonseparable.

21 Dec, / 37 Nonseparable models In particular, if C 1 (s; u) = σ 1 exp{ γ 1 (s)u} and C 2 (t; v) = σ 2 exp{ γ 2 (t)v} and U = X 0 + X 1 and V = X 0 + X 2, where X i has finite moment generating function M i, then Proposition C(s, t) = σ 2 M 0( (γ 1(s) + γ 2(t))) M 1( γ 1(s)) M 2( γ 2(t)), (s, t) D T, (3) where γ 1 (s) and γ 2 (t) are spatial and temporal variograms. For instance, γ 1 (s) = s/a α and γ 2 (t) = t/b β. Notice that c = corr(u, V) measures separability and c [0, 1]. See [Fonseca and Steel, 2008] for more details.

22 Dec, / 37 Nonseparable models In particular, if C 1 (s; u) = σ 1 exp{ γ 1 (s)u} and C 2 (t; v) = σ 2 exp{ γ 2 (t)v} and U = X 0 + X 1 and V = X 0 + X 2, where X i has finite moment generating function M i, then Proposition C(s, t) = σ 2 M 0( (γ 1(s) + γ 2(t))) M 1( γ 1(s)) M 2( γ 2(t)), (s, t) D T, (3) where γ 1 (s) and γ 2 (t) are spatial and temporal variograms. For instance, γ 1 (s) = s/a α and γ 2 (t) = t/b β. Notice that c = corr(u, V) measures separability and c [0, 1]. See [Fonseca and Steel, 2008] for more details.

23 Dec, / 37 Nonseparable models In particular, if C 1 (s; u) = σ 1 exp{ γ 1 (s)u} and C 2 (t; v) = σ 2 exp{ γ 2 (t)v} and U = X 0 + X 1 and V = X 0 + X 2, where X i has finite moment generating function M i, then Proposition C(s, t) = σ 2 M 0( (γ 1(s) + γ 2(t))) M 1( γ 1(s)) M 2( γ 2(t)), (s, t) D T, (3) where γ 1 (s) and γ 2 (t) are spatial and temporal variograms. For instance, γ 1 (s) = s/a α and γ 2 (t) = t/b β. Notice that c = corr(u, V) measures separability and c [0, 1]. See [Fonseca and Steel, 2008] for more details.

24 Dec, / 37 Nonseparable models In particular, if C 1 (s; u) = σ 1 exp{ γ 1 (s)u} and C 2 (t; v) = σ 2 exp{ γ 2 (t)v} and U = X 0 + X 1 and V = X 0 + X 2, where X i has finite moment generating function M i, then Proposition C(s, t) = σ 2 M 0( (γ 1(s) + γ 2(t))) M 1( γ 1(s)) M 2( γ 2(t)), (s, t) D T, (3) where γ 1 (s) and γ 2 (t) are spatial and temporal variograms. For instance, γ 1 (s) = s/a α and γ 2 (t) = t/b β. Notice that c = corr(u, V) measures separability and c [0, 1]. See [Fonseca and Steel, 2008] for more details.

25 Dec, / 37 Heavy tailed processes Mixing in space and time We consider the process Z(s, t) = Z 1 (s; U) Z 2 (t; V), (4) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) (5) h(s) Mixing in time Z 2 (t; V) = Z 2(t; V) λ2 (t) (6)

26 Dec, / 37 Heavy tailed processes Mixing in space and time We consider the process Z(s, t) = Z 1 (s; U) Z 2 (t; V), (4) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) (5) h(s) Mixing in time Z 2 (t; V) = Z 2(t; V) λ2 (t) (6)

27 Dec, / 37 Heavy tailed processes Mixing in space and time We consider the process Z(s, t) = Z 1 (s; U) Z 2 (t; V), (4) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) (5) h(s) Mixing in time Z 2 (t; V) = Z 2(t; V) λ2 (t) (6)

28 Dec, / 37 Heavy tailed processes Process λ 1 (s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) λ 1 (s) accounts for regions in space with larger observational variance. λ 1 (s) needs to be correlated to induce m.s. continuity of Z 1 (s; U), this is equivalent to E[λ 1/2 1 (s i )λ 1/2 1 (s i )] E[λ 1 1 (s i)] as s i s i. This is satisfied by λ 1 (s) = λ, s student-t process. But is does not account for regions with larger variance. This is also satisfied by the glg process where {ln(λ 1 (s)); s D} is a gaussian process with mean ν 2 and covariance structure νc 1(.). [Palacios and Steel, 2006]

29 Dec, / 37 Heavy tailed processes Process λ 1 (s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) λ 1 (s) accounts for regions in space with larger observational variance. λ 1 (s) needs to be correlated to induce m.s. continuity of Z 1 (s; U), this is equivalent to E[λ 1/2 1 (s i )λ 1/2 1 (s i )] E[λ 1 1 (s i)] as s i s i. This is satisfied by λ 1 (s) = λ, s student-t process. But is does not account for regions with larger variance. This is also satisfied by the glg process where {ln(λ 1 (s)); s D} is a gaussian process with mean ν 2 and covariance structure νc 1(.). [Palacios and Steel, 2006]

30 Dec, / 37 Heavy tailed processes Process λ 1 (s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) λ 1 (s) accounts for regions in space with larger observational variance. λ 1 (s) needs to be correlated to induce m.s. continuity of Z 1 (s; U), this is equivalent to E[λ 1/2 1 (s i )λ 1/2 1 (s i )] E[λ 1 1 (s i)] as s i s i. This is satisfied by λ 1 (s) = λ, s student-t process. But is does not account for regions with larger variance. This is also satisfied by the glg process where {ln(λ 1 (s)); s D} is a gaussian process with mean ν 2 and covariance structure νc 1(.). [Palacios and Steel, 2006]

31 Dec, / 37 Heavy tailed processes Process λ 1 (s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) λ 1 (s) accounts for regions in space with larger observational variance. λ 1 (s) needs to be correlated to induce m.s. continuity of Z 1 (s; U), this is equivalent to E[λ 1/2 1 (s i )λ 1/2 1 (s i )] E[λ 1 1 (s i)] as s i s i. This is satisfied by λ 1 (s) = λ, s student-t process. But is does not account for regions with larger variance. This is also satisfied by the glg process where {ln(λ 1 (s)); s D} is a gaussian process with mean ν 2 and covariance structure νc 1(.). [Palacios and Steel, 2006]

32 Dec, / 37 Heavy tailed processes Process λ 1 (s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) λ 1 (s) accounts for regions in space with larger observational variance. λ 1 (s) needs to be correlated to induce m.s. continuity of Z 1 (s; U), this is equivalent to E[λ 1/2 1 (s i )λ 1/2 1 (s i )] E[λ 1 1 (s i)] as s i s i. This is satisfied by λ 1 (s) = λ, s student-t process. But is does not account for regions with larger variance. This is also satisfied by the glg process where {ln(λ 1 (s)); s D} is a gaussian process with mean ν 2 and covariance structure νc 1(.). [Palacios and Steel, 2006]

33 Dec, / 37 Heavy tailed processes Process h(s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) h(s) accounts for traditional outliers (different nugget effects). We consider the detection of outliers jointly in the estimation procedure and the variable h i = h(s i ), i = 1,..., I are considered latent variables Their posterior distribution indicate outlying observations (h i close to 0). We consider log(hi ) N( ν h /2, ν h ). h i Ga(1/ν h, 1/ν h ).

34 Dec, / 37 Heavy tailed processes Process h(s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) h(s) accounts for traditional outliers (different nugget effects). We consider the detection of outliers jointly in the estimation procedure and the variable h i = h(s i ), i = 1,..., I are considered latent variables Their posterior distribution indicate outlying observations (h i close to 0). We consider log(hi ) N( ν h /2, ν h ). h i Ga(1/ν h, 1/ν h ).

35 Dec, / 37 Heavy tailed processes Process h(s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) h(s) accounts for traditional outliers (different nugget effects). We consider the detection of outliers jointly in the estimation procedure and the variable h i = h(s i ), i = 1,..., I are considered latent variables Their posterior distribution indicate outlying observations (h i close to 0). We consider log(hi ) N( ν h /2, ν h ). h i Ga(1/ν h, 1/ν h ).

36 Dec, / 37 Heavy tailed processes Process h(s) Mixing in space Z 1 (s; U) = 1 τ 2 Z 1(s; U) λ1 (s) + τ ɛ(s) h(s) h(s) accounts for traditional outliers (different nugget effects). We consider the detection of outliers jointly in the estimation procedure and the variable h i = h(s i ), i = 1,..., I are considered latent variables Their posterior distribution indicate outlying observations (h i close to 0). We consider log(hi ) N( ν h /2, ν h ). h i Ga(1/ν h, 1/ν h ).

37 Dec, / 37 Heavy tailed processes Process λ 2 (t) Mixing in time Z 2 (t; V) = Z 2(t; V) λ2 (t) λ 2 (t) accounts for sections in time with larger observational variance. This can be seen as a way to adress the issue of volatility clustering, which is common in finantial time series data. We consider the log gaussian process where {ln(λ 2 (t)); t T} is a gaussian process with mean ν 2 2 and covariance structure ν 2 C 2 (.).

38 Dec, / 37 Heavy tailed processes Process λ 2 (t) Mixing in time Z 2 (t; V) = Z 2(t; V) λ2 (t) λ 2 (t) accounts for sections in time with larger observational variance. This can be seen as a way to adress the issue of volatility clustering, which is common in finantial time series data. We consider the log gaussian process where {ln(λ 2 (t)); t T} is a gaussian process with mean ν 2 2 and covariance structure ν 2 C 2 (.).

39 Dec, / 37 Heavy tailed processes Process λ 2 (t) Mixing in time Z 2 (t; V) = Z 2(t; V) λ2 (t) λ 2 (t) accounts for sections in time with larger observational variance. This can be seen as a way to adress the issue of volatility clustering, which is common in finantial time series data. We consider the log gaussian process where {ln(λ 2 (t)); t T} is a gaussian process with mean ν 2 2 and covariance structure ν 2 C 2 (.).

40 Dec, / 37 Heavy tailed processes Predictions (λ 1i, h i, λ 2j ) are considered latent variables and sampled in our MCMC sampler. Given (λ 1i, h i, λ 2j ) the process is gaussian and we can predict at unobserved locations and time points. We compare the predictive performance using proper scoring rules [Gneiting and Raftery, 2008]: LPS(p, x) = log(p(x)) IS(q1, q 2 ; x) = (q 2 q 1 ) + 2 ξ (q 1 x)i(x < q 1 ) + 2 ξ (x q 2)I(x > q 2 ). We use ξ = 0.05 resulting in a 95% credible interval.

41 Dec, / 37 Heavy tailed processes Predictions (λ 1i, h i, λ 2j ) are considered latent variables and sampled in our MCMC sampler. Given (λ 1i, h i, λ 2j ) the process is gaussian and we can predict at unobserved locations and time points. We compare the predictive performance using proper scoring rules [Gneiting and Raftery, 2008]: LPS(p, x) = log(p(x)) IS(q1, q 2 ; x) = (q 2 q 1 ) + 2 ξ (q 1 x)i(x < q 1 ) + 2 ξ (x q 2)I(x > q 2 ). We use ξ = 0.05 resulting in a 95% credible interval.

42 Dec, / 37 Heavy tailed processes Predictions (λ 1i, h i, λ 2j ) are considered latent variables and sampled in our MCMC sampler. Given (λ 1i, h i, λ 2j ) the process is gaussian and we can predict at unobserved locations and time points. We compare the predictive performance using proper scoring rules [Gneiting and Raftery, 2008]: LPS(p, x) = log(p(x)) IS(q1, q 2 ; x) = (q 2 q 1 ) + 2 ξ (q 1 x)i(x < q 1 ) + 2 ξ (x q 2)I(x > q 2 ). We use ξ = 0.05 resulting in a 95% credible interval.

43 Dec, / 37 Heavy tailed processes Predictions (λ 1i, h i, λ 2j ) are considered latent variables and sampled in our MCMC sampler. Given (λ 1i, h i, λ 2j ) the process is gaussian and we can predict at unobserved locations and time points. We compare the predictive performance using proper scoring rules [Gneiting and Raftery, 2008]: LPS(p, x) = log(p(x)) IS(q1, q 2 ; x) = (q 2 q 1 ) + 2 ξ (q 1 x)i(x < q 1 ) + 2 ξ (x q 2)I(x > q 2 ). We use ξ = 0.05 resulting in a 95% credible interval.

44 Dec, / 37 Heavy tailed processes Predictions (λ 1i, h i, λ 2j ) are considered latent variables and sampled in our MCMC sampler. Given (λ 1i, h i, λ 2j ) the process is gaussian and we can predict at unobserved locations and time points. We compare the predictive performance using proper scoring rules [Gneiting and Raftery, 2008]: LPS(p, x) = log(p(x)) IS(q1, q 2 ; x) = (q 2 q 1 ) + 2 ξ (q 1 x)i(x < q 1 ) + 2 ξ (x q 2)I(x > q 2 ). We use ξ = 0.05 resulting in a 95% credible interval.

45 Dec, / 37 Data This data set has I = 30 locations and J = 30 time points generated from a Gaussian model with no nugget effect (τ 2 = 0). The covariance model is nonseparable Cauchy (X i Ga(λ i, 1), i = 0, 1, 2) in space and time with c = 0.5. We contaminated this data set with different kinds of outliers in order to see the performance of the proposed models in each situation.

46 Dec, / 37 Data This data set has I = 30 locations and J = 30 time points generated from a Gaussian model with no nugget effect (τ 2 = 0). The covariance model is nonseparable Cauchy (X i Ga(λ i, 1), i = 0, 1, 2) in space and time with c = 0.5. We contaminated this data set with different kinds of outliers in order to see the performance of the proposed models in each situation.

47 Dec, / 37 Data This data set has I = 30 locations and J = 30 time points generated from a Gaussian model with no nugget effect (τ 2 = 0). The covariance model is nonseparable Cauchy (X i Ga(λ i, 1), i = 0, 1, 2) in space and time with c = 0.5. We contaminated this data set with different kinds of outliers in order to see the performance of the proposed models in each situation.

48 Dec, / 37 Spatial domain s s1 The proposal for λ 1i, h i, i = 1,..., I in the MCMC sampler is constructed by dividing the observations in blocks defined by position in the spatial domain.

49 Dec, / 37 Spatial domain s s1 The proposal for λ 1i, h i, i = 1,..., I in the MCMC sampler is constructed by dividing the observations in blocks defined by position in the spatial domain.

50 Dec, / 37 Data 1 Description and BF One location was selected at random (location 7) and a random increment from Unif(1.0, 1.5) times the standard deviation was added to each observation for this location for the first 20 time points. The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) h (gamma) λ 1 λ 1 & h (lognormal) Gaussian

51 Dec, / 37 Data 1 Description and BF One location was selected at random (location 7) and a random increment from Unif(1.0, 1.5) times the standard deviation was added to each observation for this location for the first 20 time points. The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) h (gamma) λ 1 λ 1 & h (lognormal) Gaussian

52 Dec, / 37 Data 1 Estimated correlation function - t 0 = 1 Correlation Correlation Distance in Space Distance in Space (a) Gaussian (b) Nongaussian with λ 1 Correlation Correlation Distance in Space (c) Nongaussian with h and λ Distance in Space (d) Gaussian (Uncontaminated data)

53 Dec, / 37 Data 1 Nongaussian model with λ 1 Variance Median Observation indexes Distance from observation 07 (a) Variance for each location. (b) Median of σ 2 i vs. distance from obs. 7.

54 Dec, / 37 Data 1 Nongaussian model with h (lognormal) Variance Observation indexes τ 2 h i Observation indexes (a) Variance for each location. (b) Nugget for each location.

55 Data 1 Nongaussian model with λ 1 and h Variance Observation indexes τ 2 h i Observation indexes (a) Variance for each location. (b) Nugget for each location. λ i h i Observation indexes Observation indexes (c) λ 1i, i = 1,..., 30. (d) h i, i = 1,..., 30. Dec, / 37

56 Dec, / 37 Data 2 Description and BF A region was selected and an increment from Unif(0.5, 1.5) times the standard deviation was added to each observation for the first 10 time points. s s1 The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) h (gamma) λ 1 λ 1 & h (lognormal) Gaussian

57 Dec, / 37 Data 2 Description and BF A region was selected and an increment from Unif(0.5, 1.5) times the standard deviation was added to each observation for the first 10 time points. s s1 The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) h (gamma) λ 1 λ 1 & h (lognormal) Gaussian

58 Data 2 Nongaussian model with λ 1 and h Variance Observation indexes τ 2 h i Observation indexes (a) Variance for each location. (b) Nugget for each location. λ i Observation indexes h i Observation indexes (c) λ 1i, i = 1,..., 30. (d) h i, i = 1,..., 30. Dec, / 37

59 Dec, / 37 Data 2 Data 2 - Description and BF A region of the spatial domain was selected (locations 4, 16, 21 and 27) and the same random increment from Unif(0.5, 1.5) times the standard deviation was added to each observation for the first 10 time points. The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) h (gamma) λ 1 λ 1 & h (lognormal) Gaussian

60 Dec, / 37 Data 2 Data 2 - Description and BF A region of the spatial domain was selected (locations 4, 16, 21 and 27) and the same random increment from Unif(0.5, 1.5) times the standard deviation was added to each observation for the first 10 time points. The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) h (gamma) λ 1 λ 1 & h (lognormal) Gaussian

61 Dec, / 37 Data 3 Description and BF The observations at time points 11 to 15 were contaminated by adding a random increment from Unif(0.5, 1.5) times the standard deviation to each observation for all spatial locations. The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) λ 1 λ 2 λ 1 &λ 2 λ 1 &λ 2 &h Gaussian

62 Dec, / 37 Data 3 Description and BF The observations at time points 11 to 15 were contaminated by adding a random increment from Unif(0.5, 1.5) times the standard deviation to each observation for all spatial locations. The logarithm of the BF using Shifted-Gamma (λ = 0.98) estimators: nug. h (lognormal) λ 1 λ 2 λ 1 &λ 2 λ 1 &λ 2 &h Gaussian

63 Dec, / 37 Data 3 Nongaussian models Variance Observation indexes Variance Observation indexes (a) Model with lognormal h(s). (b) Model with lognormal h(s) and λ 1 (s).

64 Data 3 Nongaussian model with λ 1 and λ 2 location 1 location 29 Variance Time indexes Variance Time indexes (a) Variance for each time. (b) Variance for each time. λ Location indexes λ Time indexes (c) λ 1i, i = 1,..., I. (d) λ 2j, j = 1,..., J. Dec, / 37

65 Dec, / 37 Non-gaussian spatiotemporal modeling Data 33 Madrid Paris s s 2 (a) Spain and France Map. (b) Basque Country (Zoom).

66 Dec, / 37 Non-gaussian spatiotemporal modeling Model Mean function: µ(s, t) = δ 0 + δ 1 s 1 + δ 2 s 2 + δ 3 h + δ 4 t + δ 5 t 2 Cauchy covariance function: X i Ga(λ i, 1) ( ) 1 λ1 ( ) 1 λ2 ( 1 C(s, t) = 1 + s/a α 1 + t/b β 1 + s/a α + t/b β λ 1 = λ 2 = 1 and c = λ 0 /(1 + λ 0 ). ) λ0

67 Dec, / 37 Non-gaussian spatiotemporal modeling Model Mean function: µ(s, t) = δ 0 + δ 1 s 1 + δ 2 s 2 + δ 3 h + δ 4 t + δ 5 t 2 Cauchy covariance function: X i Ga(λ i, 1) ( ) 1 λ1 ( ) 1 λ2 ( 1 C(s, t) = 1 + s/a α 1 + t/b β 1 + s/a α + t/b β λ 1 = λ 2 = 1 and c = λ 0 /(1 + λ 0 ). ) λ0

68 Dec, / 37 Non-gaussian spatiotemporal modeling Likelihood In order to calculate the likelihood function we need to invert a matrix with dimension We approximate the likelihood by using conditional distributions. We consider a partition of Z into subvectors Z 1,..., Z 31 where Z j = (Z(s 1, t j ),..., Z(s 67, t j )) and we define Z (j) = (Z j L+1,..., Z j ). Then p(z φ) p(z 1 φ) 31 j=2 p(z j z (j 1), φ). (7) This means the distribution of Z j will only depend on the observations in space for the previous L time points. In this application we used L = 5 to make the MCMC feasible.

69 Dec, / 37 Results Bayes Factor h λ 1 λ 1 & h λ 2 λ 2 & h λ 1 & λ 2 λ 1, h & λ 2 Shifted gamma Table: The natural logarithm of the Bayes factor in favor of the model in the column versus Gaussian model using Shifted-Gamma (λ = 0.98) estimator for the predictive density of z.

70 Dec, / 37 Results Model with h and λ 2 σ 2 τ 2 h σ 2 τ 2 h (a) σ 2 (1 τ 2 )/λ 2. (b) σ 2 τ 2 /h.

71 Dec, / 37 Results Predicted temperature at the out-of-sample stations Station 1 Station 2 Station 3 predicted predicted predicted observed observed observed (a) Gaussian Model. (b) Gaussian Model. (c) Gaussian Model. Station 1 Station 2 Station 3 predicted predicted predicted observed observed observed (d) Model with λ 2 & h. (e) Model with λ 2 & h. (f) Model with λ 2 & h.

72 Dec, / 37 Results Model comparison model Average width IS LPS Gaussian h λ λ 1 & h λ λ 2 & h λ 1 & λ λ 1, h & λ

73 Dec, / 37 Results References T Fonseca and M F J Steel A New Class of Nonseparable Spatiotemporal Models CRiSM Working Paper T Gneiting and A E Raftery Strictly proper scoring rules, prediction and estimation JASA. (102) , C Ma Spatio-temporal covariance functions generated by mixtures Mathematical geology. (34) , M B Palacios and M F J Steel Non-Gaussian Bayesian Geostatistical Modeling JASA. (101) , 2006.

A General Class of Nonseparable Space-time Covariance

A General Class of Nonseparable Space-time Covariance Models Thaís C.O. Fonseca and Mark F.J. Steel Department of Statistics, University of Warwick, U.K. Summary. The aim of this work is to construct nonseparable,