Nonstationary cross-covariance models for multivariate processes on a globe

Size: px
Start display at page:

Download "Nonstationary cross-covariance models for multivariate processes on a globe"

Transcription

1 Nonstationary cross-covariance models for multivariate processes on a globe Mikyoung Jun 1 April 15, 2011 Abstract: In geophysical and environmental problems, it is common to have multiple variables of interest measured at the same location and time. These multiple variables typically have dependence over space (and/or time). As a consequence, there is a growing interest in developing models for multivariate spatial processes, in particular, the cross-covariance models. On the other hand, many data sets these days cover a large portion of the Earth such as satellite data, which require valid covariance models on a globe. We present a class of parametric covariance models for multivariate processes on a globe. The covariance models are flexible in capturing nonstationarity in the data yet computationally feasible and require moderate numbers of parameters. We apply our covariance model to surface temperature and precipitation data from an NCAR climate model output. We compare our model to the multivariate version of the Matérn cross-covariance function and models based on coregionalization and demonstrate the superior performance of our model in terms of AIC (and/or maximum loglikelihood values) and predictive skill. We also present some challenges in modeling the cross-covariance structure of the temperature and precipitation data. Based on the fitted results using full data, we give the estimated cross-correlation structure between the two variables. KEY WORDS: cross-covariance model, linear model of coregionalization, multivariate process, nonstationary process, process on a globe 1 Mikyoung Jun is Assistant Professor, Department of Statistics, Texas A&M University, 3143 TAMU, College Station, TX ( mjun@stat.tamu.edu). 1

2 1. INTRODUCTION Geophysical or environmental problems routinely involve multiple variables measured at the same spatial location and time point. Often, the main interest is to study the relationships between the multiple variables, which would include an accounting of any spatial and/or temporal correlations. On the other hand, with the advance of science and technology, it is common to have data with global coverage. One good example is the study about the relationship between surface temperature and precipitation (Trenberth and Shea 2005, Tebaldi and Lobell 2008, Tebaldi and Sansó 2009). As stated in Tebaldi and Lobell (2008), from the climate impact research point of view, studying the joint distribution of surface temperature and precipitation is more interesting than studying each variable separately. Trenberth and Shea (2005) estimate the empirical (spatial) cross-correlation between surface temperature and precipitation using the numerical model outputs from the Community Climate System Model version 3 (CCSM3) developed by the National Center for Atmospheric Research (NCAR). However, their estimates are based on sample correlations and, as several authors point out including Bishop and Hodyss (2007), sample correlations often give spurious correlations when the dimension of the system is much larger than the sample size. Therefore, it is essential to develop joint models for the variables that can account for not only the marginal but also the cross-covariance structures and that are valid over the whole globe. A number of authors have developed cross-covariance models for multivariate spatial processes. One of the most traditional methods is the linear model of coregionalization (LMC) (Goulard and Voltz 1992, Wackernagel 2003) and the key idea is to represent each process as a linear combination of latent, independent, and stationary (often isotropic) processes. Schmidt and Gelfand (2003) present a Bayesian stationary cross-covariance model based on the idea of the LMC. Gelfand, Schmidt, Banerjee, and Sirmans (2004) provide a good review of the history of the methods for multivariate processes and Schmidt and Gelfand (2003) extend the model by using a spatially varying LMC to account for nonstationarity. Majumdar and Gelfand (2007) present an approach to modeling stationary processes based on convolving covariance functions. This model is then extended to nonstationary processes in Majumdar, Paul, and Bautista (2010). A semiparametric approach to modeling multivariate spatial processes is proposed in Reich and Fuentes (2007). Their covariance model is nonstationary but has a separable structure; the cross-covariance is factored into a multivariate component and a spatial component, which may be limiting in some situations. Choi, Reich, Fuentes, and Davis (2009) use a spatio-temporal version of the LMC model to deal with speciated fine particles over the US with separable covariance functions. 2

3 In terms of developing parametric classes of covariance models for multivariate spatial processes, other than those based on LMC, there has only been a few papers. Apanasovich and Genton (2010) propose using latent dimensions to create a valid covariance model for multivariate processes from a covariance model for univariate process. They present the model for spatio-temporal processes. Their method is convenient to produce valid covariance models for multivariate processes but they assume stationarity (although in principle their method can be used for nonstationary processes). They introduce a concept of distance between different processes, which is different from the usual spatial distances or temporal lags and it is not clear what this distance actually means and how it compares with physical distances in the space and time domains. Under their setting, one needs to estimate this distance along with covariance parameters. Gneiting, Kleiber, and Schlather (2010) present a Matérn type covariance model for multivariate processes. Their model is for isotropic processes only. One of the nice features of their model is it allows different smoothness for different processes in the multivariate setting, which can be useful for some data. Note though that their crosscovariance model is symmetric. That is, if we consider a bivariate process, (Z 1, Z 2 ), on locations s 1 and s 2, their model implies Cov{Z 1 (s 1 ), Z 2 (s 2 )} = Cov{Z 2 (s 1 ), Z 1 (s 2 )}, for all s 1 and s 2, which may not be the case in many geophysical and environmental data sets. The co-located correlation parameter ρ, which controls the strength of cross-correlation between the two variables, is constant over the entire domain and this may be too restrictive for some data sets. As demonstrated in Section 4, for the data set that we consider in this paper, this limitation leads to an estimate of ˆρ 0, even though there is a clear dependence between the two variables. Furthermore, the above covariance functions are designed for stationary (or isotropic) processes and none are for processes on a globe. To the best of our knowledge, there is no such flexible nonstationary cross-covariance function for spatial processes on a sphere. Our focus in this paper is to develop cross-covariance functions for multivariate processes on a globe. Moreover, the covariance model is flexible enough to capture nonstationarity in the data and other complex covariance patterns. Our covariance models require moderate numbers of covariance parameters and are thus computationally feasible. The remainder of the paper is organized as follows. In Section 2, we discuss some properties of nonstationary covariance structure. Section 3 presents the construction of our covariance model and discusses some computational issues. The application to the joint modeling of global surface temperature and precipitation data is presented in Section 4. Section 4 shows a comparison of our model with some models proposed in Gneiting et al. (2010) and the LMC models. We also show our estimated cross-correlation between the surface temperature and precipitation and we compare it with the result of Trenberth and Shea (2005). We 3

4 conclude the paper with some discussion in Section NONSTATIONARY COVARIANCE STRUCTURE In this section, we explore some prevalent features of nonstationarity in the covariance structure of geophysical processes on a globe. We discuss those properties for both marginal and crosscovariances in parallel and show some empirical figures from our data (for details of the data, see Section 4.1). Throughout the section, we denote the variable of interest as (Z i (L, l), i = 1, 2,...,N), a multivariate process on the surface of a globe S 2 (the surface of a sphere in R 3 with radius R) and we illustrate for the case when N = 2. Note that L and l denote latitude and longitude, respectively. 2.1 Dependence on latitude It is common for geophysical processes on a globe to have covariance structure depending on latitude (Stein 2007). In particular, the local variation of the process usually changes with latitude and in fact, Jun and Stein (2008) show that variances of several linear combinations of total column ozone level exhibit strong dependence on latitude. Surface temperature and precipitation data that we consider in this paper possess this kind of nonstationarity for both marginal and cross-covariances. Figures 1-3 in Trenberth and Shea (2005) show that the standard deviations for both variables as well as their cross-correlations have patterns depending on latitude. Figure 1 of this paper displays the standard deviations and cross-correlations of surface temperature and precipitation, averaged over November to March each year and then averaged over 1970 to Figures (a), (c) and (e) give each quantity with respect to latitude, that is, at each latitude, the standard deviation or cross-correlation of the data across all longitude values is calculated. Figures (b), (d), and (f) give the standard deviation or cross-correlation of the data calculated across all latitude values, at each longitude. See Appendix for more details on how these values are calculated. Notice that although we see some dependence of standard deviation and cross-correlation with respect to longitude, the dependence is more obvious with respect to latitude. This may suggest the processes are reasonably modeled as axially symmetric (Jones 1963) both marginally and jointly; the covariance structure is stationary with respect to longitude and nonstationary with respect to latitude. 4

5 2.2 Longitudinal reversibility We say a univariate process Z 1 is longitudinally reversible if Cov{Z 1 (L 1, l 1 ), Z 1 (L 2, l 2 )} = Cov{Z 1 (L 1, l 2 ), Z 1 (L 2, l 1 )} for all L 1, L 2, l 1, l 2 (Stein 2007). Stein (2007) and Jun and Stein (2008) show that the total column ozone process is longitudinally irreversible; we find that it is also the case for both temperature and precipitation data marginally (not shown). This concept can be applied to crosscovariances as well. Call the cross-covariance of the two processes Z 1 and Z 2 longitudinally reversible if Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} = Cov{Z 1 (L 1, l 2 ), Z 2 (L 2, l 1 )} for all L 1, L 2, l 1, l 2. For some data sets, however, longitudinally irreversible cross-covariances may be hard to estimate. Figure 2 (a) shows the empirical estimate of the difference of the cross-correlations at a latitude band, Cor{Z 1 (L, l), Z 2 (L, l + )} Cor{Z 1 (L, l + ), Z 2 (L, l)}, against latitude (x-axis) and longitude lag, (y-axis). The empirical estimate for the above quantity is based on temporally averaged data, a 30 year average (see Section 4.1 for details on how we aggregate the data temporally). To get these empirical estimates, first we bin the latitude with the bin size roughly 7. Then within each bin, for each, we calculate the correlations between the two variables at the longitude lag of. To assess the uncertainty of the empirical longitudinal irreversibility, we also split the total 30 year period into 30 intervals of one year and instead of calculating irreversibility using a 30 year average, we calculate the irreversibility based on 30 annual averages. The mean and the standard deviation of these irreversibility quantities based on these 30 data points (annual averages) are given in (b) and (c), respectively. Note (a) and (b) give quite similar patterns although the range of irreversibility in (b) is narrower. Although there seems to be some strong longitudinal irreversibility in the cross-correlation near the poles and mid latitude of Southern Hemisphere at large longitudinal lags, the uncertainty associated with it (especially near the poles) is high. It might be hard to fit the pattern with any smooth function of latitude and longitude lags due to the complex nature of the empirical irreversibility surface. See Section 4 for more discussion on this issue. 5

6 2.3 Asymmetry We now define a general concept of asymmetry for multivariate spatial processes. We call the cross-covariance of Z 1 and Z 2 symmetric if Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} = Cov{Z 1 (L 2, l 2 ), Z 2 (L 1, l 1 )} for all L 1, L 2, l 1, l 2. If the cross-covariance structure is asymmetric, then the cross-covariance matrix of some set of observations will generally be asymmetric. Note that the model propose in Gneiting et al. (2010) is always symmetric. The model from the linear model of coregionalization is also symmetric unless the coefficients vary spatially. Apanasovich and Genton (2010) present crosscovariance models that is asymmetric in space-time domain, which is somewhat different from the asymmetry discussed in this paper. 3. METHODOLOGY In this section, we develop joint covariance models that can exhibit the nonstationary properties discussed in Section 2 for not only marginal but also cross-covariance structure. We also discuss computational methods that enable us to compute full likelihoods efficiently when we have large global data sets on a regular grid for multivariate processes. 3.1 Model Jun and Stein (2007) proposed an approach to produce nonstationary covariance models for a univariate process on a globe to capture space-time asymmetry, which is commonly found in environmental data (Gneiting 2002; Jun and Stein 2004; Li, Genton, and Sherman 2008). The key idea is to apply differential operators with respect to latitude, longitude, and time to an isotropic spatio-temporal process and Section 4 of Jun and Stein (2007) demonstrates the effectiveness of the approach in capturing such space-time asymmetry. Jun and Stein (2008) further explore the idea of applying differential operators with respect to latitude and longitude to an isotropic spatial process to represent various nonstationary properties of univariate process on a globe. They demonstrate that their model captures small scale variation in the process for a univariate process well. The key in the model is the flexibility, resulting from the products of first order differential operators with respect to latitude and longitude, applied to an underlying process. We extend this idea for multivariate spatial processes on a globe. Now we show how the idea of applying differential operators to the processes can be applied to multivariate isotropic spatial processes to create nonstationary cross-covariance structure, in particular, 6

7 asymmetry and longitudinal irreversibility, which depends on latitude in a flexible way. Suppose we have a multivariate spatial process, (Z 1 (L, l),..., Z N (L, l)), defined on a globe, S 2, and we are interested in modeling the joint distribution of Z i s. We will focus on the case that N = 2 here; for the case that N > 2, the method extends in a natural way. Let us assume (Z 1, Z 2 ) is a bivariate process with mean zero. We also assume that Z i s are axially symmetric both marginally and jointly. Let us write Y = G(α, β, ν) if the process Y defined on S 2 has mean zero and its covariance is given by a Matérn covariance function: ( d ) νkν ( d Cov{Y (L 1, l 1 ), Y (L 2, l 2 )} = K(L 1, L 2, l 1 l 2 ) = α. (1) β β) Here K denotes the covariance function of Y, L i s are latitude values and l i s are longitude values (i = 1, 2). The parameters, α, β, ν > 0, are the sill, spatial range, and the smoothness parameters for a Matérn class, respectively, K ν is the modified Bessel function, and { ( ) ( )} d = d(l 1, L 2, l 1 l 2 ) = 2R sin 2 L1 L 2 + cosl 1 cos L 2 sin 2 l1 l 1/2 2 (2) 2 2 denotes the chordal distance between the two locations, (L 1, l 1 ) and (L 2, l 2 ). To ensure the positive definiteness of (1) on S 2, we need to use chordal distance as a spatial metric instead of a geodesic distance (see Jun and Stein (2007) for a detailed discussion). Let us first consider a simple setting: δz i (L, l) = a i {Y (L + δ, l) Y (L, l)} + b i {Y (L, l + δ) Y (L, l)}, i = 1, 2, (3) with a i, b i constants and δ > 0. When δ 0, (3) is essentially equivalent to the model with differential operators with respect to latitude and longitude applied to the process Y in the L 2 sense (instead of taking differences). It is easy to see from (3) that when δ 0, the cross-covariance of Z 1 and Z 2 can be written as Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} = a 1 a 2 L 1 2 K(L 1, L 2, l) b 1 b 2 L 2 l 2K(L 1, L 2, l) a 1 b 2 L 1 l K(L 1, L 2, l) + b 1 a 2 L 2 l K(L 1, L 2, l), (4) with l = l 1 l 2. Note that the second order partial derivatives of K in (4) originate from the limits of the second order differences of the covariance K. For example, L 1 1 K(L 1, L 2, l) = lim L 2 δ 0 δ 2 {K(L 1 + δ, L 2 + δ, l) K(L 1 + δ, L 2, l) K(L 1, L 2 + δ, l) + K(L 1, L 2, l)}. Furthermore, to have the limit properly defined, we need to have ν > 1. For more details on this condition, see Section 2 of Stein (1999) and the result in Jun and Stein (2007). 7

8 Now let us consider the longitudinal irreversibility and the asymmetry discussed in Sections 2.2 and 2.3. It is straightforward from (4) that when δ 0, the longitudinal irreversibility is given by and the asymmetry is given by Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} Cov{Z 1 (L 1, l 2 ), Z 2 (L 2, l 1 )} = 2a 1 b 2 L 1 l K(L 1, L 2, l) + 2b 1 a 2 L 2 l K(L 1, L 2, l), (5) Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} Cov{Z 1 (L 2, l 2 ), Z 2 (L 1, l 1 )} = ( a 1 b 2 + b 1 a 2 ) { L 1 l K(L 1, L 2, l) + L 2 l K(L 1, L 2, l) }. (6) From (5) and (6), it is clear that the types of nonstationary in the cross-covariance discussed in Sections 2.2 and 2.3 are achieved mainly from the interactions between the first and second order differences of Y in (3). If a 1 = a 2 and b 1 = b 2, the asymmetry in (6) reduces to zero, although the longitudinal irreversibility in (5) may not be zero. We show some plots of the cross-covariance structure for various values of a i s and b i s to further demonstrate the behavior of the proposed covariance model in a simple setting. We work with correlation scale instead of the covariance scale. We fix α = 1, a 1 = 1, and β = 2000 (Km). We vary a 2, b 1, b 2, and ν to explore the nonstationarities of the covariance model of (3) when δ 0. Figure 3 gives the longitudinal irreversibility in the cross-correlation structure given in (5). The irreversibility in the covariance scale should have the same shape except the scale. Notice that different ν values give different shapes of the irreversibility. From (5), the irreversibility depends on a i s and b i s through a 1 b 2 and b 1 a 2. If the signs of a 1 b 2 and b 1 a 2 change together, the sign of irreversibility should also change. Therefore, we see that several sets of the a i s and b i s give either the same irreversibility or the same magnitude of irreversibility with different signs. For example, when a 2 = 0, the pairs, (a) and (b), (c) and (d), (e) and (f), and (g) and (h), give the same irreversibility. Moreover, when a 2 = 0, the irreversibilities in (a) and (b) and those in (c) and (d) have same magnitudes but different signs. When a 2 0, the pairs, (a) and (d), (b) and (c), (e) and (h), and (f) and (g), give the same magnitude of the irreversibility with different signs. It may then appear that there are some identifiability problems in a i s and b i s since some sets of these coefficients give the same irreversibility curves. However, this is not the case for the covariance structure of the bivariate process. As long as we fix the sign of only one of the four coefficients (a i s and b i s), we can avoid the identifiability problem (see (4)). Figure 4 gives the asymmetry against longitudinal lags in the cross-correlation structure given in (6). Unlike the longitudinal irreversibility, the asymmetry in the covariance scale may have different 8

9 shape than the asymmetry in the correlation scale. This is because the asymmetry in the correlation scale is Cor{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} Cor{Z 1 (L 2, l 2 ), Z 2 (L 1, l 1 )} = Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} Var{Z1 (L 1, l 1 )}Var{Z 2 (L 2, l 2 )} Cov{Z 1 (L 2, l 2 ), Z 2 (L 1, l 1 )} Var{Z1 (L 2, l 2 )}Var{Z 2 (L 1, l 1 )}, (7) and the terms in each denominator are in general not the same except the case such as b 2 = a 2 b 1 (a 2 0), for any L 1, L 2, l 1, l 2. Therefore, the asymmetries in the correlation scale in general do not simply depend on the coefficients through a 1 b 2 b 1 a 2 but in a more complex manner. For example, when a 2 = 0, even if (a) and (b) have the same b 2 value, their asymmetries are different. When a 2 = ±0.1 and b 2 = a 2 b 1, the asymmetry is zero. When a 2 = ±0.1 and b 2 = a 2 b 1, then we may get the same magnitude of the irreversibility but with different signs. For example, the pairs, (b) and (c) with a 2 = 0.1 and (a) and (d) with a 2 = 0.1, give the same magnitude of the irreversibility with different signs. It is interesting to note that even if l 1 l 2 = 0, the asymmetry for some combinations of the coefficients is not zero (because L 1 L 2 ). Figure 5 (a)-(d) display the asymmetry against latitude. Note the asymmetries are zero when L 2 = 0 and the pairs, (a) and (d) or (b) and (c), give symmetric asymmetry values around L 2 = 0. The fact that the asymmetries are zero when L 2 = 0 can also be easily explained by (11) (we will discuss this further when we introduce (11) later). Note that the asymmetries are not necessarily symmetric against the equator, which is realistic for most real data sets. We now generalize the model in (3) in the sense that the coefficients are functions of latitude values. That is, we write: Z i (L, l) = n { k=1 A i,k (L) L + B i,k(l) l } Y k (L.l) + C i (L)Y 0 (L, l). (8) Here, the partial derivatives are defined in the L 2 sense and Y k = G(α k, β k, ν k ) (α k, β k > 0, k = 0,...,n, ν 0 > 0, and ν k > 1, k = 1,...,n). If ν k 1 for k = 1,..., n, then the mean square derivatives of Y k is not properly defined. We assume the Y k s (k = 0,...,n) are independent of each other. A useful simplification is to assume Y k share the same covariance parameters for k = 1,..., n, but we generally let Y 0 have different covariance parameters than Y 1,...,Y n to allow sufficient flexibility in the local behavior of the model. Note it is not necessary to include Y 0 in (8). We may then let C i = 0 for parsimony. The functions A i,k, B i,k, and C i in (8) are nonrandom functions and we model these functions as 9

10 linear combinations of Legendre polynomials. For instance, we let A i,k (L) = m a ikj P j (sinl), (9) j=0 where P j denotes the Legendre polynomial of order j. Then a ikj R are additional covariance parameters to be estimated along with other covariance parameters. The maximum order of Legendre polynomials used here, m, is chosen arbitrarily and we expect a modest number of m should be able to produce flexible covariance functions. The values of m for A i,k, B i,k, and C i may be different. Larger m will obviously give more flexibility to the covariance structure and we may let m = 0 for a parsimonious model. We compare the different possibilities discussed here in Section 4. Although the Y k s are independent of each other, the Z i s have nonzero cross-covariance and we can get explicit expressions for the cross-covariance of the Z i s. For instance, suppose n = 1, Y 1 = G(1, β, ν) (ν > 1), and C i = 0 (i = 1, 2). Set h = h(l 1, L 2, l 1 l 2 ) = (d/β) 2 for d defined in (2), h p = h p (L 1, L 2, l 1 l 2 ) = h x p and h pq = h pq (L 1, L 2, l 1 l 2 ) = 2 h x p x q, where x 1 = L 1, x 2 = L 2, and x 3 = l 1 l 2 (see Appendix A of Jun and Stein (2007) for explicit expressions for h p and h pq ). Also let M ν (x) = x ν K ν (x). Then the cross-covariance function of Z 1 and Z 2 is given by, Cov{Z 1 (L 1, l 1 ), Z 2 (L 2, l 2 )} = Γ 1 M ν 2 ( h) + Γ 2 M ν 1 ( h), (10) where Γ 1 and Γ 2 are Γ 1 = 1 4 {A 1,1(L 1 )A 2,1 (L 2 )h 1 h 2 B 1,1 (L 1 )B 2,1 (L 2 )h 2 3 A 1,1 (L 1 )B 2,1 (L 2 )h 1 h 3 +B 1,1 (L 1 )A 2,1 (L 2 )h 2 h 3 }, and Γ 2 = 1 2 {A 1,1(L 1 )A 2,1 (L 2 )h 12 B 1,1 (L 1 )B 2,1 (L 2 )h 33 A 1,1 (L 1 )B 2,1 (L 2 )h 13 +B 1,1 (L 1 )A 2,1 (L 2 )h 23 }. The cross product terms of A 1,k or B 1,k and A 2,k or B 2,k in Γ 1 and Γ 2 come from the covariance of processes with L and l applied in (8) and through the linear combination terms for A i,k and B i,k as in (9), the resulting cross-covariance model in (10) achieves great flexibility and can capture complex nonstationary structure in the data. In particular, it can be easily shown that for any latitude L, longitude l and longitudinal lag, Cov{Z 1 (L, l), Z 2 (L, l + )} Cov{Z 1 (L, l + ), Z 2 (L, l)} ={B 1,1 (L)A 2,1 (L) A 1,1 (L)B 2,1 (L)} 4β 2 R 2 sinlcoslsin( 2 ) cos ( 2 ) { 1 2 M ν 2( h)4β 2 R 2 cos 2 Lsin 2 ( 2 ) M ν 1( h)}, (11) 10

11 where h = 4β 2 R 2 cos 2 Lsin 2 ( 2 ). Therefore, by letting A i,k and B i,k functions depend on the latitude, L, the proposed covariance model can produce flexible longitudinal irreversibility. (11) can also be used to prove the fact that the asymmetries are zero when L 2 = 0 in Figure 5 (note that when L 1 = L 2, the asymmetry reduces to the longitudinal irreversibility). In fact, under the current covariance model, longitudinal irreversibility at the equator is always zero due to the term sinl in (11). Figure 5 (e)-(f) display the asymmetry when A 1,1 (L) = 1, A 2,1 (L) = a 2, B 1,1 (L) = b 1, B 2,1 (L) = b 2 (L) = b 20 (5 + 5P 2 (sinl)). These plots demonstrate that by allowing the coefficients A i,k and B i,k to depend on the latitude, we get more flexibility in the resulting covariance structure. It is shown in Jun and Stein (2008) that the marginal correlation from the model in (8) (with C i = 0) can be as small as 1. For cross-correlations, we also achieve the range of 1 to 1. For an extreme example, in (8), suppose n = 1, A 1,1 = A 2,1 = 1, B 1,1 = B 2,1 = 0, and C 1 = C 2 = 0. Then it is easy to see that the cross-correlation between Z 1 and Z 2 is 1 everywhere. The proposed method here has a similar spirit as the LMC model in the sense that each process Z i is modeled as a linear combination of latent processes. However, the model proposed in this paper has several significantly different aspects compared to those of LMC. In the model, cross-covariance structure is characterized by the first term of (8) and even when n = 1 with C i = 0, we do achieve fairly flexible cross-covariance models that we cannot with the LMC models with more covariance parameters (see Section 4.3). The expression in the summation of (8) may appear to be a linear combination of latent processes, but in fact the differential operators are defined in the L 2 sense. The variations of LMC models that give nonstationary covariance models such as in Gelfand et al. (2004) achieve the nonstationarity quite differently from the way the models in (8) achieve it. One of the fundamental differences between the approach proposed in this paper and the LMC model is that the differential operators with respect to latitude and longitude are applied to the same process(y k in (8)). Suppose we have n = 1 in (8). In the LMC models, they consider linear combinations of independent processes, but the differential operators in (8) effectively evaluate covariances of differences of the same process, Y 1, with small latitudinal or longitudinal lags (see (3)). Hence, the nonstationarity with respect to latitude not only come from the coefficients of partial differential operators, A i,k and B i,k, but also from the differential operators, L and l, and the covariance between the processes with each differential operators applied. We choose to use Legendre polynomials in modeling the coefficients of the differential operators. It is not clear how we could get empirical estimates of these coefficients from the data and thus we instead model these coefficients through some orthogonal polynomials of the latitude. Legendre polynomials in that sense are natural choice since they are orthogonal over the interval [ 1, 1] and 11

12 thus P j (sinl) s for 90 L 90 are orthogonal. There are possible limitations of the model in (8). The first limitation is that each Z i s may have the same spatial range and smoothness parameter since Z i s consist of the same processes (Y k s). One easy fix of this problem is either by letting Y 0 have different covariance parameter than Y k (k > 0) s or by adding more terms (processes) in (8) and let these terms have different covariance parameters. We will explore this issue further for the climatological application in Section Computational Issues It is common to estimate the covariance parameters as well as the mean parameters using maximum likelihood estimation and for that purpose, we from now on assume that the process is multivariate Gaussian. Many spatial data sets these days are of large dimension and often it can be quite challenging to efficiently compute the full likelihood. For the case of regularly spaced data, which is usually the case for satellite data and the numerical model outputs, however, the computation of the exact likelihood can be quite efficient. Jun and Stein (2008) demonstrate such a method using the Discrete Fourier transform (DFT) for univariate spatial process. The key idea is the following: since the covariance model is axially symmetric and we have regularly spaced longitude values covering full range, the resulting covariance matrix can be written in a block circulant form. Then using the fact that a block circulant matrix can be diagonalized by applying the DFT, we can calculate the inverse and the determinant of the covariance matrix efficiently (see Jun and Stein (2008) for more details on how this works). The same idea can be applied for the cross-covariance matrix. As long as the multiple spatial processes are on the same longitudinal grids, cover the full longitude range, and the cross-covariance structure is axially symmetric (note the model in (8) does give an axially symmetric cross-covariance structure), both the marginal covariance matrix for each process and the cross-covariance matrix can be diagonalized by applying the DFT. Note Chan and Wood (1999) consider a multivariate stationary Gaussian random field defined on a rectangular grid in R d and they apply circulant embedding of a block Toeplitz matrix (Toeplitz structure comes from the stationarity of the random field) to create a block circulant covariance matrix. Then they perform the DFT to block diagonalize the covariance matrix. Since in our domain, the block circulant structure of the covariance matrix is naturally given through regularly spaced longitudinal points with 360 coverage, we do not need the step of circulant embedding. Suppose we consider a bivariate process (Z 1, Z 2 ) observed on a regular grid with p latitude points and q longitude points (longitudinal points must be equally spaced over the full longitude range). We denote Z i (L j ) = {Z i (L j, l 1 ),...,Z i (L j, l q )} T for j = 1,...,p and FZ i (L j ) is the DFT (with 12

13 respect to longitude) of Z i (L j ). Then it is well known that the corresponding covariance matrix of the complex normal vector, FZ i (L j ), is a diagonal matrix. Although FZ i (L) is a complex normal random variable, the likelihood of it can be obtained simply by calculating as if it is a real normal random variable with the appropriate covariance matrix (Wooding 1956). Therefore, if we denote Z i = {Z i,1,1,...,z i,p,1, Z i,1,2,...,z i,p,2,...,z i,p,q} T where Zi,j,k is the kth element of FZ i(l j ), then the covariance matrix of {Z 1 T,Z 2 T } T can be written as Σ = D 1 D 12, where D 1, D 2, and D 12 are complex block diagonal matrices with D 12 D 2 p p block diagonals and D 12 is the conjugate transpose of D 12. The determinant of the matrix Σ can be calculated using det(σ) = det(d 1 D 12 D 1 2 D 12 ) det(d 2) and the quadratic form in the likelihood can be efficiently calculated using the fact that Σ 1 (D = 1 D 12 D 1 2 D 12 ) 1 D 1 1 D 12(D 2 D 12 D 1 D 1 2 D 12 (D 1 D 12 D 1 2 D 12 ) 1 (D 2 D 12 D 1 2 D 12) 1 1 D 12) 1 Note that the lower off diagonal matrix is the conjugate transpose of the upper off diagonal matrix and the inverses of D 1 and D 2 can be calculated efficiently since they are block diagonal matrices with block size p p. In our application, we have p = APPLICATION 4.1 Data As noted in Section 1, the relationship between precipitation and surface temperature has received a lot of attention by scientists and it is important in the climate impact research area. We apply our covariance functions developed here to build a joint model between surface temperature and precipitation; the data originates from one of the numerical model outputs used in Trenberth and Shea (2005), the NCAR CCSM3. The NCAR CCSM3 is one of the climate models developed by NCAR. Jun, Knutti, and Nychka (2008) give a more detailed background on this and other climate models. We look at the 5 months average for Northern winter (November to March) and we take averages of these over 1970 to 1999 (we call it NDJFM from now on). The temperature output from this model, CCSM3, has also been analyzed by Jun et al. (2008) but they consider the differences between the observations and numerical model outputs and they only look at the latitude range of 50 S to 50 N on a coarser grid resolution (of 5 5 ). We use the numerical model output only (no observations) for the entire globe (full longitude and latitude ranges) in the original resolution of 13

14 ( in both longitude and latitude). It is common in climate studies to use numerical model outputs rather than observations since observations usually have a large fraction of missing observations (especially near the poles). For example, Trenberth and Shea (2005) used numerical model outputs only, to study the relationship between the temperature and precipitation. Note the unit for temperature is K and the unit for precipitation is Kg/(m 2 s) (Kilogram per squared meter per second). Tebaldi and Sansó (2009) deal with multiple numerical model outputs along with observations to build a joint model between surface temperature and precipitation, but their approach is relatively simple in terms of modeling the cross-covariance structure of the two variables. In their approach, cross-correlations between the two variables only come from the mean of the processes, that is, they let the mean of the precipitation be a linear function of the surface temperature. They consider spatio-temporal processes, but in this work, we focus on the spatial component of the process. 4.2 Model Model for the mean The first row of Figure 6 gives the NDJFM average of temperature and precipitation data. Since the order of precipitation data is 10 5, from now on, we multiply 10 5 to the precipitation data to make it comparable to the surface temperature data. For temperature, it is clear that the mean structure of the field mainly depends on the latitude. For precipitation, such dependence is not as strong as temperature data and there are places with large amount of precipitation around the equator. We first filter out the spatial mean structure using spherical harmonics and work with the residuals. Specifically we use spherical harmonics up to order r = 12 and regress each variable (surface temperature and precipitation) on the spherical harmonics, {Y s r (sinl, l) r = 0, 1, 2,...,s = r,..., r} for r = 12, separately. The second row of Figure 6 gives the estimated mean structure and the third row gives the residuals. Overall, the estimated mean field removes most of the large-scale spatial patterns in the data Model for the covariance Since our main interest in this paper is estimating covariance structure, we focus on fitting the covariance models using the residuals. We fit several covariance models to the data. We consider a Matérn model in Gneiting et al. (2010), a version of the LMC model, and a couple of variations of our covariance model developed in Section 3.1. Here, Z 1 denotes the surface temperature process 14

15 and Z 2 denotes the precipitation process. Note that these processes are the residuals after filtering out the mean as explained in Section Matérn model (MAT): we use the parsimonious bivariate Matérn model in Gneiting et al. (2010). In particular, we let Z i = G(α i, β, ν i ), i = 1, 2. The parameter ν 3 gives the smoothness for the cross-covariance and by construction, ν 3 = ν 1+ν 2 2. We also have the co-located correlation coefficient ρ. 2. LMC model (LMC): we use a version of the LMC model, that is, we set Z i (L, l) = a i W 1 (L, l)+ b i W 2 (L, l) + c i U i (L, l), where a i, b i, and c i are constants, W j = G(1, β, ν j ) (j = 1, 2), and U i = G(1, β, ω i ). We also assume W j s are independent, U i s are independent, and W j s and U i s are independent of each other. Therefore U i does not contribute to the cross-covariance structure of the Z i s. 3. Our covariance model (Nonstationary Multivariate Global model): (a) NMG1: we set Z i (L, l) = { a i L + b i l} Y (L, l) + ci U i (L, l). Here, Y = G(1, β, ν), U i = G(1, β, ω i ), and we assume the U i s are independent of Y. Note that ν > 1. All of the above models have the property that the processes Z 1 and Z 2 have the same spatial range parameter. Each of the model s properties discussed in Section 2 are summarized in Table 1. These models are intentionally set to be relatively simple since they will be used in Section 4.3 with the data over a subregion. We use the following more complex model to fit the full data in Section 4.4. (b) NMG2: we let Z i (L, l) = { A i (L) L + B i(l) l} Y (L, l) + Ci (L)W(L, l) + d i U i (L, l) with A i, B i, and C i being defined as in (9) (for instance, A i (L) = m j=0 a ijp j (sinl)). We let Y = G(1, β 1, ν 1 ) (ν 1 > 1), W = G(1, β 2, ν 2 ), and U 1 = G(1, β 3, ν 3 ) (note ν 2, ν 3 > 0). For the choices of m and d i, see Section 4.4. For the models LMC, NMG1, and NMG2, we may have an identifiability problem if we let all the parameters of the coefficients, a i, b i, c i, and the coefficients of the linear combinations in A i, B i, and C i vary in R. To avoid the problem, for LMC model, we take the signs of a 1, b 1, c 1 and c 2 positive. For NMG1 model, we take the signs of a 1, c 1, and c 2 positive. For NMG2 model, we take the sign of b 10, c 20, and d 1 positive. 4.3 Fit over North America Even if we use the computational technique through DFT described in Section 3.2, using the full data set for both variables to estimate the covariance structure takes quite some time (the total 15

16 data size is = 65, 536). Therefore, we first choose a subregion over some parts of North America and fit several covariance models for a quick comparison in terms of likelihood and prediction accuracy. We perform the prediction on a region disjoint with the estimation sites, over North America. Figure 7 shows the locations of estimation sites and the prediction sites. Note that there are 774 estimation sites and 172 prediction sites. For the fit in this section, since we have a manageable size of the data, we do not use the technique using DFT. In fact it is not possible since the technique through DFT requires that the data should cover the entire longitude range. We compare the covariance models listed in Section (1,2, and 3(a)) and estimate each covariance parameter using the maximum likelihood estimation method. We used numerical optimization (using nlm and optim functions in R and for optim, we use the default Nelder-Mead algorithm) and tried several starting points. The optimization procedures reached to the same maximum point for all of the different starting points that were tried. Table 2 gives the estimated covariance parameter values along with their asymptotic standard errors. Asymptotic standard errors are obtained from the inverse of the Hessian matrix. The maximized loglikelihood values for each model and the corresponding AIC values are also given. First thing to note is that the NMG1 model gives significantly larger loglikelihood values compared to the other models given comparable number of covariance parameters (the LMC model has the most covariance parameters). The AIC value for the NMG1 model is the smallest among the three. The fact that the LMC model, despite having the most covariance parameters, gives a much smaller loglikelihood value than NMG1 may be a sign that the nonstationarity, in particular, the dependence of covariance structure on latitudes, longitudinal irreversibility, and the asymmetry in the data (for marginal and/or cross-covariance structure) are rather strong and the differential operator term in (8) helps to explain these properties better. In terms of different smoothness in the two variables, it seems that the precipitation process is smoother than the temperature process. From the LMC model, the smallest smoothness parameter value, ν 1 is shared by the two processes, temperature and precipitation but the coefficient, a 1, is much larger in magnitude than the coefficient, a 2, and also b 2 is larger in magnitude than b 1. Therefore, the roughest process, W 1, mostly contributes to the temperature process and smoother process, W 2, mostly contributes to the precipitation process. From the NMG1 model, first of all, w 1 is smaller than w 2. Note that the effective smoothness of the process Y is ν 1 = 1.07, and thus Y has similar amount of smoothness to the process U 1. In that sense it is not clear whether precipitation process is smoother than the temperature process or not. Nevertheless the result in Tebaldi and Sansó (2009) shows that the precipitation process is smoother than the temperature process. For the LMC model, the estimate of ω 1 reached near the upper boundary of the parameter space and 16

17 thus we could not obtain its asymptotic standard errors. It is a common practice to set the range for the smoothness parameter to be (0, 2.5) since the covariance model is not valid for zero or negative values and for large values we often run into the numerical instability problem. This poor fit may imply that the data do not provide enough information on this parameter for the particular model of LMC. The model NMG1 does not have this problem. The estimate for the co-located correlation parameter of the MAT model is almost zero (ˆρ = 5.5e-06), while the empirical cross-covariance estimate is around 0.3. Figure 8 shows the prediction errors for the covariance models MAT, LMC, and NMG1 at the prediction sites. We display the difference between the true and the predicted values at the prediction sites against latitude. Note that the three models do not show much difference in terms of prediction accuracy for the temperature variable, although we see significant difference for the precipitation. The superiority of the NMG1 model is apparent for the prediction of precipitation. For the precipitation variable, the predictive skills of the MAT and LMC models are similar. To make a fair comparison of the predictive performances of the three models, we now repeat the above procedure over 24 disjoint subdomains, S 1,..., S 24, that cover most of globe altogether. That is, S 2n is a subdomain in the Northern Hemisphere with latitude range 0 to 60 N and S 2n 1 is a subdomain in the Southern Hemisphere with latitude range 0 to 60 S for n = 1,...,12. For each n, S 2n and S 2n 1 cover the longitude range 30(n 1) to 30n and for S 2n and S 2n 1, we set aside the data in the longitude range {30(n 1) + 10} to {30(n 1) + 15} for the validation of prediction. Table 3 shows the maximum loglikelihood values for the three models from the fits over the 24 subdomains. Except S 12, S 16, and S 24, NMG1 achieves the largest maximum loglikelihood values and the differences between the loglikelihood values of NMG1 and the other two models are significantly large in most of the subdomains. Table 4 gives the summary of prediction performance of the three models over the 24 subdomains. It provides the median, mean, and maximum values of Mean Squared Errors (MSEs) from the prediction over the 24 subdomains. Except the mean for temperature and maximum for the precipitation, NMG1 gives the smallest MSE values for all the summary statistics. Along with the result in Figure 8, this result demonstrates that NMG1 model indeed outperforms the other two models not only in terms of the maximum loglikelihood values (and AIC) but also the predictive performance. 4.4 Fit over full domain We now fit the full data set through the computational technique described in Section 3.2. From the fitted results in Section 4.3, it is clear that there is a strong nonstationarity in the data and the 17

18 covariance models listed in Section (except NMG2) are not flexible enough to capture such nonstationarity. On the other hand, from Tables 2-4 and Figure 8, it is clear that the NMG1 model outperforms the MAT and LMC models given comparable number of covariance parameters. Hence, we fit the full data to estimate cross-correlation between the temperature and precipitation data using an extended version of NMG1, NMG2, described in Section We set d 1 (0, ) and d 2 = 0. This is an attempt to capture the difference in smoothness for the two variables. We keep d 1 positive to avoid the identifiability problem. We could let d 1 = 0 instead of d 2, but since the parameter estimates in Table 2 and the study by Tebaldi and Sansó (2009) suggest that surface temperature data is less smooth than precipitation data, by adding the term d 1 U 1, we hope to capture the roughness in the temperature data. We also fitted the model with d 1, d 2 (0, ) but the improvement over NMG2 was not significant (that is, the loglikelihood values do not increase significantly and the fitted values do not change noticeably). We also let Y, W j, and U 1 have different spatial range parameters. That is, we set Y = G(1, β 1, ν 1 ), W = G(1, β 2, ν 2 ), and U 1 = G(1, β 3, ν 3 ). With the data size over 30, 000 for each variable, we have sufficient information to let these parameters differ. The estimated covariance parameter values along with their asymptotic standard errors are given in Table 5. It is interesting to note all three spatial range parameters are different, although for both variables, the maximum spatial range parameter estimate is given by the process W (ˆβ 2 ). The smoothness parameter estimates for Y and U 1 are comparable to the corresponding estimates for the fit in Section 4.3 in Table 2. Also the smallest estimate of the spatial range parameters, ˆβ1, is similar to the estimate of the corresponding parameter, β, in Table 2. The estimates for the remaining parameters differ significantly from those in Table 2, as we expected due to the difference between the two models, NMG1 and NMG2. As explained in Section 2.1, Figure 1 gives a comparison between the empirical and fitted variances and cross-correlations for the temperature and precipitation variables (NDJFM). For figures, (a), (c) and (e), the solid line gives the fitted values with the parameters in Table 5. For figures (b), (d), and (e), it is not obvious how to display the corresponding fitted values since as in Appendix, the empirical quantities are calculated through the sum across latitudes and thus at each longitude value, corresponding fitted values do not come in one number, but rather you get different fitted values for different combinations of latitudes. Figures (a)-(d) show the standard deviation for the univariate processes, and (e) and (f) show cross-correlations. Overall, fitted values do a reasonable job at capturing the pattern of the empirical values. Fitted variance for temperature is rather flat with respect to latitude since d 1 has relatively large estimate and the estimates of A 1,1 and B 1,1 in NMG2 for the temperature process got smaller weight. On the other hand the fitted variance for 18

19 precipitation captures the pattern in the data well. It may be interesting to see if increasing m for the temperature process would improve the fit. Fitted values for the cross-covariance structure are problematic in some places, but this may be partly due to the complex nature of the cross-covariance structure of the data. Figure 2 shows the comparison of the (a)-(b) empirical, (d) fitted using OLS, and (e) fitted using MLE of the longitudinal irreversibility, that is, r(l, L, ) = Cor{Z 1 (L, l), Z 2 (L, l + )} Cor{Z 1 (L, l + ), Z 2 (L, l)}, for L (x-axis) and (y-axis) in degrees. As explained in Section 2.2, we bin the latitude with the bin size roughly 7. For the OLS, we obtain another set of parameter estimates by minimizing the sum of the squared differences between the empirical irreversibility and the model fitted irreversibility across the latitude bins (fitted values are evaluated at the center of the latitude bins) and longitude lags,, up to 180. We also tried weighted least squares using the reciprocals of the number of data points at each latitude bin and longitude lag as weights, but the results were quite similar to the OLS fit. For (e), we use the covariance parameter estimates in Table 5. The OLS fit captures the empirical pattern better than the MLE fit, although irreversibility is overestimated near the North Pole. The MLE fit captures the positive irreversibility values near the North Pole, but overall the estimates are much smaller in magnitude than the empirical values. The fact that the OLS fit captures the empirical pattern quite well suggests that the model in (8) is indeed flexible. However, the fitted irreversibility from MLE is different from the empirical values by a factor of 10. The misfit of MLE estimates may be somewhat disappointing at first sight, but considering there are not many covariance models that can produce such irreversibility, it is encouraging to develop covariance models in this direction. Also note that as shown in figure (c), the uncertainty in the empirical longitudinal irreversibility is quite large. We also calculated the standard errors for the fitted irreversibility in (e) using the asymptotic standard errors of the fitted covariance parameters in Table 5, but the magnitude of the standard error is almost the same as the magnitude of the fitted irreversibility in (e). Now let us compare our estimated cross-correlation (in Figures 1 (e) and (f)) to the one in Trenberth and Shea (2005). Note that due to the axial symmetry assumption, our fitted cross-covariance is identical across each latitude bands. The high correlation level in the high latitude area in Northern hemisphere matches well with the result in Trenberth and Shea (2005). Our estimated correlation values are close to the average correlation levels across longitude at each latitude levels in Trenberth and Shea (2005) except the South Pole area; in this region, our estimated levels are slightly negative whereas Trenberth and Shea (2005) give high correlation levels. The estimated cross-correlation in Trenberth and Shea (2005) show clear distinction over land and sea. Unlike in Trenberth and 19

Non-stationary Cross-Covariance Models for Multivariate Processes on a Globe

Non-stationary Cross-Covariance Models for Multivariate Processes on a Globe Scandinavian Journal of Statistics, Vol. 38: 726 747, 2011 doi: 10.1111/j.1467-9469.2011.00751.x Published by Blackwell Publishing Ltd. Non-stationary Cross-Covariance Models for Multivariate Processes

More information

Paper Review: NONSTATIONARY COVARIANCE MODELS FOR GLOBAL DATA

Paper Review: NONSTATIONARY COVARIANCE MODELS FOR GLOBAL DATA Paper Review: NONSTATIONARY COVARIANCE MODELS FOR GLOBAL DATA BY MIKYOUNG JUN AND MICHAEL L. STEIN Presented by Sungkyu Jung April, 2009 Outline 1 Introduction 2 Covariance Models 3 Application: Level

More information

A TEST FOR STATIONARITY OF SPATIO-TEMPORAL RANDOM FIELDS ON PLANAR AND SPHERICAL DOMAINS

A TEST FOR STATIONARITY OF SPATIO-TEMPORAL RANDOM FIELDS ON PLANAR AND SPHERICAL DOMAINS Statistica Sinica 22 (2012), 1737-1764 doi:http://dx.doi.org/10.5705/ss.2010.251 A TEST FOR STATIONARITY OF SPATIO-TEMPORAL RANDOM FIELDS ON PLANAR AND SPHERICAL DOMAINS Mikyoung Jun and Marc G. Genton

More information

Cross-covariance Functions for Tangent Vector Fields on the Sphere

Cross-covariance Functions for Tangent Vector Fields on the Sphere Cross-covariance Functions for Tangent Vector Fields on the Sphere Minjie Fan 1 Tomoko Matsuo 2 1 Department of Statistics University of California, Davis 2 Cooperative Institute for Research in Environmental

More information

What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19

What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19 What s for today Introduction to Space-time models c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, 2012 1 / 19 Space-time Data So far we looked at the data that vary over space Now we add another

More information

Journal of Multivariate Analysis. Nonstationary modeling for multivariate spatial processes

Journal of Multivariate Analysis. Nonstationary modeling for multivariate spatial processes Journal of Multivariate Analysis () 76 9 Contents lists available at SciVerse ScienceDirect Journal of Multivariate Analysis journal homepage: wwwelseviercom/locate/jmva Nonstationary modeling for multivariate

More information

What s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model

What s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model What s for today Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model c Mikyoung Jun (Texas A&M) Stat647 Lecture 11 October 2, 2012 1 / 23 Nonstationary

More information

Asymptotic standard errors of MLE

Asymptotic standard errors of MLE Asymptotic standard errors of MLE Suppose, in the previous example of Carbon and Nitrogen in soil data, that we get the parameter estimates For maximum likelihood estimation, we can use Hessian matrix

More information

Spatial mapping of ground-based observations of total ozone

Spatial mapping of ground-based observations of total ozone doi:10.5194/amt-8-4487-2015 Author(s) 2015. CC Attribution 3.0 License. Spatial mapping of ground-based observations of total ozone K.-L. Chang 1, S. Guillas 1, and V. E. Fioletov 2 1 Department of Statistical

More information

A test for stationarity of spatio-temporal random fields on planar and spherical domains

A test for stationarity of spatio-temporal random fields on planar and spherical domains A test for stationarity of spatio-temporal random fields on planar and spherical domains Mikyoung Jun and Marc G. Genton 1 June 13, 2010 ABSTRACT: A formal test for weak stationarity of spatial and spatio-temporal

More information

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint

More information

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced

More information

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric

More information

Climate Change: the Uncertainty of Certainty

Climate Change: the Uncertainty of Certainty Climate Change: the Uncertainty of Certainty Reinhard Furrer, UZH JSS, Geneva Oct. 30, 2009 Collaboration with: Stephan Sain - NCAR Reto Knutti - ETHZ Claudia Tebaldi - Climate Central Ryan Ford, Doug

More information

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes TTU, October 26, 2012 p. 1/3 Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes Hao Zhang Department of Statistics Department of Forestry and Natural Resources Purdue University

More information

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University Seminar p.1/28 Predictive spatio-temporal models for spatially sparse environmental data Xavier de Luna and Marc G. Genton xavier.deluna@stat.umu.se and genton@stat.ncsu.edu http://www.stat.umu.se/egna/xdl/index.html

More information

Symmetry and Separability In Spatial-Temporal Processes

Symmetry and Separability In Spatial-Temporal Processes Symmetry and Separability In Spatial-Temporal Processes Man Sik Park, Montserrat Fuentes Symmetry and Separability In Spatial-Temporal Processes 1 Motivation In general, environmental data have very complex

More information

Simple example of analysis on spatial-temporal data set

Simple example of analysis on spatial-temporal data set Simple example of analysis on spatial-temporal data set I used the ground level ozone data in North Carolina (from Suhasini Subba Rao s website) The original data consists of 920 days of data over 72 locations

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan

More information

Bo Li. National Center for Atmospheric Research. Based on joint work with:

Bo Li. National Center for Atmospheric Research. Based on joint work with: Nonparametric Assessment of Properties of Space-time Covariance Functions and its Application in Paleoclimate Reconstruction Bo Li National Center for Atmospheric Research Based on joint work with: Marc

More information

Wrapped Gaussian processes: a short review and some new results

Wrapped Gaussian processes: a short review and some new results Wrapped Gaussian processes: a short review and some new results Giovanna Jona Lasinio 1, Gianluca Mastrantonio 2 and Alan Gelfand 3 1-Università Sapienza di Roma 2- Università RomaTRE 3- Duke University

More information

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17 What s for today All about Variogram Nugget effect Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, 2012 1 / 17 What is the variogram? Let us consider a stationary (or isotropic) random field Z

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,

More information

Statistical Inference and Visualization in Scale-Space for Spatially Dependent Images

Statistical Inference and Visualization in Scale-Space for Spatially Dependent Images Statistical Inference and Visualization in Scale-Space for Spatially Dependent Images Amy Vaughan College of Business and Public Administration, Drake University, Des Moines, IA 0311, USA Mikyoung Jun

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Flexible Spatio-temporal smoothing with array methods

Flexible Spatio-temporal smoothing with array methods Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and

More information

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,

More information

A spatio-temporal model for extreme precipitation simulated by a climate model

A spatio-temporal model for extreme precipitation simulated by a climate model A spatio-temporal model for extreme precipitation simulated by a climate model Jonathan Jalbert Postdoctoral fellow at McGill University, Montréal Anne-Catherine Favre, Claude Bélisle and Jean-François

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North

More information

Positive Definite Functions on Spheres

Positive Definite Functions on Spheres Positive Definite Functions on Spheres Tilmann Gneiting Institute for Applied Mathematics Heidelberg University, Germany t.gneiting@uni-heidelberg.de www.math.uni-heidelberg.de/spatial/tilmann/ International

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing

More information

COVARIANCE APPROXIMATION FOR LARGE MULTIVARIATE SPATIAL DATA SETS WITH AN APPLICATION TO MULTIPLE CLIMATE MODEL ERRORS 1

COVARIANCE APPROXIMATION FOR LARGE MULTIVARIATE SPATIAL DATA SETS WITH AN APPLICATION TO MULTIPLE CLIMATE MODEL ERRORS 1 The Annals of Applied Statistics 2011, Vol. 5, No. 4, 2519 2548 DOI: 10.1214/11-AOAS478 Institute of Mathematical Statistics, 2011 COVARIANCE APPROXIMATION FOR LARGE MULTIVARIATE SPATIAL DATA SETS WITH

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information

Spatial smoothing using Gaussian processes

Spatial smoothing using Gaussian processes Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling

More information

Multivariate modelling and efficient estimation of Gaussian random fields with application to roller data

Multivariate modelling and efficient estimation of Gaussian random fields with application to roller data Multivariate modelling and efficient estimation of Gaussian random fields with application to roller data Reinhard Furrer, UZH PASI, Búzios, 14-06-25 NZZ.ch Motivation Microarray data: construct alternative

More information

arxiv: v2 [stat.me] 9 Feb 2016

arxiv: v2 [stat.me] 9 Feb 2016 A Flexible Class of Non-separable Cross- Functions for Multivariate Space-Time Data Marc Bourotte a, Denis Allard a & Emilio Porcu b a Biostatistique et Processus Spatiaux (BioSP), INRA, Avignon, France.

More information

Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites

More information

Point-Referenced Data Models

Point-Referenced Data Models Point-Referenced Data Models Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Point-Referenced Data Models Spring 2013 1 / 19 Objectives By the end of these meetings, participants should

More information

Semi-parametric estimation of non-stationary Pickands functions

Semi-parametric estimation of non-stationary Pickands functions Semi-parametric estimation of non-stationary Pickands functions Linda Mhalla 1 Joint work with: Valérie Chavez-Demoulin 2 and Philippe Naveau 3 1 Geneva School of Economics and Management, University of

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information

BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN

BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C., U.S.A. J. Stuart Hunter Lecture TIES 2004

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

arxiv: v4 [stat.me] 14 Sep 2015

arxiv: v4 [stat.me] 14 Sep 2015 Does non-stationary spatial data always require non-stationary random fields? Geir-Arne Fuglstad 1, Daniel Simpson 1, Finn Lindgren 2, and Håvard Rue 1 1 Department of Mathematical Sciences, NTNU, Norway

More information

Recent Developments in Numerical Methods for 4d-Var

Recent Developments in Numerical Methods for 4d-Var Recent Developments in Numerical Methods for 4d-Var Mike Fisher Slide 1 Recent Developments Numerical Methods 4d-Var Slide 2 Outline Non-orthogonal wavelets on the sphere: - Motivation: Covariance Modelling

More information

Multivariate Gaussian Random Fields with SPDEs

Multivariate Gaussian Random Fields with SPDEs Multivariate Gaussian Random Fields with SPDEs Xiangping Hu Daniel Simpson, Finn Lindgren and Håvard Rue Department of Mathematics, University of Oslo PASI, 214 Outline The Matérn covariance function and

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Numerical Investigation on Spherical Harmonic Synthesis and Analysis

Numerical Investigation on Spherical Harmonic Synthesis and Analysis Numerical Investigation on Spherical Harmonic Synthesis and Analysis Johnny Bärlund Master of Science Thesis in Geodesy No. 3137 TRITA-GIT EX 15-006 School of Architecture and the Built Environment Royal

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

The minimisation gives a set of linear equations for optimal weights w:

The minimisation gives a set of linear equations for optimal weights w: 4. Interpolation onto a regular grid 4.1 Optimal interpolation method The optimal interpolation method was used to compute climatological property distributions of the selected standard levels on a regular

More information

Introduction to Geostatistics

Introduction to Geostatistics Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,

More information

Multi-resolution models for large data sets

Multi-resolution models for large data sets Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation NORDSTAT, Umeå, June, 2012 Credits Steve Sain, NCAR Tia LeRud, UC Davis

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable

More information

Gaussian predictive process models for large spatial data sets.

Gaussian predictive process models for large spatial data sets. Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

ROBUST MEASUREMENT OF THE DURATION OF

ROBUST MEASUREMENT OF THE DURATION OF ROBUST MEASUREMENT OF THE DURATION OF THE GLOBAL WARMING HIATUS Ross McKitrick Department of Economics University of Guelph Revised version, July 3, 2014 Abstract: The IPCC has drawn attention to an apparent

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Computer model calibration with large non-stationary spatial outputs: application to the calibration of a climate model

Computer model calibration with large non-stationary spatial outputs: application to the calibration of a climate model Computer model calibration with large non-stationary spatial outputs: application to the calibration of a climate model Kai-Lan Chang and Serge Guillas University College London, Gower Street, London WC1E

More information

arxiv: v1 [stat.me] 3 Nov 2018

arxiv: v1 [stat.me] 3 Nov 2018 Nonparametric Spectral Methdods for Multivariate Spatial and Spatial-Temporal Data Joseph Guinness Cornell University, Department of Statistical Science Abstract arxiv:1811.01280v1 [stat.me] 3 Nov 2018

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Half-Spectral Space-Time Covariance Models

Half-Spectral Space-Time Covariance Models Half-Spectral Space-Time Covariance Models Michael T. Horrell & Michael L. Stein University of Chicago arxiv:1505.01243v1 [stat.me] 6 May 2015 September 18, 2018 Abstract We develop two new classes of

More information

Atmospheric Predictability experiments with a large numerical model (E. N. Lorenz, 1982)

Atmospheric Predictability experiments with a large numerical model (E. N. Lorenz, 1982) Atmospheric Predictability experiments with a large numerical model (E. N. Lorenz, 1982) Imran Nadeem University of Natural Resources and Applied Life Sciences (BOKU), Vienna, Austria OUTLINE OF TALK Introduction

More information

Multivariate spatial modeling

Multivariate spatial modeling Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced

More information

Overview of Spatial Statistics with Applications to fmri

Overview of Spatial Statistics with Applications to fmri with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example

More information

MODEL TYPE (Adapted from COMET online NWP modules) 1. Introduction

MODEL TYPE (Adapted from COMET online NWP modules) 1. Introduction MODEL TYPE (Adapted from COMET online NWP modules) 1. Introduction Grid point and spectral models are based on the same set of primitive equations. However, each type formulates and solves the equations

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

New Classes of Asymmetric Spatial-Temporal Covariance Models. Man Sik Park and Montserrat Fuentes 1

New Classes of Asymmetric Spatial-Temporal Covariance Models. Man Sik Park and Montserrat Fuentes 1 New Classes of Asymmetric Spatial-Temporal Covariance Models Man Sik Park and Montserrat Fuentes 1 Institute of Statistics Mimeo Series# 2584 SUMMARY Environmental spatial data often show complex spatial-temporal

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Spatial Analysis to Quantify Numerical Model Bias and Dependence: How Many Climate Models Are There?

Spatial Analysis to Quantify Numerical Model Bias and Dependence: How Many Climate Models Are There? JASA jasa v.2007/01/31 Prn:6/11/2007; 14:44 F:jasaap06624r1.tex; (Diana) p. 1 Spatial Analysis to Quantify Numerical Model Bias and Dependence: How Many Climate Models Are There? Mikyoung JUN, RetoKNUTTI,

More information

A Generalized Convolution Model for Multivariate Nonstationary Spatial Processes

A Generalized Convolution Model for Multivariate Nonstationary Spatial Processes A Generalized Convolution Model for Multivariate Nonstationary Spatial Processes Anandamayee Majumdar, Debashis Paul and Dianne Bautista Department of Mathematics and Statistics, Arizona State University,

More information

Multilevel Analysis, with Extensions

Multilevel Analysis, with Extensions May 26, 2010 We start by reviewing the research on multilevel analysis that has been done in psychometrics and educational statistics, roughly since 1985. The canonical reference (at least I hope so) is

More information

CBMS Lecture 1. Alan E. Gelfand Duke University

CBMS Lecture 1. Alan E. Gelfand Duke University CBMS Lecture 1 Alan E. Gelfand Duke University Introduction to spatial data and models Researchers in diverse areas such as climatology, ecology, environmental exposure, public health, and real estate

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions Spatial inference I will start with a simple model, using species diversity data Strong spatial dependence, Î = 0.79 what is the mean diversity? How precise is our estimate? Sampling discussion: The 64

More information

Coregionalization by Linear Combination of Nonorthogonal Components 1

Coregionalization by Linear Combination of Nonorthogonal Components 1 Mathematical Geology, Vol 34, No 4, May 2002 ( C 2002) Coregionalization by Linear Combination of Nonorthogonal Components 1 J A Vargas-Guzmán, 2,3 A W Warrick, 3 and D E Myers 4 This paper applies the

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

A STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES

A STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES A STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES JOHN STEPHENSON 1, CHRIS HOLMES, KERRY GALLAGHER 1 and ALEXANDRE PINTORE 1 Dept. Earth Science and Engineering, Imperial College,

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.

More information

P -spline ANOVA-type interaction models for spatio-temporal smoothing

P -spline ANOVA-type interaction models for spatio-temporal smoothing P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and

More information

Modelling the Covariance

Modelling the Covariance Modelling the Covariance Jamie Monogan Washington University in St Louis February 9, 2010 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 1 / 13 Objectives By the end of this meeting, participants

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated Spatial Trends and Spatial Extremes in South Korean Ozone Seokhoon Yun University of Suwon, Department of Applied Statistics Suwon, Kyonggi-do 445-74 South Korea syun@mail.suwon.ac.kr Richard L. Smith

More information

Statistical Models for Monitoring and Regulating Ground-level Ozone. Abstract

Statistical Models for Monitoring and Regulating Ground-level Ozone. Abstract Statistical Models for Monitoring and Regulating Ground-level Ozone Eric Gilleland 1 and Douglas Nychka 2 Abstract The application of statistical techniques to environmental problems often involves a tradeoff

More information

Multi-resolution models for large data sets

Multi-resolution models for large data sets Multi-resolution models for large data sets Douglas Nychka, National Center for Atmospheric Research National Science Foundation Iowa State March, 2013 Credits Steve Sain, Tamra Greasby, NCAR Tia LeRud,

More information

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Working Paper 2013:9 Department of Statistics Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Ronnie Pingel Working Paper 2013:9 June

More information

COPING WITH NONSTATIONARITY IN CATEGORICAL TIME SERIES

COPING WITH NONSTATIONARITY IN CATEGORICAL TIME SERIES COPING WITH NONSTATIONARITY IN CATEGORICAL TIME SERIES Monnie McGee and Ian R. Harris Department of Statistical Science Southern Methodist University Dallas, TX 75275 mmcgee@smu.edu & iharris@smu.edu Key

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Applications of Tail Dependence II: Investigating the Pineapple Express. Dan Cooley Grant Weller Department of Statistics Colorado State University

Applications of Tail Dependence II: Investigating the Pineapple Express. Dan Cooley Grant Weller Department of Statistics Colorado State University Applications of Tail Dependence II: Investigating the Pineapple Express Dan Cooley Grant Weller Department of Statistics Colorado State University Joint work with: Steve Sain, Melissa Bukovsky, Linda Mearns,

More information

On Gaussian Process Models for High-Dimensional Geostatistical Datasets

On Gaussian Process Models for High-Dimensional Geostatistical Datasets On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May

More information