Bayesian hierarchical models for spatially misaligned data in R
|
|
- David Lynch
- 5 years ago
- Views:
Transcription
1 Methods in Ecology and Evolution 24, 5, doi:./24-2x.29 APPLICATION Bayesian hierarchical models for spatially misaligned data in R Andrew O. Finley *, Sudipto Banerjee 2 and Bruce D. Cook 3 Department of Forestry, Michigan State University, 26 Natural Resources Building, East Lansing, MI , USA; 2 Division of Biostatistics, School of Public Health, University of Minnesota, A46 Mayo Building, MMC 33, 42 Delaware Street S.E., Minneapolis, MN 55455, USA; and 3 Biospheric Sciences Laboratory, National Aeronautics and Space Administration, Goddard Space Flight Center, Code 6, Greenbelt, MD 277,USA Summary. Spatial misalignment occurs when at least one of multiple outcome variables is missing at an observed location. For spatial data, prediction of these missing observations should be informed by within location association among outcomes and by proximate locations where measurements were recorded. 2. This study details and illustrates a Bayesian regression framework for modelling spatially misaligned multivariate data. Particular attention is paid to developing valid probability models capable of estimating parameter posterior distributions and propagating uncertainty through to outcomes predictive distributions at locations where some or all of the outcomes are not observed. 3. Models and associated software are presented for both Gaussian and non-gaussian outcomes. Model parameter and predictive inference within the proposed framework is illustrated using a synthetic and forest inventory data set. 4. The proposed Markov chain Monte carlo samplers were written in C++ and leverage R s Foreign Language Interface to call FORTRAN BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra Package) libraries for efficient matrix computations. The models are implemented in the spmisalignlm and spmisalignglm functions within the spbayes R package available via the Comprehensive R Archive Network (CRAN) ( Key-words: multivariate, misalignment, missingness, Gaussian spatial process, linear model of coregionalization, Markov chain Monte Carlo Introduction Investment in long-term monitoring networks and advancement in sensor technologies are creating data-rich environments that provide extraordinary opportunities to understand the complexity of large and spatially indexed ecological data. Building such understanding often requires the analysis of spatially indexed data sets with multiple variables measured at each location. In such settings, it is commonly posited that there is association between the measurements at a given location as well as association among measurements across locations. In ecological analysis, we often seek inference about the association among these multiple variables or wish to predict their values at new locations. For example, consider the analysis of (i) species co-occurrence where species presence/ absence or abundance is recorded at each location, for example Ovaskainen, Hottola & Siitonen (2); (ii) soil nutrient impact on local tree growth and competition where soil nutrient measurements coincide with tree inventory locations, for example Baribault, Kobe & Finley (22); or (iii) relationship *Correspondence author. finleya@msu.edu between multiple environmental stressors and measures of focal species fitness, for example Swope & Parker (22). In each case, development of a statistical model typically requires the full set of outcomes, for example species presence/absence, and covariates, for example soil nutrients or environmental stressors, at a set of locations. Given such multivariate settings, it is common that different subsets of the outcome variables, or covariates, are available at different locations. In the statistical literature, this situation is referred to as spatial misalignment. Following the examples above, say observers record the presence/absence of different subsets of species at different locations, or for a subset of locations, only some of the soil nutrients or plant stressors were measured perhaps due to different sampling protocol or if data were drawn from different data bases. In such cases, it is necessary to somehow impute or predict the value of the missing observations. Note, if there is misalignment among the covariates, then we might view them as outcomes in a model used to predict their missing observations. Regardless of where the misalignment occurs, these predictions should be informed using the within location association among variables and from proximate locations where measurements were recorded. 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society
2 Models for spatially misaligned multivariate data 55 Further, it is common to seek prediction for the entire set of outcomes at new locations where no measurements were recorded. In both cases, an assessment of prediction uncertainty is also often desired. Here, we consider point-point misalignment to distinguish it from point-areal whichreferstothesituationwheresomevariables may be referenced by their points, while others may have been aggregated over spatial regions. Although the term pointareal misalignment is used in the literature, following Gotway & Young (22), we prefer to classify this as a change-of-support problem. See also Mugglin, Carlin & Gelfand (2), Gelfand, Zhu & Carlin (2), Zhu, Carlin & Gelfand (23) and the references therein for methods for change-of-support problems. A salient feature of our data is that every location generates, at most, only one replicate of the multiple outcomes. For empirical estimation of the association among these outcomes using sampling-based multivariate analysis methods, one must consider the observations at different spatial locations as independent replicates. This will, however, preclude estimation of the spatial associations. Under this setting, inference on associations should deploy fully model-based approaches using the flexibility of spatial stochastic processes. Existing model-based methods for handling spatial pointpoint misalignment primarily aim to align disparate variables by accounting for additional uncertainty when kriging or other smoothing methods are used to align the spatially referenced data (Madsen, Ruppert & Altman 2; Buonaccorsi 29; Gryparis et al. 29; Paciorek et al. 29; Szpiro et al. 2; Lopiano, Young & Gotway 2 23). These approaches build conditional regression-like models where the marginal distribution of the first outcome is specified, followed by the conditional distribution of the second outcome given the first and so on. This approach is easily interpretable and ensures the legality of the resulting joint distributions from the process realizations. However, the approach is more suitable when the number of outcomes is small and there is a natural ordering that would suggest the sequence for constructing the conditional distributions. Settings such as ours lack such information on ordering, so joint modelling of the outcomes is preferable to avoid the explosion in models emerging from alternate ordering schemes. Joint models attempt to directly construct cross-covariance functions that describe the covariances between different outcomes at two, possibly different, locations. A model-based Bayesian approach for point-point misalignment was presented in Banerjee & Gelfand (22). More recently, joint modelling of point patterns and misaligned covariates are considered in Illian, Sorbye & Rue (22), while Ren & Banerjee (23) considered modelling spatial misalignment using a class of spatial latent factor models. While the problem of spatial misalignment is ubiquitous, software to implement model-based analysis of such data is absent. Our current work focuses upon point-point misalignment and extends and integrates some of the aforementioned methodological work into a Bayesian hierarchical modelling framework. In addition, we demonstrate how this is implemented in our spbayes package for the R statistical programming language and environment. Multivariate spatial regression with misalignment Let S,S 2,...,S m denote sets comprising n,n 2,...,n m locations where m outcomes have been observed. We collect all observations for the first outcome into an n 9 columnvectory, those for the second outcome into an n 2 9 columnvectory 2, and so on until we collect observations corresponding to the m-th outcome into an n m 9 column vector y m.eachofthese isstackedintoann 9 columnvectory where N ¼ P m i n i. The covariates corresponding to the i-th outcome y i are collected into an n i 9 p i matrix X i,andweletb i denote the p i 9 regression slope vector associated with X i. The other key ingredient in the multivariate spatial regression model is the vector of unobserved spatial random effects. For any location s, indexed by some coordinate frame, we have a spatial random effect w i (s) associated with the i-th outcome y i (s) fori =,2,...,m. We collect the random effects corresponding to the i-th outcome into an n i 9 vectorw i so that it corresponds to y i. The multivariate spatial linear regression model is given by y i ¼ X i b i þ w i þ e i i ¼ ; 2;...; m; eqn where e i is an n i 9 column of zero-centred residual random errors corresponding to the i-th outcome such that the covariance between an element in e i andanelementine j is zero whenever i and j correspond to different outcomes. Two elements within e i represent random errors associated with the i-th outcome measured at two different locations. The covariances between any two such elements and the variances of each element in e i are placed as off-diagonal and diagonal entries in an n i 9 n i matrix Ψ i, which is the variance covariance matrix of e i.eachofthee i s is assumed to be normally distributed, independent of the others, with mean zero and variance covariance matrix Ψ i. Model () can be extended to accommodate non-gaussian outcomes such as (i) binary data modelled using logit or probit regression, and (ii) count data modelled using Poisson regression. Diggle, Tawn & Moyeed (99) unify the use of generalized linear models in spatial data contexts. See also Lin et al. (2), Kamman & Wand (23), and Banerjee, Carlin & Gelfand (24). Essentially we replace model () with the assumption that E[y i (s)] is linear on a transformed scale, that is, g(e[y i (s)]) = x i (s) b i + w i (s), where g() is a suitable link function and x i (s) isthep i 9 vector that includes outcome- and location-specific covariates. Spatial association is captured by the spatial effects, that is, the w i s in (). Any two entries in w i correspond to the spatial random effects for outcome i from two different locations. These are assumed to be associated or correlated based upon a function of the separation or distance between the two locations. The essence of multivariate spatial modelling is to prescribe these covariances in such a way that the joint distribution of the w i s, for i =,2,...,m, in () is a multivariate normal distribution. The key modelling ingredient here is a multivariate spatial process, see, for example, Chiles & Delfiner (999), Cressie & Wikle (2), and Banerjee, Carlin & Gelfand (24). In our 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
3 56 A. O. Finley, S. Banerjee & B. D. Cook context, the multivariate spatial process is an infinite collection of m 9 vectorsw(s) indexed by spatial coordinates s residing in two or three dimensional Euclidean space. The spatial random effects arise as a finite subset of this set indexed by the locations where the outcomes have been observed. A spatial process is well-defined whenever any finite collection of random effects has a legitimate probability distribution. When these distributions always belong to a multivariate normal family, we say that the spatial process is a Gaussian process. In (), each w i is an n i 9 vector of spatial random effects collected over the locations where outcome i has been observed. The covariance among outcomes spatial random effects provides learning about missing observations. The details on constructing and estimating the covariance among spatial random effects are given in Appendix S. In brief say we wish to model the covariance between spatial random effects corresponding to two different outcomes at two different locations. That is, for outcomes i and j, and locations s k and s l,wemust specify cov{w i (s k ),w j (s l )} in a manner that will ensure a legitimate probability distribution for the joint distribution of {w i : i =,2,...,m}. This covariance is specified using a spatial cross-covariance function that is constructed using outcomespecific spatial correlation functions which include parameters to control the random effects spatial dependence, for example rate of spatial decay. Given parameter estimates, the crosscovariance functions provide inference about how outcomes covary in space, after accounting for covariates, and inform prediction. We adopt the Bayesian paradigm for inference, see, for example, Gelman et al. (24), and build hierarchical models by modelling the parameters using probability distributions. Inference about the regression slopes, the spatial random effects, and the variances and covariances is based on Markov chain Monte Carlo (MCMC) sampling from posterior distributions. As noted in Introduction, a primary aim of our analysis is interpolation and prediction. Following terminology used in Banerjee & Gelfand (22), when we estimate the value of an outcome at a location where some of the other outcomes have been observed, we call it interpolation. When we seek to estimate the value of an outcome at a new location, where none of the outcomes have been observed, we call it prediction. In sampling-based Bayesian inference, we draw samples from the posterior predictive distributions of the outcome variable at unobserved locations given the observed data. The posterior predictive distribution is in fact the posterior distribution of y i (s )giveny,wheres is the location we want to interpolate or predict. Additional details can be found in the Appendix S. Software implementation The models described in the preceding section are available in the spbayes (version.3-) R package spmisalignlm and spmisalignglm functions for Gaussian and non-gaussian outcomes, respectively. These functions are written in C++ and leverage R s Foreign Language Interface to call FORTRAN BLAS (Basic Linear Algebra Subprograms, see Blackford et al. 22) and LAPACK (Linear Algebra Package, see Anderson et al. 999) libraries for efficient matrix computations. A heavy reliance on BLAS and LAPACK functions allows the software to leverage multiprocessor/core machines via threaded implementations of BLAS and LAPACK, for example Intel s Math Kernel Library (MKL; en-us/intel-mkl). Use of MKL, or similar threaded libraries, can dramatically reduce sampler run-times. For example, the illustrative analyses offered in subsequent sections were conducted using R, and hence spbayes, compiled with MKL on an Intel Ivy Bridge i7 quad-core processor with hyperthreading. The use of these parallel matrix operations results in a near linear speedup in the MCMC sampler s run-time with the number of CPUs. In addition to Appendix S, Finley, Banerjee & Gelfand (23) provide specifics on efficient implementation of the multivariate Gaussian process parameter estimation. Illustrative analyses SYNTHETIC DATA We consider a synthetic data set comprising three outcome variables observed over unique and common locations within a unit square domain. The analysis of these data demonstrates how the strength of correlation between outcomes spatial random effects and range of spatial dependence influences the accuracy and precision of prediction and interpolation. The R code to reproduce this and subsequent analyses is available in Finley, Banerjee & Cook (24). Following model () and using the true parameter values given in the first column of Table, we generated outcomes at all locations in Fig. (a). These outcomes are shown in Fig. (b d). Outcome observations were then subsampled to create misalignment following the design in Fig. (a). Here, each circle contains those locations where the given outcome identified by the circles number is observed. Regions where Table. Parameter values used to generated the synthetic data in the column labelled True along with spmisalignlm estimated parameter posterior distribution 5 (25, 975) percentiles. The b, correspond to outcomes regression intercepts, q is the cross-correlation between the outcomes spatial random effects, / is the spatial cross-correlation decay parameter, and Ψ is the non-spatial residual variances associated with each outcome. Subscripts indicate the associated outcome variable True Estimate b, (67, 57) b 2, (333, 639) b 3, 9 (99, 29) q,2 6 ( 97, 66) q,3 9 7 (63, 97) q 2, ( 7, 7) / 6 73 (439, 42) / (472, 2947) / (534, 252) Ψ 5 (2, 2) Ψ 2 4 (, 2) Ψ 3 5 (2, 27) 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
4 Models for spatially misaligned multivariate data 57 (a) (b) Fig.. (a) Locations of observed and unobserved outcome variables. Data associated with each outcome are observed within its respective circle, indicated by numbers, 2 and 3. Intersecting regions contain locations where two or more outcomes are observed. Surfaces for outcomes, 2 and 3 are given in (b), (c) and (d), respectively (c) (d) the circles overlap identify those locations where multiple outcomes were observed. The true spatial cross-covariances used to generate the data can be converted to cross-correlations to facilitate interpretation. These correlations are provided in Table and also displayed in their respective regions of overlap in Fig. (a). Given spatially misaligned data, the spmisalignlm function called in the R code below generates posterior samples from the parameters of the posited model. This function takes each outcome s symbolic regression model and locations where data are observed. Additionally, parameter starting values, prior distributions, MCMC Metropolis algorithm proposal distribution variances, spatial correlation function and number ofdesiredmcmcsamplesarealsopassedtothespmisalignlm function. A full explanation of argument syntax and output is available in the function s manual available via CRAN. 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
5 5 A. O. Finley, S. Banerjee & B. D. Cook (a) (b) (c) Fig. 2. Misalignment model posterior predictive distribution median surfaces for outcomes, 2 and 3 in (a), (b) and (c), respectively. (a) (b) (c) Fig. 3. Misalignment model posterior predictive distribution uncertainty surfaces for outcomes, 2 and 3 in (a), (b) and (c), respectively. The resulting MCMC samples were summarized using functions in the coda package and displayed in Table. Here, we can see that parameters estimated 95% credible intervals include the true parameter values. As we will see in the subsequent data analysis, Penobscot Experimental Forest LiDAR and biomass data, the parameter estimates associated with the spatial random effects cross-correlations can be used to explore hypotheses about association after accounting for the impact of covariates. Also, one can look to the spatial crosscorrelation decay parameters to make inference about the geographical range of dependence among observations. Given the spmisalignlm object m.miss and spatial coordinates with associated covariates, one can interpolate and predict using the sppredict function. In the code 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
6 Models for spatially misaligned multivariate data 59 Table 2. Univariate spatial regression model prediction and misalignment model interpolation performance. Performance metrics are (i) root mean squared error (RMSE) between the observed and predicted or interpolated outcomes; (ii) mean width between the lower and upper 95% posterior predictive distribution credible intervals (CI width); and (iii) the percentage of observations covered by their respective 95% credible interval (CI cover) Univariate outcome Misalignment outcome RMSE CI width CI cover below, sppredict is used to generate posterior predictive samples for all three outcomes at all locations in Fig. (a). Figures 2 and 3 show the median and dispersion of the resulting posterior predictive distributions. The interpolated and predicted outcomes shown in Fig. 2(a c) closely approximate the observed data Fig. (b d). We summarize the prediction uncertainty using the width between the lower and upper 95% posterior predictive credible intervals; given in Fig. 3(a), (b) and (c) for outcomes, 2 and 3, respectively. These surfaces show that stronger cross-correlation between outcomes result in more precise interpolation. For example, the spatial random effects associated with outcome are strongly correlated with those of outcomes 2 and 3, that is, estimated cross-correlation of 6 and 7, respectively. As a result, Fig. 3(a) shows greater precision in interpolation of outcomes 2 and 3 when outcome is observed (notice the lighter colours in circles 2 and 3). In contrast, when the cross-correlation is weak, there is less information available to inform interpolation. For example, a cross-correlation of 62betweenoutcomes2and3resultsinonly marginal narrowing of the interpolation precision in either Fig. 3(b) or (c). (a) (km) 5 PEF study area LVIS Lp25 and Lp95 extent G LiHT Gp95 observations Sample plot BIO observations 5 2 (km) (b) 25 2 (c) 2 Fitted Gp95 5 Fitted BIO 5 Fig. 4. Penobscot Experimental Forest LiDAR and sample plot data extent and locations (a). Misalignment model posterior distribution median (black point symbol) and 95% credible intervals for Gp95 and BIO in (b) and (c), respectively Observed Gp Observed BIO 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
7 52 A. O. Finley, S. Banerjee & B. D. Cook We are in a prediction setting when none of the outcomes are observed at a given location. In Fig. (a), prediction occurs for all locations outside of the three circles. In the absence of covariates, prediction is only informed by proximate observed locations. The stronger the spatial dependence, the more information for prediction is gleaned from observed locations. For example, the spatial decay point estimate for outcome is 73, which corresponds to an effective spatial range 3 domain distance units (where we define effective spatial range as the distance at which the spatial correlation drops to 5). The result of this relatively long spatial range is that predictions made just outside of circle show more precise posterior predictive intervals (notice the halo around the circle). To assess the usefulness of estimating the covariance among the outcomes spatial random effects for interpolation, we Table 3. Estimated parameter posterior distribution 5 (25, 975) percentiles for Penobscot Experimental Forest misalignment model. The b, correspond to outcomes regression intercepts, q Gp95,BIO is the cross-correlations between the outcomes spatial random effects, and Ψ is the non-spatial residual variances associated with each outcome. Estimates for / Gp95 and / BIO have been transformed to their respective effective spatial range in km Estimate b Gp95, 6 (245, 99) b Gp95,Lp25 45 (2, 6) b Gp95,Lp95 4 ( 3, 5) b Bio, 93 (95, 53) Ψ Gp95 75 (2, 9) Ψ BIO 4 (26, 9) q Gp95,BIO 36 (6, 9) Gp95 eff. range (km) 299 (3, 46) BIO eff. range (km) 49 (7, 54) compare the misalignment model results to predictions generated by outcome-specific univariate spatial regressions. The univariate models are equivalent to model () but assume thereisnocovarianceamongtheoutcomes randomeffects. These univariate models can be fit using the splm function in spbayes. Summaries of prediction performance are given in Table 2 and show the misalignment model improves prediction accuracy and precision for each outcome, as reflected by lower RMSE and narrower 95% credible intervals compared to those of the univariate model. PENOBSCOT EXPERIMENTAL FOREST LIDAR AND BIOMASS DATA This illustrative analysis considers data from a 6-ha area within the US Forest Service Penobscot Experimental Forest (PEF; ME, USA. The PEF has been studied extensively beginning in the 95s and is under active forest management as part of several long-term silvicultural experiments. A variety of forest variables are recorded on over 6 permanent georeferenced sample plots across the PEF. Light Detection and Ranging (LiDAR) data from the National Aeronautics and Space Administration (NASA) airborne Laser Vegetation Imaging Sensor (LVIS; and LiDAR, hyperspectral and thermal (G-LiHT; Cook et al. 23) sensors are also available for the PEF. The objectives of this illustrative analysis are to produce predictive maps, with associated uncertainty, of (i) forest canopy height metrics from sparsely sampled LiDAR, for example, G- LiHT, and (ii) forest variables measured at forest sample plots. For brevity, we consider only a subset of the available PEF data. The location and extent of these data are show in Fig. 4(a) and include: (a) 2 (b) 6 (km) (km) (km) (km) (c) 5 (d) 4 2 (km) (km) 5 (km) (km) Fig. 5. Penobscot Experimental Forest misalignment model posterior predictive distribution summary surfaces. Posterior median for Gp95 and BIO given in (a) and (b), respectively. Range between the lower and upper 95% credible intervals for Gp95 and BIO given in (c) and (d), respectively. 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
8 Models for spatially misaligned multivariate data 52 forest canopy height 25th and 95th percentiles, labelled Lp25 and Lp95, respectively, measured in 23 using LVIS at a 25-m-diameter footprint across the extent of the study area; forest canopy height 95th percentile, labelled Gp95, measured in 22 using the G-LiHT sensor at a 25-m-diameter footprint along a single transect across the study area; metric tons of live above-ground tree biomass per ha, BIO, estimated at each of the 7 permanent sample plots between 2 and 22. Here, we are interested in predicting both Gp95 and BIO at a fine spatial resolution across the study area. We expect a positive relationship between Gp95, which is a proxy for canopy height, and BIO. Further, although the forest structure has changed since 23 due to timber harvesting, the complete coverage LVIS Lp25 and Lp95 variables might explain some variability in the more current G-LiHT Gp95, and therefore, we use these metrics as covariates in the subsequent regression. This model is specified in the code below, along with parameter starting values, prior distributions, MCMC algorithm specifics and the spatial correlation function. Although not shown, variogram analysis of univariate non-spatial model residuals and other exploratory data analysis tools can help guide choice of prior distributions and associated hyperparameters for the spatial and non-spatial covariances. Again, a full explanation of argument syntax and output is available in the function s manual available via CRAN. The resulting MCMC samples were summarized using functions in the coda package and displayed in Table 3. Here, we see the LVIS Lp25 covariate explains a substantial portion of variability in G-LiHT Gp95, that is, the 95% credible intervals of the b Gp95,Lp25 do not include zero. Given timber harvesting activity in the study area over the 9 years between the LiDAR measurements, the lack of relationship between the sensors 95th canopy height percentiles is not too surprising. The long effective spatial ranges estimated for Gp95 and BIO suggest there is substantial spatial structure among the residuals. The effective spatial ranges are calculated using the cross-covariance and spatial correlation functions parameter estimates, see Finley, Banerjee & Cook (24) and Gelfand et al. (24, p. 292). Further, Gp95 s and BIO s spatial random effects are moderately correlated q Gp95,BIO 36. Estimating this crosscorrelation is useful for exploring hypotheses about strength and direction of association among the outcomes residual spatial structure, perhaps after accounting for some covariates. In this analysis, we could say there is a positive and significant, that is, credible intervals do not include zero, correlation between the residual spatial structure of Gp95 and BIO. Given the spmisalignlm object m.miss and spatial coordinates with associated covariates, one can interpolate and predict using the sppredict function. In the supplemental analysis code (Finley, Banerjee & Cook 24), sppredict is used to generate posterior predictive samples for Gp95 and BIO at all 226 locations where Lp25 and Lp95 were observed. Surfaces of the resulting posterior predictive distributions median and width between the lower and upper 95% credible intervals are given for Gp95 and BIO in Fig. 5. The posterior predictive medians shown in Fig. 5(a) and (b) closely approximate the observed data, see, for example, model fitted versus observed values in Fig. 4(b) and (c). However, more pertinent to this illustration, Fig. 5(c) and (d) 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
9 522 A. O. Finley, S. Banerjee & B. D. Cook shows narrowing of the posterior predictive distribution at and near locations of interpolation for the respective outcome. For example, the narrowing of the posterior predictive distributions for predicted Gp95 at and near observed BIO locations is clearly seen in Fig. 5(c). Similarly, Fig. 5(d) shows the posterior predictive distributions for BIO narrow within and adjacent to the G-LiHT transect where Gp95 is observed. Discussion and summary Themultivariatemodelshouldyieldimprovedpredictiveinference, over univariate models, in settings where there is moderate-to-strong covariance among outcomes spatial random effects and where the spatial range of dependence is sufficiently long as to allow observations to contribute information across locations. The development in the section Multivariate spatial regression with misalignment, and subsequent analyses, assumes a constant covariance among outcomes over the domain. This assumption might be reasonable in many settings. However, a more flexible model would pursue a non-stationary formulation of the cross-covariance matrix, see, for example, Guhaniyogi et al. (23). Such non-stationary cross-covariance models could improve inference about changing patterns in the strength and direction of the correlation between outcomes at broad spatial scales. In addition to improving prediction and interpolation in some settings, the multivariate misalignment model could be useful in designing efficient monitoring efforts. For example, if one had an a-priori estimate of the covariance among outcomes, or could learn about this covariance through an initial sampling effort, then resources could be used for an appropriate level of sampling of outcome subsets. This represents a very active area of work that builds upon a rich literature on sampling designs for spatiotemporal environmental data, see, for example, Mateu & M uller (23). Further development of the multivariate misalignment model for inference about spatiotemporal processes is a logical next step and would likely find application for exploring complex and dynamic ecological processes. Acknowledgements This work was supported by National Science Foundation Grants DMS- 669, EF-3739, EF-2474 and EF , as well as NASA Carbon Monitoring System grants. Data accessibility Data deposited in the Dryad repository: 56dryad.3g9s2 References Anderson,E.,Bai,Z.,Bischof,C.,Blackford,S.,Demmel,J.,Dongarra,J.,et al. (999) LAPACK Users Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia, PA. ISBN Banerjee, S. & Gelfand, A.E. (22) Prediction, interpolation and regression for spatially misaligned data sets. Sankhya Series A, 64, Banerjee, S., Carlin, B.P. & Gelfand, A.E. (24) Hierarchical Modeling and Analysis for Spatial Data. Chapman and Hall/CRC Press, Boca Raton, FL. Baribault,T.,Kobe,R.K.&Finley,A.O.(22)Tropicaltreegrowthiscorrelated with soil phosphorus, potassium, and calcium, though not for legumes. Ecological Monographs, 2, Blackford, S.L., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., et al. (22) An Updated Set of Basic Linear Algebra Subprograms (BLAS). Transactions on Mathematical Software, 2, Buonaccorsi, J.P. (29) Measurement Error: Models, Methods and Applications. Chapman & Hall/CRC, Boca Raton, FL. Chiles, J.P. & Delfiner, P. (999) Geostatistics: Modelling Spatial Uncertainty. Wiley, New York. Cook, B.D., Corp, L.W., Nelson, R.F., Middleton, E.M., Morton, D.C., McCorkel, J.T., et al. (23) NASA Goddard s Lidar, Hyperspectral and Thermal (G-LiHT) airborne imager. Remote Sensing, 5, Cressie, N.A.C. & Wikle, C.K. (2) Statistics for Spatio-Temporal Data. Wiley, New York. Diggle, P.J., Tawn, J.A. & Moyeed, R.A. (99) Model-based geostatistics (with discussion). Journal of the Royal Statistical Society, Series C (Applied Statistics), 47, Finley, A.O., Banerjee, S. & Gelfand, A.E. (23) spbayes for large univariate and multivariate point-referenced spatio-temporal data models. arxiv:3. 92[stat.CO]. Finley, A.O., Banerjee, S. & Cook, B.D. (24) Data from: Bayesian hierarchical models for spatially misaligned data in R. Methods in Ecology and Evolution. doi:.56/dryad.3g9s2 Gelfand, A.E., Zhu, L. & Carlin, B.P. (2) On the change of support problem for spatio-temporal data. Biostatistics, 2, Gelfand, A.E., Schmidt, A.M., Banerjee, S. & Sirmans, C.F. (24) Nonstationary multivariate process modelling through spatially varying coregionalization (with discussion). TEST, 3, Gelman, A., Carlin, J.B., Stern, H.S. & Rubin, D.B. (24) Bayesian Data Analysis, 2nd edn. Chapman and Hall/CRC Press, Boca Raton, FL. Gotway, C.A. & Young, L.J. (22) Combining incompatible spatial data. Journal of the American Statistical Association, 97, Gryparis, A., Paciorek, C.J., Zeka, A., Schwartz, J. & Coull, B.A., (29) Measurement error caused by spatial misalignment in environmental epidemiology. Biostatistics,, Guhaniyogi, R., Finley, A.O., Banerjee, S. & Kobe, R.K. (23) Modeling complex spatial dependencies: low-rank spatially-varying cross-covariances with application to soil nutrient data. Journal of Agricultural, Biological, and Environmental Statistics,, Illian, J.B., Sorbye, S.H. & Rue, H. (22) A toolbox for fitting complex spatial point process models using integrated nested Laplace approximation (INLA). The Annals of Applied Statistics, 6, Kamman, E.E. & Wand, M.P. (23) Geoadditive models. Applied Statistics, 52,. Lin, X., Wahba, G., Xiang, D., Gao, F., Klein, R. & Klein, B. (2) Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV. Annals of Statistics, 2, Lopiano, K.K., Young, L.J. & Gotway, C.A. (2) A comparison of errors in variables methods for use in regression models with spatially misaligned data. Statistical Methods in Medical Research, 2, Lopiano, K.K., Young, L.J. & Gotway, C.A. (23) Estimated generalized least squares in spatially misaligned regression models with berkson error. Biostatistics, 4, Madsen, L., Ruppert, D. & Altman, N.S. 2. Regression with spatially misaligned data. Environmetrics, 9, Mateu, J. & M uller, W.G. (23) Spatio-Temporal Design: Advances in Efficient Data Acquisition. John Wiley & Sons, Ltd., West Sussex. Mugglin, A.S., Carlin, B.P. & Gelfand, A.E. (2) Fully model-based approaches for spatially misaligned data. Journal of the American Statistical Association, 95, Ovaskainen, O., Hottola, J. & Siitonen, J. (2) Modeling species co-occurrence by multivariate logistic regression generates new hypotheses on fungal interactions. Ecology, 2, Paciorek, C.J., Yanosky, J.D., Puett, R.C., Laden, F. & Suh, H.H. (29) Practical large-scale spatio-temporal modeling of particulate matter concentrations. The Annals of Applied Statistics, 3, Ren, Q. & Banerjee, S. (23) Hierarchical factor models for large spatially misaligned data: a low-rank predictive process approach. Biometrics, 69, 9 3. Swope, S.M. & Parker, I.M. (22) Complex interactions among biocontrol agents, pollinators, and an invasive weed: a structural equation modeling approach. Ecology, 22, The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
10 Models for spatially misaligned multivariate data 523 Szpiro, A.A., Sheppard, L. & Lumley, T. (2) Efficient measurement error correction with spatially misaligned data. Biostatistics, 2, Zhu, L., Carlin, B.P. & Gelfand, A.E. (23) Hierarchical regression with misaligned spatial data: relating ambient ozone and pediatric asthma er visits in atlanta. Environmetrics, 4, Received 5 November 23; accepted 26 February 24 Handling Editor: Bob O Hara Supporting Information Additional Supporting Information may be found in the online version of this article. Appendix S. Misalignment model specification. 24 The Authors. Methods in Ecology and Evolution 24 British Ecological Society, Methods in Ecology and Evolution, 5,
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationspbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments
More informationHierarchical Modeling for Multivariate Spatial Data
Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationHierarchical Modeling for non-gaussian Spatial Data
Hierarchical Modeling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationModelling Multivariate Spatial Data
Modelling Multivariate Spatial Data Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. June 20th, 2014 1 Point-referenced spatial data often
More informationHierarchical Modelling for non-gaussian Spatial Data
Hierarchical Modelling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2
More informationBayesian Dynamic Modeling for Space-time Data in R
Bayesian Dynamic Modeling for Space-time Data in R Andrew O. Finley and Sudipto Banerjee September 5, 2014 We make use of several libraries in the following example session, including: ˆ library(fields)
More informationSome notes on efficient computing and setting up high performance computing environments
Some notes on efficient computing and setting up high performance computing environments Andrew O. Finley Department of Forestry, Michigan State University, Lansing, Michigan. April 17, 2017 1 Efficient
More informationHierarchical Modelling for Multivariate Spatial Data
Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as
More informationHierarchical Modelling for non-gaussian Spatial Data
Hierarchical Modelling for non-gaussian Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Generalized Linear Models Often data
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationHierarchical Modeling for Spatio-temporal Data
Hierarchical Modeling for Spatio-temporal Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of
More informationBAYESIAN HIERARCHICAL MODELS FOR MISALIGNED DATA: A SIMULATION STUDY
STATISTICA, anno LXXV, n. 1, 2015 BAYESIAN HIERARCHICAL MODELS FOR MISALIGNED DATA: A SIMULATION STUDY Giulia Roli 1 Dipartimento di Scienze Statistiche, Università di Bologna, Bologna, Italia Meri Raggi
More informationBayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets
Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets Sudipto Banerjee University of California, Los Angeles, USA Based upon projects involving: Abhirup Datta (Johns Hopkins University)
More informationHierarchical Modeling and Analysis for Spatial Data
Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota
More informationAggregated cancer incidence data: spatial models
Aggregated cancer incidence data: spatial models 5 ième Forum du Cancéropôle Grand-est - November 2, 2011 Erik A. Sauleau Department of Biostatistics - Faculty of Medicine University of Strasbourg ea.sauleau@unistra.fr
More informationGaussian Process Regression Model in Spatial Logistic Regression
Journal of Physics: Conference Series PAPER OPEN ACCESS Gaussian Process Regression Model in Spatial Logistic Regression To cite this article: A Sofro and A Oktaviarina 018 J. Phys.: Conf. Ser. 947 01005
More informationOn Gaussian Process Models for High-Dimensional Geostatistical Datasets
On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May
More informationTechnical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models
Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationDisease mapping with Gaussian processes
EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationBayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling
Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation
More informationThe Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology
The Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology Chris Paciorek and Brent Coull Department of Biostatistics Harvard School of Public Health wwwbiostatharvardedu/
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationJournal of Statistical Software
JSS Journal of Statistical Software April 2007, Volume 19, Issue 4. http://www.jstatsoft.org/ spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O.
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationAdvanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland
EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1
More informationRejoinder. Peihua Qiu Department of Biostatistics, University of Florida 2004 Mowry Road, Gainesville, FL 32610
Rejoinder Peihua Qiu Department of Biostatistics, University of Florida 2004 Mowry Road, Gainesville, FL 32610 I was invited to give a plenary speech at the 2017 Stu Hunter Research Conference in March
More informationGaussian predictive process models for large spatial data sets.
Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview
More informationA Spatio-Temporal Downscaler for Output From Numerical Models
Supplementary materials for this article are available at 10.1007/s13253-009-0004-z. A Spatio-Temporal Downscaler for Output From Numerical Models Veronica J. BERROCAL,AlanE.GELFAND, and David M. HOLLAND
More informationHierarchical Modelling for Univariate and Multivariate Spatial Data
Hierarchical Modelling for Univariate and Multivariate Spatial Data p. 1/4 Hierarchical Modelling for Univariate and Multivariate Spatial Data Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota
More informationApproaches for Multiple Disease Mapping: MCAR and SANOVA
Approaches for Multiple Disease Mapping: MCAR and SANOVA Dipankar Bandyopadhyay Division of Biostatistics, University of Minnesota SPH April 22, 2015 1 Adapted from Sudipto Banerjee s notes SANOVA vs MCAR
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationSlice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method
Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationBAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS
BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS Srinivasan R and Venkatesan P Dept. of Statistics, National Institute for Research Tuberculosis, (Indian Council of Medical Research),
More informationHastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model
UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced
More informationBagging During Markov Chain Monte Carlo for Smoother Predictions
Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods
More informationEstimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2
Estimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2 1 Norwegian University of Life Sciences, Department of Ecology and Natural Resource
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationBayesian data analysis in practice: Three simple examples
Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayesian Inference for the Multivariate Normal
Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate
More informationA Note on Bayesian Inference After Multiple Imputation
A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in
More informationStatistics for extreme & sparse data
Statistics for extreme & sparse data University of Bath December 6, 2018 Plan 1 2 3 4 5 6 The Problem Climate Change = Bad! 4 key problems Volcanic eruptions/catastrophic event prediction. Windstorms
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationCombining Incompatible Spatial Data
Combining Incompatible Spatial Data Carol A. Gotway Crawford Office of Workforce and Career Development Centers for Disease Control and Prevention Invited for Quantitative Methods in Defense and National
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationPACKAGE LMest FOR LATENT MARKOV ANALYSIS
PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,
More informationStatistical Practice
Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed
More informationNon-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources
th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationA Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources
A Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources Carol A. Gotway Crawford National Center for Environmental Health Centers for Disease Control and Prevention,
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationOn the change of support problem for spatio-temporal data
Biostatistics (2001), 2, 1,pp. 31 45 Printed in Great Britain On the change of support problem for spatio-temporal data ALAN E. GELFAND Department of Statistics, University of Connecticut, Storrs, Connecticut
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,
More informationFusing point and areal level space-time data. data with application to wet deposition
Fusing point and areal level space-time data with application to wet deposition Alan Gelfand Duke University Joint work with Sujit Sahu and David Holland Chemical Deposition Combustion of fossil fuel produces
More informationspbayes: an R package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: an R package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley, Sudipto Banerjee, and Bradley P. Carlin 1 Department Correspondence of Forest Resources,
More informationMEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES
XX IMEKO World Congress Metrology for Green Growth September 9 14, 212, Busan, Republic of Korea MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES A B Forbes National Physical Laboratory, Teddington,
More informationModel Assessment and Comparisons
Model Assessment and Comparisons Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationBayesian Areal Wombling for Geographic Boundary Analysis
Bayesian Areal Wombling for Geographic Boundary Analysis Haolan Lu, Haijun Ma, and Bradley P. Carlin haolanl@biostat.umn.edu, haijunma@biostat.umn.edu, and brad@biostat.umn.edu Division of Biostatistics
More informationSPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA
SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA D. Pokrajac Center for Information Science and Technology Temple University Philadelphia, Pennsylvania A. Lazarevic Computer
More informationRepresent processes and observations that span multiple levels (aka multi level models) R 2
Hierarchical models Hierarchical models Represent processes and observations that span multiple levels (aka multi level models) R 1 R 2 R 3 N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 N 9 N i = true abundance on a
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry
More informationModelling Replicated Weed Growth Data Using Spatially-Varying Growth Curves
Modelling Replicated Weed Growth Data Using Spatially-Varying Growth Curves By Sudipto Banerjee, Gregg A. Johnson, Nick Schneider and Beverly R. Durgan 1 Abstract: Weed growth in agricultural fields constitutes
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationBayesian Hierarchical Models
Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationLecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH
Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationBAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS
BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS Andrew A. Neath 1 and Joseph E. Cavanaugh 1 Department of Mathematics and Statistics, Southern Illinois University, Edwardsville, Illinois 606, USA
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationASA Section on Survey Research Methods
REGRESSION-BASED STATISTICAL MATCHING: RECENT DEVELOPMENTS Chris Moriarity, Fritz Scheuren Chris Moriarity, U.S. Government Accountability Office, 411 G Street NW, Washington, DC 20548 KEY WORDS: data
More informationOdds ratio estimation in Bernoulli smoothing spline analysis-ofvariance
The Statistician (1997) 46, No. 1, pp. 49 56 Odds ratio estimation in Bernoulli smoothing spline analysis-ofvariance models By YUEDONG WANG{ University of Michigan, Ann Arbor, USA [Received June 1995.
More informationRestricted spatial regression in practice: geostatistical models, confounding, and robustness under model misspecification
Research Article Environmetrics Received: 10 September 2014, Revised: 12 January 2015, Accepted: 15 January 2015, Published online in Wiley Online Library: 18 February 2015 (wileyonlinelibrary.com) DOI:
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationQuantile POD for Hit-Miss Data
Quantile POD for Hit-Miss Data Yew-Meng Koh a and William Q. Meeker a a Center for Nondestructive Evaluation, Department of Statistics, Iowa State niversity, Ames, Iowa 50010 Abstract. Probability of detection
More informationThe STS Surgeon Composite Technical Appendix
The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic
More informationSummary STK 4150/9150
STK4150 - Intro 1 Summary STK 4150/9150 Odd Kolbjørnsen May 22 2017 Scope You are expected to know and be able to use basic concepts introduced in the book. You knowledge is expected to be larger than
More informationFastGP: an R package for Gaussian processes
FastGP: an R package for Gaussian processes Giri Gopalan Harvard University Luke Bornn Harvard University Many methodologies involving a Gaussian process rely heavily on computationally expensive functions
More informationStatistícal Methods for Spatial Data Analysis
Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis
More information