Hierarchical Modeling for Univariate Spatial Data

Size: px
Start display at page:

Download "Hierarchical Modeling for Univariate Spatial Data"


1 Univariate spatial models Spatial Domain Hierarchical Modeling for Univariate Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 y x 2 Graduate Workshop on Environmental Data Analytics 204 Univariate spatial models Algorithmic Modeling Univariate spatial models What is a spatial process? Spatial surface observed at finite set of locations S = {s, s 2,..., s n } Tessellate the spatial domain (usually with data locations as vertices) Fit an interpolating polynomial: x x f(s) = i w i (S ; s)f(s i ) Y(s) Y(s2) Interpolate by reading off f(s 0 ). Includes: triangulation, weighted averages, geographically weighted regression (GWR) Issues: Sensitivity to tessellations Choices of multivariate interpolators Numerical error analysis x Y(sn) x 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Univariate spatial models Simple linear model Univariate spatial models Simple linear model Simple linear model Simple linear model y(s) = µ(s) + ɛ(s), Response: y(s) at location s Mean: µ = x(s) β Error: ɛ(s) iid N(0, τ 2 ) Assumptions regarding ɛ(s): ɛ(s) iid N(0, τ 2 ) y(s) = µ(s) + ɛ(s), ɛ(s i ) and ɛ(s j ) are uncorrelated for all i j D D y(s), x(s) ɛ(s i) ɛ(s j) 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

2 Univariate spatial models Sources of variation Univariate spatial models Sources of variation Spatial Gaussian processes (GP ): Say w(s) GP (0, σ 2 ρ( )) and Cov(w(s ), w(s 2 )) = σ 2 ρ (φ; s s 2 ) Let w = [w(s i )] n i=, then w N(0, σ 2 R(φ)), where R(φ) = [ρ(φ; s i s j )] n i,j= D Realization of a Gaussian process: Changing φ and holding σ 2 = : w N(0, σ 2 R(φ)), where R(φ) = [ρ(φ; s i s j )] n i,j= Correlation model for R(φ): e.g., exponential decay ρ(φ; t) = exp( φt) if t > 0. w(s i) w(s j) Other valid models e.g., Gaussian, Spherical, Matérn. Effective range, t 0 = ln(0.05)/φ 3/φ 7 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Univariate spatial models Sources of variation Univariate spatial models Univariate spatial regression w N(0, σ 2 wr(φ)) provides complex spatial dependence through simple structured dependence. E.g., anisotropic Matérn correlation function: ρ(s i, s j; φ) = ( /Γ(ν)2 ν ) ( 2 νd ij) ν κ ν(2 νd ij ), where d ij = (s i s j) Σ (s i s j), Σ = G(ψ)Λ 2 G(ψ). Thus, φ = (ν, ψ, Λ). Simple linear model + random spatial effects y(s) = µ(s) + w(s) + ɛ(s), Simulated Predicted Response: y(s) at some site Mean: µ = x(s) β Spatial random effects: w(s) GP (0, σ 2 ρ(φ; s s 2 )) Non-spatial variance: ɛ(s) iid N(0, τ 2 ). Interpretation as pure error, measurement error, replication error, microscale error. 9 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Univariate spatial models Univariate spatial regression Univariate spatial models Univariate spatial regression First stage: y β, w, τ 2 Second stage: Hierarchical modeling n N(y(s i ) x(s i ) β + w(s i ), τ 2 ) i= w σ 2, φ N(0, σ 2 R(φ)) Third stage: Priors on Ω = (β, τ 2, σ 2, φ) Marginalized likelihood: y Ω N(Xβ, σ 2 R(φ) + τ 2 I) Note: Spatial process parametrizes Σ: y = Xβ + ɛ, ɛ N (0, Σ), Σ = σ 2 R(φ) + τ 2 I Graduate Workshop on Environmental Data Analytics 204 Bayesian Computations Choice: Fit [y Ω] [Ω] or [y β, w, τ 2 ] [w σ 2, φ] [Ω]. Conditional model: conjugate full conditionals for β, σ 2, τ 2 and w easier to program. Marginalized model: need Metropolis or Slice sampling for σ 2, τ 2 and φ. Harder to program. But, reduced parameter space faster convergence σ 2 R(φ) + τ 2 I is more stable than σ 2 R(φ). But what about R (φ)?? EXPENSIVE! 2 Graduate Workshop on Environmental Data Analytics 204

3 Univariate spatial models Univariate spatial regression Univariate spatial models Spatial Prediction Where are the w s? Often we need to predict y(s) at a new set of locations {s 0,..., s m } with associated predictor matrix X. Interest often lies in the spatial surface w y. Sample from predictive distribution: Z [y y, X, X ] = [y, Ω y, X, X ]dω Z = [y y, Ω, X, X ] [Ω y, X]dΩ, They are recovered from Z [w y, X] = [w Ω, y, X] [Ω y, X]dΩ using posterior samples: Obtain Ω(),..., Ω(G) [Ω y, X] For each Ω(g), draw w(g) [w Ω(g), y, X] [y y, Ω, X, X ] is multivariate normal. Sampling scheme: Obtain Ω(),..., Ω(G) [Ω y, X] (g) For each Ω(g), draw y [y y, Ω(g), X, X ]. NOTE: With Gaussian likelihoods [w Ω, y, X] is also Gaussian. With other likelihoods this may not be a standard distribution; conditional updating scheme is preferred. 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Illustration Illustration Colorado data illustration Colorado data illustration Modeling temperature: 50 locations in Colorado. Simple spatial regression model: y(s) = x(s)> β + w(s) + (s) iid w(s) GP (0, σ 2 ρ( ; φ, ν)); (s) N (0, τ 2 ) Parameters Intercept [Elevation] Precipitation σ2 φ Range τ2 5 Graduate Workshop on Environmental Data Analytics % (2.5%,97.5%) (2.3,3.866) (-0.527,-0.333) (0.002,0.072) 0.34 (0.05,.245) 7.39E-3 (4.7E-3, 5.2E-3) (38.8, 476.3) 0.05 (0.022, 0.092) 6 Graduate Workshop on Environmental Data Analytics 204 Illustration Illustration Elevation map 39 Latitude Northing Temperature residual map Easting 7 Graduate Workshop on Environmental Data Analytics Longitude 8 Graduate Workshop on Environmental Data Analytics 204

4 Illustration Residual map with elev. as covariate Northing Easting 9 Graduate Workshop on Environmental Data Analytics 204

5 Computing environments Brief notes on setting up semi-high performance computing environments July 25, 204 We have two different computing environments for fitting demanding models to large space and/or time data sets. A distributed system consists of multiple autonomous computers (nodes) that communicate through a co mputer network. A computer program that runs in a distributed system is called a distributed program. Message Passing Interface (MPI) is a specification for an Application Programming Interface (API) that allows many computers to communicate with one another (implementations in C, C++, and Fortran.) 2 A shared memory multiprocessing system consists of a single computer with memory that may be simultaneously accessed by one or more programs running on multiple central processing units (CPUs). The OpenMP (Open Multi-Processing) is an API that supports shared memory multiprocessing programming (implementations in C, C++, and Fortran). 2 Graduate Workshop on Environmental Data Analytics 204 Computing environments We have two different computing environments for fitting demanding models to large space and/or time data sets. Recent work focuses on fitting geostatistical (specifically point-referenced) models using MCMC methods. This necessitates iterative evaluation of a likelihood which requires operations on large matrices. A specific hurdle is factorization to computing determinant and inverse of large dense covariance matrices. We try to model our way out and use tools from computer science to overcome the computational complexity (e.g., covariance tapering, Kaufman et al. 2008; low-rank methods, Cressie and Johannesson 2008; Banerjee et al. 2008, etc.). Due to slow network communication and transport of submatrices among nodes distributed systems are not ideal for these types of iterative large matrix operations. 3 Graduate Workshop on Environmental Data Analytics 204 Computing environments My lab currently favors shared memory multiprocessing system. We buy rack mounted units (e.g., Sun Fire X470 Server with 2 quad-core Intel Xeon Processor 5500 Series and 48 GB of RAM 0-5k) running the Linux operating systems. Software includes OpenMP coupled with Intel Math Kernel Library (MKL) non-commercial-software-development. MKL is a library of highly optimized, extensively threaded math routines (e.g., BLAS, LAPACK, ScaLAPACK, Sparse Solvers, Fast Fourier Transforms, and vector RNGs). 4 Graduate Workshop on Environmental Data Analytics 204 Computing environments Computing environments So what kind of speed up to expect from threaded BLAS and LAPACK libraries. Mean computing times of dpotrf: See for some simple examples of C++ with MKL and Rmath libraries along with associated Makefile files (I ll add more examples shortly and upon request). 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

6 Computing environments Computing environments Many core and contributed packages (including spbayes) call Basic Linear Algebra Subprograms (BLAS) and LAPACK (Linear Algebra PACKage) Fortran libraries. Substantial computing gains: processor specific threaded BLAS/LAPACK implementation (e.g., MKL or AMD s Core Math Library (ACML)) processor specific compilers (e.g., Intel s icc/ifort) Compiling R to call MKL s BLAS and LAPACK libraries (rather than stock serial versions). MKL_LIB_PATH="/opt/intel/composer_xe_20_sp.0.39/mkl/lib/intel64" export LD_LIBRARY_PATH=$MKL_LIB_PATH MKL="-L${MKL_LIB_PATH} -lmkl_intel_lp64 -lmkl_intel_thread \ -lmkl_core -liomp5 -lpthread -lm"./configure --with-blas="$mkl" --with-lapack 7 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Computing environments For many BLAS and LAPACK functions calls from R, expect near linear speed up... 9 Graduate Workshop on Environmental Data Analytics 204

7 Spatial Generalized Linear Models Hierarchical Modeling for non-gaussian Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. Often data sets preclude Gaussian modeling: y(s) may not even be continuous Example: y(s) is a binary or count variable species presence or absence at location s species abundance from count at location s continuous forest variable is high or low at location s Replace Gaussian likelihood by exponential family member July 25, 204 Spatial GLM, Diggle Tawn and Moyeed (998) 2 Graduate Workshop on Environmental Data Analytics 204 Spatial Generalized Linear Models Spatial Generalized Linear Models Comments First stage: y(s i ) are conditionally independent given β and w(s i ), so f(y(s i ) β, w(s i ), γ) equals h(y(s i ), γ) exp (γ[y(s i )η(s i ) ψ(η(s i ))]) where g(e(y(s i ))) = η(s i ) = x (s i )β + w(s i ) (canonical link function) and γ is a dispersion parameter. Second stage: Model w(s) as a Gaussian process: w N(0, σ 2 R(φ)) Third stage: Priors and hyperpriors. No process for y(s), only a valid joint distribution Not sensible to add a pure error term ɛ(s) We are modeling with spatial random effects Introducing these in the transformed mean encourages means of spatial variables at proximate locations to be close to each other Marginal spatial dependence is induced between, say, y(s) and y(s ), but observed y(s) and y(s ) need not be close to each other Second stage spatial modeling is attractive for spatial explanation in the mean First stage spatial modeling more appropriate to encourage proximate observations to be similar. 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Illustration Illustration Binary spatial regression: forest/non-forest Illustration from: Finley, A.O., S. Banerjee, and R.E. McRoberts. (2008) A Bayesian approach to quantifying uncertainty in multi-source forest area estimates. Environmental and Ecological Statistics, 5: We illustrate a non-gaussian model for point-referenced spatial data: Objective is to make pixel-level prediction of forest/non-forest across the domain. Data: Observations are from 500 georeferenced USDA Forest Service Forest Inventory and Analysis (FIA) inventory plots within a 32 km radius circle in MN, USA. The response y(s) is a binary variable, with { if inventory plot is forested y(s) = 0 if inventory plot is not forested Observed covariates include the coinsiding pixel values for 3 dates of m resolution Landsat imagery. 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

8 Illustration Illustration Posterior parameter estimates Binary spatial regression: forest/non-forest Parameter estimates (posterior medians and upper and lower 2.5 percentiles): We fit a generalized linear model where y(si ) Bernoulli(p(si )), logit(p(si )) = x(si )0 β + w(si ). Parameters Intercept (θ0 ) AprilTC (θ ) AprilTC2 (θ2 ) AprilTC3 (θ3 ) JulyTC (θ4 ) JulyTC2 (θ5 ) JulyTC3 (θ6 ) OctTC (θ7 ) OctTC2 (θ8 ) OctTC3 (θ9 ) σ2 φ log(0.05)/φ (meters) Assume vague flat for β, a Uniform(3/32, 3/0.5) prior for φ, and an inverse-gamma(2, ) prior for σ 2. Parameters updated with Metropolis algorithm using target log density: ln (p(ω y)) σb n σa + + ln σ 2 2 ln ( R(φ) ) 2 w0 R(φ) w 2 σ 2 2σ n n X X + y(si ) x(si )0 β + w(si ) ln + exp(x(si )0 β + w(si ) i= i= Covariates and proximity to observed FIA plot will contribute to increase precision of prediction. + ln(σ 2 ) + ln(φ φa ) + ln(φb φ). 7 Graduate Workshop on Environmental Data Analytics Illustration Graduate Workshop on Environmental Data Analytics 204 Illustration cut point F[P(Y(A) = )] CDF of holdout area s posterior predictive distributions P(Y(A) = ) Classification of pixel areas (based on visual inspection of imagery) into non-forest (), moderately forest ( ), and forest (no marker). Graduate Workshop on Environmental Data Analytics 204 Illustration Median of posterior predictive distributions 9 Estimates: 50% (2.5%, 97.5%) (49.56, 20.46) (-0.45, -0.) 0.7 (0.07, 0.29) (-0.43, -0.08) (-0.25, 0.7) 0.09 (-0.0, 0.9) 0.0 (-0.5, 0.6) (-0.68, -0.22) (-0.9, 0.4) (-0.46, -0.07).358 (0.39, 2.42) ( , ) (932.33, ) Graduate Workshop on Environmental Data Analytics %-2.5% range of posterior predictive distributions 0 Graduate Workshop on Environmental Data Analytics 204

9 Spatio-temporal Models Building simple spatiotemporal models Hierarchical Modeling for Spatialtemporal Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 Modeling: Y t (s) = µ t (s) + w t (s) + ɛ t (s), or perhaps g(e(y t (s)) = µ t (s) + w t (s) For ɛ t (s), independent N(0, τ 2 t ) For w t (s) w t (s) = α t + w(s) w t (s) independent for each t w t (s) = w t (s) + η t (s), independent spatial process innovations 2 Graduate Workshop on Environmental Data Analytics 204 Spatio-temporal Models Univariate dynamic spatiotemporal models Spatio-temporal Models Dynamic Predictive Process Models Measurement Equation Y (s, t) = µ(s, t) + ɛ(s, t); ɛ(s, t) ind N(0, σ 2 ɛ ). µ(s, t) = x(s, t) β(s, t). β(s, t) = β t + β(s, t) Transition Equation β t = β t + η t, η t ind N p (0, Ση) β(s, t) = β(s, t ) + w(s, t). w (s, t) = Av (s, t), with v (s, t) = (v (s, t),..., v p (s, t)). The v l (s, t) s are replications of a Gaussian processes with unit variance and correlation function ρ l (φ l, ) Connect to linear Kalman filter 3 Graduate Workshop on Environmental Data Analytics 204 Dynamic models for large spatiotemporal datasets Measurement Equation Y (s, t) = µ(s, t) + ɛ(s, t); ɛ(s, t) ind N(0, σ 2 ɛ ). µ(s, t) = x(s, t) β(s, t). β(s, t) = β t + β(s, t) Transition Equation β t = β t + η t, η t ind N p (0, Ση) β(s, t) = β(s, t ) + w(s, t). w (s, t) = Aṽ (s, t), where ṽ (s, t) = E[v(s, t) v ]. 4 Graduate Workshop on Environmental Data Analytics 204

10 Multivariate spatial modeling Modeling Multivaraite Spatial Data Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 Point-referenced spatial data often come as multivariate measurements at each location. Examples: Environmental monitoring: stations yield measurements on ozone, NO, CO, and PM 2.5. Community ecology: assemblages of plant species due to water availibility, temperature, and light requirements. Forestry: measurements of stand characteristics age, total biomass, and average tree diameter. Atmospheric modeling: at a given site we observe surface temperature, precipitation and wind speed We anticipate dependence between measurements at a particular location across locations 2 Graduate Workshop on Environmental Data Analytics 204 Multivariate spatial modeling Multivariate spatial modeling Bivariate Linear Spatial Regression A single covariate X(s) and a univariate response Y (s) At any arbitrary point in the domain, we conceive a linear spatial relationship: E[Y (s) X(s)] = β 0 + β X(s); where X(s) and Y (s) are spatial processes. Regression on uncountable sets: Regress {Y (s) : s D} on {X(s) : s D}. Inference: Estimate β 0 and β. Estimate spatial surface {X(s) : s D}. Estimate spatial surface {Y (s) : s D}. 3 Graduate Workshop on Environmental Data Analytics 204 Bivariate spatial process A bivariate distribution [Y, X] will yield regression [Y X]. So why not start with a bivariate process? [ ] ([ ] [ ]) X(s) µx (s) CXX ( ; θ Z(s) = GP Y (s) 2, Z ) C XY ( ; θ Z ) µ Y (s) C Y X ( ; θ Z ) C Y Y ( ; θ Z ) The cross-covariance function: [ ] CXX (s, t; θ C Z (s, t; θ Z ) = Z ) C XY (s, t; θ Z ), C Y X (s, t; θ Z ) C Y Y (s, t; θ Z ) where C XY (s, t) = cov(x(s), Y (t)) and so on. 4 Graduate Workshop on Environmental Data Analytics 204 Multivariate spatial modeling Multivariate spatial modeling Cross-covariance functions satisfy certain properties: C XY (s, t) = cov(x(s), Y (t)) = cov(y (t), X(s)) = C Y X (t, s). Caution: C XY (s, t) C XY (t, s) and C XY (s, t) C Y X (s, t). In matrix terms, C Z (s, t; θ Z ) = C Z (t, s; θ Z ) Positive-definiteness for any finite collection of points: n i= j= n a i C Z (s i, t j ; θ Z )a j > 0 for all a i R 2 \ {0}. Bivariare Spatial Regression from a Separable Process To ensure E[Y (s) X(s)] = β 0 + β X(s), we assume [ ] ([ ] [ ]) X(s) µ T T Z(s) = N, 2 for every s D Y (s) µ 2 T 2 T 22 Simplifying assumption : C Z (s, t) = ρ(s, t)t = Σ Z = {ρ(s i, s j )T} = R(φ) T. 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

11 Multivariate spatial modeling Multivariate spatial modeling Then, p(y (s) X(s)) = N(Y (s) β 0 + β X(s), σ 2 ), where β 0 = µ 2 T 2 T µ, β = T 2 T, σ 2 = T 22 T 2 2 T. Regression coefficients are functions of process parameters. Estimate {µ, µ 2, T, T 2, T 22 } by sampling from p(φ) N(µ δ, V µ ) IW (T r, S) N(Z µ, R(φ) T) Immediately obtain posterior samples of {β 0, β, σ 2 }. Bivariate Spatial Regression with Misalignment Rearrange the components of Z to Z = (X(s ), X(s 2 ),..., X(s n ), Y (s ), Y (s 2 ),..., Y (s n )) yields [ ] X N Y ([ ] ) µ, T R (φ). µ 2 Priors: Wishart for T, normal (perhaps flat) for (µ, µ 2 ), discrete prior for φ or perhaps a uniform on (0,.5max dist). Estimation: Markov chain Monte Carlo (Gibbs, Metropolis, Slice, HMC/NUTS); Integrated Nested Laplace Approximation (INLA). 7 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Multivariate spatial modeling Multivariate spatial modeling Hierarchical approach (Royle and Berliner, 999; Cressie and Wikle, 20) Y (s) and X(s) observed over a finite set of locations S = {s, s 2,..., s n }. Y and X are n vectors of observed Y (s i ) s and X(s i ) s, respectively. How do we model Y X? No conditional process meaningless to talk about the joint distribution of Y (s i ) X(s i ) and Y (s j ) X(s j ) for two distinct locations s i and s j. Can model using [X] [Y X] but can we interpolate/predict at arbitrary locations? Hierarchical approach (contd.) X(s) GP (µ X (s), C X ( ; θ X )). Therefore, X N(µ X, C X (θ X )). C X (θ X ) is n n with entries C X (s i, s j ; θ X ). e(s) GP (0, C e ( ; θ e )); C e is analogous to C X. Y (s i ) = β 0 + β X(s i ) + e(s i ), for i =, 2,..., n. Joint distribution of Y and X: ( ) ([ ] [ ]) X µx CX (θ N, X ) β C X (θ X ) Y µ Y β C X (θ X ) C e (θ e ) + βc 2, X (θ X ) where µ Y = β 0 + β µ X. 9 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Multivariate spatial modeling Multivariate spatial modeling This joint distribution arises from a bivariate spatial process: [ ] [ ] X(s) µ W(s) = and E[W(s)] = µ Y (s) W (s) = X (s). β 0 + β µ X (s) and cross-covariance [ C W (s, s CX (s, s ) = ) β C X (s, s ] ) β C X (s, s ) β 2C X(s, s ) + C e (s, s, ) where we have suppressed the dependence of C X (s, s ) and C e (s, s ) on θ X and θ e respectively. This implies that E[Y (s) X(s)] = β 0 + β X(s) for any arbitrary location s, thereby specifying a well-defined spatial regression model for an arbitrary s. Coregionalization (Wackernagel) Separable models assume one spatial range for both X(s) and Y (s). Coregionalization helps to introduce a second range parameter. Introduce two latent independent GP s, each having its own parameters: v (s) GP (0, ρ ( ; φ )) and v 2 (s) GP (0, ρ 2 ( ; φ 2 )) Construct a bivariate process as the linear transformation: w (s) = a v (s) w 2 (s) = a 2 v (s) + a 22 v 2 (s) Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

12 Multivariate spatial modeling Multivariate spatial modeling Short form: Coregionalization [ ] [ ] a 0 v (s) w(s) = a 2 a 22 v 2 (s) = Av(s) Cross-covariance of v(s): [ ] ρ (s, t; φ C v (s, t) = ) 0 0 ρ 2 (s, t; φ 2 ) Cross-covariance of w(s): C w (s, t) = AC v (s, t)a. It is a valid cross-covariance function (by construction). If s = t, then C w (s, s) = AA. No loss of generality to specify A as (lower) triangular. 3 Graduate Workshop on Environmental Data Analytics 204 If v (s) and v 2 (s) have identical correlation functions, then ρ (s, t) = ρ 2 (s, t) and C w (s) = ρ(s, t; φ)aa = separable model Coregionalized Spatial Linear Model [ ] X(s) = Y (s) [ ] µx (s) + µ Y (s) [ ] w (s) + w 2 (s) [ ] ex (s), e Y (s) where e X (s) and e Y (s) are independent white-noise processes [ ] ([ ] [ ]) ex (s) 0 τ 2 N e Y (s) 2, X τy 2 for every s D. 4 Graduate Workshop on Environmental Data Analytics 204 Multivariate spatial modeling Multivariate spatial modeling Generalizations Each location contains m spatial regressions Y k (s) = µ k (s) + w k (s) + ɛ k (s), k =,..., m. Let v k (s) GP (0, ρ k (s, s )), for k =,..., m be m independent GP s with unit variance. Assume w(s) = A(s)v(s) arises as a space-varying linear transformation of v(s). Then: C w (s, t) = A(s)C v (s, t)a (t) is a valid cross-covariance function. A(s) is unknown! Should we first model A(s) to obtain C w (s, s)? Or should we model C w (s, t) first and derive A(s)? A(s) is completely determined from within-site associations. 5 Graduate Workshop on Environmental Data Analytics 204 Other approaches for cross-covariance models Convolutions of processes and covariance functions Gaspari and Cohn (Quart. J. Roy. Met. Soc., 999). Majumdar and Gelfand (Math. Geo., 2007). Latent dimension approach: Apanasovich and Genton (Biometrika, 200). Apanasovich et al. (JASA, 202). Multivariate Matérn family Gneiting et al. (JASA, 200). Nonstationary variants of coregionalization Space-varying: Gelfand et al. (Test, 200). Dimension-reducing (over space): Guhaniyogi et al. (JABES, 202). Dimension-reducing (over outcomes): Ren and Banerjee (Biometrics, 203). Variogram modeling: De Iaco et al. (Math. Geo., 2003). 6 Graduate Workshop on Environmental Data Analytics 204

13 Hierarchical spatial process models Hierarchical Spatial model Modeling Large Spatial Datasets Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 p(θ) p(ψ) N(β µ β, Σ β ) N(w 0, C w ( θ )) n N m (y(s i ) X(s i ) β + w(s i ), D(Ψ) ) i= regression slopes spatial random effects from Gaussian process nonspatial variability (nugget) spatial process parameters (spatial variance, range, smoothness) and. 2 Graduate Workshop on Environmental Data Analytics 204 Computational issues Dimension reduction Some approaches Approaches to dimension reduction: We need to evaluate Computational issues. 2 log det(c w(θ)) 2 w C w (θ) w What if n is LARGE? How do we tackle C w (θ) and det(c(θ))? 3 Graduate Workshop on Environmental Data Analytics 204 Covariance tapering (Furrer et al. 2006; Zhang and Du, 2007; Du et al. 2009; Kaufman et al., 2009) Spectral domain: (Fuentes 2007; Paciorek, 2007) Approximate using GMRFs: INLA (Rue et al. 2009; Lindgren et al., 20) Nearest-neighbor models (processes) (Vecchia 988; Stein et al. 2004; Datta et al., 204) Low-rank approaches (Wahba, 990; Higdon, 2002; Lin et al., 2000; Kamman & Wand, 2003; Paciorek, 2007; Rasmussen & Williams, 2006; Stein 2007, 2008; Cressie & Johannesson, 2008; Banerjee et al., 2008; 200; Sang et al., 20) 4 Graduate Workshop on Environmental Data Analytics 204 Dimension reduction Some approaches Dimension reduction Some approaches Higdon (2002) proposed kernel convolution approximations. S = {s, s 2,..., s n}: a set of knots. w(s) w KC (s) = k(s s j, θ )u j, n j= Smoothing causes loss in variability: w(s) w KC (s) = u j iid N(0, ). k(s v, θ )du(v) k(s s j, θ )u j j=n + k(s s j, θ )u j, n j= No easy way to quantify this difference with kernel convolutions. 5 Graduate Workshop on Environmental Data Analytics 204 Low rank Gaussian process Call w(s) GP m (0, C θ ( )) the parent process For S = {s, s 2,..., s n }, let C w (θ) = { C θ (s i, s j ) } : w = (w(s ), w(s 2),..., w(s n ) ) N(0, C w(θ)) The predictive process derived from w(s) is: w(s) = E[w(s) w ] = cov{w(s), w } var{w } w. w(s) is a degenerate Gaussian process delivering dimension-reduction. 6 Graduate Workshop on Environmental Data Analytics 204

14 y x y x Hierarchical predictive process models w(s) Low rank interpolation w(s) = z(s, θ) w w = (w(s ),..., w(s n ) ) Hierarchical predictive process models tauˆ knots Parent process surface Predictive process surface Hierarchical predictive process models p(θ) p(ψ) N(β µ β, Σ β ) N(w 0, C w(θ)) n N m (y(s i ) X(s i ) β + w(s i ), D(Ψ)). i= Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Hierarchical predictive process models Systemic under-estimation: Systematic under-estimation var{w(s)} = var{e[w(s) w ]} + E{var[w(s) w ]} var{e[w(s) w ]} = var{ w(s)}. Orthogonal decomposition: var{w(s)} = var{ w(s)} + var{w(s) w(s)} ɛ(s) = w(s) w(s) GP (0, C ɛ (s, s 2 ; θ )): C ɛ (s, s 2 ; θ ) = C(s, s 2 ; θ ) c(s ; θ ) C (θ ) c(s 2 ; θ 2 ). 9 Graduate Workshop on Environmental Data Analytics 204

15 Introduction to Spatial Data and Models Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 Researchers in diverse areas such as climatology, ecology, environmental health, and real estate marketing are increasingly faced with the task of analyzing data that are: highly multivariate, with many important predictors and response variables, geographically referenced, and often presented as maps, and temporally correlated, as in longitudinal or other time series structures. motivates hierarchical modeling and data analysis for complex spatial (and spatiotemporal) data sets. 2 Graduate Workshop on Environmental Data Analytics 204 Type of spatial data Exploration of spatial data point-referenced data, where y(s) is a random vector at a location s R r, where s varies continuously over D, a fixed subset of R r that contains an r-dimensional rectangle of positive volume; areal data, where D is again a fixed subset (of regular or irregular shape), but now partitioned into a finite number of areal units with well-defined boundaries; point pattern data, where now D is itself random; its index set gives the locations of random events that are the spatial point pattern. y(s) itself can simply equal for all s D (indicating occurrence of the event), or possibly give some additional covariate information (producing a marked point pattern process). First step in analyzing data First Law of Geography: Mean + Error Mean: first-order behavior Error: second-order behavior (covariance function) EDA tools examine both first and second order behavior Preliminary displays: Simple locations to surface displays 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Exploration of spatial data Exploration of spatial data First Law of Geography Scallops Sites data = mean + error 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

16 Deterministic surface interpolation Spatial surface observed at finite set of locations S = {s, s 2,..., s n } Tessellate the spatial domain (usually with data locations as vertices) Fit an interpolating polynomial: f(s) = i w i (S ; s)f(s i ) Interpolate by reading off f(s 0 ). Issues: Sensitivity to tessellations Choices of multivariate interpolators Numerical error analysis 7 Graduate Workshop on Environmental Data Analytics 204 Scallops data: image and contour plots Longitude Latitude 8 Graduate Workshop on Environmental Data Analytics 204 Scallops data: image and contour plots Drop-line scatter plot 9 Graduate Workshop on Environmental Data Analytics 204 Scallops data: image and contour plots Surface plot Longitude Latitude logsp 0 Graduate Workshop on Environmental Data Analytics 204 Scallops data: image and contour plots Image contour plot Longitude Latitude Graduate Workshop on Environmental Data Analytics 204 Scallops data: image and contour plots Locations form patterns Eastings Northings Graduate Workshop on Environmental Data Analytics 204

17 Scallops data: image and contour plots Scallops data: image and contour plots Surface features Interesting plot arrangements Shrub Density Northings Eastings N S UTM coordinates E W UTM coordinates 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Elements of point-level modeling Stationary Gaussian processes Point-level modeling refers to modeling of spatial data collected at locations referenced by coordinates (e.g., lat-long, Easting-Northing). Fundamental concept: Data from a spatial process {y(s) : s D}, where D is a fixed subset in Euclidean space. Example: y(s) is a pollutant level at site s Conceptually: Pollutant level exists at all possible sites Practically: Data will be a partial realization of a spatial process observed at {s,..., s n } Statistical objectives: Inference about the process y(s); predict at new locations. Suppose our spatial process has a mean, µ (s) = E (y (s)), and that the variance of y(s) exists for all s D. Strong stationarity: If for any given set of sites, and any displacement h, the distribution of (y(s ),..., y(s n )) is the same as, (y(s + h),..., y(s n + h)). Weak stationarity: Constant mean µ(s) = µ, and Cov(y(s), y(s + h)) = C(h): the covariance depends only upon the displacement (or separation) vector. Strong stationarity implies weak stationarity The process is Gaussian if y = (y(s ),..., y(s n )) has a multivariate normal distribution. For Gaussian processes, strong and weak stationarity are equivalent. 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Stationary Gaussian processes Stationary Gaussian processes Variograms Suppose we assume E[y(s + h) y(s)] = 0 and define E[y(s + h) y(s)] 2 = V ar (y(s + h) y(s)) = 2γ (h). This is sensible if the left hand side depends only upon h. Then we say the process is intrinsically stationary. γ(h) is called the semivariogram and 2γ(h) is called the variogram. Note that intrinsic stationarity defines only the first and second moments of the differences y(s + h) y(s). It says nothing about the joint distribution of a collection of variables y(s ),..., y(s n ), and thus provides no likelihood. Intrinsic Stationarity and Ergodicity Relationship between γ(h) and C(h): 2γ(h) = V ar(y(s + h)) + V ar(y(s)) 2Cov(y(s + h), y(s)) = C(0) + C(0) 2C(h) = 2[C(0) C(h)]. Easy to recover γ from C. The converse needs the additional assumption of ergodicity: lim u C(u) = 0. So lim u γ(u) = C(0), and we can recover C from γ as long as this limit exists. C(h) = lim u γ(u) γ(h). 7 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

18 Isotropy Isotropy When γ(h) or C(h) depends upon the separation vector only through the distance h, we say that the process is isotropic. In that case, we write γ( h ) or C( h ). Otherwise we say that the process is anisotropic. If the process is intrinsically stationary and isotropic, it is also called homogeneous. Isotropic processes are popular because of their simplicity, interpretability, and because a number of relatively simple parametric forms are available as candidates for C (and γ). Denoting h by t for notational simplicity, the next two tables provide a few examples... Some common isotropic variograms model Variogram, { γ(t) τ Linear γ(t) = 2 + σ 2 t if t > 0 0 otherwise τ 2 + σ 2 if t /φ Spherical γ(t) = τ 2 + σ 2 [ 3 2 φt 2 (φt)3] if 0 < t /φ { 0 otherwise τ Exponential γ(t) = 2 + σ 2 ( exp( φt)) if t > 0 { 0 otherwise Powered τ γ(t) = 2 + σ 2 ( exp( φt p )) if t > 0 exponential { 0 otherwise Matérn τ γ(t) = 2 + σ 2 [ ( + φt) e φt] if t > 0 at ν = 3/2 0 o/w 9 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Isotropy Isotropy Examples: Spherical Variogram Examples: Spherical Variogram γ(t) = τ 2 + σ 2 if t /φ τ 2 + σ 2 [ 3 2 φt 2 (φt)3] if 0 < t /φ 0 if t = 0. While γ(0) = 0 by definition, γ(0 + ) lim t 0 + γ(t) = τ 2 ; this quantity is the nugget. lim t γ(t) = τ 2 + σ 2 ; this asymptotic value of the semivariogram is called the sill. (The sill minus the nugget, σ 2 in this case, is called the partial sill.) Finally, the value t = /φ at which γ(t) first reaches its ultimate level (the sill) is called the range, R /φ b) spherical; a0 = 0.2, a =, R = 2 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Isotropy Isotropy Some common isotropic covariograms Model Covariance function, C(t) Linear C(t) does not exist 0 if t /φ Spherical C(t) = σ 2 [ 3 2 φt + 2 (φt)3] if 0 < t /φ { τ 2 + σ 2 otherwise σ Exponential C(t) = 2 exp( φt) if t > 0 { τ 2 + σ 2 otherwise Powered σ C(t) = 2 exp( φt p ) if t > 0 exponential { τ 2 + σ 2 otherwise Matérn σ C(t) = 2 ( + φt) exp( φt) if t > 0 at ν = 3/2 τ 2 + σ 2 otherwise Notes on exponential model { τ C(t) = 2 + σ 2 if t = 0 σ 2 exp( φt) if t > 0 We define the effective range, t 0, as the distance at which this correlation has dropped to only Setting exp( φt 0 ) equal to this value we obtain t 0 3/φ, since log(0.05) 3. Finally, the form of C(t) shows why the nugget τ 2 is often viewed as a nonspatial effect variance, and the partial sill (σ 2 ) is viewed as a spatial effect variance.. 23 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

19 Isotropy Variogram model fitting The Matèrn Correlation Function Much of statistical modelling is carried out through correlation functions rather than variograms The Matèrn is a very versatile family: { σ 2 C(t) = 2 ν Γ(ν) (2 νtφ) ν K ν (2 (ν)tφ) if t > 0 τ 2 + σ 2 if t = 0 K ν is the modified Bessel function of order ν (computationally tractable) ν is a smoothness parameter (a fractal) controlling process smoothness How do we select a variogram? Can the data really distinguish between variograms? Empirical Variogram: γ(t) = 2 N(t) s i,s j N(t) (y(s i ) y(s j )) 2 where N(t) is the number of points such that s i s j = t and N(t) is the number of points in N(t). Grid up the t space into intervals I = (0, t ), I 2 = (t, t 2 ), and so forth, up to I K = (t K, t K ). Representing t values in each interval by its midpoint, we define: N(t k ) = {(s i, s j ) : s i s j I k }, k =,..., K. 25 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Variogram model fitting Variogram model fitting Empirical variogram: scallops data Empirical variogram: scallops data Parametric Semivariograms Gamma(d) gamma(d) gamma(d) gamma(d) Exponential Gaussian Cauchy Spherical Bessel-J distance Bessel Mixtures - Random Weights Two Three Four Five distance Bessel Mixtures - Random Phi s Two Three Four Five distance distance 27 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

20 Bayesian principles Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters as random, and thus having distributions (just like the data). A Bayesian writes down a prior guess for parameter(s) θ, say p(θ). They then combines this with the information provided by the observed data y to obtain the posterior distribution of θ, which we denote by p(θ y). All statistical inferences (point and interval estimates, hypothesis tests) then follow from posterior summaries. For example, the posterior means/medians/modes offer point estimates of θ, while the quantiles yield credible intervals. 2 Graduate Workshop on Environmental Data Analytics 204 Bayesian principles Basics of Bayesian inference The key to Bayesian inference is learning or updating of prior beliefs. Thus, posterior information prior information. Is the classical approach wrong? That may be a controversial statement, but it certainly is fair to say that the classical approach is limited in scope. The Bayesian approach expands the class of models and easily handles: repeated measures unbalanced or missing data nonhomogenous variances multivariate data and many other settings that are precluded (or much more complicated) in classical settings. We start with a model (likelihood) f(y θ) for the observed data y = (y,..., y n ) given unknown parameters θ (perhaps a collection of several parameters). Add a prior distribution p(θ λ), where λ is a vector of hyper-parameters. The posterior distribution of θ is given by: p(θ y, λ) = p(θ λ) f(y θ) p(y λ) We refer to this formula as Bayes Theorem. = p(θ λ) f(y θ) f(y θ)p(θ λ)dθ. 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Basics of Bayesian inference Calculations (numerical and algebraic) are usually required only up to a proportionaly constant. We, therefore, write the posterior as: p(θ y, λ) p(θ λ) f(y θ). If λ are known/fixed, then the above represents the desired posterior. If, however, λ are unknown, we assign a prior, p(λ), and seek: p(θ, λ y) p(λ)p(θ λ)f(y θ). The proportionality constant does not depend upon θ or λ: p(y) = p(λ)p(θ λ)f(y θ)dλdθ The above represents a joint posterior from a hierarchical model. The marginal posterior distribution for θ is: p(θ y) = p(λ)p(θ λ)f(y θ)dλ. 5 Graduate Workshop on Environmental Data Analytics 204 Bayesian inference: point estimation Point estimation is easy: simply choose an appropriate distribution summary: posterior mean, median or mode. Mode sometimes easy to compute (no integration, simply optimization), but often misrepresents the middle of the distribution especially for one-tailed distributions. Mean: easy to compute. It has the opposite effect of the mode chases tails. Median: probably the best compromise in being robust to tail behaviour although it may be awkward to compute as it needs to solve: θmedian p(θ y)dθ = 2. 6 Graduate Workshop on Environmental Data Analytics 204

21 Bayesian inference: interval estimation The most popular method of inference in practical Bayesian modelling is interval estimation using credible sets. A 00( α)% credible set C for θ is a set that satisfies: P (θ C y) = p(θ y)dθ α. C The most popular credible set is the simple equal-tail interval estimate (q L, q U ) such that: ql p(θ y)dθ = α 2 = p(θ y)dθ Then clearly P (θ (q L, q U ) y) = α. This interval is relatively easy to compute and has a direct interpretation: The probability that θ lies between (q L, q U ) is α. The frequentist interpretation is extremely convoluted. 7 Graduate Workshop on Environmental Data Analytics 204 q U A simple example: Normal data and normal priors Example: Consider a single data point y from a Normal distribution: y N(θ, σ 2 ); assume σ is known. f(y θ) = N(y θ, σ 2 ) = σ 2π exp( 2σ 2 (y θ)2 ) Now set the prior for θ N(µ, τ 2 ), i.e. p(θ) = N(θ µ, τ 2 ); µ, τ 2 are known. Posterior distribution of θ p(θ y) N(θ µ, τ 2 ) N(y θ, σ 2 ) ( τ = N θ 2 + µ + σ 2 τ 2 ( = N ) σ 2 + y, + σ 2 τ 2 σ 2 τ 2 σ 2 θ σ 2 + τ 2 µ + τ 2 σ 2 + τ 2 y, σ 2 τ 2 ) σ 2 + τ 2. 8 Graduate Workshop on Environmental Data Analytics 204 A simple example: Normal data and normal priors Another simple example: The Beta-Binomial model Interpret: Posterior mean is a weighted mean of prior mean and data point. The direct estimate is shrunk towards the prior. What if you had n observations instead of one in the earlier set up? Say y = (y,..., y n ) iid, where y i N(θ, σ 2 ). ( ) ȳ is a sufficient statistic for θ; ȳ N θ, σ2 n Posterior distribution of θ ) p(θ y) N(θ µ, τ 2 ) N (ȳ θ, σ2 n ( ) n τ = N θ 2 σ n + µ + 2 n + ȳ, n + σ 2 τ 2 σ 2 τ 2 σ 2 τ ( 2 σ 2 = N θ σ 2 + nτ 2 µ + nτ 2 σ 2 + nτ 2 ȳ, σ 2 τ 2 ) σ 2 + nτ 2 Example: Let Y be the number of successes in n independent trials. ( ) n P (Y = y θ) = f(y θ) = θ y ( θ) n y y Prior: p(θ) = Beta(θ a, b): p(θ) θ a ( θ) b. Prior mean: µ = a/(a + b); Variance ab/((a + b) 2 (a + b + )) Posterior distribution of θ p(θ y) = Beta(θ a + y, b + n y) 9 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Sampling-based inference In practice, we will compute the posterior distribution p(θ y) by drawing samples from it. This replaces numerical integration (quadrature) by Monte Carlo integration. One important advantage: we only need to know p(θ y) up to the proportionality constant. Suppose θ = (θ, θ 2 ) and we know how to sample from the marginal posterior distribution p(θ 2 y) and the conditional distribution P (θ θ 2, y). How do we draw samples from the joint distribution: p(θ, θ 2 y)? Sampling-based inference We do this in two stages using composition sampling: First draw θ (j) 2 p(θ 2 y), j =,... M. ( Next draw θ (j) p θ θ (j) 2 )., y This sampling scheme produces exact samples, {θ (j), θ(j) 2 }M j= from the posterior distribution p(θ, θ 2 y). Gelfand and Smith (JASA, 990) demonstrated automatic marginalization: {θ (j) }M j= are samples from p(θ y) and (of course!) {θ (j) 2 }M j= are samples from p(θ 2 y). In effect, composition sampling has performed the following integration : p(θ y) = p(θ θ 2, y)p(θ 2 y)dθ. Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

22 Bayesian predictions Some remarks on sampling-based inference Suppose we want to predict new observations, say ỹ, based upon the observed data y. Bayesian predictions follow from the posterior predictive distribution that averages out the θ from the conditional predictive distribution with respect to the posterior: p(ỹ y) = p(ỹ y, θ)p(θ y)dθ. This can be evaluated using composition sampling: First obtain: θ (j) p(θ y), j =,... M For j =,..., M sample ỹ (j) p(ỹ y, θ (j) ) The {ỹ (j) } M j= are samples from the posterior predictive distribution p(ỹ y). Direct Monte Carlo: Some algorithms (e.g. composition sampling) can generate independent samples exactly from the posterior distribution. In these situations there are NO convergence problems or issues. Sampling is called exact. Markov chain Monte Carlo (MCMC): In general, exact sampling may not be possible/feasible. MCMC is a far more versatile set of algorithms that can be invoked to fit more general models. Note: anywhere where direct Monte Carlo applies, MCMC will provide excellent results too. 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Some remarks on sampling-based inference Convergence issues: There is no free lunch! The power of MCMC comes at a cost. The initial samples do not necessarily come from the desired posterior distribution. Rather, they need to converge to the true posterior distribution. Therefore, one needs to assess convergence, discard output before the convergence and retain only post-convergence samples. The time of convergence is called burn-in. Diagnosing convergence: Usually a few parallel chains are run from rather different starting points. The sample values are plotted (called trace-plots) for each of the chains. The time for the chains to mix together is taken as the time for convergence. Good news! Many modeling frameworks are automated in freely available software. So, as users, we need to only configure how to specify good Bayesian models and 5 Graduate Workshop on Environmental Data Analytics 204 implement them in the available software. Some remarks on sampling-based inference Find a wide variety of R packages dealing with Bayesian inference here: http: //cran.r-project.org/web/views/bayesian.html Here s a nice rant on Why I love R ac.uk/~ajrs/talks/why_i_love_r.pdf 6 Graduate Workshop on Environmental Data Analytics 204

23 Linear regression models: a Bayesian perspective Linear regression is, perhaps, the most widely used statistical modelling tool. Bayesian Linear Regression Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 204 It addresses the following question: How does a quantity of primary interest, y, vary as (depend upon) another quantity, or set of quantities, x? The quantity y is called the response or outcome variable. Some people simply refer to it as the dependent variable. The variable(s) x are called explanatory variables, covariates or simply independent variables. In general, we are interested in the conditional distribution of y, given x, parametrized as p(y θ, x). 2 Graduate Workshop on Environmental Data Analytics 204 Linear regression models: a Bayesian perspective Linear regression models: a Bayesian perspective Typically, we have a set of units or experimental subjects i =, 2,..., n. For each of these units we have measured an outcome y i and a set of explanatory variables x i = (, x i, x i2,..., x ip ). The first element of x i is often taken as to signify the presence of an intercept. We collect the outcome and explanatory variables into an n vector and an n (p + ) matrix: y x x 2... x p x y 2 y =. ; X = x 2 x x 2p..... = x 2.. y n x n x n2... x np x n The linear model is the most fundamental of all serious statistical models underpinning: ANOVA: y i is continuous, x ij s are all categorical REGRESSION: y i is continuous, x ij s are continuous ANCOVA: y i is continuous, x ij s are continuous for some j and categorical for others. 3 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204 Linear regression models: a Bayesian perspective Bayesian regression with flat priors The Bayesian or hierarchical linear model is given by: y i µ i, σ 2, X ind N(µ i, σ 2 ); i =, 2,..., n; µ i = β 0 + β x i + + β p x ip = x iβ; β = (β 0, β,..., β p ); β, σ 2 X p(β, σ 2 X). Unknown parameters include the regression parameters and the variance, i.e. θ = {β, σ 2 }. p(β, σ 2 X) p(θ X) is the joint prior on the parameters. We assume X is observed without error and all inference is conditional on X. We suppress dependence on X in subsequent notation. Specifying p(β, σ 2 ) completes the hierarchical model. All inference proceeds from p(β, σ 2 y) With no prior information, we specify p(β, σ 2 ) σ 2 or equivalently p(β) ; p(log(σ2 )). The above is NOT a probability density (they do not integrate to any finite number). So why is it that we are even discussing them? Even if the priors are improper, as long as the resulting posterior distributions are valid we can still conduct legitimate statistical inference on them. 5 Graduate Workshop on Environmental Data Analytics Graduate Workshop on Environmental Data Analytics 204

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Hierarchical Modeling for non-gaussian Spatial Data

Hierarchical Modeling for non-gaussian Spatial Data Hierarchical Modeling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Modelling Multivariate Spatial Data

Modelling Multivariate Spatial Data Modelling Multivariate Spatial Data Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. June 20th, 2014 1 Point-referenced spatial data often

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Researchers in diverse areas such as climatology, ecology, environmental health, and real estate marketing are increasingly faced with the task of analyzing data

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Hierarchical Modelling for non-gaussian Spatial Data

Hierarchical Modelling for non-gaussian Spatial Data Hierarchical Modelling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2

More information

Hierarchical Modelling for non-gaussian Spatial Data

Hierarchical Modelling for non-gaussian Spatial Data Hierarchical Modelling for non-gaussian Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Generalized Linear Models Often data

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Some notes on efficient computing and setting up high performance computing environments

Some notes on efficient computing and setting up high performance computing environments Some notes on efficient computing and setting up high performance computing environments Andrew O. Finley Department of Forestry, Michigan State University, Lansing, Michigan. April 17, 2017 1 Efficient

More information

Hierarchical Modeling for Multivariate Spatial Data

Hierarchical Modeling for Multivariate Spatial Data Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan

More information

Hierarchical Modelling for Univariate and Multivariate Spatial Data

Hierarchical Modelling for Univariate and Multivariate Spatial Data Hierarchical Modelling for Univariate and Multivariate Spatial Data p. 1/4 Hierarchical Modelling for Univariate and Multivariate Spatial Data Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Introduction to Geostatistics

Introduction to Geostatistics Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Hierarchical Modelling for Multivariate Spatial Data

Hierarchical Modelling for Multivariate Spatial Data Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as

More information

Nearest Neighbor Gaussian Processes for Large Spatial Data

Nearest Neighbor Gaussian Processes for Large Spatial Data Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns

More information

On Gaussian Process Models for High-Dimensional Geostatistical Datasets

On Gaussian Process Models for High-Dimensional Geostatistical Datasets On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May

More information

Hierarchical Modeling for Spatio-temporal Data

Hierarchical Modeling for Spatio-temporal Data Hierarchical Modeling for Spatio-temporal Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of

More information

Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,

More information

Gaussian predictive process models for large spatial data sets.

Gaussian predictive process models for large spatial data sets. Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview

More information

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

CBMS Lecture 1. Alan E. Gelfand Duke University

CBMS Lecture 1. Alan E. Gelfand Duke University CBMS Lecture 1 Alan E. Gelfand Duke University Introduction to spatial data and models Researchers in diverse areas such as climatology, ecology, environmental exposure, public health, and real estate

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,

More information

Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets

Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets Sudipto Banerjee University of California, Los Angeles, USA Based upon projects involving: Abhirup Datta (Johns Hopkins University)

More information

Multivariate spatial modeling

Multivariate spatial modeling Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced

More information

Point-Referenced Data Models

Point-Referenced Data Models Point-Referenced Data Models Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Point-Referenced Data Models Spring 2013 1 / 19 Objectives By the end of these meetings, participants should

More information

Model Assessment and Comparisons

Model Assessment and Comparisons Model Assessment and Comparisons Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes TTU, October 26, 2012 p. 1/3 Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes Hao Zhang Department of Statistics Department of Forestry and Natural Resources Purdue University

More information

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric

More information

Hierarchical Modeling and Analysis for Spatial Data

Hierarchical Modeling and Analysis for Spatial Data Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Approaches for Multiple Disease Mapping: MCAR and SANOVA Approaches for Multiple Disease Mapping: MCAR and SANOVA Dipankar Bandyopadhyay Division of Biostatistics, University of Minnesota SPH April 22, 2015 1 Adapted from Sudipto Banerjee s notes SANOVA vs MCAR

More information

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Marcela Alfaro Córdoba August 25, 2016 NCSU Department of Statistics Continuous Parameter

More information

Gaussian Processes 1. Schedule

Gaussian Processes 1. Schedule 1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software April 2007, Volume 19, Issue 4. http://www.jstatsoft.org/ spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O.

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites

More information

Practicum : Spatial Regression

Practicum : Spatial Regression : Alexandra M. Schmidt Instituto de Matemática UFRJ - www.dme.ufrj.br/ alex 2014 Búzios, RJ, www.dme.ufrj.br Exploratory (Spatial) Data Analysis 1. Non-spatial summaries Numerical summaries: Mean, median,

More information

Introduction to Bayes and non-bayes spatial statistics

Introduction to Bayes and non-bayes spatial statistics Introduction to Bayes and non-bayes spatial statistics Gabriel Huerta Department of Mathematics and Statistics University of New Mexico http://www.stat.unm.edu/ ghuerta/georcode.txt General Concepts Spatial

More information

A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models

A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models arxiv:1811.03735v1 [math.st] 9 Nov 2018 Lu Zhang UCLA Department of Biostatistics Lu.Zhang@ucla.edu Sudipto Banerjee UCLA

More information



More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced

More information

Introduction. Spatial Processes & Spatial Patterns

Introduction. Spatial Processes & Spatial Patterns Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain

More information

Geostatistical Modeling for Large Data Sets: Low-rank methods

Geostatistical Modeling for Large Data Sets: Low-rank methods Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Hierarchical Modeling for Spatial Data

Hierarchical Modeling for Spatial Data Bayesian Spatial Modelling Spatial model specifications: P(y X, θ). Prior specifications: P(θ). Posterior inference of model parameters: P(θ y). Predictions at new locations: P(y 0 y). Model comparisons.

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information



More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

The Bayesian approach to inverse problems

The Bayesian approach to inverse problems The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

The linear model is the most fundamental of all serious statistical models encompassing:

The linear model is the most fundamental of all serious statistical models encompassing: Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University

More information

An Additive Gaussian Process Approximation for Large Spatio-Temporal Data

An Additive Gaussian Process Approximation for Large Spatio-Temporal Data An Additive Gaussian Process Approximation for Large Spatio-Temporal Data arxiv:1801.00319v2 [stat.me] 31 Oct 2018 Pulong Ma Statistical and Applied Mathematical Sciences Institute and Duke University

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Multivariate Gaussian Random Fields with SPDEs

Multivariate Gaussian Random Fields with SPDEs Multivariate Gaussian Random Fields with SPDEs Xiangping Hu Daniel Simpson, Finn Lindgren and Håvard Rue Department of Mathematics, University of Oslo PASI, 214 Outline The Matérn covariance function and

More information

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström.

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström. C Stochastic fields Covariance Spatial Statistics with Image Analysis Lecture 2 Johan Lindström November 4, 26 Lecture L2 Johan Lindström - johanl@maths.lth.se FMSN2/MASM2 L /2 C Stochastic fields Covariance

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Overview of Spatial Statistics with Applications to fmri

Overview of Spatial Statistics with Applications to fmri with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example

More information

Fusing point and areal level space-time data. data with application to wet deposition

Fusing point and areal level space-time data. data with application to wet deposition Fusing point and areal level space-time data with application to wet deposition Alan Gelfand Duke University Joint work with Sujit Sahu and David Holland Chemical Deposition Combustion of fossil fuel produces

More information

Summary STK 4150/9150

Summary STK 4150/9150 STK4150 - Intro 1 Summary STK 4150/9150 Odd Kolbjørnsen May 22 2017 Scope You are expected to know and be able to use basic concepts introduced in the book. You knowledge is expected to be larger than

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Empirical Bayes methods for the transformed Gaussian random field model with additive measurement errors

Empirical Bayes methods for the transformed Gaussian random field model with additive measurement errors 1 Empirical Bayes methods for the transformed Gaussian random field model with additive measurement errors Vivekananda Roy Evangelos Evangelou Zhengyuan Zhu CONTENTS 1.1 Introduction......................................................

More information

A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging

A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging Cheng Li DSAP, National University of Singapore Joint work with Rajarshi Guhaniyogi (UC Santa Cruz), Terrance D. Savitsky (US Bureau of Labor

More information

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP The IsoMAP uses the multiple linear regression and geostatistical methods to analyze isotope data Suppose the response variable

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Nonparametric Bayes Uncertainty Quantification

Nonparametric Bayes Uncertainty Quantification Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information