Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets
|
|
- Bertram Page
- 5 years ago
- Views:
Transcription
1 Bayesian Modeling and Inference for High-Dimensional Spatiotemporal Datasets Sudipto Banerjee University of California, Los Angeles, USA
2 Based upon projects involving: Abhirup Datta (Johns Hopkins University) Andrew O. Finley (Michigan State University) Nicholas A.S. Hamm (University of Twente) Martjin Schaap (TNO Built Environment and Geosciences)
3 Example 1: U.S. forest biomass data Figure: Observed biomass (left) and NDVI (right) Forest biomass data collected over 114,371 plots Normalized Difference Vegetation Index (NDVI) is a measure of greenness Forest Biomass Regression Model: Biomass(l) = β 0 (l) + β 1 (l)ndvi(l) + error
4 Example 2: European Particulate Matter (PM 10 ) data Northing (km) Northing (km) Easting (km) Easting (km) (a) PM 10 levels in March, 2009 (b) PM 10 levels in June, 2009 Significant variation across space and time Daily observations at 308 stations for 2 years i.e., n = = 224, 840
5 Example 2: European PM 10 data Computer models like Chemistry Transport Model (CTM) consistently underestimate PM 10 levels CTM outputs used as covariates to improve fits log(pm 10 )(l) = β 0 (l) + β 1 (l)ctm(l) + ɛ(l)
6 Example 3: Tanana Valley (Alaska) forest canopy height analysis (a) (b) Figure: Tanana vally, Alaska, study region. (a) G-LiHT flight lines where canopy height was measured at locations over the percent forest canopy covariate. (b) Occurrence of forest fire in the past 20 years and areas of interest for prediction illustration.
7 Spatiotemporal regression models Y(l) = β 0 (l) + X(l)β 1 (l) + e(l) Produce maps for intercept and slope: {β 0 (l) : l L} and {β 1 (l) : l L} L is spatial domain (e.g., D R d ) or spatiotemporal domain (e.g., D R d R + ) Potentially very rich: understand spatially- and/or temporally-varying impact of predictors on outcome. Model-based predictions: Y(l 0 ) {y(l 1 ), y(l 2 ),..., y(l n )}.
8 Gaussian spatiotemporal process {w(l) : l L} GP(0, K θ (, )) implies w = (w(l 1 ), w(l 2 ),..., w(l n )) N(0, K θ ) for every finite set of points l 1, l 2,..., l n. K θ = {K θ (l i, l j )} is a spatial variance-covariance matrix Stationary: K θ (l, l ) = K θ (l l ). Isotropy: K θ (l, l ) = K θ ( l l ). With nugget (esp. when modeling data): K θ = C (σ,φ) + D τ, where θ = {σ, φ, τ} No nugget (esp. when modeling random effects): K θ = C (σ,φ), where θ = {σ, φ}
9 Likelihood from (full rank) GP models L = {l 1, l 2,..., l n } are locations where data is observed y(l i ) is outcome at the i th location, y = (y(l 1 ), y(l 2 ),..., y(l n )) Model: y N(Xβ, K θ ) Estimating process parameters from the likelihood: 1 2 log det(k θ) 1 2 (y Xβ) K 1 (y Xβ) Bayesian inference: Priors on {β, θ} Challenges: Storage and chol(k θ ) = LDL. θ
10 Burgeoning literature on spatial big data Low-rank models (Wahba, 1990; Higdon, 2002; Kamman & Wand, 2003; Paciorek, 2007; Rasmussen & Williams, 2006; Stein 2007, 2008; Cressie & Johannesson, 2008; Banerjee et al., 2008; 2010; Gramacy & Lee 2008; Sang et al., 2011, 2012; Lemos et al., 2011; Guhaniyogi et al., 2011, 2013; Salazar et al., 2013; Katzfuss, 2016) Spectral approximations and composite likelihoods: (Fuentes 2007; Paciorek, 2007; Eidsvik et al. 2016) Multi-resolution approaches (Nychka, 2002; Johannesson et al., 2007; Matsuo et al., 2010; Tzeng & Huang, 2015; Katzfuss, 2016) Sparsity: (Solve Ax = b by (i) sparse A, or (ii) sparse A 1 ) 1. Covariance tapering (Furrer et al. 2006; Du et al. 2009; Kaufman et al., 2009; Shaby and Ruppert, 2013) 2. GMRFs to GPs: INLA (Rue et al. 2009; Lindgren et al., 2011) 3. LAGP (Gramacy et al. 2014; Gramacy and Apley, 2015) 4. Nearest-neighbor models (Vecchia 1988; Stein et al. 2004; Stroud et al 2014; Datta et al., 2016)
11 Reduced (Low) rank models K θ B θ K θ B θ + D θ B θ is n r matrix of spatial basis functions, r << n K θ is r r spatial covariance matrix D θ is either diagonal or sparse Examples: Kernel projections, Splines, Predictive process, FRK, spectral basis... Computations exploit above structure: roughly O(nr 2 ) << O(n 3 ) flops
12 Oversmoothing due to reduced-rank models (a) True w (b) Full GP (c) PPGP 64 knots Figure: Comparing full GP vs low-rank GP with 2500 locations. Figure (4(c)) exhibits oversmoothing by a low-rank process (predictive process with 64 knots)
13 Simple method of introducing sparsity (e.g. graphical models) Full dependency graph p(y 1 )p(y 2 y 1 )p(y 3 y 1, y 2 )p(y 4 y 1, y 2, y 3 ) p(y 5 y 1, y 2, y 3, y 4 )p(y 6 y 1, y 2,..., y 5 )p(y 7 y 1, y 2,..., y 6 ).
14 Simple method of introducing sparsity (e.g. graphical models) 3 Nearest neighbor dependency graph p(y 1 )p(y 2 y 1 )p(y 3 y 1, y 2 )p(y 4 y 1, y 2, y 3 ) p(y 5 y 1, y 2, y 3, y 4 )p(y 6 y 1, y 2, y 3, y 4, y 5 )p(y 7 y 1, y 2, y 3, y 4, y 5, y 6 )
15 Gaussian graphical models: linearity Write a joint density p(w) = p(w 1, w 2,..., w n ) as: p(w 1 )p(w 2 w 1 )p(w 3 w 1, w 2 ) p(w n w 1, w 2,..., w n 1 ) Example: For Gaussian distribution N(w 0, K θ ), we have a linear model w 1 = 0 + η 1 ; w 2 = a 21 w 1 + η 2 ; w 3 = a 31 w 1 + a 32 w 2 + η 3 ; w i = a i1 w 1 + a i2 w a i,i 1 w i 1 + η i ; i = 4,..., n. More compactly: w = Aw + η ; η N(0, D).
16 Simple method of introducing sparsity (e.g. graphical models) For Gaussian distribution N(w 0, K θ ), K θ = (I A) 1 D(I A) D = diag(var{w i w {j<i} }) If L is from chol(k θ ) = LDL, then L 1 = I A. a ij s obtained from n 1 linear systems implied by j<i:j i a ij w j = E[w i w {j<i} ] i = 2,..., n Example: for(i in 1:n) { a[i+1,] = solve(k[1:i,1:i], K[i, 1:i]) }
17 Let a ij = 0 for all but m nearest neighbors of node i implies solving j N[i] a ij w j = E[w i w {j N[i]} ] i = 2,..., n, where N[i] = {j < i : j i} are indices for neighbors of i from its past. Example: for(i in 1:n) { } a[i+1,] = solve(k[n[i],n[i]], K[i, N[i]]) We need to solve n 1 linear systems of size at most m m We effectively model a (sparse) Cholesky factor instead of computing it
18 Sparse precision matrices N(w R 0, K θ ) N(w R 0, K θ ) ; K 1 θ = (I A) D 1 (I A)
19 Sparse precision matrices N(w R 0, K θ ) N(w R 0, K θ ) ; K 1 θ = (I A) D 1 (I A) (a) I A (b) D 1 (c) K 1 θ
20 Sparse precision matrices N(w R 0, K θ ) N(w R 0, K θ ) ; K 1 θ = (I A) D 1 (I A) (a) I A (b) D 1 (c) K 1 θ det( K 1 θ ) = n i=1 D 1 ii, K θ 1 is sparse with O(nm 2 ) entries
21 Sparse precision matrices N(w R 0, K θ ) N(w R 0, K θ ) ; K 1 θ = (I A) D 1 (I A) (a) I A (b) D 1 (c) K 1 θ det( K 1 θ ) = n i=1 D 1 ii, K θ 1 is sparse with O(nm 2 ) entries
22 Sparse likelihood approximations (Vecchia, 1988; Stein et al., 2004) Let R = {l 1, l 2,..., l r } With w(l) GP(0, K θ ( )), write the joint density p(w R ) as: N(w R 0, K θ ) = where N(l i ) H(l i ). r p(w(l i ) w H(li )) i=1 r i=1 p(w(l i ) w N(li )) = N(w R 0, K θ ). Shrinkage: Choose N(l) as the set of m nearest-neighbors among H(l i ). Theory: Screening effect of kriging. K θ 1 depends on K θ, but is sparser with at most nm 2 non-zero entries
23 Extension to a GP (Datta et al., JASA, 2016) Fix reference set R = {l 1, l 2,..., l r } (e.g. observed points) N(l i ) is the set of at most m nearest neighbors of l i among {l 1, l 2,..., l i 1 }. First piece: Model w R N(0, K θ ) ( Vecchia prior ) Second piece: If l / R, then N(l) is the set of m-nearest neighbors of l in R Third piece: w(l) = r i=1 a i (l)w(l i ) + η(l) with a i (l) = 0 if l i / N(l). Nonzero a i (l) s obtained by solving m m system: E[w(l) w N(l) ] = a i (l)w(l i ) i:l i N(l)
24 Neighbors in Space and Time No universal definition of distance in a space-time domain Use K θ (, ) as a proxy for distance Datta et al. (2016, AoAS): Efficient algorithm O(4nm) flops to do this
25 Example 1: Hierarchical NNGP model Start with a desired full GP specification: GP(0, K θ ( )) Derive the NNGP: NNGP(0, K θ ( )) Y(l) ind P θ exponential family ; g(e[y(l)]) = β 0 (l) + X(l)β 1 (l) (β 0 (l), β 1 (l)) NNGP( β 0 + X(l) β 1, K θ ( )) ( β 0, β 1 ) N(0, V β ) ; θ p(θ) Posterior predictive inference for β 0 (l 0 ), β 1 (l 0 ) and Y(l 0 )
26 Example 2: Hierarchical NNGP model Start with a desired full GP specification for Y(l): Y(l) GP(x (l)β, K θ ( )) Derive the NNGP: Y(l) NNGP(x (l)β, K θ ( )) Y N(Xβ, K θ ) ; β N(0, V β ) ; θ p(θ) No need for Cholesky: it is modeled. Easy posterior predictive inference for Y(l 0 ) at new l 0. But no latent spatial-temporal process
27 (a) True w (b) Full GP (c) PPGP 64 knots (d) NNGP, m = 10 (e) NNGP, m = 20
28 RMSPE NNGP RMSPE NNGP Mean 95% CI width Full GP RMSPE Full GP Mean 95% CI width Mean 95% CI width m Figure: Choice of m in NNGP models: Out-of-sample Root Mean Squared Prediction Error (RMSPE) and mean width between the upper and lower 95% posterior predictive credible intervals for a range of m for the univariate synthetic data analysis
29 Back to European PM 10 data Northing (km) Northing (km) Easting (km) Easting (km) (a) PM 10 levels in March, 2009 (b) PM 10 levels in June, 2009 Interest in estimating short and long term temporal (and spatial) decay (to improve the CTMs) log(pm 10 )(s, t) = β 0 + β 1 CTM(s, t) + w(s, t) + ɛ(s, t) w(s, t) DNNGP(0, K θ ( ))
30 European PM 10 Dataset Significantly improved fit OLS DNNGP RMSPE Total time 24 hrs
31 European PM 10 Dataset Northing (km) Missing [0,20) [20,40) [40,60) [60,80) [80,100) [100,120] Northing (km) [0,0.1) [0.1,0.2) [0.2,0.3) [0.3,0.4) [0.4,0.5) [0.5,0.6) [0.6,0.7) [0.7,0.8) [0.8,0.9) [0.9,1] (a) PM 10 for (b) Pr( PM 10 > 50µgm 3 ) Easting (km) Easting (km)
32 European PM 10 Dataset Northing (km) Missing [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) [80,90] Northing (km) [0,0.1) [0.1,0.2) [0.2,0.3) [0.3,0.4) [0.4,0.5) [0.5,0.6) [0.6,0.7) [0.7,0.8) [0.8,0.9) [0.9,1] (a) PM 10 for (b) Pr( PM 10 > 50µgm 3 ) Easting (km) Easting (km)
33 Concluding remarks: Storage and computation Algorithms: Gibbs, RWM, HMC, VB, INLA; NNGP/HMC especially promising Model-based solution for spatial BIG DATA Never needs to store n n distance matrix. Stores n small m m matrices Total flop count per iteration is O(nm 3 ) i.e linear in n Scalable to massive datasets because m is small you never need more than a few neighbors. Compare with reduced-rank models: O(nm 3 ) << O(nr 2 ). New R package spnngp (on CRAN soon)
34 Concluding remarks: Comparisons Are low-rank spatial models well and truly beaten? Certainly do not seem to scale as nicely as NNGP Have somewhat greater theoretical tractability (e.g. Bayesian asymptotics) Can be used to flexibly model smoothness Can be constructed for other processes e.g., Spatial Dirichlet Predictive Process Compare with scalable multi-resolution frameworks (Katzfuss, 2016) Highly scalable meta-kriging frameworks (Guhaniyogi, 2016) Future work: High-dimensional multivariate spatial-temporal variable selection
35 Thank You!
On Gaussian Process Models for High-Dimensional Geostatistical Datasets
On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationAn Additive Gaussian Process Approximation for Large Spatio-Temporal Data
An Additive Gaussian Process Approximation for Large Spatio-Temporal Data arxiv:1801.00319v2 [stat.me] 31 Oct 2018 Pulong Ma Statistical and Applied Mathematical Sciences Institute and Duke University
More informationModelling Multivariate Spatial Data
Modelling Multivariate Spatial Data Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. June 20th, 2014 1 Point-referenced spatial data often
More informationA Divide-and-Conquer Bayesian Approach to Large-Scale Kriging
A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging Cheng Li DSAP, National University of Singapore Joint work with Rajarshi Guhaniyogi (UC Santa Cruz), Terrance D. Savitsky (US Bureau of Labor
More informationA Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models
A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models arxiv:1811.03735v1 [math.st] 9 Nov 2018 Lu Zhang UCLA Department of Biostatistics Lu.Zhang@ucla.edu Sudipto Banerjee UCLA
More informationPractical Bayesian Modeling and Inference for Massive Spatial Datasets On Modest Computing Environments
Practical Bayesian Modeling and Inference for Massive Spatial Datasets On Modest Computing Environments Lu Zhang UCLA Department of Biostatistics arxiv:1802.00495v1 [stat.me] 1 Feb 2018 Lu.Zhang@ucla.edu
More informationA full scale, non stationary approach for the kriging of large spatio(-temporal) datasets
A full scale, non stationary approach for the kriging of large spatio(-temporal) datasets Thomas Romary, Nicolas Desassis & Francky Fouedjio Mines ParisTech Centre de Géosciences, Equipe Géostatistique
More informationEfficient algorithms for Bayesian Nearest Neighbor. Gaussian Processes
Efficient algorithms for Bayesian Nearest Neighbor Gaussian Processes arxiv:1702.00434v3 [stat.co] 3 Mar 2018 Andrew O. Finley 1, Abhirup Datta 2, Bruce C. Cook 3, Douglas C. Morton 3, Hans E. Andersen
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationHierarchical Modeling for Multivariate Spatial Data
Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationMethods for Analyzing Large Spatial Data:
Methods for Analyzing Large Spatial Data: A Review and Comparison arxiv:1710.013v1 [stat.me] 13 Oct 2017 Matthew J. Heaton, Abhirup Datta, Andrew Finley, Reinhard Furrer, Raj Guhaniyogi, Florian Gerber,
More informationGaussian predictive process models for large spatial data sets.
Gaussian predictive process models for large spatial data sets. Sudipto Banerjee, Alan E. Gelfand, Andrew O. Finley, and Huiyan Sang Presenters: Halley Brantley and Chris Krut September 28, 2015 Overview
More informationGaussian Processes 1. Schedule
1 Schedule 17 Jan: Gaussian processes (Jo Eidsvik) 24 Jan: Hands-on project on Gaussian processes (Team effort, work in groups) 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik) 7 Feb: Hands-on project
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationThe Matrix Reloaded: Computations for large spatial data sets
The Matrix Reloaded: Computations for large spatial data sets The spatial model Solving linear systems Matrix multiplication Creating sparsity Doug Nychka National Center for Atmospheric Research Sparsity,
More informationNonparametric Bayesian Methods
Nonparametric Bayesian Methods Debdeep Pati Florida State University October 2, 2014 Large spatial datasets (Problem of big n) Large observational and computer-generated datasets: Often have spatial and
More informationGeostatistical Modeling for Large Data Sets: Low-rank methods
Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank
More informationHierarchical Modeling for Spatio-temporal Data
Hierarchical Modeling for Spatio-temporal Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationLatent Gaussian Processes and Stochastic Partial Differential Equations
Latent Gaussian Processes and Stochastic Partial Differential Equations Johan Lindström 1 1 Centre for Mathematical Sciences Lund University Lund 2016-02-04 Johan Lindström - johanl@maths.lth.se Gaussian
More informationThe Matrix Reloaded: Computations for large spatial data sets
The Matrix Reloaded: Computations for large spatial data sets Doug Nychka National Center for Atmospheric Research The spatial model Solving linear systems Matrix multiplication Creating sparsity Sparsity,
More informationA Fused Lasso Approach to Nonstationary Spatial Covariance Estimation
Supplementary materials for this article are available at 10.1007/s13253-016-0251-8. A Fused Lasso Approach to Nonstationary Spatial Covariance Estimation Ryan J. Parker, Brian J. Reich,andJoEidsvik Spatial
More informationHierarchical Modelling for Multivariate Spatial Data
Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as
More informationA Multi-resolution Gaussian process model for the analysis of large spatial data sets.
1 2 3 4 A Multi-resolution Gaussian process model for the analysis of large spatial data sets. Douglas Nychka, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain August 13, 2013 5
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationspbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments
More informationTechnical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models
Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationDouglas Nychka, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain. October 10, 2012
A multi-resolution Gaussian process model for the analysis of large spatial data sets. Douglas Nychka, Soutir Bandyopadhyay, Dorit Hammerling, Finn Lindgren, and Stephan Sain October 10, 2012 Abstract
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationGaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency
Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationLow-rank methods and predictive processes for spatial models
Low-rank methods and predictive processes for spatial models Sam Bussman, Linchao Chen, John Lewis, Mark Risser with Sebastian Kurtek, Vince Vu, Ying Sun February 27, 2014 Outline Introduction and general
More informationA multi-resolution Gaussian process model for the analysis of large spatial data sets.
National Science Foundation A multi-resolution Gaussian process model for the analysis of large spatial data sets. Doug Nychka Soutir Bandyopadhyay Dorit Hammerling Finn Lindgren Stephen Sain NCAR/TN-504+STR
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationA Bayesian Spatio-Temporal Geostatistical Model with an Auxiliary Lattice for Large Datasets
Statistica Sinica (2013): Preprint 1 A Bayesian Spatio-Temporal Geostatistical Model with an Auxiliary Lattice for Large Datasets Ganggang Xu 1, Faming Liang 1 and Marc G. Genton 2 1 Texas A&M University
More informationSome notes on efficient computing and setting up high performance computing environments
Some notes on efficient computing and setting up high performance computing environments Andrew O. Finley Department of Forestry, Michigan State University, Lansing, Michigan. April 17, 2017 1 Efficient
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationParameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets
Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets Matthias Katzfuß Advisor: Dr. Noel Cressie Department of Statistics The Ohio State University
More informationHierarchical Modelling for Univariate and Multivariate Spatial Data
Hierarchical Modelling for Univariate and Multivariate Spatial Data p. 1/4 Hierarchical Modelling for Univariate and Multivariate Spatial Data Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota
More informationSpatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter
Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint
More informationHierarchical Low Rank Approximation of Likelihoods for Large Spatial Datasets
Hierarchical Low Rank Approximation of Likelihoods for Large Spatial Datasets Huang Huang and Ying Sun CEMSE Division, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia. July
More informationComputation fundamentals of discrete GMRF representations of continuous domain spatial models
Computation fundamentals of discrete GMRF representations of continuous domain spatial models Finn Lindgren September 23 2015 v0.2.2 Abstract The fundamental formulas and algorithms for Bayesian spatial
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More information20: Gaussian Processes
10-708: Probabilistic Graphical Models 10-708, Spring 2016 20: Gaussian Processes Lecturer: Andrew Gordon Wilson Scribes: Sai Ganesh Bandiatmakuri 1 Discussion about ML Here we discuss an introduction
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,
More informationA BAYESIAN SPATIO-TEMPORAL GEOSTATISTICAL MODEL WITH AN AUXILIARY LATTICE FOR LARGE DATASETS
Statistica Sinica 25 (2015), 61-79 doi:http://dx.doi.org/10.5705/ss.2013.085w A BAYESIAN SPATIO-TEMPORAL GEOSTATISTICAL MODEL WITH AN AUXILIARY LATTICE FOR LARGE DATASETS Ganggang Xu 1, Faming Liang 1
More informationOn Some Computational, Modeling and Design Issues in Bayesian Analysis of Spatial Data
On Some Computational, Modeling and Design Issues in Bayesian Analysis of Spatial Data A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Qian Ren IN PARTIAL
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationarxiv: v4 [stat.me] 14 Sep 2015
Does non-stationary spatial data always require non-stationary random fields? Geir-Arne Fuglstad 1, Daniel Simpson 1, Finn Lindgren 2, and Håvard Rue 1 1 Department of Mathematical Sciences, NTNU, Norway
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry
More informationKriging and Alternatives in Computer Experiments
Kriging and Alternatives in Computer Experiments C. F. Jeff Wu ISyE, Georgia Institute of Technology Use kriging to build meta models in computer experiments, a brief review Numerical problems with kriging
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationData are collected along transect lines, with dense data along. Spatial modelling using GMRFs. 200m. Today? Spatial modelling using GMRFs
Centre for Mathematical Sciences Lund University Engineering geology Lund University Results A non-stationary extension The model Estimation Gaussian Markov random fields Basics Approximating Mate rn covariances
More informationSpatial smoothing using Gaussian processes
Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling
More informationA STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES
A STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES JOHN STEPHENSON 1, CHRIS HOLMES, KERRY GALLAGHER 1 and ALEXANDRE PINTORE 1 Dept. Earth Science and Engineering, Imperial College,
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationVirtual Sensors and Large-Scale Gaussian Processes
Virtual Sensors and Large-Scale Gaussian Processes Ashok N. Srivastava, Ph.D. Principal Investigator, IVHM Project Group Lead, Intelligent Data Understanding ashok.n.srivastava@nasa.gov Coauthors: Kamalika
More informationA full-scale approximation of covariance functions for large spatial data sets
A full-scale approximation of covariance functions for large spatial data sets Huiyan Sang Department of Statistics, Texas A&M University, College Station, USA. Jianhua Z. Huang Department of Statistics,
More informationMarkov random fields. The Markov property
Markov random fields The Markov property Discrete time: (X k X k!1,x k!2,... = (X k X k!1 A time symmetric version: (X k! X!k = (X k X k!1,x k+1 A more general version: Let A be a set of indices >k, B
More informationA Sequential Split-Conquer-Combine Approach for Analysis of Big Spatial Data
A Sequential Split-Conquer-Combine Approach for Analysis of Big Spatial Data Min-ge Xie Department of Statistics & Biostatistics Rutgers, The State University of New Jersey In collaboration with Xuying
More informationFitting Large-Scale Spatial Models with Applications to Microarray Data Analysis
Fitting Large-Scale Spatial Models with Applications to Microarray Data Analysis Stephan R Sain Department of Mathematics University of Colorado at Denver Denver, Colorado ssain@mathcudenveredu Reinhard
More informationHierarchical Modeling for Spatial Data
Bayesian Spatial Modelling Spatial model specifications: P(y X, θ). Prior specifications: P(θ). Posterior inference of model parameters: P(θ y). Predictions at new locations: P(y 0 y). Model comparisons.
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationCOVARIANCE APPROXIMATION FOR LARGE MULTIVARIATE SPATIAL DATA SETS WITH AN APPLICATION TO MULTIPLE CLIMATE MODEL ERRORS 1
The Annals of Applied Statistics 2011, Vol. 5, No. 4, 2519 2548 DOI: 10.1214/11-AOAS478 Institute of Mathematical Statistics, 2011 COVARIANCE APPROXIMATION FOR LARGE MULTIVARIATE SPATIAL DATA SETS WITH
More informationTheory and Computation for Gaussian Processes
University of Chicago IPAM, February 2015 Funders & Collaborators US Department of Energy, US National Science Foundation (STATMOS) Mihai Anitescu, Jie Chen, Ying Sun Gaussian processes A process Z on
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationBayesian and Maximum Likelihood Estimation for Gaussian Processes on an Incomplete Lattice
Bayesian and Maximum Likelihood Estimation for Gaussian Processes on an Incomplete Lattice Jonathan R. Stroud, Michael L. Stein and Shaun Lysen Georgetown University, University of Chicago, and Google,
More informationFusing point and areal level space-time data. data with application to wet deposition
Fusing point and areal level space-time data with application to wet deposition Alan Gelfand Duke University Joint work with Sujit Sahu and David Holland Chemical Deposition Combustion of fossil fuel produces
More informationrandom fields on a fine grid
Spatial models for point and areal data using Markov random fields on a fine grid arxiv:1204.6087v1 [stat.me] 26 Apr 2012 Christopher J. Paciorek Department of Biostatistics, Harvard School of Public Health
More informationThe Bayesian approach to inverse problems
The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu
More informationOverview of Spatial Statistics with Applications to fmri
with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example
More informationBeyond MCMC in fitting complex Bayesian models: The INLA method
Beyond MCMC in fitting complex Bayesian models: The INLA method Valeska Andreozzi Centre of Statistics and Applications of Lisbon University (valeska.andreozzi at fc.ul.pt) European Congress of Epidemiology
More informationLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationConsider the following example of a linear system:
LINEAR SYSTEMS Consider the following example of a linear system: Its unique solution is x + 2x 2 + 3x 3 = 5 x + x 3 = 3 3x + x 2 + 3x 3 = 3 x =, x 2 = 0, x 3 = 2 In general we want to solve n equations
More informationBayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling
Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationInterpolation of daily mean air temperature data via spatial and non-spatial copulas
Interpolation of daily mean air temperature data via spatial and non-spatial copulas F. Alidoost, A. Stein f.alidoost@utwente.nl 6 July 2017 Research problem 2 Assessing near-real time crop and irrigation
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationarxiv: v2 [cs.lg] 16 Nov 2017
Journal of Machine Learning Research XX (17) 1- Submitted 1/17; Published XX/XX Patchwork Kriging for Large-scale Gaussian Process Regression arxiv:171.66v [cs.lg] 16 Nov 17 Chiwoo Park Department of Industrial
More informationPartial factor modeling: predictor-dependent shrinkage for linear regression
modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework
More information