Nonstationary Covariance Functions for Gaussian Process Regression
|
|
- Lester Nichols
- 5 years ago
- Views:
Transcription
1 Nonstationar Covariance Functions for Gaussian Process Regression Christopher J. Paciorek Department of Biostatistics Harvard School of Public Health Mark J. Schervish Department of Statistics Carnegie Mellon Universit Neural Information Processing Sstems December 9,
2 ABSTRACT We introduce a class of nonstationar covariance functions for Gaussian process (GP) regression. Nonstationar covariance functions allow the model to adapt to functions whose smoothness varies with the inputs. The class includes a nonstationar version of the Matérn stationar covariance, in which the differentiabilit of the regression function is controlled b a parameter, freeing one from fiing the differentiabilit in advance. In eperiments, the nonstationar GP regression model performs well when the input space is two or three dimensions, performing comparabl to a Baesian neural network and outperforming an optimized neural network model and Baesian free-knot spline models. In one dimension, it is outperformed b a state-of-the-art Baesian free-knot spline model and b the Baesian neural network model. The model readil generalizes to non-gaussian data. Use of computational methods for speeding GP fitting allows for implementation of the method on somewhat larger datasets. 2
3 THE GENERAL PROBLEM OF VARIABLE SMOOTHNESS For functions whose smoothness varies with the input values, most nonparametric methods, including Gaussian process (GP) regression with a stationar covariance function, will oversmooth in some regions and undersmooth in others. Below I show the posterior mean regression function from a GP regression implementation with two different fied values (blue and red lines) of the correlation scale hperparameter. The top panels are for 100 data points using Gaussian error around the true function (black line) while the bottom panels are for 500 data points. κ = 0.2 κ =
4 CURRENT APPROACHES Man other approaches to this problem have been presented, but good approaches for multivariate features and comparison amongst the approaches are still needed. Some current approaches are: Free-knot regression spline models: number and location of knots optimized, with more knots in locations where function varies more quickl Baesian Adaptive Regression Splines (BARS) (DiMatteo, Genovese, and Kass 2002) (single feature onl) Baesian Multivariate Adaptive Regression Splines (BMARS) (Denison, Mallick, and Smith 1998) and Baesian Multivariate Linear Splines (BMLS) (Holmes and Mallick 2001) (multiple features) Adaptive penalt smoothing spline models (MacKa and Takeuchi 1995) Neural network models (Neal, etc.) Mitures of stationar Gaussian processes (Tresp 2001; Rasmussen and Ghahramani 2002) Stationar Gaussian process regression in a deformed feature space (Damian, Sampson, and Guttorp 2001, Schmidt and O Hagan 2000) (used for spatial features) In this poster and accompaning paper, we describe an approach to the variable smoothness problem using Gaussian process regression with nonstationar covariance functions. 4
5 STATIONARY CORRELATION FUNCTIONS Here are two stationar correlation functions, with eamples of the correlation function (left) and sample functions drawn from a Gaussian process parameterized b the correlation function (right). The squared eponential correlation function (top) gives sample functions with infinitel man derivatives, while the Matérn correlation function (bottom) gives sample functions whose number of derivatives varies with ν: ν 1 derivatives. For ν, one recovers the squared eponential form. Squared eponential: R(τ) = ep ( ( ) ) τ 2 κ Correlation κ = 0.2 κ = 0.04 f() Distance Matérn form: R(τ) = ( 1 2 ντ 2 Γ(ν)2 ν 1 κ ) ν Kν ( 2 ντ 2 κ ) Correlation ν = 30 ν = 2 f() Distance Correlation function 5 Sample functions
6 A NONSTATIONARY COVARIANCE FUNCTION Higdon, Swall, and Kern (1999) (HSK) introduced the following nonstationar correlation function, where c ij is a normalizing term and k i (u) is a function (called the kernel function) centered at i. R NS ( i, j ) = c ij R P k i (u)k j (u)du Guaranteed positive definite Using Gaussian kernels, one gets a closed form for the correlation: k i (u) ep ( (u i ) T Σ 1 i (u i ) ) ( R NS ( i, j ) = c ij ep ( i j ) T ( Σi + Σ j 2 ) 1 ( i j )) Gibbs (1997) gave a special case with diagonal Σ i, Σ j f( ) GP(µ, σ 2 R NS (, ; Σ( ))) is a nonstationar Gaussian process 6
7 NONSTATIONARY GPS IN 1D Here are four sample functions (right) drawn from a nonstationar Gaussian process distribution whose Gaussian kernels are defined based on their standard deviation (left). Note that the sample functions are more wiggl in the left part of the input space where the kernels are less broad. Some sample functions Kernel standard deviation Some kernels
8 NONSTATIONARY GPS IN 2D Here (right) is one sample function (of two inputs) from a nonstationar Gaussian process distribution whose Gaussian kernels are depicted using ellipses of constant densit (left). Note the smoothness of the function where the kernels are large and the directionalit of the smoothness where the kernels have strong directionalit. Kernel Structure Sample Function z
9 GENERALIZING THE HSK KERNEL METHOD The stationar squared eponential form (left) has the same form as the HSK nonstationar covariance (right): ( ( τ ) ) ( ( 2 ep c ij ep ( i j ) T Σi + Σ j κ 2 ) 1 ( i j )) The HSK nonstationar covariance merel replaces Euclidean distance with a quadratic form in the eponential. Gaussian process distributions with this nonstationar covariance have infinitel-differentiable sample paths if Σ( ) var smoothl. Consider the following distance measures (the nonstationar one does not isotrop satisf the triangle inequalit): τ 2 ij = ( i j ) T ( i j ) anisotrop τ 2 ij = ( i j ) T Σ 1 ( i j ) ( ) 1 Q ij = ( i j ) T Σi + Σ nonstationarit j ( i j ) 2 Can we replace τij 2 with Q ij in other stationar correlation functions and still retain positive definiteness? 9
10 GENERALIZED NONSTATIONARY COVARIANCE Theorem 1 (Paciorek 2003): if a stationar correlation function, R(τ), is positive definite for R P, P = 1, 2,..., then R NS ( i, j ) = Σ i 1 4 Σ j 1 4 Σ 1 i+σ j 2 2 ( Qij ) R is positive definite for R P, P = 1, 2,... Theorem 2 (Paciorek 2003): Smoothness (differentiabilit) properties of original stationar correlation R(τ) are retained if elements of Σ( ) var smoothl in the feature space. 10
11 PROOF OF THEOREM 1 (SKETCH) If R(τ) is positive definite for R P, P = 1, 2,..., then R(τ) = 0 ep( τ 2 w)h(w)dw (Schoenberg 1938) Reepress the proposed nonstationar correlation function: R NS ( i, j ) = 2 P 2 Σ i 1 4 Σ j 1 4 Σ i + Σ j ep( Q ij w)h(w)dw = 2 P 2 Σ i 1 4 Σ j 1 4 Σ i + Σ j 1 2 = 0 0 ep ( 1 2 ( i j ) T ( Σi +Σ j 2w R P k i,w (u)k j,w (u)duh(w)dw ) 1 (i j )) h(w)dw 11
12 Now check the definition of positive definiteness directl: n i=1 n a i a j C( i, j ) = j=1 = = n i=1 j=1 0 0 n a i a j R P R P 0 n a i k i,w (u) i=1 R P k i,w (u)k j,w (u)duh(w)dw n a j k j,w (u)duh(w)dw j=1 ( n 2 a i k i,w (u)) duh(w)dw 0. The ke is that the covariance must depend onl on location-specific kernels i=1 12
13 NONSTATIONARY MATÉRN COVARIANCE A particular nonstationar covariance of interest is a Matérn form, which satisfies the conditions of Theorem 1. 1 Γ(ν)2 ν 1 stationar form ( ) ν ( 2 ντ 2 ντ K ν κ κ ) nonstationar form 1 ( Γ(ν)2 ν 1 2 ) ν ( νq ij Kν 2 ) νq ij Provided Σ( ) varies smoothl, b Theorem 2, this form will give sample functions whose differentiabilit varies with ν. B not constraining the differentiabilit, this gives a more fleible form of the correlation function than the original HSK nonstationar correlation, and has asmptotic advantages (Stein 1999). 13
14 A BAYESIAN NONSTATIONARY GP REGRESSION MODEL Baesian model Y i N(f( i ), η 2 ), i R P f( ) GP(µ, σ 2 R NS (, ; Σ( ), ν)) Let R NS be the nonstationar Matérn correlation Kernels, Σ( ), are constructed based on stationar GP priors, as described net. 14
15 SMOOTHLY-VARYING KERNEL MATRICES Goals: Define multiple kernel matrices, Σ i one for each training observation and test/prediction observation Matri elements should be smoothl-varing in input space Matrices must be positive definite Use spectral decomposition (Σ i = Γ T i Λ iγ i ) Eigenvector matri, Γ i, is parameterized as first eigenvector plus successive orthogonal vectors in reduced-dimension subspaces stationar GP priors on unnormalized eigenvector coordinates [(a i, b i ) in 2-d cartoon ] Eigenvalue matri, Λ i, parameterized b diagonal elements stationar GP priors on logarithms of diagonal elements (log λ 1,i, log λ 2,i in 2-d cartoon) gets unwield and highl-parameterized for large P λ 2, i i λ 1,i a i, b i 15
16 EFFICIENT REPRESENTATIONS OF STATIONARY GPS Let Φ be a vector of process values 1. Matérn basis functions (Kammann and Wand 2003) Φ = µ + σzω 1 2 w Z = (C( i κ k )), 1 i n,1 k K Ω = (C( κ j κ k )), 1 j K,1 k K C( ) a stationar covariance function matri operations based on K knots, {κ k }, so more efficient motivation: if {κ k } = { i }, Cov(Φ) = C( ) When sample hperparameters in MCMC, sample process as well: Φ = µ + σzω 1 2 w 16
17 2. Fourier basis functions (Wikle 2002) Φ dat = µ + σaφ grid (Φ dat is process values at data locations) Φ grid = Ψw (Φ grid is discretized process on a grid) w elements are independent, comple-valued RVs their variance is based on spectral densit of stationar C( ) Ψw is the inverse FFT (Ψ are Fourier basis vectors) propose blocks of values of w with focus on low-frequenc coefficients motivation: Cov(Φ) = C( ) asmptoticall as grid gets finer When sample hperparameters in MCMC, sample process as well: Φ dat = µ + σaψw 17
18 NONPARAMETRIC REGRESSION COMPARISON - 1D Compare to: stationar GP free-knot regression spline model (BARS) (DiMatteo et al. 2002) neural network (with number of hidden units that give best result) (R nnet librar) Baesian neural network (R. Neal software as implemented b A. Vehtari) Simulate 50 datasets and fit each model to each dataset Compare results based on mean squared error (relative to true function) 18
19 REGRESSION RESULTS - 1D test function 1 test function 2 test function Method Function 1 Function 2 Function 3 BARS.0081 (.0071,.0092).012 (.011,.013).0050 (.0043,.0056) Baesian neural network.0082 (.0072,.0093).011 (.010,.014).015 (.014,.016) Nonstat. GP.0083 (.0073,.0093).015 (.013,.016).026 (.021,.030) Stat. GP.0083 (.0073,.0093).026 (.024,.029).071 (.067,.074) neural network.0108 (.0095,.012).013 (.012,.015).0095 (.0086,.010) Free-knot spline (BARS) performs best, followed b Baesian neural network and nonstationar GP Most methods are similar on smoothl varing Function 1 19
20 NONPARAMETRIC REGRESSION COMPARISON - 2/3D Compare to: stationar GP free-knot regression spline with tensor products of univariate splines (BMARS) (Denison et al. 1998) free-knot regression spline with multivariate linear splines (BMLS) (Holmes and Mallick 2001) neural network (optimal number of hidden units)(r nnet librar) Baesian neural network (R. Neal software as implemented b A. Vehtari) 20
21 EXAMPLES 1.) Simulated dataset with 2 inputs: P = 2, n = 225 simulate 50 datasets and compare using MSE (relative to true function) True function NSGP estimate z 4 z ) Real dataset of Dec mean temperatures in Americas, n = 109 P = 2: longitude, latitude MSE based on 50-fold cross-validation (MSE relative to test data) 3.) Real dataset of dail ozone in NY, n = 111 P = 3: radiation, temperature, wind speed MSE based on 50-fold cross-validation (MSE relative to test data) 21
22 MEAN SQUARED ERROR Method Function with 2 inputs Temp. data Ozone data Baesian neural network.020 (.019,.022) Nonstat. GP.023 (.020,.026) Stat. GP.024 (.021,.026) BMARS.076 (.065,.087) BMLS.033 (.029,.038) neural network.040* (.033,.047) * Holmes and Mallick (2001) report a value of.07 for a neural network Baesian neural network and nonstationar Gaussian process appear to outperform other approaches. 22
23 GENERALIZED NONPARAMETRIC REGRESSION Model: Y i D(g(f( i ))) f( ) GP(µ, σ 2 R NS (, ; Σ( ), ν)) D is a probabilit distribution and g( ) is a link function Eamples: count data (D = Poisson, g 1 = log) binar data (D = Bernoulli, g 1 = logit) 23
24 TOKYO RAINFALL DATA EXAMPLE Data are presence/absence of rainfall for the calendar das, (Y i {0, 1, 2}). Top panel shows posterior mean probabilit of rainfall as a function of calendar da, with pointwise 95% uncertaint intervals, and bottom panel shows standard deviation of the Gaussian kernels as a function of calendar da. The probabilit of rainfall appears to be more variable toward the end of the ear calendar da Kernel size Prob. of rainfall calendar da
25 CONCLUSIONS We introduce a class of nonstationar covariance functions that can be used in Gaussian process regression (and classification) models and allow the model to adapt to variable smoothness in the unknown function. The nonstationar GPs improve on stationar GP models on several test datasets. In test functions on one-dimensional spaces, a state-of-the-art free-knot spline model and Baesian neural network outperform the nonstationar GP, but in higher dimensions, the nonstationar GP outperforms two free-knot spline approaches and a non-baesian neural network while being comparable to a Baesian neural network. The nonstationar GP ma be of particular interest for data indeed b spatial coordinates. Unfortunatel, the nonstationar GP requires man more parameters than a stationar GP, particularl as the dimension grows, losing the attractive simplicit of the stationar GP model. Use of stationar GP priors in the hierarch of the model to parameterize the nonstationar covariance results in slow computation. More efficient representations of these stationar GPs improves efficienc but still limits the model to approimatel n < The slow part of the computation for Gaussian data is that calculating the marginal likelihood requires the Cholesk decomposition of an n b n matri. Our approach provides a general modelling framework; other low-rank approimations to the covariance matri (e.g., Smola and Bartlett 2001; Seeger and Williams 2003) ma further speed fitting. 25
26 References Damian, D., Sampson, P., and Guttorp, P. (2001), Baesian estimation of semi-parametric non-stationar spatial covariance structure, Environmetrics, 12, Denison, D., Mallick, B., and Smith, A. (1998), Baesian MARS, Statistics and Computing, 8, DiMatteo, I., Genovese, C., and Kass, R. (2002), Baesian curve-fitting with free-knot splines, Biometrika, 88, Gibbs, M. (1997), Baesian Gaussian Processes for Classification and Regression, unpublished Ph.D. dissertation, Universit of Cambridge. Higdon, D., Swall, J., and Kern, J. (1999), Non-stationar spatial modeling, in Baesian Statistics 6, eds. J. Bernardo, J. Berger, A. Dawid, and A. Smith, Oford, U.K.: Oford Universit Press, pp Holmes, C. and Mallick, B. (2001), Baesian regression with multivariate linear splines, Journal of the Roal Statistical Societ, Series B, 63, Kammann, E. and Wand, M. (2003), Geoadditive models, Applied Statistics, 52, MacKa, D. and Takeuchi, R. (1995), Interpolation models with multiple hperparameters,. Paciorek, C. (2003), Nonstationar Gaussian Processes for Regression and Spatial Modelling, unpublished Ph.D. dissertation, Carnegie Mellon Universit, Department of Statistics. Rasmussen, C. and Ghahramani, Z. (2002), Infinite mitures of Gaussian process eperts, in Advances 26
27 in Neural Information Processing Sstems 14, eds. T. G. Dietterich, S. Becker, and Z. Ghahramani, Cambridge, Massachusetts: MIT Press. Schmidt, A. and O Hagan, A. (2000), Baesian Inference for Nonstationar Spatial Covariance Structure via Spatial Deformations, Technical Report 498/00, Universit of Sheffield, Department of Probabilit and Statistics. Schoenberg, I. (1938), Metric spaces and completel monotone functions, Ann. of Math., 39, Seeger, M. and Williams, C. (2003), Fast forward selection to speed up sparse Gaussian process regression, in Workshop on AI and Statistics 9. Smola, A. and Bartlett, P. (2001), Sparse greed Gaussian process approimation, in Advances in Neural Information Processing Sstems 13, eds. T. Leen, T. Dietterich, and V. Tresp, Cambridge, Massachusetts: MIT Press. Tresp, V. (2001), Mitures of Gaussian Processes, in Advances in Neural Information Processing Sstems 13, eds. T. K. Leen, T. G. Dietterich, and V. Tresp, MIT Press, pp Wikle, C. (2002), Spatial modeling of count data: A case stud in modelling breeding bird surve data on large spatial domains, in Spatial Cluster Modelling, eds. A. Lawson and D. Denison, Chapman & Hall, pp
Spatial smoothing using Gaussian processes
Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling
More informationGaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency
Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of
More informationSpatial Modelling Using a New Class of Nonstationary Covariance Functions
Spatial Modelling Using a New Class of Nonstationary Covariance Functions Christopher J. Paciorek and Mark J. Schervish 13th June 2005 Abstract We introduce a new class of nonstationary covariance functions
More informationNONSTATIONARY GAUSSIAN PROCESSES FOR REGRESSION AND SPATIAL MODELLING
CARNEGIE MELLON UNIVERSITY NONSTATIONARY GAUSSIAN PROCESSES FOR REGRESSION AND SPATIAL MODELLING A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree
More informationAnomaly Detection and Removal Using Non-Stationary Gaussian Processes
Anomaly Detection and Removal Using Non-Stationary Gaussian ocesses Steven Reece Roman Garnett, Michael Osborne and Stephen Roberts Robotics Research Group Dept Engineering Science Oford University, UK
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationA STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES
A STATISTICAL TECHNIQUE FOR MODELLING NON-STATIONARY SPATIAL PROCESSES JOHN STEPHENSON 1, CHRIS HOLMES, KERRY GALLAGHER 1 and ALEXANDRE PINTORE 1 Dept. Earth Science and Engineering, Imperial College,
More informationRegression with Input-Dependent Noise: A Bayesian Treatment
Regression with Input-Dependent oise: A Bayesian Treatment Christopher M. Bishop C.M.BishopGaston.ac.uk Cazhaow S. Qazaz qazazcsgaston.ac.uk eural Computing Research Group Aston University, Birmingham,
More informationGaussian Process Vine Copulas for Multivariate Dependence
Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationThe Variational Gaussian Approximation Revisited
The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationarxiv: v4 [stat.me] 14 Sep 2015
Does non-stationary spatial data always require non-stationary random fields? Geir-Arne Fuglstad 1, Daniel Simpson 1, Finn Lindgren 2, and Håvard Rue 1 1 Department of Mathematical Sciences, NTNU, Norway
More informationEigenGP: Gaussian Process Models with Adaptive Eigenfunctions
Proceedings of the Twent-Fourth International Joint Conference on Artificial Intelligence (IJCAI 5) : Gaussian Process Models with Adaptive Eigenfunctions Hao Peng Department of Computer Science Purdue
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationLecture 1c: Gaussian Processes for Regression
Lecture c: Gaussian Processes for Regression Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationKernels for Automatic Pattern Discovery and Extrapolation
Kernels for Automatic Pattern Discovery and Extrapolation Andrew Gordon Wilson agw38@cam.ac.uk mlg.eng.cam.ac.uk/andrew University of Cambridge Joint work with Ryan Adams (Harvard) 1 / 21 Pattern Recognition
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationMultivariate Gaussian Random Fields with SPDEs
Multivariate Gaussian Random Fields with SPDEs Xiangping Hu Daniel Simpson, Finn Lindgren and Håvard Rue Department of Mathematics, University of Oslo PASI, 214 Outline The Matérn covariance function and
More informationTechnical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models
Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationPerturbation Theory for Variational Inference
Perturbation heor for Variational Inference Manfred Opper U Berlin Marco Fraccaro echnical Universit of Denmark Ulrich Paquet Apple Ale Susemihl U Berlin Ole Winther echnical Universit of Denmark Abstract
More informationStatistícal Methods for Spatial Data Analysis
Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationOn Gaussian Process Models for High-Dimensional Geostatistical Datasets
On Gaussian Process Models for High-Dimensional Geostatistical Datasets Sudipto Banerjee Joint work with Abhirup Datta, Andrew O. Finley and Alan E. Gelfand University of California, Los Angeles, USA May
More informationComputational techniques for spatial logistic regression with large datasets
Computational techniques for spatial logistic regression with large datasets Christopher J. Paciorek and Louise Ryan 6th October 2005 key words: Bayesian statistics, Fourier basis, FFT, generalized linear
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationBayesian Inference of Noise Levels in Regression
Bayesian Inference of Noise Levels in Regression Christopher M. Bishop Microsoft Research, 7 J. J. Thomson Avenue, Cambridge, CB FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop
More informationComputational techniques for spatial logistic regression with large datasets
Computational techniques for spatial logistic regression with large datasets Christopher J. Paciorek April 19, 2007 key words: Bayesian statistics, disease mapping, Fourier basis, generalized linear mixed
More informationComputer Emulation With Density Estimation
Computer Emulation With Density Estimation Jake Coleman, Robert Wolpert May 8, 2017 Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, 2017 1 / 17 Computer Emulation Motivation Expensive
More informationEfficient adaptive covariate modelling for extremes
Efficient adaptive covariate modelling for extremes Slides at www.lancs.ac.uk/ jonathan Matthew Jones, David Randell, Emma Ross, Elena Zanini, Philip Jonathan Copyright of Shell December 218 1 / 23 Structural
More informationINF Introduction to classifiction Anne Solberg
INF 4300 8.09.17 Introduction to classifiction Anne Solberg anne@ifi.uio.no Introduction to classification Based on handout from Pattern Recognition b Theodoridis, available after the lecture INF 4300
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationSome Topics in Convolution-Based Spatial Modeling
Some Topics in Convolution-Based Spatial Modeling Catherine A. Calder 1 and Noel Cressie Department of Statistics The Ohio State University 1958 Neil Ave. Columbus, OH 43210 USA Appears in the Proceedings
More information6. Vector Random Variables
6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following
More informationLearning Gaussian Process Kernels via Hierarchical Bayes
Learning Gaussian Process Kernels via Hierarchical Bayes Anton Schwaighofer Fraunhofer FIRST Intelligent Data Analysis (IDA) Kekuléstrasse 7, 12489 Berlin anton@first.fhg.de Volker Tresp, Kai Yu Siemens
More informationStatistical Geometry Processing Winter Semester 2011/2012
Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian
More informationLeast Absolute Shrinkage is Equivalent to Quadratic Penalization
Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr
More informationAnalytic Long-Term Forecasting with Periodic Gaussian Processes
Nooshin Haji Ghassemi School of Computing Blekinge Institute of Technology Sweden Marc Peter Deisenroth Department of Computing Imperial College London United Kingdom Department of Computer Science TU
More informationConstrained Gaussian processes: methodology, theory and applications
Constrained Gaussian processes: methodology, theory and applications Hassan Maatouk hassan.maatouk@univ-rennes2.fr Workshop on Gaussian Processes, November 6-7, 2017, St-Etienne (France) Hassan Maatouk
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationKneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"
Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/
More informationInfinite Mixtures of Gaussian Process Experts
in Advances in Neural Information Processing Systems 14, MIT Press (22). Infinite Mixtures of Gaussian Process Experts Carl Edward Rasmussen and Zoubin Ghahramani Gatsby Computational Neuroscience Unit
More informationNon-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources
th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical
More informationLearning Gaussian Process Models from Uncertain Data
Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationA parametric approach to Bayesian optimization with pairwise comparisons
A parametric approach to Bayesian optimization with pairwise comparisons Marco Co Eindhoven University of Technology m.g.h.co@tue.nl Bert de Vries Eindhoven University of Technology and GN Hearing bdevries@ieee.org
More informationGaussian Process priors with Uncertain Inputs: Multiple-Step-Ahead Prediction
Gaussian Process priors with Uncertain Inputs: Multiple-Step-Ahead Prediction Agathe Girard Dept. of Computing Science University of Glasgow Glasgow, UK agathe@dcs.gla.ac.uk Carl Edward Rasmussen Gatsby
More informationGeostatistical Modeling for Large Data Sets: Low-rank methods
Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank
More informationVARIANCE ESTIMATION AND KRIGING PREDICTION FOR A CLASS OF NON-STATIONARY SPATIAL MODELS
Statistica Sinica 25 (2015), 135-149 doi:http://dx.doi.org/10.5705/ss.2013.205w VARIANCE ESTIMATION AND KRIGING PREDICTION FOR A CLASS OF NON-STATIONARY SPATIAL MODELS Shu Yang and Zhengyuan Zhu Iowa State
More informationModels for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research
Models for models Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Outline Statistical models and tools Spatial fields (Wavelets) Climate regimes (Regression and clustering)
More informationReview of Probability
Review of robabilit robabilit Theor: Man techniques in speech processing require the manipulation of probabilities and statistics. The two principal application areas we will encounter are: Statistical
More informationBayesian Nonparametrics
Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent
More informationarxiv: v1 [stat.me] 3 Dec 2014
A Generalized Convolution Model and Estimation for Non-stationary Random Functions F. Fouedjio, N. Desassis, J. Rivoirard Équipe Géostatistique - Centre de Géosciences, MINES ParisTech 35, rue Saint-Honoré,
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry
More informationVariational Principal Components
Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings
More informationA Practitioner s Guide to Generalized Linear Models
A Practitioners Guide to Generalized Linear Models Background The classical linear models and most of the minimum bias procedures are special cases of generalized linear models (GLMs). GLMs are more technically
More informationWhat s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model
What s for today Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model c Mikyoung Jun (Texas A&M) Stat647 Lecture 11 October 2, 2012 1 / 23 Nonstationary
More informationSlice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method
Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at
More informationLow-rank methods and predictive processes for spatial models
Low-rank methods and predictive processes for spatial models Sam Bussman, Linchao Chen, John Lewis, Mark Risser with Sebastian Kurtek, Vince Vu, Ying Sun February 27, 2014 Outline Introduction and general
More informationEfficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions
Efficient Posterior Inference and Prediction of Space-Time Processes Using Dynamic Process Convolutions Catherine A. Calder Department of Statistics The Ohio State University 1958 Neil Avenue Columbus,
More informationImplementing an anisotropic and spatially varying Matérn model covariance with smoothing filters
CWP-815 Implementing an anisotropic and spatially varying Matérn model covariance with smoothing filters Dave Hale Center for Wave Phenomena, Colorado School of Mines, Golden CO 80401, USA a) b) c) Figure
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationspbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models Andrew O. Finley 1, Sudipto Banerjee 2, and Bradley P. Carlin 2 1 Michigan State University, Departments
More informationL20: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs
L0: MLPs, RBFs and SPR Bayes discriminants and MLPs The role of MLP hidden units Bayes discriminants and RBFs Comparison between MLPs and RBFs CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU
More informationINF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification
INF 4300 151014 Introduction to classifiction Anne Solberg anne@ifiuiono Based on Chapter 1-6 in Duda and Hart: Pattern Classification 151014 INF 4300 1 Introduction to classification One of the most challenging
More informationA Fused Lasso Approach to Nonstationary Spatial Covariance Estimation
Supplementary materials for this article are available at 10.1007/s13253-016-0251-8. A Fused Lasso Approach to Nonstationary Spatial Covariance Estimation Ryan J. Parker, Brian J. Reich,andJoEidsvik Spatial
More informationEstimation of non-stationary spatial covariance structure
Estimation of non-stationary spatial covariance structure DAVID J NOTT Department of Statistics, University of New South Wales, Sydney 2052, Australia djn@mathsunsweduau WILLIAM T M DUNSMUIR Division of
More informationCross-validation for detecting and preventing overfitting
Cross-validation for detecting and preventing overfitting A Regression Problem = f() + noise Can we learn f from this data? Note to other teachers and users of these slides. Andrew would be delighted if
More informationA Generalized Convolution Model for Multivariate Nonstationary Spatial Processes
A Generalized Convolution Model for Multivariate Nonstationary Spatial Processes Anandamayee Majumdar, Debashis Paul and Dianne Bautista Department of Mathematics and Statistics, Arizona State University,
More informationSTA 414/2104, Spring 2014, Practice Problem Set #1
STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,
More informationOverview of Spatial Statistics with Applications to fmri
with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example
More informationarxiv: v2 [stat.ml] 9 Oct 2015
Generalized Spectral Kernels Yves-Laurent Kom Samo ylks@robots.ox.ac.uk University of Oxford Stephen J. Roberts sjrob@robots.ox.ac.uk University of Oxford arxiv:506.036v [stat.ml] 9 Oct 05 Abstract In
More informationA Framework for Daily Spatio-Temporal Stochastic Weather Simulation
A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric
More informationUncertainty and Parameter Space Analysis in Visualization -
Uncertaint and Parameter Space Analsis in Visualiation - Session 4: Structural Uncertaint Analing the effect of uncertaint on the appearance of structures in scalar fields Rüdiger Westermann and Tobias
More informationNonparametric M-quantile Regression via Penalized Splines
Nonparametric M-quantile Regression via Penalized Splines Monica Pratesi 1, M. Giovanna Ranalli 2, Nicola Salvati 1 Dipartimento di Statistica e Matematica Applicata all Economia, Università di Pisa, Ital
More informationCreating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005
Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach Radford M. Neal, 28 February 2005 A Very Brief Review of Gaussian Processes A Gaussian process is a distribution over
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationDeep Neural Networks as Gaussian Processes
Deep Neural Networks as Gaussian Processes Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein Google Brain {jaehlee, yasamanb, romann, schsam, jpennin,
More informationLanguage and Statistics II
Language and Statistics II Lecture 19: EM for Models of Structure Noah Smith Epectation-Maimization E step: i,, q i # p r $ t = p r i % ' $ t i, p r $ t i,' soft assignment or voting M step: r t +1 # argma
More informationMulti-Kernel Gaussian Processes
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Arman Melkumyan Australian Centre for Field Robotics School of Aerospace, Mechanical and Mechatronic Engineering
More informationTutorial on Gaussian Processes and the Gaussian Process Latent Variable Model
Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,
More informationNonparmeteric Bayes & Gaussian Processes. Baback Moghaddam Machine Learning Group
Nonparmeteric Bayes & Gaussian Processes Baback Moghaddam baback@jpl.nasa.gov Machine Learning Group Outline Bayesian Inference Hierarchical Models Model Selection Parametric vs. Nonparametric Gaussian
More informationHierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets Abhirup Datta 1 Sudipto Banerjee 1 Andrew O. Finley 2 Alan E. Gelfand 3 1 University of Minnesota, Minneapolis,
More informationHilbert Space Methods for Reduced-Rank Gaussian Process Regression
Hilbert Space Methods for Reduced-Rank Gaussian Process Regression Arno Solin and Simo Särkkä Aalto University, Finland Workshop on Gaussian Process Approximation Copenhagen, Denmark, May 2015 Solin &
More informationarxiv: v1 [stat.ml] 2 Aug 2012
Ancestral Inference from Functional Data: Statistical Methods and Numerical Eamples Pantelis Z. Hadjipantelis, Nick, S. Jones, John Moriart, David Springate, Christopher G. Knight arxiv:1208.0628v1 [stat.ml]
More informationGaussian Process Vine Copulas for Multivariate Dependence
Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández Lobato 1,2, David López Paz 3,2 and Zoubin Ghahramani 1 June 27, 2013 1 University of Cambridge 2 Equal Contributor 3 Ma-Planck-Institute
More informationLecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan
Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always
More informationA Brief Overview of Nonparametric Bayesian Models
A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine
More informationFixed-domain Asymptotics of Covariance Matrices and Preconditioning
Fixed-domain Asymptotics of Covariance Matrices and Preconditioning Jie Chen IBM Thomas J. Watson Research Center Presented at Preconditioning Conference, August 1, 2017 Jie Chen (IBM Research) Covariance
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,
More informationA Program for Data Transformations and Kernel Density Estimation
A Program for Data Transformations and Kernel Density Estimation John G. Manchuk and Clayton V. Deutsch Modeling applications in geostatistics often involve multiple variables that are not multivariate
More information