Computer Emulation With Density Estimation
|
|
- Pierce Williamson
- 5 years ago
- Views:
Transcription
1 Computer Emulation With Density Estimation Jake Coleman, Robert Wolpert May 8, 2017 Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
2 Computer Emulation Motivation Expensive Experiments Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
3 Density Estimation Physics Data Motivation & Literature Review Physics Data 2 2 [G. Aad et al., 2010] Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
4 Density Estimation Physics Data Motivation & Literature Review Physics Data 2 Outputs are frequency histograms rather than just multivariate vectors with unknown correlation structure 2 [G. Aad et al., 2010] Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
5 Density Estimation Physics Data Motivation & Literature Review Physics Data 2 Outputs are frequency histograms rather than just multivariate vectors with unknown correlation structure Want to predict underlying density given physics input parameters - suggests Bayesian density estimation and regression 2 [G. Aad et al., 2010] Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
6 Density Estimation Physics Data Motivation & Literature Review Density Estimation Literature We aim measure smoothness of the density with a Gaussian Process. Some prior work in this area: Logistic GP prior ([Lenk, 1991], [Lenk, 2003], [Tokdar, 2007], [Tokdar et al., 2010], [Riihimäki and Vehtari, 2010]) Latent Factor Models ([Kundu and Dunson, 2014]) Exact sampling of transformed GP ([Adams et al., 2009]) Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
7 Density Estimation Physics Data Motivation & Literature Review Density Estimation Literature We aim measure smoothness of the density with a Gaussian Process. Some prior work in this area: Logistic GP prior ([Lenk, 1991], [Lenk, 2003], [Tokdar, 2007], [Tokdar et al., 2010], [Riihimäki and Vehtari, 2010]) Latent Factor Models ([Kundu and Dunson, 2014]) Exact sampling of transformed GP ([Adams et al., 2009]) Complication - we don t have access to draws, only counts within bins Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
8 Density Estimation Single-Histogram Model Likelihood Let Y j be the counts in bin j, which has edges [α j 1, α j ). Marginally Poisson jointly Multinomial (conditional on total) p( Y ) J j=1 p Y j j p j α j f (t)dt α j 1 We aim to model the unknown density f (t) nonparametrically with a smooth, continuous function over [0, 1]. Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
9 Density Estimation Single-Histogram Model Main Idea f (t) = 2[an Z n cos(2πnt) + b n W n sin(2πnt)] + 1 n=1 where n a2 n + b 2 n < and {Z n }, {W n } iid N(0, 1). Then f is a GP with covariance function c(t, t ) = n 2a2 n cos(2πn[t t ]) if a n = b n 1 0 f (t)dt = 1 α j α j 1 f (t)dt can be easily found and pre-computed Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
10 Density Estimation Single-Histogram Model Main Idea f (t) = 2[an Z n cos(2πnt) + b n W n sin(2πnt)] + 1 n=1 where n a2 n + b 2 n < and {Z n }, {W n } iid N(0, 1). Then f is a GP with covariance function c(t, t ) = n 2a2 n cos(2πn[t t ]) if a n = b n 1 0 f (t)dt = 1 α j α j 1 f (t)dt can be easily found and pre-computed Downside Not positive a.s. Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
11 Density Estimation Single-Histogram Model Main Idea f (t) = 2[an Z n cos(2πnt) + b n W n sin(2πnt)] + 1 n=1 where n a2 n + b 2 n < and {Z n }, {W n } iid N(0, 1). Then f is a GP with covariance function c(t, t ) = n 2a2 n cos(2πn[t t ]) if a n = b n 1 0 f (t)dt = 1 α j α j 1 f (t)dt can be easily found and pre-computed Downside Not positive a.s. Hope - P(f (t) < 0) is very small in region of interest Positive values are often modeled with normal RVs if far enough away from zero (heights, rainfall, etc) Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
12 Density Estimation Single-Histogram Model Toy Example We let c = 0.3 and r = 0.7, while looking to estimate 10,000 draws from a Beta(3, 7) distribution in 6 evenly-spaced bins in [0, 0.6] Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
13 Density Estimation Single-Histogram Model Toy Example We let c = 0.3 and r = 0.7, while looking to estimate 10,000 draws from a Beta(3, 7) distribution in 6 evenly-spaced bins in [0, 0.6] GP Density Estimate, Bins = 6 & N x = 5 Density Post Mean Post 95% Cred Truth y Decent enough! Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
14 Density Estimation Multiple-Histogram Model Extending the Model Now we assume that we have an input d upon which we condition our estimate f (t d) = N 2an [Z n (d) cos(2πnt/t ) + W n (d) sin(2πnt/t )] + γ/t n=1 where {Z n ( )}, {W n ( )} GP(0, c M (, )) Thus, each component of the Karhunen-Loève representation is itself a GP. Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
15 Density Estimation Multiple-Histogram Model Initial Results I chose c = 1, r = 0.5, and squared-exponential kernel. Predicted Bin Probabilities Predicted Probability Predicted Truth Bin Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
16 Density Estimation Multiple-Histogram Model Initial Results I chose c = 1, r = 0.5, and squared-exponential kernel. Predicted Bin Probabilities GP Density Estimate Predicted Probability Predicted Truth Density Post Mean 95% Interval Truth Bin y Bin probability prediction is good, density prediction less so Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
17 Density Estimation Multiple-Histogram Model Strawman The naïve emulation strategy treats the histogram counts as multivariate normals and rotates them via PCA to apply independent GPs Adjusts for within-histogram correlation through PCA Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
18 Density Estimation Multiple-Histogram Model Strawman The naïve emulation strategy treats the histogram counts as multivariate normals and rotates them via PCA to apply independent GPs Adjusts for within-histogram correlation through PCA No density estimation Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
19 Density Estimation Multiple-Histogram Model Strawman The naïve emulation strategy treats the histogram counts as multivariate normals and rotates them via PCA to apply independent GPs Adjusts for within-histogram correlation through PCA No density estimation Predicted Bin Probabilities Strawman Predicted Probability Predicted Truth Bin Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
20 Future Directions Thoughts and Future Directions Improvement over strawman will have to come in full density estimation Increasing N (with higher r) could provide more flexibility to avoid strange tail behavoir A different (or learned) a n could lead to other processes Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
21 Future Directions Thoughts and Future Directions Improvement over strawman will have to come in full density estimation Increasing N (with higher r) could provide more flexibility to avoid strange tail behavoir A different (or learned) a n could lead to other processes Future Directions Incorporate calibration Improve density estimation Show some form of posterior consistency with counts and bins going to infinity Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
22 Future Directions Thank you! Questions? Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
23 Future Directions Works Cited: I Adams, R., Murray, I., and MacKay, D. (2009). Nonparametric Bayesian density modeling with gaussian processes. G. Aad et al. (2010). Observation of a centrality-dependent dijet asymmetry in lead-lead collisions atsnn=2.76 TeVwith the ATLAS detector at the LHC. Physical Review Letters, 105(25). Higdon, D., Gattiker, J., Williams, B., and Rightley, M. (2008). Computer model calibration using high dimensional output. Journal of the American Statistical Association, 103(482): Higdon, D., Kennedy, M., Cavendish, J. C., Cafeo, J. A., and Ryne, R. D. (2004). Combining field data and computer simulations for calibration and prediction. SIAM Journal on Scientific Computing, 26(2): Kundu, S. and Dunson, D. B. (2014). Latent factor models for density estimation. Biometrika, 101(3): Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
24 Future Directions Works Cited: II Lenk, P. (1991). Towards a practicable Bayesian nonparametric density estimator. Biometrika, 78(3): Lenk, P. (2003). Bayesian semiparametric density estimation and model verification using a logistic-gaussian process. Journal of Computational and Graphical Statistics, 12(3): Riihimäki, J. and Vehtari, A. (2010). Laplace approximation for logistic Gaussian process density estimation and regression. Bayesian Analysis, 9(2): Tokdar, Zhu, and Gosh (2010). Bayesian density regression with logistic gaussian process and subspace projection. Bayesian Analysis, 5(2): Tokdar, S. T. (2007). Towards a faster implementation of density estimation with logistic gaussian process priors. Journal of Computational and Graphical Statistics, 16(3): Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
25 Appendix Computer Emulation Covariance Functions The covariance function c(, ) is often of the form c(x, x ) = λ 1 r(x x θ). Examples of r( θ): Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
26 Appendix Computer Emulation Covariance Functions The covariance function c(, ) is often of the form c(x, x ) = λ 1 r(x x θ). Examples of r( θ): Power Exponential: r(h α, l) = e h/l α, where α (0, 2] Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
27 Appendix Computer Emulation Covariance Functions The covariance function c(, ) is often of the form c(x, x ) = λ 1 r(x x θ). Examples of r( θ): Power Exponential: r(h α, l) = e h/l α, where α (0, 2] Usually learn l and fix α. Setting α = 2 makes the function infinitely differentiable - maybe undesirable. Sometimes set α = 1.9 for computational stability Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
28 Appendix Computer Emulation Covariance Functions The covariance function c(, ) is often of the form c(x, x ) = λ 1 r(x x θ). Examples of r( θ): Power Exponential: r(h α, l) = e h/l α, where α (0, 2] Usually learn l and fix α. Setting α = 2 makes the function infinitely differentiable - maybe undesirable. Sometimes set α = 1.9 for computational stability ( Matérn: r(h ν, l) = 21 ν h ) ν ( h ) Γ(ν) l Kν l, where Kν is the modified Bessel function of the second kind Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
29 Appendix Computer Emulation Covariance Functions The covariance function c(, ) is often of the form c(x, x ) = λ 1 r(x x θ). Examples of r( θ): Power Exponential: r(h α, l) = e h/l α, where α (0, 2] Usually learn l and fix α. Setting α = 2 makes the function infinitely differentiable - maybe undesirable. Sometimes set α = 1.9 for computational stability ( Matérn: r(h ν, l) = 21 ν h ) ν ( h ) Γ(ν) l Kν l, where Kν is the modified Bessel function of the second kind For ν = n/2 for n N, this has closed form. Most common are ν = 3/2 and ν = 5/2 ν = 3/2 : r(h l) = e ( ) h/l 1 + ( h l ) ν = 5/2 : r(h l) = e h/l 1 + h + h2 l 3l 2 Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
30 Appendix Computer Emulation Covariance Functions The covariance function c(, ) is often of the form c(x, x ) = λ 1 r(x x θ). Examples of r( θ): Power Exponential: r(h α, l) = e h/l α, where α (0, 2] Usually learn l and fix α. Setting α = 2 makes the function infinitely differentiable - maybe undesirable. Sometimes set α = 1.9 for computational stability ( Matérn: r(h ν, l) = 21 ν h ) ν ( h ) Γ(ν) l Kν l, where Kν is the modified Bessel function of the second kind For ν = n/2 for n N, this has closed form. Most common are ν = 3/2 and ν = 5/2 ν = 3/2 : r(h l) = e ( ) h/l 1 + ( h l ) ν = 5/2 : r(h l) = e h/l 1 + h + h2 l 3l 2 Usually assume separable covariance function. That is, if x has J dimensions, then r(x x θ) = J j=1 r j(x j x j θ) Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
31 Appendix Density Estimation Predictive Posterior The change to a GP prior on the components allows us to predict bin probabilities given new input d. Let Y( d ) and Y (d ) be the histogram counts for in-sample and out-of-sample inputs, respectively (similarly for X and P). [ We want Y (d ) d, Y( d ] ) Note P is a linear transformation of X Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
32 Appendix Density Estimation Predictive Posterior The change to a GP prior on the components allows us to predict bin probabilities given new input d. Let Y( d ) and Y (d ) be the histogram counts for in-sample and out-of-sample inputs, respectively (similarly for X and P). [ We want Y (d ) d, Y( d ] ) Note P is a linear transformation of X [ Y (d ) d, Y( d ] ) = [ Y (d ) X (d ), d, Y( ] [ d ) X (d ) d, Y( ] d ) dx X [ X (d ) d, Y( d ] ) = [ X(d ) d, Y( d ), X( d ), ] [ θ X( d ), θ d, Y( ] d ), dxdθ Θ X Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
33 Appendix Density Estimation Predictive Posterior The change to a GP prior on the components allows us to predict bin probabilities given new input d. Let Y( d ) and Y (d ) be the histogram counts for in-sample and out-of-sample inputs, respectively (similarly for X and P). [ We want Y (d ) d, Y( d ] ) Note P is a linear transformation of X [ Y (d ) d, Y( d ] ) = [ Y (d ) X (d ), d, Y( ] [ d ) X (d ) d, Y( ] d ) dx X [ X (d ) d, Y( d ] ) = [ X(d ) d, Y( d ), X( d ), ] [ θ X( d ), θ d, Y( ] d ), dxdθ Θ X Monte Carlo integration [ Note X (d ) d, Y( d ), X( d ), ] θ simply conditional normals, from GP Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
34 Appendix Density Estimation Emulation Predictions Data Across Input Emulated Values Across Input Fraction of Bin Out of Sample Histgram Probability of Bin Out of Sample Prediction Aj Aj Figure: The left plot depicts the bin probability data points, denoting the holdout set, while the right plot depicts emulator predictions Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, / 17
Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationLog Gaussian Cox Processes. Chi Group Meeting February 23, 2016
Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationPractical Bayesian Optimization of Machine Learning. Learning Algorithms
Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationFastGP: an R package for Gaussian processes
FastGP: an R package for Gaussian processes Giri Gopalan Harvard University Luke Bornn Harvard University Many methodologies involving a Gaussian process rely heavily on computationally expensive functions
More informationNonparametric Bayes Density Estimation and Regression with High Dimensional Data
Nonparametric Bayes Density Estimation and Regression with High Dimensional Data Abhishek Bhattacharya, Garritt Page Department of Statistics, Duke University Joint work with Prof. D.Dunson September 2010
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationThe Gaussian Process Density Sampler
The Gaussian Process Density Sampler Ryan Prescott Adams Cavendish Laboratory University of Cambridge Cambridge CB3 HE, UK rpa3@cam.ac.uk Iain Murray Dept. of Computer Science University of Toronto Toronto,
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationBayesian Dynamic Linear Modelling for. Complex Computer Models
Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationThe Bayesian approach to inverse problems
The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationGaussian Processes in Machine Learning
Gaussian Processes in Machine Learning November 17, 2011 CharmGil Hong Agenda Motivation GP : How does it make sense? Prior : Defining a GP More about Mean and Covariance Functions Posterior : Conditioning
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More informationStudent-t Process as Alternative to Gaussian Processes Discussion
Student-t Process as Alternative to Gaussian Processes Discussion A. Shah, A. G. Wilson and Z. Gharamani Discussion by: R. Henao Duke University June 20, 2014 Contributions The paper is concerned about
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationDisease mapping with Gaussian processes
EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationHierarchical Modelling for Univariate Spatial Data
Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationProbabilistic Graphical Models Lecture 20: Gaussian Processes
Probabilistic Graphical Models Lecture 20: Gaussian Processes Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 30, 2015 1 / 53 What is Machine Learning? Machine learning algorithms
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More informationNONPARAMETRIC BAYESIAN DENSITY MODELING WITH GAUSSIAN PROCESSES
Submitted to the Annals of Statistics NONPARAMETRIC BAYESIAN DENSITY MODELING WITH GAUSSIAN PROCESSES By Ryan P. Adams, Iain Murray and David J.C. MacKay University of Toronto, University of Edinburgh
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationGaussian Process Regression Networks
Gaussian Process Regression Networks Andrew Gordon Wilson agw38@camacuk mlgengcamacuk/andrew University of Cambridge Joint work with David A Knowles and Zoubin Ghahramani June 27, 2012 ICML, Edinburgh
More informationLecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis
Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,
More informationThe Variational Gaussian Approximation Revisited
The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationGaussian with mean ( µ ) and standard deviation ( σ)
Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (
More informationMachine Learning Techniques for Computer Vision
Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM
More informationBayesian Quadrature: Model-based Approximate Integration. David Duvenaud University of Cambridge
Bayesian Quadrature: Model-based Approimate Integration David Duvenaud University of Cambridge The Quadrature Problem ˆ We want to estimate an integral Z = f ()p()d ˆ Most computational problems in inference
More informationComputer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression
Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:
More informationTractable Nonparametric Bayesian Inference in Poisson Processes with Gaussian Process Intensities
Tractable Nonparametric Bayesian Inference in Poisson Processes with Gaussian Process Intensities Ryan Prescott Adams rpa23@cam.ac.uk Cavendish Laboratory, University of Cambridge, Cambridge CB3 HE, UK
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationHilbert Space Methods for Reduced-Rank Gaussian Process Regression
Hilbert Space Methods for Reduced-Rank Gaussian Process Regression Arno Solin and Simo Särkkä Aalto University, Finland Workshop on Gaussian Process Approximation Copenhagen, Denmark, May 2015 Solin &
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationProbabilistic & Unsupervised Learning
Probabilistic & Unsupervised Learning Gaussian Processes Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College London
More information20: Gaussian Processes
10-708: Probabilistic Graphical Models 10-708, Spring 2016 20: Gaussian Processes Lecturer: Andrew Gordon Wilson Scribes: Sai Ganesh Bandiatmakuri 1 Discussion about ML Here we discuss an introduction
More informationMachine learning: Hypothesis testing. Anders Hildeman
Location of trees 0 Observed trees 50 100 150 200 250 300 350 400 450 500 0 100 200 300 400 500 600 700 800 900 1000 Figur: Observed points pattern of the tree specie Beilschmiedia pendula. Location of
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationStatistical Data Analysis Stat 3: p-values, parameter estimation
Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,
More informationLatent factor density regression models
Biometrika (2010), 97, 1, pp. 1 7 C 2010 Biometrika Trust Printed in Great Britain Advance Access publication on 31 July 2010 Latent factor density regression models BY A. BHATTACHARYA, D. PATI, D.B. DUNSON
More informationNon-Gaussian likelihoods for Gaussian Processes
Non-Gaussian likelihoods for Gaussian Processes Alan Saul University of Sheffield Outline Motivation Laplace approximation KL method Expectation Propagation Comparing approximations GP regression Model
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationBayesian Support Vector Machines for Feature Ranking and Selection
Bayesian Support Vector Machines for Feature Ranking and Selection written by Chu, Keerthi, Ong, Ghahramani Patrick Pletscher pat@student.ethz.ch ETH Zurich, Switzerland 12th January 2006 Overview 1 Introduction
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationEfficient Likelihood-Free Inference
Efficient Likelihood-Free Inference Michael Gutmann http://homepages.inf.ed.ac.uk/mgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh 8th November 2017
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationCurve Fitting Re-visited, Bishop1.2.5
Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the
More informationDesign of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.
Design of Text Mining Experiments Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/research Active Learning: a flavor of design of experiments Optimal : consider
More informationComputer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression
Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.
More informationBayesian Nonparametrics
Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent
More informationExpectation Propagation for Approximate Bayesian Inference
Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given
More informationFactor Analysis and Kalman Filtering (11/2/04)
CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used
More informationState Space Representation of Gaussian Processes
State Space Representation of Gaussian Processes Simo Särkkä Department of Biomedical Engineering and Computational Science (BECS) Aalto University, Espoo, Finland June 12th, 2013 Simo Särkkä (Aalto University)
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More informationHierarchical Modelling for Univariate Spatial Data
Spatial omain Hierarchical Modelling for Univariate Spatial ata Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
More informationStat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016
Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016 1 Theory This section explains the theory of conjugate priors for exponential families of distributions,
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationCS 7140: Advanced Machine Learning
Instructor CS 714: Advanced Machine Learning Lecture 3: Gaussian Processes (17 Jan, 218) Jan-Willem van de Meent (j.vandemeent@northeastern.edu) Scribes Mo Han (han.m@husky.neu.edu) Guillem Reus Muns (reusmuns.g@husky.neu.edu)
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationNew Insights into History Matching via Sequential Monte Carlo
New Insights into History Matching via Sequential Monte Carlo Associate Professor Chris Drovandi School of Mathematical Sciences ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationStatistical Approaches to Learning and Discovery
Statistical Approaches to Learning and Discovery Bayesian Model Selection Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon
More informationCreating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005
Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach Radford M. Neal, 28 February 2005 A Very Brief Review of Gaussian Processes A Gaussian process is a distribution over
More information