Efficient Likelihood-Free Inference

Size: px
Start display at page:

Download "Efficient Likelihood-Free Inference"

Transcription

1 Efficient Likelihood-Free Inference Michael Gutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh 8th November 2017

2 Problem statement Likelihood-free inference: Perform statistical inference for models where 1. the likelihood function is too costly to evaluate 2. sampling simulating data from the model is possible Michael Gutmann Efficient Likelihood-Free Inference 2 / 32

3 Strain Importance Such models and inference problems occur widely Neuroscience: Simulating neural circuits Evolutionary biology: Simulating evolution Computer vision: Simulating naturalistic scenes Health science: Simulating the spread of an infectious disease 5 Strain Individual 30 Individual Time Sampled individuals Michael Gutmann Efficient Likelihood-Free Inference 3 / 32

4 Program Background Previous work Our approach Michael Gutmann Efficient Likelihood-Free Inference 4 / 32

5 Program Background Previous work Our approach Michael Gutmann Efficient Likelihood-Free Inference 5 / 32

6 Assumptions on the models Only assumption: sampling simulating data from the model is possible Models specified by a data generating mechanism e.g. stochastic nonlinear dynamical systems e.g. computer models / simulators of some complex physical or biological process Different communities use different names: Simulator-based models Stochastic simulation models Implicit models Generative (latent-variable) models Probabilistic programs Michael Gutmann Efficient Likelihood-Free Inference 6 / 32

7 Definition of simulator-based models (SBMs) Let (Ω, F, P) be a probability space. A simulator-based model is a collection of (measurable) functions g(., θ) parametrised by θ, ω Ω x θ = g(ω, θ) X (1) For any fixed θ, x θ = g(., θ) is a random variable. g(., θ) typically not available in closed form Simulation / Sampling Michael Gutmann Efficient Likelihood-Free Inference 7 / 32

8 Strengths of SBMs Direct implementation of hypotheses of how the observed data were generated. Neat interface with scientific models (e.g. from physics or biology). Modelling by replicating the mechanisms of nature that produced the observed/measured data. ( Analysis by synthesis ) Possibility to perform experiments in silico. Michael Gutmann Efficient Likelihood-Free Inference 8 / 32

9 Weaknesses of SBMs Generally elude analytical treatment. Can be easily made more complicated than necessary. Statistical inference is difficult. Michael Gutmann Efficient Likelihood-Free Inference 9 / 32

10 Weaknesses of SBMs Generally elude analytical treatment. Can be easily made more complicated than necessary. Statistical inference is difficult. Main reason: Likelihood function is too expensive to evaluate Michael Gutmann Efficient Likelihood-Free Inference 9 / 32

11 The likelihood function L(θ) Well defined but generally intractable for SBMs Probability that the model generates data like x o when using parameter value θ Data source Observation Data space Unknown properties M(θ) Model Data generation Michael Gutmann Efficient Likelihood-Free Inference 10 / 32

12 Three foundational issues 1. How should we assess whether x θ x o? 2. How should we compute the probability of the event x θ x o? 3. For which values of θ should we compute it? Data source Observation Data space Unknown properties M(θ) Model Data generation Likelihood: Probability that the model generates data like x o for parameter value θ Michael Gutmann Efficient Likelihood-Free Inference 11 / 32

13 Program Background Previous work Our approach Michael Gutmann Efficient Likelihood-Free Inference 12 / 32

14 Approximate Bayesian computation For recent review, see: Lintusaari et al (2017) Fundamentals and recent developments in approximate Bayesian computation, Systematic Biology 1. How should we assess whether x θ x o? Check whether T (x θ ) T (x o ) ɛ 2. How should we compute the proba of the event x θ x o? By counting 3. For which values of θ should we compute it? Sample from the prior (or other proposal distributions) Michael Gutmann Efficient Likelihood-Free Inference 13 / 32

15 Approximate Bayesian computation For recent review, see: Lintusaari et al (2017) Fundamentals and recent developments in approximate Bayesian computation, Systematic Biology 1. How should we assess whether x θ x o? Check whether T (x θ ) T (x o ) ɛ 2. How should we compute the proba of the event x θ x o? By counting 3. For which values of θ should we compute it? Sample from the prior (or other proposal distributions) Difficulties: Choice of T () and ɛ Typically high computational cost Michael Gutmann Efficient Likelihood-Free Inference 13 / 32

16 Implicit likelihood approximation Likelihood: Probability to generate data like x o for parameter value θ Model M(θ) Likelihood L(θ) proportion of green outcomes (1) x θ (2) x θ (3) x θ (4) x θ (5) x θ (6) x θ x o ε Data space L(θ) P N (d(x θ, x o ) ɛ) = 1 N ) N (d(x 1 (i) i=1 θ, xo ) ɛ Michael Gutmann Efficient Likelihood-Free Inference 14 / 32

17 Synthetic likelihood (Simon Wood, Nature, 2010) 1. How should we assess whether x θ x o? 2. How should we compute the proba of the event x θ x o? Compute summary statistics t θ = T (x θ ) Model their distribution as a Gaussian Compute likelihood function with T (x o ) as observed data 3. For which values of θ should we compute it? Use obtained synthetic likelihood function as part of a Monte Carlo method Michael Gutmann Efficient Likelihood-Free Inference 15 / 32

18 Synthetic likelihood (Simon Wood, Nature, 2010) 1. How should we assess whether x θ x o? 2. How should we compute the proba of the event x θ x o? Compute summary statistics t θ = T (x θ ) Model their distribution as a Gaussian Compute likelihood function with T (x o ) as observed data 3. For which values of θ should we compute it? Use obtained synthetic likelihood function as part of a Monte Carlo method Difficulties: Choice of T () Gaussianity assumption may not hold Typically high computational cost Michael Gutmann Efficient Likelihood-Free Inference 15 / 32

19 Program Background Previous work Our approach Michael Gutmann Efficient Likelihood-Free Inference 16 / 32

20 Overview of some of my work 1. How should we assess whether x θ x o? Use classification (Gutmann et al, 2014, 2017) 1. How should we assess whether x θ x o? 2. How should we compute the proba of the event x θ x o? Use density ratio estimation (Dutta et al, 2016, arxiv: ) 2. How should we compute the proba of the event x θ x o? 3. For which values of θ should we compute it? Use Bayesian optimisation (Gutmann and Corander, ) Michael Gutmann Efficient Likelihood-Free Inference 17 / 32

21 Overview of some of my work 1. How should we assess whether x θ x o? Use classification (Gutmann et al, 2014, 2017) Basic idea: Classification accuracy (discriminability) serves as distance measure Value of 1: far; Value of 1/2: close 3 2 Observed data Simulated data, mean (6,0) 3 2 Observed data Simulated data, mean (1/2,0) 1 1 y coordinate 0 y coordinate x coordinate x coordinate Michael Gutmann Efficient Likelihood-Free Inference 18 / 32

22 Overview of some of my work 1. How should we assess whether x θ x o? 2. How should we compute the proba of the event x θ x o? Use density ratio estimation (Dutta et al, 2016, arxiv: ) Basic idea: frame posterior estimation as ratio estimation problem p(θ x) = p(x θ)p(θ) p(x) = r(x, θ)p(θ) (2) Estimate ˆr(x, θ) yields estimate of the likelihood function and posterior ˆL(θ) ˆr(x o, θ), ˆp(θ x o ) = ˆr(x o, θ)p(θ). (3) Michael Gutmann Efficient Likelihood-Free Inference 19 / 32

23 Overview of some of my work 1. How should we assess whether x θ x o? Use classification (Gutmann et al, 2014, 2017) 1. How should we assess whether x θ x o? 2. How should we compute the proba of the event x θ x o? Use density ratio estimation (Dutta et al, 2016, arxiv: ) 2. How should we compute the proba of the event x θ x o? 3. For which values of θ should we compute it? Use Bayesian optimisation (Gutmann and Corander, ) Michael Gutmann Efficient Likelihood-Free Inference 20 / 32

24 Overview of some of my work 1. How should we assess whether x θ x o? Use classification (Gutmann et al, 2014, 2017) 1. How should we assess whether x θ x o? 2. How should we compute the proba of the event x θ x o? Use density ratio estimation (Dutta et al, 2016, arxiv: ) 2. How should we compute the proba of the event x θ x o? 3. For which values of θ should we compute it? Use Bayesian optimisation (Gutmann and Corander, ) Michael Gutmann Efficient Likelihood-Free Inference 21 / 32

25 Why is the ABC algorithm so expensive? 1. It rejects most samples when ɛ is small 2. It does not make assumptions about the shape of L(θ) 3. It does not use all information available 4. It aims at equal accuracy for all parameters Model M(θ) Likelihood L(θ) proportion of green outcomes (1) x θ (2) x θ (3) x θ (4) x θ (5) x θ (6) x θ x o ε Data space L(θ) P N (d(x θ, x o ) ɛ) = 1 N ( N ) 1 d(x (i) i=1 θ, xo ) ɛ Michael Gutmann Efficient Likelihood-Free Inference 22 / 32

26 Proposed solution (Gutmann and Corander, JMLR, 2016) 1. It rejects most samples when ɛ is small Don t reject samples learn from them 2. It does not make assumptions about the shape of L(θ) Model the distances, assume average distance is smooth 3. It does not use all information available Use Bayes theorem to update the model 4. It aims at equal accuracy for all parameters Prioritize parameter regions with small distances equivalent strategy applies to inference with synthetic likelihood Michael Gutmann Efficient Likelihood-Free Inference 23 / 32

27 Modelling (points 1 & 2) Data are tuples (θ i, d i ), where d i = d(x (i) θ, xo ) Model the conditional distribution of d given θ Estimated model yields approximation ˆL(θ) for any choice of ɛ ˆL(θ) P (d ɛ θ) P is probability under the estimated model. Here: Use (log) Gaussian process with squared exponential covariance function as model Approach not restricted to this model or Gaussian processes (comparison of different GP models: Järvenpää et al, 2016, arxiv: ) Michael Gutmann Efficient Likelihood-Free Inference 24 / 32

28 Data acquisition (points 3 & 4) Samples of θ could be obtained by sampling from the prior or some adaptively constructed proposal distribution Give priority to regions in the parameter space where distance d tends to be small. Use Bayesian optimization to find such regions Here: Use lower confidence bound acquisition function (e.g. Cox and John, 1992; Srinivas et al, 2012) A t (θ) = µ t (θ) }{{} η2 t v t (θ) }{{}}{{} post mean weight post var t: number of samples acquired so far Approach not restricted to this acquisition function. (new acquisition function: Järvenpää et al, 2017, arxiv: ) (4) Michael Gutmann Efficient Likelihood-Free Inference 25 / 32

29 Bayesian optimization for likelihood-free inference 5 Model based on 2 data points 95% 6 Model based on 3 data points 0 mean 90% 80% 50% distance -5 20% 10% Acquisition function 5% Competition parameter Model based on 4 data points Next parameter to try Competition parameter 5 4 Exploration vs exploitation distance 3 2 Model Data 1 0 Bayes' theorem Competition parameter Michael Gutmann Efficient Likelihood-Free Inference 26 / 32

30 Strain Example: Bacterial infections in child care centers Likelihood intractable for cross-sectional data But generating data from the model is possible Strain Parameters of interest: - rate of infections within a center - rate of infections from outside - competition between the strains Individual Individual 5 10 Strain Individual 10 Strain Strain (Numminen et al, 2013) Time Individual Individual Michael Gutmann Efficient Likelihood-Free Inference 27 / 32

31 Example: Bacterial infections in child care centers Comparison of the proposed approach with a standard population Monte Carlo ABC approach. Roughly equal results using 1000 times fewer simulations. 4.5 days with 200 cores 90 minutes with seven cores Posterior means: solid lines, credibility intervals: shaded areas or dashed lines. Competition parameter Developed Fast Method Standard Method Computational cost (log10) (Gutmann and Corander, JMLR, 2016) Michael Gutmann Efficient Likelihood-Free Inference 28 / 32

32 Example: Bacterial infections in child care centers Comparison of the proposed approach with a standard population Monte Carlo ABC approach. Roughly equal results using 1000 times fewer simulations Developed Fast Method Standard Method Developed Fast Method Standard Method Internal infection parameter External infection parameter Computational cost (log10) Computational cost (log10) Posterior means are shown as solid lines, credibility intervals as shaded areas or dashed lines. Michael Gutmann Efficient Likelihood-Free Inference 29 / 32

33 Benefits The proposed method makes the inference more efficient. allowed us to perform far more comprehensive data analysis than with standard approach (Numminen et al, 2016) Enables inference for models which were out of reach till now model of evolution where simulating a single data set took us hours (Marttinen et al, 2015) Enables easier assessment of parameter identifiability for complex models model about transmission dynamics of tuberculosis (Lintusaari et al, 2016) Michael Gutmann Efficient Likelihood-Free Inference 30 / 32

34 Open questions Model: How to best model the distance between simulated and observed data? Acquisition function: Can we find strategies which are optimal for parameter inference? Efficient high-dimensional inference: Can we use the approach to infer the joint distribution of 1000 variables? see Gutmann and Corander, JMLR, 2016 for a discussion for first answers: Michael Gutmann Efficient Likelihood-Free Inference 31 / 32

35 Summary Topic: Inference for models where the likelihood is intractable but sampling is possible Inference principle: Find parameter values for which the distance between simulated and observed data is small Problem considered: Computational cost Proposed approach: Combine statistical modeling of the distance with decision making under uncertainty (Bayesian optimization) Outcome: Approach increases the efficiency of the inference by several orders of magnitude Michael Gutmann Efficient Likelihood-Free Inference 32 / 32

Bayesian Inference by Density Ratio Estimation

Bayesian Inference by Density Ratio Estimation Bayesian Inference by Density Ratio Estimation Michael Gutmann https://sites.google.com/site/michaelgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh

More information

Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

More information

Tutorial on Approximate Bayesian Computation

Tutorial on Approximate Bayesian Computation Tutorial on Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology 16 May 2016

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Bayesian Deep Learning

Bayesian Deep Learning Bayesian Deep Learning Mohammad Emtiyaz Khan AIP (RIKEN), Tokyo http://emtiyaz.github.io emtiyaz.khan@riken.jp June 06, 2018 Mohammad Emtiyaz Khan 2018 1 What will you learn? Why is Bayesian inference

More information

Intractable Likelihood Functions

Intractable Likelihood Functions Intractable Likelihood Functions Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring semester 2018 Recap p(x y o ) = z p(x,y o,z) x,z

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)

The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) The Poisson transform for unnormalised statistical models Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) Part I Unnormalised statistical models Unnormalised statistical models

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

arxiv: v4 [stat.ml] 18 Dec 2018

arxiv: v4 [stat.ml] 18 Dec 2018 arxiv (December 218) Likelihood-free inference by ratio estimation Owen Thomas, Ritabrata Dutta, Jukka Corander, Samuel Kaski and Michael U. Gutmann, arxiv:1611.1242v4 [stat.ml] 18 Dec 218 Abstract. We

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

arxiv: v3 [stat.ml] 8 Aug 2018

arxiv: v3 [stat.ml] 8 Aug 2018 Efficient acquisition rules for model-based approximate Bayesian computation Marko Järvenpää, Michael U. Gutmann, Arijus Pleska, Aki Vehtari, and Pekka Marttinen arxiv:74.5v3 [stat.ml] 8 Aug 8 Helsinki

More information

Algorithms for Variational Learning of Mixture of Gaussians

Algorithms for Variational Learning of Mixture of Gaussians Algorithms for Variational Learning of Mixture of Gaussians Instructors: Tapani Raiko and Antti Honkela Bayes Group Adaptive Informatics Research Center 28.08.2008 Variational Bayesian Inference Mixture

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

arxiv: v2 [stat.ml] 23 Feb 2019

arxiv: v2 [stat.ml] 23 Feb 2019 arxiv:1810.09912v2 [stat.ml] 23 Feb 2019 Steven Kleinegesse School of Informatics University of Edinburgh steven.kleinegesse@ed.ac.uk Abstract Bayesian experimental design involves the optimal allocation

More information

Variational Autoencoder

Variational Autoencoder Variational Autoencoder Göker Erdo gan August 8, 2017 The variational autoencoder (VA) [1] is a nonlinear latent variable model with an efficient gradient-based training procedure based on variational

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Tutorial on Probabilistic Programming with PyMC3

Tutorial on Probabilistic Programming with PyMC3 185.A83 Machine Learning for Health Informatics 2017S, VU, 2.0 h, 3.0 ECTS Tutorial 02-04.04.2017 Tutorial on Probabilistic Programming with PyMC3 florian.endel@tuwien.ac.at http://hci-kdd.org/machine-learning-for-health-informatics-course

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1) HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016 Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started

More information

Variational Bayesian Logistic Regression

Variational Bayesian Logistic Regression Variational Bayesian Logistic Regression Sargur N. University at Buffalo, State University of New York USA Topics in Linear Models for Classification Overview 1. Discriminant Functions 2. Probabilistic

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Introduction to Bayesian inference

Introduction to Bayesian inference Introduction to Bayesian inference Thomas Alexander Brouwer University of Cambridge tab43@cam.ac.uk 17 November 2015 Probabilistic models Describe how data was generated using probability distributions

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Auto-Encoding Variational Bayes

Auto-Encoding Variational Bayes Auto-Encoding Variational Bayes Diederik P Kingma, Max Welling June 18, 2018 Diederik P Kingma, Max Welling Auto-Encoding Variational Bayes June 18, 2018 1 / 39 Outline 1 Introduction 2 Variational Lower

More information

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Practical Bayesian Optimization of Machine Learning. Learning Algorithms Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that

More information

Probabilistic Reasoning in Deep Learning

Probabilistic Reasoning in Deep Learning Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

The Generalized Likelihood Uncertainty Estimation methodology

The Generalized Likelihood Uncertainty Estimation methodology CHAPTER 4 The Generalized Likelihood Uncertainty Estimation methodology Calibration and uncertainty estimation based upon a statistical framework is aimed at finding an optimal set of models, parameters

More information

Probabilistic numerics for deep learning

Probabilistic numerics for deep learning Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling

More information

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Afternoon Meeting on Bayesian Computation 2018 University of Reading Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 305 Part VII

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Patterns of Scalable Bayesian Inference Background (Session 1)

Patterns of Scalable Bayesian Inference Background (Session 1) Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles

More information

PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric

PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric 2. Preliminary Analysis: Clarify Directions for Analysis Identifying Data Structure:

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

arxiv: v3 [stat.co] 3 Mar 2017

arxiv: v3 [stat.co] 3 Mar 2017 Likelihood-free inference via classification Michael U. Gutmann Ritabrata Dutta Samuel Kaski Jukka Corander arxiv:7.9v [stat.co] Mar 7 Accepted for publication in Statistics and Computing (Feb, 7) Abstract

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007 Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Scalable robust hypothesis tests using graphical models

Scalable robust hypothesis tests using graphical models Scalable robust hypothesis tests using graphical models Umamahesh Srinivas ipal Group Meeting October 22, 2010 Binary hypothesis testing problem Random vector x = (x 1,...,x n ) R n generated from either

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Fundamentals and Recent Developments in Approximate Bayesian Computation

Fundamentals and Recent Developments in Approximate Bayesian Computation Syst. Biol. 66(1):e66 e82, 2017 The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of

More information

Statistical Methods for Particle Physics (I)

Statistical Methods for Particle Physics (I) Statistical Methods for Particle Physics (I) https://agenda.infn.it/conferencedisplay.py?confid=14407 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Topics in Natural Language Processing

Topics in Natural Language Processing Topics in Natural Language Processing Shay Cohen Institute for Language, Cognition and Computation University of Edinburgh Lecture 5 Solving an NLP Problem When modelling a new problem in NLP, need to

More information

Inference Control and Driving of Natural Systems

Inference Control and Driving of Natural Systems Inference Control and Driving of Natural Systems MSci/MSc/MRes nick.jones@imperial.ac.uk The fields at play We will be drawing on ideas from Bayesian Cognitive Science (psychology and neuroscience), Biological

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

Variational Autoencoders

Variational Autoencoders Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly

More information

Big model configuration with Bayesian quadrature. David Duvenaud, Roman Garnett, Tom Gunter, Philipp Hennig, Michael A Osborne and Stephen Roberts.

Big model configuration with Bayesian quadrature. David Duvenaud, Roman Garnett, Tom Gunter, Philipp Hennig, Michael A Osborne and Stephen Roberts. Big model configuration with Bayesian quadrature David Duvenaud, Roman Garnett, Tom Gunter, Philipp Hennig, Michael A Osborne and Stephen Roberts. This talk will develop Bayesian quadrature approaches

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Disease mapping with Gaussian processes

Disease mapping with Gaussian processes EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno

More information

20: Gaussian Processes

20: Gaussian Processes 10-708: Probabilistic Graphical Models 10-708, Spring 2016 20: Gaussian Processes Lecturer: Andrew Gordon Wilson Scribes: Sai Ganesh Bandiatmakuri 1 Discussion about ML Here we discuss an introduction

More information

Talk on Bayesian Optimization

Talk on Bayesian Optimization Talk on Bayesian Optimization Jungtaek Kim (jtkim@postech.ac.kr) Machine Learning Group, Department of Computer Science and Engineering, POSTECH, 77-Cheongam-ro, Nam-gu, Pohang-si 37673, Gyungsangbuk-do,

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Bayesian Decision Theory Bayesian classification for normal distributions Error Probabilities

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Statistical learning. Chapter 20, Sections 1 3 1

Statistical learning. Chapter 20, Sections 1 3 1 Statistical learning Chapter 20, Sections 1 3 Chapter 20, Sections 1 3 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

Bayesian Learning Features of Bayesian learning methods:

Bayesian Learning Features of Bayesian learning methods: Bayesian Learning Features of Bayesian learning methods: Each observed training example can incrementally decrease or increase the estimated probability that a hypothesis is correct. This provides a more

More information

The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan

The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan The geometry of Gaussian processes and Bayesian optimization. Contal CMLA, ENS Cachan Background: Global Optimization and Gaussian Processes The Geometry of Gaussian Processes and the Chaining Trick Algorithm

More information