Bayesian Evidence and Model Selection: A Tutorial in Two Acts

Size: px
Start display at page:

Download "Bayesian Evidence and Model Selection: A Tutorial in Two Acts"

Transcription

1 Bayesian Evidence and Model Selection: A Tutorial in Two Acts Kevin H. Knuth Depts. of Physics and Informatics, University at Albany, Albany NY USA Based on the paper: Knuth K.H., Habeck M., Malakar N.K., Mubeen A.M., Placek B Bayesian evidence and model selection. In press at Digital Signal Processing. doi: /j.dsp DOWNLOAD TALK NOW: Google knuthlab Click Talks 7/19/2015 MaxEnt 2015 Tutorial 1

2 This tutorial follows the paper: Knuth K.H., Habeck M., Malakar N.K., Mubeen A.M., Placek B Bayesian evidence and model selection. In press at Digital Signal Processing. doi: /j.dsp References are not provided in the talk slides, please consult the paper Equations in the talk are numbered in accordance with the paper When referencing anything from Act 1 of this talk, please reference this paper When referencing anything from Act 2 of this talk, please reference the slides 7/19/2015 MaxEnt 2015 Tutorial 2

3 Bayesian Evidence Odds Ratios Evidence, Model Order and Priors Numerical Techniques Laplace Approximation Importance Sampling Annealed Importance Sampling Variational Bayes Nested Sampling Applications Signal Detection : Brain Computer Interface / Neuroscience Sensor Characterization : Robotics / Signal Processing Exoplanet Characterization : Astronomy / Astrophysics Examples Nested Sampling Demo Nested Sampling and Phase Transitions 7/19/2015 MaxEnt 2015 Tutorial 3

4 7/19/2015 MaxEnt 2015 Tutorial 4

5 Bayesian Evidence 7/19/2015 MaxEnt 2015 Tutorial 5

6 Bayesian Evidence : Odds Ratios : oooo Bayes Theorem Posterior Probability Prior Probability Likelihood Evidence or Marginal Likelihood M represents a class of models represented by a set of model parameters m represents a particular model defined by a set of particular model parameter values d represents the acquired data 7/19/2015 MaxEnt 2015 Tutorial 6

7 Bayesian Evidence : Odds Ratios : oooo Bayesian Evidence The Bayesian evidence can be found by marginalizing the joint distribution P m, d M, I over all model parameter values. M represents a class of models represented by a set of model parameters m represents a particular model defined by a set of particular model parameter values d represents the acquired data I represents the dependence on any relevant prior information 7/19/2015 MaxEnt 2015 Tutorial 7

8 Bayesian Evidence : Odds Ratios : oooo Model Comparison We derive the ratio of probabilities of two models given data If the prior probabilities of the models are equal then this is the ratio of evidences 7/19/2015 MaxEnt 2015 Tutorial 8

9 Bayesian Evidence : Odds Ratios : oooo Odds Ratio or Bayes Factor The ratio of probabilities of the models given the data is proportional to the Odds Ratio 7/19/2015 MaxEnt 2015 Tutorial 9

10 Bayesian Evidence : Evidence, Model Order and Priors : ooooo Evidence: Model Order and Priors It is instructive to see how the evidence depends on both the model order and prior probabilities Consider a model with a single parameter: x ε [x min, x max ] with a width of x = x max x min Define the effective width where L max is the maximum likelihood value. 7/19/2015 MaxEnt 2015 Tutorial 10

11 Bayesian Evidence : Evidence, Model Order and Priors : ooooo Model with a Single Parameter Consider a model with a single parameter: x ε [x min, x max ] with a width of x = x max x min Given the effective width The evidence is 7/19/2015 MaxEnt 2015 Tutorial 11

12 Bayesian Evidence : Evidence, Model Order and Priors : ooooo Occam Factor The evidence is proportional to the ratio of the effective width of the likelihood and the width of the prior This ratio δx x is called the Occam factor after Occam s Razor: "Non sunt multiplicanda entia sine necessitate", " Entities must not be multiplied beyond necessity - William of Ockham 7/19/2015 MaxEnt 2015 Tutorial 12

13 Bayesian Evidence : Evidence, Model Order and Priors : ooooo Model Order For models with multiple parameters this generalizes to the ratio of the volume of the models that are compatible with both the data and the prior and the prior volume. If we assume that each of the K parameters has prior width x δx K then the Occam factor scales as. As model parameters added, eventually one fits the data asymptotically well so that δx attains a maximum value and further model parameters can only decrease the Occam factor. If we increase the flexibility of our model by the introduction of more model parameters, we reduce the Occam factor. x 7/19/2015 MaxEnt 2015 Tutorial 13

14 Bayesian Evidence : Evidence, Model Order and Priors : ooooo Odds Ratios and Occam Factors We compute the odds ratio for a model M 0 without model parameters to a model M 1 with a single model parameter. L max ratio Occam Factor The likelihood ratio is a classical statistic in frequentist model selection. If we only consider the likelihood ratio in model comparison problems, we fail to acknowledge the importance of Occam factors. 7/19/2015 MaxEnt 2015 Tutorial 14

15 Numerical Techniques 7/19/2015 MaxEnt 2015 Tutorial 15

16 Numerical Techniques : o Numerical Techniques There are a wide variety of techniques that can be used to estimate the Bayesian evidence: Laplace Approximation Importance Sampling Path Sampling Thermodynamic Integration Simulated Annealing Annealed Importance Sampling Variational Bayes (Ensemble Learning) Nested Sampling 7/19/2015 MaxEnt 2015 Tutorial 16

17 Numerical Techniques : Laplace Approximation : oooo Laplace Approximation is a simple and useful method for approximating a unimodal probability density function with a Gaussian Consider a function p x with a peak at x = x 0 We write a Taylor series expansion of ln p x about x = x 0 which can be simplified to 7/19/2015 MaxEnt 2015 Tutorial 17

18 Numerical Techniques : Laplace Approximation : oooo Laplace Approximation Previously, we had By defining We can write 7/19/2015 MaxEnt 2015 Tutorial 18

19 Numerical Techniques : Laplace Approximation : oooo Laplace Approximation Previously, we had By taking the exponential we can approximate the density by with an integral (evidence) of: 7/19/2015 MaxEnt 2015 Tutorial 19

20 Numerical Techniques : Laplace Approximation : oooo Laplace Approximation In the case of a multidimensional posterior we have where The evidence is then 7/19/2015 MaxEnt 2015 Tutorial 20

21 Numerical Techniques : Importance Sampling : oooo Importance Sampling allows one to find expectation values with respect to one distribution p(x) by computing expectation values with respect to a second distribution q(x) that is easier to sample from. The expectation value of f x with respect to p x is given by One can write p x as p x q x whenever p x is nonzero. q x as long as q x is non-zero 7/19/2015 MaxEnt 2015 Tutorial 21

22 Numerical Techniques : Importance Sampling : oooo Importance Sampling Writing p x as p x q x q x, we have: As long as the ratio p x q x estimate this with samples from q x by does not attain extreme values we can 7/19/2015 MaxEnt 2015 Tutorial 22

23 Numerical Techniques : Importance Sampling : oooo Importance Sampling Importance sampling can be used to compute ratios of evidence values in a similar fashion by writing 7/19/2015 MaxEnt 2015 Tutorial 23

24 Numerical Techniques : Importance Sampling : oooo Importance Sampling The evidence ratio can be found by sampling from q x as long as p x is sufficiently close to q x to avoid extreme ratios of p x q x 7/19/2015 MaxEnt 2015 Tutorial 24

25 Numerical Techniques : Variational Bayes : ooo Variational Bayes which is also known as ensemble learning, relies on approximating The posterior P m M, I with another distribution Q m. By defining the negative Free Energy And the Kullback-Leibler (KL) Divergence We can write 7/19/2015 MaxEnt 2015 Tutorial 25

26 Numerical Techniques : Variational Bayes : ooo Variational Bayes With this expression in hand We can show that the negative Free Energy is a lower bound to the evidence By minimizing the negative Free Energy, we can approximate the evidence 7/19/2015 MaxEnt 2015 Tutorial 26

27 Numerical Techniques : Variational Bayes : ooo Variational Bayes By choosing a distribution Q m that factorizes into where the set of parameters m 0 is disjoint from m 1 we can minimize the negative Free Energy and estimate the evidence by choosing where 7/19/2015 MaxEnt 2015 Tutorial 27

28 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling was developed by John Skilling to stochastically integrate the posterior probability to obtain the evidence. Posterior estimates are used to obtain model parameter estimates. Nested sampling aims to estimate the cumulative distribution function of the density or states (DOS), which is the prior probability mass enclosed within a likelihood boundary. 7/19/2015 MaxEnt 2015 Tutorial 28

29 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling Given a likelihood L, one can find the prior mass such that the likelihood of those states is greater than L Parameter Space 7/19/2015 MaxEnt 2015 Tutorial 29

30 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling One can then estimate the evidence via stochastic integration using samples distributed according to the prior Likelihood integrated over Prior 7/19/2015 MaxEnt 2015 Tutorial 30

31 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling One begins with a set of N samples. Use the sample with the lowest likelihood to define an implicit likelihood boundary. This results in an average decrease of the prior volume by 1/N Sample from the prior (uniformly is easiest) from within the implicit likelihood boundary to maintain N samples to estimate the evidence Z Keep track of L i (X i+1 X i ) i 7/19/2015 MaxEnt 2015 Tutorial 31

32 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling Note how the prior volume contracts by 1/N each time. Early steps contribute little to the integral (Z) since the likelihood is very low. Later steps contribute little to (Z) since the prior volume change is very small. The steps that contribute most are in the middle of the sequence. 7/19/2015 MaxEnt 2015 Tutorial 32

33 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling Since nested sampling contracts along the prior volume, it is relatively unaffected by local maxima in evidence (phase transitions). (See Figure A) Methods based on tempering, such as simulated annealing follow the slope of the log L curve and as such, get stuck at phase transitions. (See Figure B) 7/19/2015 MaxEnt 2015 Tutorial 33

34 Numerical Techniques : Nested Sampling : ooooooo Nested Sampling The great challenge is sampling uniformly (from the prior) within the implicit Likelihood boundaries. Several versions of Nested Sampling now exist: MultiNest (developed by Feroz and Hobson): clusters samples (K-means) and fits clusters with ellipsoids. Samples uniformly from within those ellipsoids. Very fast. Excellent performance for multi-modal distributions. Clustering limits this to 10s of parameters and the ellipsoids may not cover the high likelihood regions. Galilean Monte Carlo (Developed by Feroz and Skilling): moves a new sample with momentum reflecting off of logl boundaries. Excellent at handling ridges both angled and curved. Constrained Hamilton Monte Carlo (developed by M. Betancourt): similar to Galilean Monte Carlo. Diffusive Nested Sampling (developed by Brewer): allows samples to diffuse to lower likelihood nested levels and takes a weighted average. Nested Sampling with Demons (developed by M. Habeck): utilizes demon variables that smooth the constraint boundary and push the samples away from it. 7/19/2015 MaxEnt 2015 Tutorial 34

35 7/19/2015 MaxEnt 2015 Tutorial 35

36 Applications: Signal Detection: oooooooo Signal Detection Brain Computer Interface / Neuroscience 7/19/2015 MaxEnt 2015 Tutorial 36

37 Applications: Signal Detection: oooooooo Signal Detection (Mubeen and Knuth) We consider a practical signal detection problem where the log odds-ratio can be analytically derived. The specific application was originally for the detection of evoked brain responses The signal absent case models the recording x in channel m as noise The signal present case models the recording x in channel m as signal plus where the signal has a amplitude parameter α and can be coupled differently to different detectors (via C) 7/19/2015 MaxEnt 2015 Tutorial 37

38 Applications: Signal Detection: oooooooo Considering the Evidence The odds ratio can be written as For the noise only case, the evidence is the likelihood (Gaussian) 7/19/2015 MaxEnt 2015 Tutorial 38

39 Applications: Signal Detection: oooooooo Considering the Evidence In the signal plus noise case, the evidence is Assigning a Gaussian Likelihood and Prior for α 7/19/2015 MaxEnt 2015 Tutorial 39

40 Applications: Signal Detection: oooooooo Considering the Evidence We can then write the evidence as where 7/19/2015 MaxEnt 2015 Tutorial 40

41 Applications: Signal Detection: oooooooo Considering the Evidence If the signal amplitude must be positive: α ε 0, + then: If amplitude can be positive or negative: α ε, + then: 7/19/2015 MaxEnt 2015 Tutorial 41

42 Applications: Signal Detection: oooooooo Considering the Evidence Look at: The expression E (86) contains the cross-correlation term, which is what is typically used for the detection of a target signal in ongoing recordings. The log OR detection filters incorporate more information that leads to extra terms, which serve to aid in target signal detection. 7/19/2015 MaxEnt 2015 Tutorial 42

43 Applications: Signal Detection: oooooooo Detecting Signals A. The P300 template target signal. B. An example of three channels (Cz, Pz, Fz) of synthetic ongoing EEG with two P300 target signal events (indicated by the arrows) at an SNR of 5 db. 7/19/2015 MaxEnt 2015 Tutorial 43

44 Applications: Signal Detection: oooooooo Signal Detection Performance Detection performance measured by that area under the ROC curve as a function of signal SNR. Both OR techniques outperform cross-correlation! 7/19/2015 MaxEnt 2015 Tutorial 44

45 Applications: Signal Detection: oooo Sensor Characterization Robotics / Signal Processing 7/19/2015 MaxEnt 2015 Tutorial 45

46 Applications: Signal Detection: oooo Modeling a Robotic Sensor (Malakar, Gladkov, Knuth) In this project, we aim to model the spatial sensitivity function of a LEGO light sensor for use on a robotic system. Here the application is to develop a robotic arm that can characterize the white circle by measuring light intensities are various locations. By modeling the light sensor, we aim to increase the robot s performance. 7/19/2015 MaxEnt 2015 Tutorial 46

47 Applications: Signal Detection: oooo Modeling a Robotic Sensor The LEGO light sensor was slowly moved over a black-and-white albedo pattern on the surface of a table to obtain calibration data. Sensor orientation was varied as well. 7/19/2015 MaxEnt 2015 Tutorial 47

48 Applications: Signal Detection: oooo Modeling a Robotic Sensor Mixture of Gaussians models were used. Four model orders were tested using Nested Sampling. The 1-MoG model was slightly favored. Note the increasing uncertainty as the model becomes more complex. This suggests that the permutation space was not fully explored. 7/19/2015 MaxEnt 2015 Tutorial 48

49 Applications: Signal Detection: oooo Examining the Sensor Model Performance Here we show a comparison between the 1-MoG model and the data 7/19/2015 MaxEnt 2015 Tutorial 49

50 Applications: Star System Characterization: ooooooo Star System Characterization Astronomy / Astrophysics 7/19/2015 MaxEnt 2015 Tutorial 50

51 Applications: Star System Characterization: ooooooo Star System Characterization (Placek and Knuth) In our DSP paper, we give an example of Bayesian model testing applied to exoplanet characterization. Ben Placek also has a paper and poster here at MaxEnt 2015 on the topic. Here I will apply these model testing concepts to determining the orbital configuration of a triple star system. Digital Sky Survey (DSS) 7/19/2015 MaxEnt 2015 Tutorial 51

52 Applications: Star System Characterization: ooooooo KIC : Two Periods This star exhibits oscillations of two commensurate periods in its Light curve: 6.45 days and days (a rare 10:1 resonance!) Digital Sky Survey (DSS) Photometric data obtained from the Kepler mission (A) Quarter 13 light curve folded on the P1 = 6.45 day period, (B) Quarter 13 light curve folded on the P2 = day period (C) is the entire Q13 light curve. 7/19/2015 MaxEnt 2015 Tutorial 52

53 Applications: Star System Characterization: ooooooo KIC : Radial Velocity Measurements Eleven radial velocity measurements taken over the span of a week. The 6.45 day period is visible, but not the day period. Courtesy of Geoff Marcy and Howard Issacson 7/19/2015 MaxEnt 2015 Tutorial 53

54 Applications: Star System Characterization: ooooooo KIC : Models Two possible models of the system. The main star is a G-star (like our sun), at least one of the other companions (C1) is M-dwarf. (A) A hierarchical arrangement (C1 and C2 orbit G with 6.45 day period, and orbit one another with day period) (B) A planetary arrangement (C1 orbits with 6.45 day period, and C2 orbits with day period) 7/19/2015 MaxEnt 2015 Tutorial 54

55 Applications: Star System Characterization: ooooooo KIC : Results Testing the Hierarchical Model against the Planetary Model using the Radial Velocity Data. The Circular Hierarchical Model has the greatest evidence (by a factor of exp ) 7/19/2015 MaxEnt 2015 Tutorial 55

56 Applications: Star System Characterization: ooooooo KIC This system is a hierarchical triple system consisting of a G-star with two co-orbiting M-dwarfs in a 1:10 resonance (P1 = 6.45 day, P2 = day) 7/19/2015 MaxEnt 2015 Tutorial 56

57 Applications: Star System Characterization: ooooooo KIC /19/2015 MaxEnt 2015 Tutorial 57

58 Demonstrations: Nested Sampling: Lighthouse Problem: ooooo Nested Sampling Demo (sans model testing) 7/19/2015 MaxEnt 2015 Tutorial 58

59 Demonstrations: Nested Sampling: Lighthouse Problem: ooooo Nested Sampling Demo The Lighthouse Problem (Gull) Consider a Lighthouse located just off of a straight shore that extends a great distance. Imagine that the lighthouse has a laser beam that it fires at random times as it rotates with a uniform speed. Along the shore are light detectors that detect laser beam hits. Based on this data, where is the lighthouse? 7/19/2015 MaxEnt 2015 Tutorial 59

60 Demonstrations: Nested Sampling: Lighthouse Problem: ooooo The Likelihood Function It is a useful exercise to derive the likelihood via a change of variables. p x I = β π β 2 + α x 2 We assign a uniform prior for the location parameters α and β x = β tan θ + α 7/19/2015 MaxEnt 2015 Tutorial 60

61 y position Demonstrations: Nested Sampling: Lighthouse Problem: ooooo Nested Sampling Run using D = 64 data points (recorded flashes) and N = 100 samples Iteration is halted when Δ log Z = 10 7 o Live Samples + Used Samples Location of Lighthouse # Iterations = 1193 Log Z = mean(x) = 0.48 ± 0.26 mean(y) = 0.51 ± 0.28 x position 7/19/2015 MaxEnt 2015 Tutorial 61

62 Demonstrations: Nested Sampling: Lighthouse Problem: ooooo Nested Sampling Run This shows the relationship between log L and log Prior Volume 7/19/2015 MaxEnt 2015 Tutorial 62

63 Demonstrations: Nested Sampling: Lighthouse Problem: ooooo Nested Sampling Run with a Gaussian Likelihood This shows the relationship between log L and log Prior Volume 7/19/2015 MaxEnt 2015 Tutorial 63

64 Applications: Nested Sampling and Phase Transitions: oooo Nested Sampling and Phase Transitions 7/19/2015 MaxEnt 2015 Tutorial 64

65 Applications: Nested Sampling and Phase Transitions: oooo Peaks on Peaks Here is a Gaussian Likelihood with a taller peak on the side 7/19/2015 MaxEnt 2015 Tutorial 65

66 Applications: Nested Sampling and Phase Transitions: oooo Nested Sampling with Phase Transitions Phase Transitions represent local peaks in the evidence Phase Transition 7/19/2015 MaxEnt 2015 Tutorial 66

67 Applications: Nested Sampling and Phase Transitions: oooo Acoustic Source Localization: One Detector Consider an acoustic source localization problem using a single detector. There is a low (red) and a high (blue) frequency source. Note how the high frequency source is found first inducing a phase transition: 7/19/2015 MaxEnt 2015 Tutorial 67

68 Applications: Nested Sampling and Phase Transitions: oooo Acoustic Source Localization: Two Detectors In this example, we have two detectors, which allow us to localize the sources to rings. Again, the low frequency source is found first. 7/19/2015 MaxEnt 2015 Tutorial 68

69 Acknowledgements Michael Habeck Nabin Malakar Asim Mubeen Ben Placek 7/19/2015 MaxEnt 2015 Tutorial 69

EXONEST The Exoplanetary Explorer. Kevin H. Knuth and Ben Placek Department of Physics University at Albany (SUNY) Albany NY

EXONEST The Exoplanetary Explorer. Kevin H. Knuth and Ben Placek Department of Physics University at Albany (SUNY) Albany NY EXONEST The Exoplanetary Explorer Kevin H. Knuth and Ben Placek Department of Physics University at Albany (SUNY) Albany NY Kepler Mission The Kepler mission, launched in 2009, aims to explore the structure

More information

arxiv:astro-ph/ v1 14 Sep 2005

arxiv:astro-ph/ v1 14 Sep 2005 For publication in Bayesian Inference and Maximum Entropy Methods, San Jose 25, K. H. Knuth, A. E. Abbas, R. D. Morris, J. P. Castle (eds.), AIP Conference Proceeding A Bayesian Analysis of Extrasolar

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Nested Sampling. Brendon J. Brewer.   brewer/ Department of Statistics The University of Auckland Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/ brewer/ is a Monte Carlo method (not necessarily MCMC) that was introduced by John Skilling in 2004. It is very popular

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information

2D Image Processing (Extended) Kalman and particle filter

2D Image Processing (Extended) Kalman and particle filter 2D Image Processing (Extended) Kalman and particle filter Prof. Didier Stricker Dr. Gabriele Bleser Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz

More information

Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4)

Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4) Miscellany : Long Run Behavior of Bayesian Methods; Bayesian Experimental Design (Lecture 4) Tom Loredo Dept. of Astronomy, Cornell University http://www.astro.cornell.edu/staff/loredo/bayes/ Bayesian

More information

Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis

Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis Additional Keplerian Signals in the HARPS data for Gliese 667C from a Bayesian re-analysis Phil Gregory, Samantha Lawler, Brett Gladman Physics and Astronomy Univ. of British Columbia Abstract A re-analysis

More information

Expectation propagation for signal detection in flat-fading channels

Expectation propagation for signal detection in flat-fading channels Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Variational Methods in Bayesian Deconvolution

Variational Methods in Bayesian Deconvolution PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Mobile Robot Localization

Mobile Robot Localization Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang Bayesian Dropout Tue Herlau, Morten Morup and Mikkel N. Schmidt Discussed by: Yizhe Zhang Feb 20, 2016 Outline 1 Introduction 2 Model 3 Inference 4 Experiments Dropout Training stage: A unit is present

More information

arxiv: v1 [physics.ins-det] 18 Mar 2013

arxiv: v1 [physics.ins-det] 18 Mar 2013 Modeling a Sensor to Improve its Efficacy arxiv:1303.4385v1 [physics.ins-det] 18 Mar 2013 Nabin K. Malakar W. B. Hanson Center for Space Sciences University of Texas at Dallas Richardson TX nabin.malakar@utdallas.edu

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring Lecture 9 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2015 http://www.astro.cornell.edu/~cordes/a6523 Applications: Comparison of Frequentist and Bayesian inference

More information

Bayesian Model Selection & Extrasolar Planet Detection

Bayesian Model Selection & Extrasolar Planet Detection Bayesian Model Selection & Extrasolar Planet Detection Eric B. Ford UC Berkeley Astronomy Dept. Wednesday, June 14, 2006 SCMA IV SAMSI Exoplanets Working Group: Jogesh Babu, Susie Bayarri, Jim Berger,

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Unfolding techniques for neutron spectrometry

Unfolding techniques for neutron spectrometry Uncertainty Assessment in Computational Dosimetry: A comparison of Approaches Unfolding techniques for neutron spectrometry Physikalisch-Technische Bundesanstalt Braunschweig, Germany Contents of the talk

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Hierarchical Bayesian Modeling

Hierarchical Bayesian Modeling Hierarchical Bayesian Modeling Making scientific inferences about a population based on many individuals Angie Wolfgang NSF Postdoctoral Fellow, Penn State Astronomical Populations Once we discover an

More information

The Expectation Maximization or EM algorithm

The Expectation Maximization or EM algorithm The Expectation Maximization or EM algorithm Carl Edward Rasmussen November 15th, 2017 Carl Edward Rasmussen The EM algorithm November 15th, 2017 1 / 11 Contents notation, objective the lower bound functional,

More information

An introduction to Bayesian inference and model comparison J. Daunizeau

An introduction to Bayesian inference and model comparison J. Daunizeau An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland Overview of the talk An introduction to probabilistic modelling Bayesian model comparison

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Centre for Computational Statistics and Machine Learning University College London c.archambeau@cs.ucl.ac.uk CSML

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software SAMSI Astrostatistics Tutorial More Markov chain Monte Carlo & Demo of Mathematica software Phil Gregory University of British Columbia 26 Bayesian Logical Data Analysis for the Physical Sciences Contents:

More information

SRNDNA Model Fitting in RL Workshop

SRNDNA Model Fitting in RL Workshop SRNDNA Model Fitting in RL Workshop yael@princeton.edu Topics: 1. trial-by-trial model fitting (morning; Yael) 2. model comparison (morning; Yael) 3. advanced topics (hierarchical fitting etc.) (afternoon;

More information

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions December 14, 2016 Questions Throughout the following questions we will assume that x t is the state vector at time t, z t is the

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Robotics. Lecture 4: Probabilistic Robotics. See course website for up to date information.

Robotics. Lecture 4: Probabilistic Robotics. See course website   for up to date information. Robotics Lecture 4: Probabilistic Robotics See course website http://www.doc.ic.ac.uk/~ajd/robotics/ for up to date information. Andrew Davison Department of Computing Imperial College London Review: Sensors

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Bayesian Inference. Chris Mathys Wellcome Trust Centre for Neuroimaging UCL. London SPM Course

Bayesian Inference. Chris Mathys Wellcome Trust Centre for Neuroimaging UCL. London SPM Course Bayesian Inference Chris Mathys Wellcome Trust Centre for Neuroimaging UCL London SPM Course Thanks to Jean Daunizeau and Jérémie Mattout for previous versions of this talk A spectacular piece of information

More information

Why Try Bayesian Methods? (Lecture 5)

Why Try Bayesian Methods? (Lecture 5) Why Try Bayesian Methods? (Lecture 5) Tom Loredo Dept. of Astronomy, Cornell University http://www.astro.cornell.edu/staff/loredo/bayes/ p.1/28 Today s Lecture Problems you avoid Ambiguity in what is random

More information

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit

Statistics. Lent Term 2015 Prof. Mark Thomson. 2: The Gaussian Limit Statistics Lent Term 2015 Prof. Mark Thomson Lecture 2 : The Gaussian Limit Prof. M.A. Thomson Lent Term 2015 29 Lecture Lecture Lecture Lecture 1: Back to basics Introduction, Probability distribution

More information

Efficient Likelihood-Free Inference

Efficient Likelihood-Free Inference Efficient Likelihood-Free Inference Michael Gutmann http://homepages.inf.ed.ac.uk/mgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh 8th November 2017

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Bishop PRML Ch. 9 Alireza Ghane c Ghane/Mori 4 6 8 4 6 8 4 6 8 4 6 8 5 5 5 5 5 5 4 6 8 4 4 6 8 4 5 5 5 5 5 5 µ, Σ) α f Learningscale is slightly Parameters is slightly larger larger

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 10 Alternatives to Monte Carlo Computation Since about 1990, Markov chain Monte Carlo has been the dominant

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Bayesian search for other Earths

Bayesian search for other Earths Bayesian search for other Earths Low-mass planets orbiting nearby M dwarfs Mikko Tuomi University of Hertfordshire, Centre for Astrophysics Research Email: mikko.tuomi@utu.fi Presentation, 19.4.2013 1

More information

ECE295, Data Assimila0on and Inverse Problems, Spring 2015

ECE295, Data Assimila0on and Inverse Problems, Spring 2015 ECE295, Data Assimila0on and Inverse Problems, Spring 2015 1 April, Intro; Linear discrete Inverse problems (Aster Ch 1 and 2) Slides 8 April, SVD (Aster ch 2 and 3) Slides 15 April, RegularizaFon (ch

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

A Bayesian Analysis of Extrasolar Planet Data for HD 73526

A Bayesian Analysis of Extrasolar Planet Data for HD 73526 Astrophysical Journal, 22 Dec. 2004, revised May 2005 A Bayesian Analysis of Extrasolar Planet Data for HD 73526 P. C. Gregory Physics and Astronomy Department, University of British Columbia, Vancouver,

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Recent advances in cosmological Bayesian model comparison

Recent advances in cosmological Bayesian model comparison Recent advances in cosmological Bayesian model comparison Astrophysics, Imperial College London www.robertotrotta.com 1. What is model comparison? 2. The Bayesian model comparison framework 3. Cosmological

More information

Approximate Inference

Approximate Inference Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes

More information

Modeling a Sensor to Improve its Efficacy

Modeling a Sensor to Improve its Efficacy University at Albany, State University of New York Scholars Archive Physics Faculty Scholarship Physics Spring 5--13 Modeling a Sensor to Improve its Efficacy Nabin K. Malakar University of Texas at Dallas,

More information

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein

Kalman filtering and friends: Inference in time series models. Herke van Hoof slides mostly by Michael Rubinstein Kalman filtering and friends: Inference in time series models Herke van Hoof slides mostly by Michael Rubinstein Problem overview Goal Estimate most probable state at time k using measurement up to time

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Lecture 6: Model Checking and Selection

Lecture 6: Model Checking and Selection Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2017

Cheng Soon Ong & Christian Walder. Canberra February June 2017 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2017 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 679 Part XIX

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

Approximate inference in Energy-Based Models

Approximate inference in Energy-Based Models CSC 2535: 2013 Lecture 3b Approximate inference in Energy-Based Models Geoffrey Hinton Two types of density model Stochastic generative model using directed acyclic graph (e.g. Bayes Net) Energy-based

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Advanced Statistical Methods. Lecture 6

Advanced Statistical Methods. Lecture 6 Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution

More information

Bayesian room-acoustic modal analysis

Bayesian room-acoustic modal analysis Bayesian room-acoustic modal analysis Wesley Henderson a) Jonathan Botts b) Ning Xiang c) Graduate Program in Architectural Acoustics, School of Architecture, Rensselaer Polytechnic Institute, Troy, New

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.

More information

Methods of Data Analysis Learning probability distributions

Methods of Data Analysis Learning probability distributions Methods of Data Analysis Learning probability distributions Week 5 1 Motivation One of the key problems in non-parametric data analysis is to infer good models of probability distributions, assuming we

More information

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester Physics 403 Choosing Priors and the Principle of Maximum Entropy Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Odds Ratio Occam Factors

More information

Week 3: The EM algorithm

Week 3: The EM algorithm Week 3: The EM algorithm Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Term 1, Autumn 2005 Mixtures of Gaussians Data: Y = {y 1... y N } Latent

More information

Astronomy. Astrophysics. DIAMONDS: A new Bayesian nested sampling tool. Application to peak bagging of solar-like oscillations

Astronomy. Astrophysics. DIAMONDS: A new Bayesian nested sampling tool. Application to peak bagging of solar-like oscillations A&A 571, A71 (2014) DOI: 10.1051/0004-6361/201424181 c ESO 2014 Astronomy & Astrophysics DIAMONDS: A new Bayesian nested sampling tool Application to peak bagging of solar-like oscillations E. Corsaro

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Introduction to Bayesian Data Analysis

Introduction to Bayesian Data Analysis Introduction to Bayesian Data Analysis Phil Gregory University of British Columbia March 2010 Hardback (ISBN-10: 052184150X ISBN-13: 9780521841504) Resources and solutions This title has free Mathematica

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Mobile Robot Localization

Mobile Robot Localization Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations

More information

Machine Learning! in just a few minutes. Jan Peters Gerhard Neumann

Machine Learning! in just a few minutes. Jan Peters Gerhard Neumann Machine Learning! in just a few minutes Jan Peters Gerhard Neumann 1 Purpose of this Lecture Foundations of machine learning tools for robotics We focus on regression methods and general principles Often

More information

Independent Component Analysis. Contents

Independent Component Analysis. Contents Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle

More information

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart Robotics Mobile Robotics State estimation, Bayes filter, odometry, particle filter, Kalman filter, SLAM, joint Bayes filter, EKF SLAM, particle SLAM, graph-based SLAM Marc Toussaint U Stuttgart DARPA Grand

More information

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian Inference in Astronomy & Astrophysics A Short Course Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Quantitative Biology II Lecture 4: Variational Methods

Quantitative Biology II Lecture 4: Variational Methods 10 th March 2015 Quantitative Biology II Lecture 4: Variational Methods Gurinder Singh Mickey Atwal Center for Quantitative Biology Cold Spring Harbor Laboratory Image credit: Mike West Summary Approximate

More information

Probabilistic Graphical Models for Image Analysis - Lecture 1

Probabilistic Graphical Models for Image Analysis - Lecture 1 Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Clustering with k-means and Gaussian mixture distributions

Clustering with k-means and Gaussian mixture distributions Clustering with k-means and Gaussian mixture distributions Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Clustering Finding a group structure in the data Data in one cluster similar to

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

REVEAL. Receiver Exploiting Variability in Estimated Acoustic Levels Project Review 16 Sept 2008

REVEAL. Receiver Exploiting Variability in Estimated Acoustic Levels Project Review 16 Sept 2008 REVEAL Receiver Exploiting Variability in Estimated Acoustic Levels Project Review 16 Sept 2008 Presented to Program Officers: Drs. John Tague and Keith Davidson Undersea Signal Processing Team, Office

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Extra-solar Planets via Bayesian Fusion MCMC

Extra-solar Planets via Bayesian Fusion MCMC Extra-solar Planets via Bayesian Fusion MCMC This manuscript to appear as Chapter 7 in Astrostatistical Challenges for the New Astronomy, Springer Series in Astrostatistics, Hilbe, J.M (ed), 2012, New

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Clustering with k-means and Gaussian mixture distributions

Clustering with k-means and Gaussian mixture distributions Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2014-2015 Jakob Verbeek, ovember 21, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

Series 7, May 22, 2018 (EM Convergence)

Series 7, May 22, 2018 (EM Convergence) Exercises Introduction to Machine Learning SS 2018 Series 7, May 22, 2018 (EM Convergence) Institute for Machine Learning Dept. of Computer Science, ETH Zürich Prof. Dr. Andreas Krause Web: https://las.inf.ethz.ch/teaching/introml-s18

More information

ECE 5984: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning ECE 5984: Introduction to Machine Learning Topics: (Finish) Expectation Maximization Principal Component Analysis (PCA) Readings: Barber 15.1-15.4 Dhruv Batra Virginia Tech Administrativia Poster Presentation:

More information