Tutorial on Approximate Bayesian Computation

Size: px
Start display at page:

Download "Tutorial on Approximate Bayesian Computation"

Transcription

1 Tutorial on Approximate Bayesian Computation Michael Gutmann University of Helsinki Aalto University Helsinki Institute for Information Technology 16 May 2016

2 Content Two parts: 1. The basics of approximate Bayesian computation (ABC) 2. Computational and statistical efficiency What is ABC? A set of methods for approximate Bayesian inference which can be used whenever sampling from the model is possible. Michael Gutmann ABC Tutorial 2 / 65

3 Part I Basic ABC

4 Program Preliminaries Statistical inference Simulator-based models Likelihood function Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Michael Gutmann ABC Tutorial 4 / 65

5 Program Preliminaries Statistical inference Simulator-based models Likelihood function Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Michael Gutmann ABC Tutorial 5 / 65

6 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Big picture of statistical inference Given data y o, draw conclusions about properties of its source If available, possibly take prior information into account Data source Observation Data space Unknown properties y o Inference Prior information Michael Gutmann ABC Tutorial 6 / 65

7 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function General approach Set up a model with potential properties θ (parameters) See which θ are reasonable given the observed data Data source Observation Data space Unknown properties y o M(θ) Model Inference Prior information Michael Gutmann ABC Tutorial 7 / 65

8 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Likelihood function Measures agreement between θ and the observed data y o Probability to see data y like y o if property θ holds Data source Observation Data space Unknown properties y θ y o M(θ) Data generation Model Michael Gutmann ABC Tutorial 8 / 65

9 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Likelihood function Measures agreement between θ and the observed data y o Probability to see data y like y o if property θ holds Data space Data source Observation Unknown properties M(θ) Data generation y θ y o ε Model Michael Gutmann ABC Tutorial 9 / 65

10 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Likelihood function For discrete random variables: For continuous random variables: L(θ) = Pr(y = y o θ) (1) L(θ) = lim ɛ 0 Pr(y B ɛ (y o ) θ) Vol(B ɛ (y o )) (2) Michael Gutmann ABC Tutorial 10 / 65

11 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Performing statistical inference If L(θ) is known, inference boils down to solving an optimization/sampling problem Maximum likelihood estimation Bayesian inference ˆθ = argmax θ L(θ) p(θ y o ) p(θ) L(θ) posterior prior likelihood Michael Gutmann ABC Tutorial 11 / 65

12 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Textbook case model family of probability density/mass functions p(y θ) Likelihood function L(θ) = p(y o θ) Closed form solutions are possible. prior pdf likelihood posterior pdf mean mean mean prior likelihood posterior Michael Gutmann ABC Tutorial 12 / 65

13 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Simulator-based models Not all models are specified as family of pdfs p(y θ). Here: simulator-based models: models which are specified via a mechanism (rule) for generating data Michael Gutmann ABC Tutorial 13 / 65

14 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Toy example Let y θ N (θ, 1) Family of pdfs as model: p(y θ) = 1 2π exp ( ) (y θ)2 2 (3) Simulator-based model: or y = z + θ z N (0, 1) (4) y = z + θ z = 2 log(ω) cos(2πν) (5) where ω and ν are independent random variables uniformly distributed on (0, 1) Michael Gutmann ABC Tutorial 14 / 65

15 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Formal definition of a simulator-based model Let (Ω, F, P) be a probability space. A simulator-based model is a collection of (measurable) functions g(., θ) parametrized by θ, ω Ω y = g(ω, θ) Y (6) The functions g(., θ) are typically not available in closed form. Simulation / Sampling Michael Gutmann ABC Tutorial 15 / 65

16 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Other names for simulator-based models Models specified via a data generating mechanism occur in multiple and diverse scientific fields. Different communities use different names for simulator-based models: Generative models Implicit models Stochastic simulation models Probabilistic programs Michael Gutmann ABC Tutorial 16 / 65

17 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Examples I Astrophysics: Simulating the formation of galaxies, stars, or planets I Evolutionary biology: Simulating evolution I Neuroscience: Simulating neural circuits I Ecology: Simulating species migration I Health science: Simulating the spread of an infectious disease I Simulated neural activity in rat somatosensory cortex (Figure from Michael Gutmann ABC Tutorial 17 / 65

18 Strain Example (health science) Simulating bacterial transmissions in child day care centers (Numminen et al, 2013) 5 10 Strain Individual Individual 5 10 Strain Individual 10 Strain Strain Time Individual Individual Michael Gutmann ABC Tutorial 18 / 65

19 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Advantages of simulator-based models Direct implementation of hypotheses of how the observed data were generated. Neat interface with physical or biological models of data. Modeling by replicating the mechanisms of nature which produced the observed/measured data. ( Analysis by synthesis ) Possibility to perform experiments in silico. Michael Gutmann ABC Tutorial 19 / 65

20 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Disadvantages of simulator-based models Generally elude analytical treatment. Can be easily made more complicated than necessary. Statistical inference is difficult... but possible! Michael Gutmann ABC Tutorial 20 / 65

21 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Family of pdfs induced by the simulator For any fixed θ, the output of the simulator y θ = g(., θ) is a random variable. No closed-form formulae available for p(y θ). Simulator defines the model pdfs p(y θ) implicitly. Michael Gutmann ABC Tutorial 21 / 65

22 Implicit definition of the model pdfs A A Michael Gutmann ABC Tutorial 22 / 65

23 Implicit definition of the likelihood function For discrete random variables: Michael Gutmann ABC Tutorial 23 / 65

24 Implicit definition of the likelihood function For continuous random variables: L(θ) = lim ɛ 0 L ɛ (θ) Michael Gutmann ABC Tutorial 24 / 65

25 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Implicit definition of the likelihood function To compute the likelihood function, we need to compute the probability that the simulator generates data close to y o, Pr (y = y o θ) or Pr (y B ɛ (y o ) θ) No analytical expression available. But we can empirically test whether simulated data equals y o or is in B ɛ (y o ). This property will be exploited to perform inference for simulator-based models. Michael Gutmann ABC Tutorial 25 / 65

26 Program Preliminaries Statistical inference Simulator-based models Likelihood function Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Michael Gutmann ABC Tutorial 26 / 65

27 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Exact inference for discrete random variables For discrete random variables, we can perform exact Bayesian inference without knowing the likelihood function. By definition, the posterior is obtained by conditioning p(θ, y) on the event y = y o : p(θ y o ) = p(θ, yo ) p(y o ) = p(θ, y = yo ) p(y = y o ) (7) Michael Gutmann ABC Tutorial 27 / 65

28 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Exact inference for discrete random variables Generate tuples (θ i, y i ): 1. θ i p θ (iid from the prior) 2. ω i P (by running the simulator) 3. y i = g(ω i, θ i ) (by running the simulator) Condition on y = y o Retain only the tuples with y i = y o The θ i from the retained tuples are samples from the posterior p(θ y o ). Michael Gutmann ABC Tutorial 28 / 65

29 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Example Posterior inference of the success probability θ in a Bernoulli trial. Data: y o = 1 Prior: p θ = 1 on (0, 1) Generate tuples (θ i, y i ) 1. θ i p θ 2. ω i { U(0, 1) 1 if ω i < θ i 3. y i = 0 otherwise Retain those θ i for which y i = y o. Michael Gutmann ABC Tutorial 29 / 65

30 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Example The method produces samples from the posterior. Monte Carlo error when summarizing the samples as an empirical distribution or computing expectations via sample averages. Histogram for N simulated tuples (θ i, y i ) Estimated posterior pdf True posterior pdf Prior pdf Estimated posterior pdf True posterior pdf Prior pdf Estimated posterior pdf True posterior pdf Prior pdf pdf pdf 1 pdf Success probability N = Success probability N = 10, Success probability N = 100, 000 Michael Gutmann ABC Tutorial 30 / 65

31 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Limitations Only applicable to discrete random variables. And even for discrete random variables: Computationally not feasible in higher dimensions Reason: The probability of the event y θ = y o becomes smaller and smaller as the dimension of the data increases. Out of N simulated tuples only a small fraction will be accepted. The small number of accepted samples do not represent the posterior well. Large Monte Carlo errors Michael Gutmann ABC Tutorial 31 / 65

32 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Approximations to make inference feasible Settle for approximate yet computationally feasible inference. Introduce two types of approximations: 1. Instead of working with the whole data, work with lower dimensional summary statistics t θ and t o, t θ = T (y θ ) t o = T (y o ). (8) 2. Instead of checking t θ = t o, check whether θ = d(t o, t θ ) is less than ɛ. (d may or may not be a metric) Michael Gutmann ABC Tutorial 32 / 65

33 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Approximation of the likelihood function L(θ) = lim ɛ 0 L ɛ (θ) L ɛ (θ) = Pr(y Bɛ(yo ) θ) Vol(B ɛ(y o )) Approximations are equivalent to: 1. Replacing Pr (y B ɛ (y o ) θ) with Pr ( θ ɛ θ) 2. Not taking the limit ɛ 0 Defines an approximate likelihood function L ɛ (θ), L ɛ (θ) Pr ( θ ɛ θ) (9) Discrepancy θ is a (non-negative) random variable θ = d(t o, t θ ) = d (T (y o ), T (y θ )) Michael Gutmann ABC Tutorial 33 / 65

34 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Rejection ABC algorithm The two approximations made yield the rejection algorithm for approximate Bayesian computation (ABC): 1. Sample θ i p θ 2. Simulate a data set y i by running the simulator with θ i (y i = g(ω i, θ i )) 3. Compute the discrepancy i = d(t (y o ), T (y i )) 4. Retain θ i if i ɛ This is the basic ABC algorithm. Michael Gutmann ABC Tutorial 34 / 65

35 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Properties Rejection ABC algorithm produces samples θ p ɛ (θ y o ), p ɛ (θ y o ) p θ (θ) L ɛ (θ) (10) L ɛ (θ) Pr ( d(t (y o ) ), T (y)) ɛ θ (11) }{{} θ Inference is approximate due to the summary statistics T and distance d ɛ > 0 the finite number of samples (Monte Carlo error) Michael Gutmann ABC Tutorial 35 / 65

36 Part II Computational and statistical efficiency

37 Brief recap Simulator-based models: Models which are specified by a data generating mechanism. By construction, we can sample from simulator-based models. Likelihood function can generally not be written down. Rejection ABC: Trial and error scheme to find parameter values which produce simulated data resembling the observed data. Simulated data resemble the observed data if some discrepancy measure is small. Michael Gutmann ABC Tutorial 37 / 65

38 Efficiency of ABC 1. Computational efficiency: How to efficiently find the parameter values which yield a small discrepancy? 2. Statistical efficiency: How to measure the discrepancy between the simulated and observed data? Michael Gutmann ABC Tutorial 38 / 65

39 Program Computational efficiency Difficulties Solutions Recent work Statistical efficiency Difficulties Solutions Recent work Michael Gutmann ABC Tutorial 39 / 65

40 Program Computational efficiency Difficulties Solutions Recent work Statistical efficiency Difficulties Solutions Recent work Michael Gutmann ABC Tutorial 40 / 65

41 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example Inference of the mean θ of a Gaussian of variance one. Pr(y = y o θ) = 0. Discrepancy θ : θ = (ˆµ o ˆµ θ ) 2, ˆµ o = 1 n yi o, n ˆµ θ = 1 n i=1 n y i, i=1 y i N (θ, 1) discrepancy , 0.9 quantiles mean realization of stochastic process realizations at θ θ Discrepancy θ is a random variable. Michael Gutmann ABC Tutorial 41 / 65

42 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example Probability that θ is below some threshold ɛ approximates the likelihood function , 0.9 quantiles mean threshold (0.1) approximate likelihood (rescaled) true likelihood (rescaled) θ Michael Gutmann ABC Tutorial 42 / 65

43 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example Here, T (y) = 1 n n i=1 y i is a sufficient statistics for inference of the mean θ The only approximation is ɛ > 0. In general, the summary statistics will not be sufficient. Michael Gutmann ABC Tutorial 43 / 65

44 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example In the Gaussian example, the probability for θ ɛ can be computed in closed form θ = ( ˆµ o ˆµ θ ) 2 Pr( θ ɛ) = Φ ( n(ˆµ o θ) + nɛ ) Φ ( n(ˆµ o θ) nɛ ) For nɛ small: L ɛ (θ) Pr( θ ɛ) ɛl(θ) Φ(x) = x 1 exp ( 1 2π 2 u2) du For small ɛ good approximation of the likelihood function. But for small ɛ, Pr( θ ɛ) 0: Very few samples will be accepted , 0.9 quantiles mean threshold (0.1) approximate likelihood (rescaled) true likelihood (rescaled) θ Michael Gutmann ABC Tutorial 44 / 65

45 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Two widely used algorithms Two widely used algorithms which improve computationally upon rejection ABC: 1. Regression ABC (Beaumont et al, 2002) 2. Sequential Monte Carlo ABC (Sisson et al, 2007) Both use rejection ABC as a building block. Sequential Monte Carlo (SMC) ABC is also known as Population Monte Carlo (PMC) ABC. Michael Gutmann ABC Tutorial 45 / 65

46 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Two widely used algorithms Regression ABC consists in running rejection ABC with a relatively large ɛ and then adjusting the obtained samples so that they are closer to samples from the true posterior. Sequential Monte Carlo ABC consists in sampling θ from an adaptively constructed proposal distribution φ(θ) rather than from the prior in order to avoid simulating many data sets which are not accepted. Michael Gutmann ABC Tutorial 46 / 65

47 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of regression ABC The summary statistics t θ = T (y θ ) and θ have a joint distribution. Let t i be the summary statistics for simulated data y i = g(ω i, θ i ). We can learn a regression model between the summary statistics (covariates) and the parameters (response variables) θ i = f (t i ) + ξ i (12) where ξ i is the error term (zero mean random variable). The training data for the regression are typically tuples (θ i, t i ) produced by rejection-abc with some sufficiently large ɛ. Michael Gutmann ABC Tutorial 47 / 65

48 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of regression ABC Fitting the regression model to the training data (θ i, t i ) yields an estimated regression function ˆf and the residuals ˆξ i, ˆξ i = θ i ˆf (t i ) (13) Regression ABC consists in replacing θ i with θ i, θ i = ˆf (t o ) + ˆξ i = ˆf (t o ) + θ i ˆf (t i ) (14) Corresponds to an adjustment of θ i. If the relation between t and θ is learned correctly, the θ i correspond to samples from an approximation with ɛ = 0. Michael Gutmann ABC Tutorial 48 / 65

49 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of sequential Monte Carlo ABC We may modify the rejection ABC algorithm and use φ(θ) instead of the prior p θ. 1. Sample θ i φ(θ) 2. Simulate a data set y i by running the simulator with θ i (y i = g(ω i, θ i )) 3. Compute the discrepancy i = d(t (y o ), T (y i )) 4. Retain θ i if i ɛ The retained samples follow a distribution proportional to φ(θ) L ɛ (θ) Michael Gutmann ABC Tutorial 49 / 65

50 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of sequential Monte Carlo ABC Parameters θ i weighted with w i, w i = p θ(θ i ) φ(θ i ), (15) follow a distribution proportional to p θ (θ) L ɛ (θ). Can be used to iteratively morph the prior into a posterior: Use a sequence of shrinking thresholds ɛt Run rejection ABC with ɛ 0. Define φ t at iteration t based on the weighted samples from the previous iteration (e.g Gaussian mixture with means equal to the θ i from the previous iteration). Michael Gutmann ABC Tutorial 50 / 65

51 Basic idea of sequential Monte Carlo ABC Construction of proposal distribution true posterior t = t ε proposal t=3 y 0 ε3 ε2 1 2 proposal t= prior t= θ Simulator M(θ, ) Michael Gutmann ABC Tutorial 51 / 65

52 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Learning a model of the discrepancy L ɛ(θ) Pr ( θ ɛ θ) The approximate likelihood function L ɛ (θ) is determined by the distribution of the discrepancy θ If we knew the distribution of θ we could compute L ɛ (θ). We proposed to learn a model of θ and to approximate L ɛ (θ) by ˆL ɛ (θ), L ɛ (θ) Pr ( θ ɛ θ) (16) Model is learned more accurately in regions where θ tends to be small to make further computational savings. (Gutmann and Corander, Journal of Machine Learning Research, in press) Michael Gutmann ABC Tutorial 52 / 65

53 Strain Example: Bacterial infections in child care centers Likelihood intractable for cross-sectional data But generating data from the model is possible Strain Parameters of interest: - rate of infections within a center - rate of infections from outside - competition between the strains Individual Individual 5 10 Strain Individual 10 Strain Strain (Numminen et al, 2013) Time Individual Individual Michael Gutmann ABC Tutorial 53 / 65

54 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers Comparison of the proposed approach with a standard population Monte Carlo ABC approach. Roughly equal results using 1000 times fewer simulations. 4.5 days with 200 cores 90 minutes with seven cores Posterior means: solid lines, credibility intervals: shaded areas or dashed lines. Competition parameter Developed Fast Method Standard Method (Gutmann and Corander, 2015) Computational cost (log10) Michael Gutmann ABC Tutorial 54 / 65

55 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers Comparison of the proposed approach with a standard population Monte Carlo ABC approach. Roughly equal results using 1000 times fewer simulations Developed Fast Method Standard Method Developed Fast Method Standard Method Internal infection parameter External infection parameter Computational cost (log10) Computational cost (log10) Posterior means are shown as solid lines, credibility intervals as shaded areas or dashed lines. Michael Gutmann ABC Tutorial 55 / 65

56 Program Computational efficiency Difficulties Solutions Recent work Statistical efficiency Difficulties Solutions Recent work Michael Gutmann ABC Tutorial 56 / 65

57 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Discrepancy measure affects the accuracy of the estimates Bad discrepancy: estimated posterior = prior Bad discrepancy: vanishingly small acceptance probability Good discrepancy: good trade-off between loss of information and increase in acceptance probability Michael Gutmann ABC Tutorial 57 / 65

58 Computational efficiency Statistical efficiency Difficulties Solutions Recent work How to choose the discrepancy measure? Manually Use expert knowledge about y o to define summary statistics T. Use Euclidean distance for d. Semi-automatic θ = T (y o ) T (y θ ) Simulate pairs (θi, y i ) Define a large number of summary statistics T Define T as a smaller number of (linear) combinations of them, automatically learned from the simulated pairs. Use Euclidean distance for d. Combinations are typically determined via regression with the T (y θ ) as covariates and parameters θ as reponse variables. e.g. Nunes and Balding, 2010; Fearnhead and Prangle, 2012; Aeschbacher et al, 2012; Blum et al, 2013 Michael Gutmann ABC Tutorial 58 / 65

59 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Discrepancy measurement via classification (Gutmann et al, 2014) Classification accuracy (discriminability) as discrepancy measure θ. Discriminability of 100% indicates maximally different data sets; 50% indicates similar data sets. 3 2 Observed data Simulated data, mean (6,0) 3 2 Observed data Simulated data, mean (1/2,0) 1 1 y coordinate 0 y coordinate x coordinate x coordinate Michael Gutmann ABC Tutorial 59 / 65

60 Strain Example: Bacterial infections in child care centers Likelihood intractable for cross-sectional data But generating data from the model is possible Strain Parameters of interest: - rate of infections within a center - rate of infections from outside - competition between the strains Individual Individual 5 10 Strain Individual 10 Strain Strain (Numminen et al, 2013) Time Individual Individual Michael Gutmann ABC Tutorial 60 / 65

61 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers (Gutmann et al, 2014) Our classification-based distance measure does not use domain/expert knowledge. Performs as well as a distance measure based on domain knowledge (Numminen et, 2013) Expert Classifier (a) Posterior pdf for β (b) Posterior pdf for Λ (c) Posterior pdf for θ Michael Gutmann ABC Tutorial 61 / 65

62 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers (Gutmann et al, 2014) Robustness is a concern when relying on expert knowledge Classification-based distance can automatically compensate errors in the expert input. Posterior probability density Deterioration Developed Robust Method Standard Method Reference Compensation External infection parameter Michael Gutmann ABC Tutorial 62 / 65

63 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Michael Gutmann ABC Tutorial 63 / 65

64 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Covered three classical algorithms: 1. Rejection ABC 2. Regression ABC 3. Sequential Monte Carlo ABC Michael Gutmann ABC Tutorial 63 / 65

65 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Covered three classical algorithms: 1. Rejection ABC 2. Regression ABC 3. Sequential Monte Carlo ABC Choice of discrepancy measure between simulated and observed data Michael Gutmann ABC Tutorial 63 / 65

66 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Covered three classical algorithms: 1. Rejection ABC 2. Regression ABC 3. Sequential Monte Carlo ABC Choice of discrepancy measure between simulated and observed data Recent work of mine Combining modeling of the discrepancy and optimization to increase computational efficiency Using classification to measure the discrepancy Michael Gutmann ABC Tutorial 63 / 65

67 References My papers: Numminen et al. Estimating the Transmission Dynamics of Streptococcus pneumoniae from Strain Prevalence Data. Biometrics, Beaumont et al. Approximate Bayesian Computation in Population Genetics. Genetics, Sisson et al. Sequential Monte Carlo without likelihoods. PNAS, Nunes and Balding. On optimal selection of summary statistics for approximate Bayesian computation. Statistical Applications in Genetics and Molecular Biology, Fearnhead and Prangle. Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Aeschbacher et al. A novel approach for choosing summary statistics in approximate Bayesian computation. Genetics, Blum et al. A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation. Statistical Science, Michael Gutmann ABC Tutorial 64 / 65

68 Review papers Beaumont. Approximate Bayesian Computation in Evolution and Ecology. Annual Review of Ecology, Evolution, and Systematics, Hartig et al. Statistical inference for stochastic simulation models theory and application.ecology Letters, Marin et al. Approximate Bayesian computational methods. Statistics and Computing, Sunnaker et al. Approximate Bayesian Computation. PLoS Computational Biology, Michael Gutmann ABC Tutorial 65 / 65

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Efficient Likelihood-Free Inference

Efficient Likelihood-Free Inference Efficient Likelihood-Free Inference Michael Gutmann http://homepages.inf.ed.ac.uk/mgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh 8th November 2017

More information

Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

More information

Bayesian Inference by Density Ratio Estimation

Bayesian Inference by Density Ratio Estimation Bayesian Inference by Density Ratio Estimation Michael Gutmann https://sites.google.com/site/michaelgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh

More information

ABC random forest for parameter estimation. Jean-Michel Marin

ABC random forest for parameter estimation. Jean-Michel Marin ABC random forest for parameter estimation Jean-Michel Marin Université de Montpellier Institut Montpelliérain Alexander Grothendieck (IMAG) Institut de Biologie Computationnelle (IBC) Labex Numev! joint

More information

Fundamentals and Recent Developments in Approximate Bayesian Computation

Fundamentals and Recent Developments in Approximate Bayesian Computation Syst. Biol. 66(1):e66 e82, 2017 The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of

More information

arxiv: v3 [stat.co] 3 Mar 2017

arxiv: v3 [stat.co] 3 Mar 2017 Likelihood-free inference via classification Michael U. Gutmann Ritabrata Dutta Samuel Kaski Jukka Corander arxiv:7.9v [stat.co] Mar 7 Accepted for publication in Statistics and Computing (Feb, 7) Abstract

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble

More information

Tutorial on ABC Algorithms

Tutorial on ABC Algorithms Tutorial on ABC Algorithms Dr Chris Drovandi Queensland University of Technology, Australia c.drovandi@qut.edu.au July 3, 2014 Notation Model parameter θ with prior π(θ) Likelihood is f(ý θ) with observed

More information

An introduction to Approximate Bayesian Computation methods

An introduction to Approximate Bayesian Computation methods An introduction to Approximate Bayesian Computation methods M.E. Castellanos maria.castellanos@urjc.es (from several works with S. Cabras, E. Ruli and O. Ratmann) Valencia, January 28, 2015 Valencia Bayesian

More information

Approximate Bayesian computation: methods and applications for complex systems

Approximate Bayesian computation: methods and applications for complex systems Approximate Bayesian computation: methods and applications for complex systems Mark A. Beaumont, School of Biological Sciences, The University of Bristol, Bristol, UK 11 November 2015 Structure of Talk

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Sarah Filippi Department of Statistics University of Oxford 09/02/2016 Parameter inference in a signalling pathway A RAS Receptor Growth factor Cell membrane Phosphorylation

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Notes on Noise Contrastive Estimation (NCE)

Notes on Noise Contrastive Estimation (NCE) Notes on Noise Contrastive Estimation NCE) David Meyer dmm@{-4-5.net,uoregon.edu,...} March 0, 207 Introduction In this note we follow the notation used in [2]. Suppose X x, x 2,, x Td ) is a sample of

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Bayesian Deep Learning

Bayesian Deep Learning Bayesian Deep Learning Mohammad Emtiyaz Khan AIP (RIKEN), Tokyo http://emtiyaz.github.io emtiyaz.khan@riken.jp June 06, 2018 Mohammad Emtiyaz Khan 2018 1 What will you learn? Why is Bayesian inference

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Intractable Likelihood Functions

Intractable Likelihood Functions Intractable Likelihood Functions Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring semester 2018 Recap p(x y o ) = z p(x,y o,z) x,z

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)

The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) The Poisson transform for unnormalised statistical models Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) Part I Unnormalised statistical models Unnormalised statistical models

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Bayes via forward simulation. Approximate Bayesian Computation

Bayes via forward simulation. Approximate Bayesian Computation : Approximate Bayesian Computation Jessi Cisewski Carnegie Mellon University June 2014 What is Approximate Bayesian Computing? Likelihood-free approach Works by simulating from the forward process What

More information

Does Unlabeled Data Help?

Does Unlabeled Data Help? Does Unlabeled Data Help? Worst-case Analysis of the Sample Complexity of Semi-supervised Learning. Ben-David, Lu and Pal; COLT, 2008. Presentation by Ashish Rastogi Courant Machine Learning Seminar. Outline

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Self Adaptive Particle Filter

Self Adaptive Particle Filter Self Adaptive Particle Filter Alvaro Soto Pontificia Universidad Catolica de Chile Department of Computer Science Vicuna Mackenna 4860 (143), Santiago 22, Chile asoto@ing.puc.cl Abstract The particle filter

More information

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003

Statistical Approaches to Learning and Discovery. Week 4: Decision Theory and Risk Minimization. February 3, 2003 Statistical Approaches to Learning and Discovery Week 4: Decision Theory and Risk Minimization February 3, 2003 Recall From Last Time Bayesian expected loss is ρ(π, a) = E π [L(θ, a)] = L(θ, a) df π (θ)

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions

More information

CS-E3210 Machine Learning: Basic Principles

CS-E3210 Machine Learning: Basic Principles CS-E3210 Machine Learning: Basic Principles Lecture 4: Regression II slides by Markus Heinonen Department of Computer Science Aalto University, School of Science Autumn (Period I) 2017 1 / 61 Today s introduction

More information

New Insights into History Matching via Sequential Monte Carlo

New Insights into History Matching via Sequential Monte Carlo New Insights into History Matching via Sequential Monte Carlo Associate Professor Chris Drovandi School of Mathematical Sciences ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)

More information

Probabilistic Regression Using Basis Function Models

Probabilistic Regression Using Basis Function Models Probabilistic Regression Using Basis Function Models Gregory Z. Grudic Department of Computer Science University of Colorado, Boulder grudic@cs.colorado.edu Abstract Our goal is to accurately estimate

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

CONDITIONING ON PARAMETER POINT ESTIMATES IN APPROXIMATE BAYESIAN COMPUTATION

CONDITIONING ON PARAMETER POINT ESTIMATES IN APPROXIMATE BAYESIAN COMPUTATION CONDITIONING ON PARAMETER POINT ESTIMATES IN APPROXIMATE BAYESIAN COMPUTATION by Emilie Haon Lasportes Florence Carpentier Olivier Martin Etienne K. Klein Samuel Soubeyrand Research Report No. 45 July

More information

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts

ICML Scalable Bayesian Inference on Point processes. with Gaussian Processes. Yves-Laurent Kom Samo & Stephen Roberts ICML 2015 Scalable Nonparametric Bayesian Inference on Point Processes with Gaussian Processes Machine Learning Research Group and Oxford-Man Institute University of Oxford July 8, 2015 Point Processes

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation

Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Won Chang Post Doctoral Scholar, Department of Statistics, University of Chicago Oct 15, 2014 Thesis Advisors: Murali

More information

arxiv: v3 [stat.ml] 8 Aug 2018

arxiv: v3 [stat.ml] 8 Aug 2018 Efficient acquisition rules for model-based approximate Bayesian computation Marko Järvenpää, Michael U. Gutmann, Arijus Pleska, Aki Vehtari, and Pekka Marttinen arxiv:74.5v3 [stat.ml] 8 Aug 8 Helsinki

More information

arxiv: v2 [stat.co] 15 Aug 2015

arxiv: v2 [stat.co] 15 Aug 2015 arxiv:7.9v [stat.co] Aug Statistical Inference of Intractable Generative Models via Classification Michael U. Gutmann, Ritabrata Dutta, Samuel Kaski, and Jukka Corander Abstract Increasingly complex generative

More information

Approximate Bayesian Computation for Astrostatistics

Approximate Bayesian Computation for Astrostatistics Approximate Bayesian Computation for Astrostatistics Jessi Cisewski Department of Statistics Yale University October 24, 2016 SAMSI Undergraduate Workshop Our goals Introduction to Bayesian methods Likelihoods,

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Goal (Lecture): To present Probabilistic Principal Component Analysis (PPCA) using both Maximum Likelihood (ML) and Expectation Maximization

More information

Bayesian Interpretations of Regularization

Bayesian Interpretations of Regularization Bayesian Interpretations of Regularization Charlie Frogner 9.50 Class 15 April 1, 009 The Plan Regularized least squares maps {(x i, y i )} n i=1 to a function that minimizes the regularized loss: f S

More information

Non-linear Regression Approaches in ABC

Non-linear Regression Approaches in ABC Non-linear Regression Approaches in ABC Michael G.B. Blum Lab. TIMC-IMAG, Université Joseph Fourier, CNRS, Grenoble, France ABCiL, May 5 2011 Overview 1 Why using (possibly non-linear) regression adjustment

More information

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu and Dilin Wang NIPS 2016 Discussion by Yunchen Pu March 17, 2017 March 17, 2017 1 / 8 Introduction Let x R d

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Practical Bayesian Optimization of Machine Learning. Learning Algorithms Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Addressing the Challenges of Luminosity Function Estimation via Likelihood-Free Inference

Addressing the Challenges of Luminosity Function Estimation via Likelihood-Free Inference Addressing the Challenges of Luminosity Function Estimation via Likelihood-Free Inference Chad M. Schafer www.stat.cmu.edu/ cschafer Department of Statistics Carnegie Mellon University June 2011 1 Acknowledgements

More information

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)

Outline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012) Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation

More information

Learning features by contrasting natural images with noise

Learning features by contrasting natural images with noise Learning features by contrasting natural images with noise Michael Gutmann 1 and Aapo Hyvärinen 12 1 Dept. of Computer Science and HIIT, University of Helsinki, P.O. Box 68, FIN-00014 University of Helsinki,

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Algorithm Independent Topics Lecture 6

Algorithm Independent Topics Lecture 6 Algorithm Independent Topics Lecture 6 Jason Corso SUNY at Buffalo Feb. 23 2009 J. Corso (SUNY at Buffalo) Algorithm Independent Topics Lecture 6 Feb. 23 2009 1 / 45 Introduction Now that we ve built an

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Probabilistic Graphical Models for Image Analysis - Lecture 1

Probabilistic Graphical Models for Image Analysis - Lecture 1 Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

Lecture 4: Probabilistic Learning

Lecture 4: Probabilistic Learning DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods

More information

arxiv: v4 [stat.ml] 18 Dec 2018

arxiv: v4 [stat.ml] 18 Dec 2018 arxiv (December 218) Likelihood-free inference by ratio estimation Owen Thomas, Ritabrata Dutta, Jukka Corander, Samuel Kaski and Michael U. Gutmann, arxiv:1611.1242v4 [stat.ml] 18 Dec 218 Abstract. We

More information

Optimization Tools in an Uncertain Environment

Optimization Tools in an Uncertain Environment Optimization Tools in an Uncertain Environment Michael C. Ferris University of Wisconsin, Madison Uncertainty Workshop, Chicago: July 21, 2008 Michael Ferris (University of Wisconsin) Stochastic optimization

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Approximate Bayesian computation: an application to weak-lensing peak counts

Approximate Bayesian computation: an application to weak-lensing peak counts STATISTICAL CHALLENGES IN MODERN ASTRONOMY VI Approximate Bayesian computation: an application to weak-lensing peak counts Chieh-An Lin & Martin Kilbinger SAp, CEA Saclay Carnegie Mellon University, Pittsburgh

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Robust Monte Carlo Methods for Sequential Planning and Decision Making

Robust Monte Carlo Methods for Sequential Planning and Decision Making Robust Monte Carlo Methods for Sequential Planning and Decision Making Sue Zheng, Jason Pacheco, & John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

Multi-level Approximate Bayesian Computation

Multi-level Approximate Bayesian Computation Multi-level Approximate Bayesian Computation Christopher Lester 20 November 2018 arxiv:1811.08866v1 [q-bio.qm] 21 Nov 2018 1 Introduction Well-designed mechanistic models can provide insights into biological

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Variational Inference via Stochastic Backpropagation

Variational Inference via Stochastic Backpropagation Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Probabilistic Graphical Networks: Definitions and Basic Results

Probabilistic Graphical Networks: Definitions and Basic Results This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical

More information

Accelerating ABC methods using Gaussian processes

Accelerating ABC methods using Gaussian processes Richard D. Wilkinson School of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, United Kingdom Abstract Approximate Bayesian computation (ABC) methods are used to approximate posterior

More information