Tutorial on Approximate Bayesian Computation

Size: px

Start display at page:

Download "Tutorial on Approximate Bayesian Computation"

Nora Conley
5 years ago
Views:

1 Tutorial on Approximate Bayesian Computation Michael Gutmann University of Helsinki Aalto University Helsinki Institute for Information Technology 16 May 2016

2 Content Two parts: 1. The basics of approximate Bayesian computation (ABC) 2. Computational and statistical efficiency What is ABC? A set of methods for approximate Bayesian inference which can be used whenever sampling from the model is possible. Michael Gutmann ABC Tutorial 2 / 65

3 Part I Basic ABC

4 Program Preliminaries Statistical inference Simulator-based models Likelihood function Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Michael Gutmann ABC Tutorial 4 / 65

5 Program Preliminaries Statistical inference Simulator-based models Likelihood function Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Michael Gutmann ABC Tutorial 5 / 65

6 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Big picture of statistical inference Given data y o, draw conclusions about properties of its source If available, possibly take prior information into account Data source Observation Data space Unknown properties y o Inference Prior information Michael Gutmann ABC Tutorial 6 / 65

7 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function General approach Set up a model with potential properties θ (parameters) See which θ are reasonable given the observed data Data source Observation Data space Unknown properties y o M(θ) Model Inference Prior information Michael Gutmann ABC Tutorial 7 / 65

8 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Likelihood function Measures agreement between θ and the observed data y o Probability to see data y like y o if property θ holds Data source Observation Data space Unknown properties y θ y o M(θ) Data generation Model Michael Gutmann ABC Tutorial 8 / 65

9 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Likelihood function Measures agreement between θ and the observed data y o Probability to see data y like y o if property θ holds Data space Data source Observation Unknown properties M(θ) Data generation y θ y o ε Model Michael Gutmann ABC Tutorial 9 / 65

10 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Likelihood function For discrete random variables: For continuous random variables: L(θ) = Pr(y = y o θ) (1) L(θ) = lim ɛ 0 Pr(y B ɛ (y o ) θ) Vol(B ɛ (y o )) (2) Michael Gutmann ABC Tutorial 10 / 65

11 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Performing statistical inference If L(θ) is known, inference boils down to solving an optimization/sampling problem Maximum likelihood estimation Bayesian inference ˆθ = argmax θ L(θ) p(θ y o ) p(θ) L(θ) posterior prior likelihood Michael Gutmann ABC Tutorial 11 / 65

12 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Textbook case model family of probability density/mass functions p(y θ) Likelihood function L(θ) = p(y o θ) Closed form solutions are possible. prior pdf likelihood posterior pdf mean mean mean prior likelihood posterior Michael Gutmann ABC Tutorial 12 / 65

13 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Simulator-based models Not all models are specified as family of pdfs p(y θ). Here: simulator-based models: models which are specified via a mechanism (rule) for generating data Michael Gutmann ABC Tutorial 13 / 65

14 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Toy example Let y θ N (θ, 1) Family of pdfs as model: p(y θ) = 1 2π exp ( ) (y θ)2 2 (3) Simulator-based model: or y = z + θ z N (0, 1) (4) y = z + θ z = 2 log(ω) cos(2πν) (5) where ω and ν are independent random variables uniformly distributed on (0, 1) Michael Gutmann ABC Tutorial 14 / 65

15 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Formal definition of a simulator-based model Let (Ω, F, P) be a probability space. A simulator-based model is a collection of (measurable) functions g(., θ) parametrized by θ, ω Ω y = g(ω, θ) Y (6) The functions g(., θ) are typically not available in closed form. Simulation / Sampling Michael Gutmann ABC Tutorial 15 / 65

16 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Other names for simulator-based models Models specified via a data generating mechanism occur in multiple and diverse scientific fields. Different communities use different names for simulator-based models: Generative models Implicit models Stochastic simulation models Probabilistic programs Michael Gutmann ABC Tutorial 16 / 65

Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Examples I Astrophysics: Simulating the formation of galaxies, stars, or planets I

17 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Examples I Astrophysics: Simulating the formation of galaxies, stars, or planets I Evolutionary biology: Simulating evolution I Neuroscience: Simulating neural circuits I Ecology: Simulating species migration I Health science: Simulating the spread of an infectious disease I Simulated neural activity in rat somatosensory cortex (Figure from Michael Gutmann ABC Tutorial 17 / 65

Strain Example (health science) Simulating bacterial transmissions in child day care centers (Numminen et al, 2013) 5 10 Strain 15 20 25 30 5 10 15 20 25 30 35 Individual Individual 5 10 Strain

18 Strain Example (health science) Simulating bacterial transmissions in child day care centers (Numminen et al, 2013) 5 10 Strain Individual Individual 5 10 Strain Individual 10 Strain Strain Time Individual Individual Michael Gutmann ABC Tutorial 18 / 65

19 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Advantages of simulator-based models Direct implementation of hypotheses of how the observed data were generated. Neat interface with physical or biological models of data. Modeling by replicating the mechanisms of nature which produced the observed/measured data. ( Analysis by synthesis ) Possibility to perform experiments in silico. Michael Gutmann ABC Tutorial 19 / 65

20 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Disadvantages of simulator-based models Generally elude analytical treatment. Can be easily made more complicated than necessary. Statistical inference is difficult... but possible! Michael Gutmann ABC Tutorial 20 / 65

21 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Family of pdfs induced by the simulator For any fixed θ, the output of the simulator y θ = g(., θ) is a random variable. No closed-form formulae available for p(y θ). Simulator defines the model pdfs p(y θ) implicitly. Michael Gutmann ABC Tutorial 21 / 65

22 Implicit definition of the model pdfs A A Michael Gutmann ABC Tutorial 22 / 65

23 Implicit definition of the likelihood function For discrete random variables: Michael Gutmann ABC Tutorial 23 / 65

24 Implicit definition of the likelihood function For continuous random variables: L(θ) = lim ɛ 0 L ɛ (θ) Michael Gutmann ABC Tutorial 24 / 65

25 Preliminaries Inference for simulator-based models Statistical inference Simulator-based models Likelihood function Implicit definition of the likelihood function To compute the likelihood function, we need to compute the probability that the simulator generates data close to y o, Pr (y = y o θ) or Pr (y B ɛ (y o ) θ) No analytical expression available. But we can empirically test whether simulated data equals y o or is in B ɛ (y o ). This property will be exploited to perform inference for simulator-based models. Michael Gutmann ABC Tutorial 25 / 65

26 Program Preliminaries Statistical inference Simulator-based models Likelihood function Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Michael Gutmann ABC Tutorial 26 / 65

27 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Exact inference for discrete random variables For discrete random variables, we can perform exact Bayesian inference without knowing the likelihood function. By definition, the posterior is obtained by conditioning p(θ, y) on the event y = y o : p(θ y o ) = p(θ, yo ) p(y o ) = p(θ, y = yo ) p(y = y o ) (7) Michael Gutmann ABC Tutorial 27 / 65

28 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Exact inference for discrete random variables Generate tuples (θ i, y i ): 1. θ i p θ (iid from the prior) 2. ω i P (by running the simulator) 3. y i = g(ω i, θ i ) (by running the simulator) Condition on y = y o Retain only the tuples with y i = y o The θ i from the retained tuples are samples from the posterior p(θ y o ). Michael Gutmann ABC Tutorial 28 / 65

29 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Example Posterior inference of the success probability θ in a Bernoulli trial. Data: y o = 1 Prior: p θ = 1 on (0, 1) Generate tuples (θ i, y i ) 1. θ i p θ 2. ω i { U(0, 1) 1 if ω i < θ i 3. y i = 0 otherwise Retain those θ i for which y i = y o. Michael Gutmann ABC Tutorial 29 / 65

30 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Example The method produces samples from the posterior. Monte Carlo error when summarizing the samples as an empirical distribution or computing expectations via sample averages. Histogram for N simulated tuples (θ i, y i ) Estimated posterior pdf True posterior pdf Prior pdf Estimated posterior pdf True posterior pdf Prior pdf Estimated posterior pdf True posterior pdf Prior pdf pdf pdf 1 pdf Success probability N = Success probability N = 10, Success probability N = 100, 000 Michael Gutmann ABC Tutorial 30 / 65

31 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Limitations Only applicable to discrete random variables. And even for discrete random variables: Computationally not feasible in higher dimensions Reason: The probability of the event y θ = y o becomes smaller and smaller as the dimension of the data increases. Out of N simulated tuples only a small fraction will be accepted. The small number of accepted samples do not represent the posterior well. Large Monte Carlo errors Michael Gutmann ABC Tutorial 31 / 65

32 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Approximations to make inference feasible Settle for approximate yet computationally feasible inference. Introduce two types of approximations: 1. Instead of working with the whole data, work with lower dimensional summary statistics t θ and t o, t θ = T (y θ ) t o = T (y o ). (8) 2. Instead of checking t θ = t o, check whether θ = d(t o, t θ ) is less than ɛ. (d may or may not be a metric) Michael Gutmann ABC Tutorial 32 / 65

33 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Approximation of the likelihood function L(θ) = lim ɛ 0 L ɛ (θ) L ɛ (θ) = Pr(y Bɛ(yo ) θ) Vol(B ɛ(y o )) Approximations are equivalent to: 1. Replacing Pr (y B ɛ (y o ) θ) with Pr ( θ ɛ θ) 2. Not taking the limit ɛ 0 Defines an approximate likelihood function L ɛ (θ), L ɛ (θ) Pr ( θ ɛ θ) (9) Discrepancy θ is a (non-negative) random variable θ = d(t o, t θ ) = d (T (y o ), T (y θ )) Michael Gutmann ABC Tutorial 33 / 65

34 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Rejection ABC algorithm The two approximations made yield the rejection algorithm for approximate Bayesian computation (ABC): 1. Sample θ i p θ 2. Simulate a data set y i by running the simulator with θ i (y i = g(ω i, θ i )) 3. Compute the discrepancy i = d(t (y o ), T (y i )) 4. Retain θ i if i ɛ This is the basic ABC algorithm. Michael Gutmann ABC Tutorial 34 / 65

35 Preliminaries Inference for simulator-based models Exact inference Approximate inference Rejection ABC algorithm Properties Rejection ABC algorithm produces samples θ p ɛ (θ y o ), p ɛ (θ y o ) p θ (θ) L ɛ (θ) (10) L ɛ (θ) Pr ( d(t (y o ) ), T (y)) ɛ θ (11) }{{} θ Inference is approximate due to the summary statistics T and distance d ɛ > 0 the finite number of samples (Monte Carlo error) Michael Gutmann ABC Tutorial 35 / 65

36 Part II Computational and statistical efficiency

37 Brief recap Simulator-based models: Models which are specified by a data generating mechanism. By construction, we can sample from simulator-based models. Likelihood function can generally not be written down. Rejection ABC: Trial and error scheme to find parameter values which produce simulated data resembling the observed data. Simulated data resemble the observed data if some discrepancy measure is small. Michael Gutmann ABC Tutorial 37 / 65

38 Efficiency of ABC 1. Computational efficiency: How to efficiently find the parameter values which yield a small discrepancy? 2. Statistical efficiency: How to measure the discrepancy between the simulated and observed data? Michael Gutmann ABC Tutorial 38 / 65

39 Program Computational efficiency Difficulties Solutions Recent work Statistical efficiency Difficulties Solutions Recent work Michael Gutmann ABC Tutorial 39 / 65

40 Program Computational efficiency Difficulties Solutions Recent work Statistical efficiency Difficulties Solutions Recent work Michael Gutmann ABC Tutorial 40 / 65

41 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example Inference of the mean θ of a Gaussian of variance one. Pr(y = y o θ) = 0. Discrepancy θ : θ = (ˆµ o ˆµ θ ) 2, ˆµ o = 1 n yi o, n ˆµ θ = 1 n i=1 n y i, i=1 y i N (θ, 1) discrepancy , 0.9 quantiles mean realization of stochastic process realizations at θ θ Discrepancy θ is a random variable. Michael Gutmann ABC Tutorial 41 / 65

42 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example Probability that θ is below some threshold ɛ approximates the likelihood function , 0.9 quantiles mean threshold (0.1) approximate likelihood (rescaled) true likelihood (rescaled) θ Michael Gutmann ABC Tutorial 42 / 65

43 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example Here, T (y) = 1 n n i=1 y i is a sufficient statistics for inference of the mean θ The only approximation is ɛ > 0. In general, the summary statistics will not be sufficient. Michael Gutmann ABC Tutorial 43 / 65

44 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example In the Gaussian example, the probability for θ ɛ can be computed in closed form θ = ( ˆµ o ˆµ θ ) 2 Pr( θ ɛ) = Φ ( n(ˆµ o θ) + nɛ ) Φ ( n(ˆµ o θ) nɛ ) For nɛ small: L ɛ (θ) Pr( θ ɛ) ɛl(θ) Φ(x) = x 1 exp ( 1 2π 2 u2) du For small ɛ good approximation of the likelihood function. But for small ɛ, Pr( θ ɛ) 0: Very few samples will be accepted , 0.9 quantiles mean threshold (0.1) approximate likelihood (rescaled) true likelihood (rescaled) θ Michael Gutmann ABC Tutorial 44 / 65

45 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Two widely used algorithms Two widely used algorithms which improve computationally upon rejection ABC: 1. Regression ABC (Beaumont et al, 2002) 2. Sequential Monte Carlo ABC (Sisson et al, 2007) Both use rejection ABC as a building block. Sequential Monte Carlo (SMC) ABC is also known as Population Monte Carlo (PMC) ABC. Michael Gutmann ABC Tutorial 45 / 65

46 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Two widely used algorithms Regression ABC consists in running rejection ABC with a relatively large ɛ and then adjusting the obtained samples so that they are closer to samples from the true posterior. Sequential Monte Carlo ABC consists in sampling θ from an adaptively constructed proposal distribution φ(θ) rather than from the prior in order to avoid simulating many data sets which are not accepted. Michael Gutmann ABC Tutorial 46 / 65

47 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of regression ABC The summary statistics t θ = T (y θ ) and θ have a joint distribution. Let t i be the summary statistics for simulated data y i = g(ω i, θ i ). We can learn a regression model between the summary statistics (covariates) and the parameters (response variables) θ i = f (t i ) + ξ i (12) where ξ i is the error term (zero mean random variable). The training data for the regression are typically tuples (θ i, t i ) produced by rejection-abc with some sufficiently large ɛ. Michael Gutmann ABC Tutorial 47 / 65

48 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of regression ABC Fitting the regression model to the training data (θ i, t i ) yields an estimated regression function ˆf and the residuals ˆξ i, ˆξ i = θ i ˆf (t i ) (13) Regression ABC consists in replacing θ i with θ i, θ i = ˆf (t o ) + ˆξ i = ˆf (t o ) + θ i ˆf (t i ) (14) Corresponds to an adjustment of θ i. If the relation between t and θ is learned correctly, the θ i correspond to samples from an approximation with ɛ = 0. Michael Gutmann ABC Tutorial 48 / 65

49 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of sequential Monte Carlo ABC We may modify the rejection ABC algorithm and use φ(θ) instead of the prior p θ. 1. Sample θ i φ(θ) 2. Simulate a data set y i by running the simulator with θ i (y i = g(ω i, θ i )) 3. Compute the discrepancy i = d(t (y o ), T (y i )) 4. Retain θ i if i ɛ The retained samples follow a distribution proportional to φ(θ) L ɛ (θ) Michael Gutmann ABC Tutorial 49 / 65

50 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Basic idea of sequential Monte Carlo ABC Parameters θ i weighted with w i, w i = p θ(θ i ) φ(θ i ), (15) follow a distribution proportional to p θ (θ) L ɛ (θ). Can be used to iteratively morph the prior into a posterior: Use a sequence of shrinking thresholds ɛt Run rejection ABC with ɛ 0. Define φ t at iteration t based on the weighted samples from the previous iteration (e.g Gaussian mixture with means equal to the θ i from the previous iteration). Michael Gutmann ABC Tutorial 50 / 65

51 Basic idea of sequential Monte Carlo ABC Construction of proposal distribution true posterior t = t ε proposal t=3 y 0 ε3 ε2 1 2 proposal t= prior t= θ Simulator M(θ, ) Michael Gutmann ABC Tutorial 51 / 65

52 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Learning a model of the discrepancy L ɛ(θ) Pr ( θ ɛ θ) The approximate likelihood function L ɛ (θ) is determined by the distribution of the discrepancy θ If we knew the distribution of θ we could compute L ɛ (θ). We proposed to learn a model of θ and to approximate L ɛ (θ) by ˆL ɛ (θ), L ɛ (θ) Pr ( θ ɛ θ) (16) Model is learned more accurately in regions where θ tends to be small to make further computational savings. (Gutmann and Corander, Journal of Machine Learning Research, in press) Michael Gutmann ABC Tutorial 52 / 65

53 Strain Example: Bacterial infections in child care centers Likelihood intractable for cross-sectional data But generating data from the model is possible Strain Parameters of interest: - rate of infections within a center - rate of infections from outside - competition between the strains Individual Individual 5 10 Strain Individual 10 Strain Strain (Numminen et al, 2013) Time Individual Individual Michael Gutmann ABC Tutorial 53 / 65

54 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers Comparison of the proposed approach with a standard population Monte Carlo ABC approach. Roughly equal results using 1000 times fewer simulations. 4.5 days with 200 cores 90 minutes with seven cores Posterior means: solid lines, credibility intervals: shaded areas or dashed lines. Competition parameter Developed Fast Method Standard Method (Gutmann and Corander, 2015) Computational cost (log10) Michael Gutmann ABC Tutorial 54 / 65

55 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers Comparison of the proposed approach with a standard population Monte Carlo ABC approach. Roughly equal results using 1000 times fewer simulations Developed Fast Method Standard Method Developed Fast Method Standard Method Internal infection parameter External infection parameter Computational cost (log10) Computational cost (log10) Posterior means are shown as solid lines, credibility intervals as shaded areas or dashed lines. Michael Gutmann ABC Tutorial 55 / 65

56 Program Computational efficiency Difficulties Solutions Recent work Statistical efficiency Difficulties Solutions Recent work Michael Gutmann ABC Tutorial 56 / 65

57 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Discrepancy measure affects the accuracy of the estimates Bad discrepancy: estimated posterior = prior Bad discrepancy: vanishingly small acceptance probability Good discrepancy: good trade-off between loss of information and increase in acceptance probability Michael Gutmann ABC Tutorial 57 / 65

58 Computational efficiency Statistical efficiency Difficulties Solutions Recent work How to choose the discrepancy measure? Manually Use expert knowledge about y o to define summary statistics T. Use Euclidean distance for d. Semi-automatic θ = T (y o ) T (y θ ) Simulate pairs (θi, y i ) Define a large number of summary statistics T Define T as a smaller number of (linear) combinations of them, automatically learned from the simulated pairs. Use Euclidean distance for d. Combinations are typically determined via regression with the T (y θ ) as covariates and parameters θ as reponse variables. e.g. Nunes and Balding, 2010; Fearnhead and Prangle, 2012; Aeschbacher et al, 2012; Blum et al, 2013 Michael Gutmann ABC Tutorial 58 / 65

59 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Discrepancy measurement via classification (Gutmann et al, 2014) Classification accuracy (discriminability) as discrepancy measure θ. Discriminability of 100% indicates maximally different data sets; 50% indicates similar data sets. 3 2 Observed data Simulated data, mean (6,0) 3 2 Observed data Simulated data, mean (1/2,0) 1 1 y coordinate 0 y coordinate x coordinate x coordinate Michael Gutmann ABC Tutorial 59 / 65

60 Strain Example: Bacterial infections in child care centers Likelihood intractable for cross-sectional data But generating data from the model is possible Strain Parameters of interest: - rate of infections within a center - rate of infections from outside - competition between the strains Individual Individual 5 10 Strain Individual 10 Strain Strain (Numminen et al, 2013) Time Individual Individual Michael Gutmann ABC Tutorial 60 / 65

61 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers (Gutmann et al, 2014) Our classification-based distance measure does not use domain/expert knowledge. Performs as well as a distance measure based on domain knowledge (Numminen et, 2013) Expert Classifier (a) Posterior pdf for β (b) Posterior pdf for Λ (c) Posterior pdf for θ Michael Gutmann ABC Tutorial 61 / 65

62 Computational efficiency Statistical efficiency Difficulties Solutions Recent work Example: Bacterial infections in child care centers (Gutmann et al, 2014) Robustness is a concern when relying on expert knowledge Classification-based distance can automatically compensate errors in the expert input. Posterior probability density Deterioration Developed Robust Method Standard Method Reference Compensation External infection parameter Michael Gutmann ABC Tutorial 62 / 65

63 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Michael Gutmann ABC Tutorial 63 / 65

64 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Covered three classical algorithms: 1. Rejection ABC 2. Regression ABC 3. Sequential Monte Carlo ABC Michael Gutmann ABC Tutorial 63 / 65

65 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Covered three classical algorithms: 1. Rejection ABC 2. Regression ABC 3. Sequential Monte Carlo ABC Choice of discrepancy measure between simulated and observed data Michael Gutmann ABC Tutorial 63 / 65

66 Summary The topic was Bayesian inference for models specified via a simulator (implicit / generative models). Introduced approximate Bayesian computation (ABC). Principle of ABC: Find parameter values which yield simulated data resembling the observed data. Covered three classical algorithms: 1. Rejection ABC 2. Regression ABC 3. Sequential Monte Carlo ABC Choice of discrepancy measure between simulated and observed data Recent work of mine Combining modeling of the discrepancy and optimization to increase computational efficiency Using classification to measure the discrepancy Michael Gutmann ABC Tutorial 63 / 65

67 References My papers: Numminen et al. Estimating the Transmission Dynamics of Streptococcus pneumoniae from Strain Prevalence Data. Biometrics, Beaumont et al. Approximate Bayesian Computation in Population Genetics. Genetics, Sisson et al. Sequential Monte Carlo without likelihoods. PNAS, Nunes and Balding. On optimal selection of summary statistics for approximate Bayesian computation. Statistical Applications in Genetics and Molecular Biology, Fearnhead and Prangle. Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Aeschbacher et al. A novel approach for choosing summary statistics in approximate Bayesian computation. Genetics, Blum et al. A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation. Statistical Science, Michael Gutmann ABC Tutorial 64 / 65

68 Review papers Beaumont. Approximate Bayesian Computation in Evolution and Ecology. Annual Review of Ecology, Evolution, and Systematics, Hartig et al. Statistical inference for stochastic simulation models theory and application.ecology Letters, Marin et al. Approximate Bayesian computational methods. Statistics and Computing, Sunnaker et al. Approximate Bayesian Computation. PLoS Computational Biology, Michael Gutmann ABC Tutorial 65 / 65

Approximate Bayesian Computation

Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate