Introduction to Bayesian thinking
|
|
- Garry Bell
- 5 years ago
- Views:
Transcription
1 Introduction to Bayesian thinking Statistics seminar Rodrigo Díaz Geneva Observatory, April 11th, 2016
2 Agenda (I) Part I. Basics. General concepts & history of Bayesian statistics. Bayesian inference. Part II. Likelihoods. Constructing likelihoods. Part III. Priors. General concepts and problems. Subjective vs objective priors. Assigning priors.
3 Agenda (II) Part IV. Posterior. Conjugate priors. The right prior for a given likelihood. MCMC. Part V. Model comparison. Model evidence and Bayes factor. The philosophy of integrating over parameter space. Estimating the model evidence. Dos and Don ts Point estimate: BIC. Importance sampling and a family of methods. Nested sampling.
4 Part I basics
5 Statistical inference Fig. adapted from Gregory (2005) Deductive inference (predictions) Testable hypothesis (theory) Observations (data) Statistical inference (hyp. testing, parameter estimation)
6 Statistical inference requires a probability theory Frequentist Bayesian
7 Thomas Bayes ( ) First appearance of the product rule (the base for the Bayes theorem; An Essay towards solving a Problem in the Doctrine of Chances). p(h i I, D) = p(d H i,i) p(d I) p(h i I) Pierre-Simon Laplace ( ) Wide application of the Bayes' rule. Principle of insufficient reason (non-informative priors). Primitive version of the Bernstein von Mises theorem. Laplace s inverse probability is largely rejected for ~100 years. The reign of frequentist probability. Fischer, Pearson, etc.
8 Harold Jeffreys ( ) Objective Bayesian probability revived. Jeffreys rule for priors. (1940s s) R. T. Cox George Pólya E. T. Jaynes Plausible reasoning. Reasoning with uncertainty. Probability theory as an extension of Aristotelian logic. The product and sum rules deduced for basic principles. MAXENT priors. See E.T Jaynes. Probability Theory: The Logic of Science.
9 Statistical inference Fig. adapted from Gregory (2005) Deductive inference (predictions) Testable hypothesis (theory) Observations (data) Statistical inference (hyp. testing, parameter estimation)
10 The three desiderata Rules of deductive reasoning: strong syllogisms from Aristotelian logic. Brain works using plausible inference. Woman in wood cabin story. The rules for plausible reasoning are deduced from three simple desiderata on how the mind of a thinking robot should work (Cox- Póyla-Jaynes). I. Degrees of Plausibility are represented by real numbers. II. Qualitative Correspondence with common sense. III. If a conclusion can be reasoned out in more than one way; then every possible way must lead to the same result.
11 if degrees of plausibility are represented by real numbers, then there is a uniquely determined set of quantitative rules for conducting inference That is any, other rules whose results conflict with them will necessarily violate an elementary -and nearly inescapable- desideratum of rationality or consistency. But the final result was just the standard rules of probability theory, given already by Bernoulli and Laplace, so why all the fuss? The important new feature was that these rules were now seen as uniquely valid principles of logic in general, making no reference to chance or random variables ; so their range of application is vastly greater than had been supposed in the conventional probability theory that was developed in the early 20th century. As a result, the imaginary distinction between probability theory and statistical inference disappears and the field achieves not only logical unity and simplicity but far greater technical power and flexibility in applications.
12 The basic rules of probability theory. Sum p(a + B) =p(a)+p(b) p(ab) A B Product p(ab) =p(a B)p(B) =p(b A)p(A) Once these rules are recognised as general ways of manipulating any preposition, a world of applications opens up. Aristotelian deductive logic is the limiting form of our rules for plausible reasoning E.T. Jaynes
13 if degrees of plausibility are represented by real numbers, then there is a uniquely determined set of quantitative rules for conducting inference That is any, other rules whose results conflict with them will necessarily violate an elementary -and nearly inescapable- desideratum of rationality or consistency. But the final result was just the standard rules of probability theory, given already by Bernoulli and Laplace, so why all the fuss? The important new feature was that these rules were now seen as uniquely valid principles of logic in general, making no reference to chance or random variables ; so their range of application is vastly greater than had been supposed in the conventional probability theory that was developed in the early 20th century. As a result, the imaginary distinction between probability theory and statistical inference disappears and the field achieves not only logical unity and simplicity but far greater technical power and flexibility in applications.
14 The Bayesian recipe Sum p(a + B) =p(a)+p(b) p(ab) Product p(ab) =p(a B)p(B) =p(b A)p(A) Bayes theorem p(a B) = p(b A) p(a) p(b)
15 The Bayesian recipe Law of Total Probability p(a I) = X i p(a B i, I)p(B i I) for {Bi} exclusive and exhaustive B2 B3 B1 B4 A B5 B6
16 The Bayesian recipe Sum p(a + B) =p(a)+p(b) p(ab) Product p(ab) =p(a B)p(B) =p(b A)p(A) Bayes theorem Law of Total Probability p(b A) p(a) p(a B) = p(b) X p(a I) = p(a B i, I)p(B i I) i for {Bi} exclusive and exhaustive
17 The Bayesian recipe Sum p(a + B) =p(a)+p(b) p(ab) Product p(ab) =p(a B)p(B) =p(b A)p(A) Bayes theorem Law of Total Probability p(b A) p(a) p(a B) = p(b) X p(a I) = p(a B i, I)p(B i I) i for {Bi} exclusive and exhaustive Normalisation X i p(b i I) =1 for {Bi} exclusive and exhaustive
18 "Bayes", "Bayesian", MCMC
19
20 Two basic tasks of statistical inference Learning process (parameter estimation) Decision making (model comparison)
21 Learning process Bayesian probability represents a state of knowledge : parameter vector Hi: hypothesis I: information p( H i, I) Prior D: data Discrete space (hypothesis space) p( D, H i, I) Posterior p(h i I) p(h i I, D)
22 Learning process : parameter vector Hi: hypothesis I: information D: data Enter the likelihood function p( H i, I, D) = p(d, H i, I) p(d H i, I) Posterior p( H i, I) Prior p( D, H i, I) / L (H i ) p( H i, I) The proportionality constant has many names: marginal likelihood, global likelihood, model evidence, prior predictive. Hard to compute. Important.
23 Optimising the learning process The likelihood needs to be selective for the learning process to be effective. Prior p(h 0 M 1,I ) Likelihood p(d H 0,M 1,I ) Likelihood p(d H 0,M 1,I ) Prior p(h 0 M 1,I ) Dominated by prior information Posterior p(h 0 D,M 1,I ) Parameter H 0 P. Gregory (2005) Posterior p(h 0 D,M 1,I ) Parameter H 0 Dominated by data
24 Decision making Bayes theorem is also the base for model comparison p(h i I, D) = p(d H i, I) p(d I) p(h i I) Z but now p(d H i, I) = p(d, H i, I)p( H i, I) d n Computation of the evidence cannot be escaped.
25 Decision making Model comparison consists in computing the ratio of the posteriors (odds ratio) of two competing hypotheses. p(h i D, I) p(h i I,D) = p(d H i,i) p(h i I) p(h j I,D) p(d H j,i) p(h j I) p(h j D, I) Bayes factor Prior odds
26 Part II likelihoods
27 The likelihood function : parameter vector Hi: hypothesis I: information D: data p( H i, I, D) = p(d, H i, I) p(d H i, I) Posterior p( H i, I) Prior p( D, H i, I) / L (H i ) p( H i, I) The likelihood is the probability of obtaining data D, for a given prior information I and a set of parameters θ. Remember, likelihood is not a probability for parameter vector θ (for that you need the prior)
28 Optimising the learning process The likelihood needs to be selective for the learning process to be effective. Prior p(h 0 M 1,I ) Likelihood p(d H 0,M 1,I ) Likelihood p(d H 0,M 1,I ) Prior p(h 0 M 1,I ) Dominated by prior information Posterior p(h 0 D,M 1,I ) Parameter H 0 P. Gregory (2005) Posterior p(h 0 D,M 1,I ) Parameter H 0 Dominated by data
29 Likelihood function : parameter vector Hi: hypothesis I: information D: data Ingredients Physical model Analytic model Simulations Statistical (non-deterministic) model Unknown errors (jitter) Instrument systematics Complex physics (activity, ) Error statistics Covariances Non-Gaussianity p(d, H i, I) =L (H i ) indep.,gauss. / exp 2 2
30 Constructing the likelihood The data: D = D 1 D 2... D n = {D i } Di: the i-th measurement is in the infinitesimal range yi to yi + dyi The errors: Ei: the i-th error is in the infinitesimal range ei to ei + dei p(e i, H, I) =f E (e i ) The probability distribution of statement Ei Most used f E The model: f E (e i )=N(0, 2 i ) Mi: the i-th error is in the infinitesimal range mi to mi + dmi p(m i, H, I) =f M (m i ) The probability distribution of statement Mi
31 Constructing the likelihood The data: D = D 1 D 2... D n = {D i } We want to build the probability distribution: p(d, H, I) =p(d 1,D 2,...,D n, H, I) Write: y i = m i + e i It can be shown that: p(d i, H, I) = Z dm i f M (m i ) f E (y i m i ) Convolution equation
32 Constructing the likelihood Z p(d i, H, I) = dm i f M (m i ) f E (y i m i ) But for a deterministic model, mi is obtained from a (usually analytically) function f without any uncertainty (say, a Keplerian curve for RV measurements) Then: p(d i, H, I) = m i = f(x i ) f M (m i )= (m i f(x i )) Z dm i (m i f(x i )) f E (y i m i ) = f E (y i f(x i ) =p(e i, H, I) p(d, H, I) =p(d 1,D 2,...,D n, H, I) =p(e 1,E 2,...,E n, H, I)
33 Constructing the likelihood p(d, H, I) =p(d 1,D 2,...,D n, H, I) =p(e 1,E 2,...,E n, H, I) Now, for independent errors p(d, H, I) =p(e 1,E 2,...,E n, H, I) = p(e 1, H, I)...p(E n, H, I) ny = p(e i, H, I) i=1 And, if in addition, the errors are distributed normally: p(d, H, I) indep.,gauss. / exp 2 2
34 Constructing the likelihood Back to the convolution equation Z p(d i, H, I) = dm i f M (m i ) f E (y i m i ) For a non-deterministic model, Mi is distributed: Mi: the i-th error is in the infinitesimal range mi to mi + dmi p(m i, H, I) =f M (m i ) The probability distribution of statement Mi E.g. adding instrumental error / resolution: f M (m i )=N(f(x i ), 2 inst)
35 Part III priors
36 xkcd.com
37 Prior probabilities p(h i I) Hi: hypothesis (can be continuous). I: information Prior information I is always present: The term prior does not necessarily mean earlier in time. Philosophical controversy on how to assign priors. Subjective vs. objective views. No single universal rule, but a few accepted methods. Informative priors. Usually based on the output from previous observations. (What was the prior of the first analysis?). Ignorance priors. Required as a starting point for the theory.
38 Assigning ignorance priors 1. Principle of indifference. Given n mutually exclusive, exhaustive hypothesis, {H i }, with i = 1,, n, the PoI states: p(h i I) =1/n
39 Assigning ignorance priors 2. Transformation groups. Location and scale parameters. For a certain type of parameters (location and scale), total ignorance can we represented as invariance under certain (group of) transformation. Location: position of highest tree along a river. Problem must be invariant under a translation. X 0 = X + c p(x I)dX = p(x 0 I)dX 0 = p(x 0 I)d(X + c) =p(x 0 I)dX p(x I) = constant Uniform prior.
40 Assigning ignorance priors 2. Transformation groups. Location and scale parameters. For a certain type of parameters (location and scale), total ignorance can we represented as invariance under certain (group of) transformation. Scale: life time of a new bacteria or Poisson rate Problem must be invariant under scaling. X 0 = ax p(x I)dX = p(x 0 I)dX 0 = p(x 0 I)d(aX) =ap(x 0 I)dX p(x I) = constant x Jeffreys prior.
41 Assigning ignorance priors 3. Jeffreys rule. Besides location and scale parameters, little more can be said using transformation invariance. Jeffreys priors use the Fisher information; parameterisation invariant, but strange behaviour in many dimensions. Observed Fischer information: I D = d2 log L D d 2 But D is not known when we have to define a prior. Use expectation value over D. I( ) = E D apple d 2 log L D d 2
42 Assigning ignorance priors 3. Jeffreys rule says: Examples: p( I) / p I( ) Mean of Normal distribution (μ) with known variance σ^2. p(µ 2, I) / constant Rate λ of Poisson distribution. p( I) / 1/ p Exercise: Scale of Normal with known mean value?
43 Assigning ignorance priors 3. Jeffreys rule: p( I) / p I( ) I( ) = E D apple d 2 log L D d 2 Favours parts of parameter space where data provides more information. Is invariant under reparametrisation. Works fine only in one dimension See more examples here: en.wikipedia.org/wiki/jeffreys_prior
44 Assigning ignorance priors 4. Maximum entropy. Uses information theory to define a measurement of the distribution information. S = nx i=0 p i log(p i ) Maximise entropy subject to constraints (use Lagrange multipliers). Defined strictly for discrete distributions.
45 Assigning ignorance priors 5. Reference priors. Recent development in the field (Bernardo; Bernardo, Berger, & Sun). Similar to MAXENT, but maximise the Kullback-Leibler divergence between prior and posterior. Priors which maximise the expected difference with the posterior. Rigorous definition is complicated and subtle. For 1-D, reference priors are the Jeffreys priors. In multi-d, reference prior behaves better than Jeffreys priors.
46 Agenda (II) Part IV. Posterior. Conjugate priors. The right prior for a given likelihood. MCMC. Part V. Model comparison. Model evidence and Bayes factor. The philosophy of integrating over parameter space. Estimating the model evidence. Dos and Don ts BIC, Chib & Jeliazkov, etc. Importance sampling and a family of methods.
47 Part IV sampling the posterior / MCMC
48 Sampling from the posterior p( D, H i, I) / L (H i ) p( H i, I) The posterior distribution is proportional to the likelihood times the prior. The normalising constant (called model evidence, marginal likelihood, etc.) is of importance when comparing different models. The posterior contains all the information on a given model a Bayesian statistician can get for a given set of priors and data. Posterior is only analytical in few cases: Conjugate priors. Other methods needed to sample from posterior. Remember, most Bayesian computations can be reduced to expectation values with respect to the posterior.
49 Markov Chain Monte Carlo : parameter vector Hi: hypothesis I: information D: data p( D, H i, I) / L (H i ) p( H i, I) Metropolis-Hastings 0 * 0 q( 0, ) * 1. L 0 p( 0 I) 2. Create proposal point L 0 p( 0 I) r = L 0 p( 0 I) L 0 p( 0 I)
50 Markov Chain Monte Carlo : parameter vector Hi: hypothesis I: information D: data p( D, H i, I) / L (H i ) p( H i, I) Metropolis-Hastings 0 * q( 0, ) 0 * q( 0, ) 1 * 1. L 0 p( 0 I) 2. Create proposal point L 0 p( 0 I) r = L 0 p( 0 I) L 0 p( 0 I) 5. Accept proposal with probability min(1, r)
51 Markov Chain Monte Carlo : parameter vector Hi: hypothesis I: information D: data p( D, H i, I) / L (H i ) p( H i, I) q( 0, ) Metropolis-Hastings 1 * 0 * 1. L 0 p( 0 I) 2. Create proposal point L 0 p( 0 I) r = L 0 p( 0 I) L 0 p( 0 I) 5. Accept proposal with probability min(1, r)
52 Markov Chain Monte Carlo : parameter vector Hi: hypothesis I: information D: data p( D, H i, I) / L (H i ) p( H i, I) Metropolis-Hastings 0 * q( 0, ) 0 * 1 * q( 0, ) Algorithms Metropolis-Hastings Gibbs sampling Slice sampling Hybrid Monte Carlo Codes pymc emcee kombine cobmcmc
53 5 minutes to LEARN a lifetime to MASTER
54 But beware. Nonconvergence, bad mixing. The dark side of MCMC are they. Problems with correlations and multi-modal distributions.
55 The problem with Correlations If parameters exhibit correlations, then step size must be small to reach the demanded fraction of accepted jumps. Rejected proposal Accepted proposal Need a very long chain to explore the entire posterior. Or, more relevant, the entire posterior will not be explored thoroughly (i.e. reduced error bars!)
56 MCMC: the Good, the Bad, and the Ugly Visual inspection of traces. Good Marginal mixing No convergence
57 MCMC: the Good, the Bad, and the Ugly Evaluation of the chain auto-correlations Image credit: E. Ford
58 Multi-modal posteriors. Difficult for sampler to move across modes if separated by a low-probability barrier. Image credit: E. Ford
59 Multi-modal posteriors. Run as many chains as possible starting from significantly different places in the prior space. Be paranoid! You can always be missing modes.
60
61 Part V model comparison
62 Decision making Bayes theorem is also the base for model comparison p(h i I, D) = p(d H i, I) p(d I) p(h i I) Z but now p(d H i, I) = p(d, H i, I)p( H i, I) d n Computation of the model evidence (a.k.a. marginal likelihood, global likelihood, prior predictive, ) cannot be escaped.
63 Decision making Model comparison consists in computing the ratio of the posteriors (odds ratio) of two competing hypotheses. p(h i D, I) p(h i I,D) = p(d H i,i) p(h i I) p(h j I,D) p(d H j,i) p(h j I) p(h j D, I) Bayes factor Prior odds
64 A built-in Occam s razor The Bayes factor naturally punishes models with more parameters Z p(d H i, I) = p(d, H i, I)p( H i, I) d E.g.: model M0, without free parameters. Model M1, with one parameter. M 0 = M 1 ( = 0 ) p(d M 0, I) =p(d 0, M 1, I) θ L(θ) = p(d θ, M 1, I ) p(d θ, M 1, I ) = L(θ) Z p(d M 1, I) = p(d, M 1, I) p( M 1, I) d = 1 Z p(d, M 1, I) d δθ p(d ˆ, M 1, I) p(θ M 1,I ) = 1 θ θ B 10 p(d ˆ, M 1, I) p(d 0, M 1, I) Occam s factor One per parameter Gregory (2005) Parameter θ
65 Estimating the marginal likelihood p(d H i, I) = Z p(d, H i, I)p( H i, I) d n Marginal likelihood is a k-dimensional integral over parameter space. Accounting for the size of parameter space bring many good things to Bayesian statistics, but the integral is intractable Large number of techniques to estimate the value of the integral: Asymptotic estimates (Laplace approximation, BIC). Importance sampling. Chib & Jeliazkov. Nested sampling.
66 Estimating the marginal likelihood Point Estimations BIC 2ln L max + k ln N Bayesian Information Criterion Some nice properties are lost, such as reasonable parameter penalisation (see Liddle 2007). The error term of the BIC is generally O(1). Even for large samples, it does not produce the right value.
67 Importance Sampling (I) Used to obtain moments of distributions using samples from another distribution. The evidence is the expectation value of the likelihood over the prior space. p(d H i, I) = Z p(d, H i, I)p( H i, I) d n The simplest estimation of the evidence is to take samples from the prior and compute the average of the p(d θ, Hi, I): ˆp(D H i, I) = 1 S SX p(d (i), H i, I) But this estimator is extremely inefficient if the likelihood is concentrated relative to the prior. A large number of samples are needed, which is computationally expensive. i
68 Importance Sampling (II) More generally, we can choose a distribution g(θ) (from which we can sample) and express the integral as an expectation value over that distribution (dropping conditional I for notation simplicity) Z p( H i ) p(d, H i )d = Z w g ( ) z } { p( H i ) g( ) p(d, H i ) g( )d = =E g( ) [w g ( ) p(d, H i )] 1 SX w g ( (i) ) p(d (i), H i ) S i where θ (i) is sampled from g(θ), the importance sampling function.
69 Importance Sampling (III) Different choices of g(θ) give different estimates: g( ) =p( H i, I) g( ) =p( D, H i, I) ˆp(D H i, I) = 1 S ˆ p(d H i, I) =S " S X i SX i p(d (i), H i, I) 1 p(d (i), H i, I) # 1 The Harmonic Mean: does not satisfy a Gaussian central limit theorem (Kass & Raftery 1995). The sum is dominated by the occasional small terms in the likelihood. Its variance can be infinite. Insensitive to diffuse priors (i.e. priors much larger than likelihood). Harmonic mean
70 Importance Sampling (IV) Perrakis et al. (2014) use the product of the posterior marginal distributions. g( ) = ky i=1 p( i D, H i, I) Which leads to the estimator ˆ p(d H, I) = 1 S SX i L( (i) )p( (i) H, I) Q k j=1 p( (i) D, H, I) Behaves much better than the harmonic mean. Requires estimating the marginal densities in the denominator. Different techniques available (moment-matching, kernel estimation, etc.) Samples from the marginal distributions are readily obtained from MCMC samples
71 Estimating the marginal likelihood Estimations based on importance sampling (IS) ISF Leads to... But... Prior Mean Estimate Very inefficient Posterior Harmonic Mean Estimate Dominated by points with low likelihood. Posterior & Prior Mixture Estimate Requires drawing from both posterior and prior Kaas & Raferty (1995) Posterior x 2 Truncated- Posterior Mixture Estimate Inconsistent or as good as HM. Tuomi & Jones (2012)
72 Nested sampling (Skilling 2007) p(d H i, I) = Z p(d, H i, I)p( H i, I) d n A non-trivial change of variables. Define prior volume X dx = p( I, H)d n X( )= R L( )> p( I, H)dn Then, the integral is a simple 1-D integral over prior volume p(d H, I) = Z 1 0 L(X)dX
73 Nested sampling (Skilling 2007) p(d H, I) = Z 1 0 L(X)dX
74 Nested sampling (Skilling 2007) Algorithm: p(d H, I) = Z 1 0 L(X)dX Start with N points θ 1,..., θ N from prior; initialise Z = 0, X 0 = 1. Repeat for i = 1, 2,..., j; record the lowest of the current likelihood values as L i, set X i = exp( i/n) (crude) or sample it to get uncertainty, set w i = X i 1 X i (simple) or (X i 1 X i+1 )/2 (trapezoidal), increment Z by L i w i, then replace point of lowest likelihood by new one drawn from within L(θ) > L i, in proportion to the prior π(θ). Increment Z by N 1 (L(θ 1 ) L(θ N )) X j. Two keys: - Assign Xi. - Draw with condition L > Li.
75 Nested sampling (Skilling 2007) Draw new point with condition. Elipsoidal nested sampling (Mukherjee et al. 2006). Uses ellipsoidal contours around active points. Becomes inefficient at high dimensions or multi-modal. MultiNest algorithm (Feroz, Hobson, Bridges, 2009). Ellipsoids break up as needed to improve efficiency. Still problems at high dimensions (the curse goes on).
76 Nested sampling (Skilling 2007) Assigning Xi. p(d H, I) = Z 1 0 L(X)dX Use statistical properties of randomly distributed points. X 0 =1, X i = t i X i 1, p(t i )=Nt N 1 i E [log t] = 1/N Take mean value i/n) t i =exp( OR Draw from distribution to account for uncertainty.
77
Bayesian Inference in Astronomy & Astrophysics A Short Course
Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationNested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland
Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/ brewer/ is a Monte Carlo method (not necessarily MCMC) that was introduced by John Skilling in 2004. It is very popular
More informationPhysics 403. Segev BenZvi. Choosing Priors and the Principle of Maximum Entropy. Department of Physics and Astronomy University of Rochester
Physics 403 Choosing Priors and the Principle of Maximum Entropy Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Odds Ratio Occam Factors
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationIntroduction to Bayesian Data Analysis
Introduction to Bayesian Data Analysis Phil Gregory University of British Columbia March 2010 Hardback (ISBN-10: 052184150X ISBN-13: 9780521841504) Resources and solutions This title has free Mathematica
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationBayesian Phylogenetics:
Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes
More informationStatistical Methods in Particle Physics Lecture 1: Bayesian methods
Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan
More informationDavid Giles Bayesian Econometrics
David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationShould all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?
Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationMultimodal Nested Sampling
Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationComputer Vision Group Prof. Daniel Cremers. 14. Sampling Methods
Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationBayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)
Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) Background: Conditional probability A P (B A) = A,B P (A,
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationBayesian search for other Earths
Bayesian search for other Earths Low-mass planets orbiting nearby M dwarfs Mikko Tuomi University of Hertfordshire, Centre for Astrophysics Research Email: mikko.tuomi@utu.fi Presentation, 19.4.2013 1
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationIntroduction to Bayes
Introduction to Bayes Alan Heavens September 3, 2018 ICIC Data Analysis Workshop Alan Heavens Introduction to Bayes September 3, 2018 1 / 35 Overview 1 Inverse Problems 2 The meaning of probability Probability
More informationF denotes cumulative density. denotes probability density function; (.)
BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models
More informationBayesian Inference: Concept and Practice
Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationDoing Bayesian Integrals
ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to
More informationMarkov Chain Monte Carlo
Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/~brewer/ Emphasis I will try to emphasise the underlying ideas of the methods. I will not be teaching specific software
More informationan introduction to bayesian inference
with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationAnswers and expectations
Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E
More informationSAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software
SAMSI Astrostatistics Tutorial More Markov chain Monte Carlo & Demo of Mathematica software Phil Gregory University of British Columbia 26 Bayesian Logical Data Analysis for the Physical Sciences Contents:
More informationA BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain
A BAYESIAN MATHEMATICAL STATISTICS PRIMER José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods
Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric
More informationBayesian model selection: methodology, computation and applications
Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationWho was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?
Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the
More informationDevelopment of Stochastic Artificial Neural Networks for Hydrological Prediction
Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental
More informationBayesian Phylogenetics
Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born
More informationPenalized Loss functions for Bayesian Model Choice
Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationThe Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13
The Jeffreys Prior Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) The Jeffreys Prior MATH 9810 1 / 13 Sir Harold Jeffreys English mathematician, statistician, geophysicist, and astronomer His
More informationBayesian inference: what it means and why we care
Bayesian inference: what it means and why we care Robin J. Ryder Centre de Recherche en Mathématiques de la Décision Université Paris-Dauphine 6 November 2017 Mathematical Coffees Robin Ryder (Dauphine)
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationLecture 6: Model Checking and Selection
Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationIntroduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationPreliminary statistics
1 Preliminary statistics The solution of a geophysical inverse problem can be obtained by a combination of information from observed data, the theoretical relation between data and earth parameters (models),
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationDavid Giles Bayesian Econometrics
9. Model Selection - Theory David Giles Bayesian Econometrics One nice feature of the Bayesian analysis is that we can apply it to drawing inferences about entire models, not just parameters. Can't do
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationBayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida
Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationTopics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, June 2009
Topics in Statistical Data Analysis for HEP Lecture 1: Bayesian Methods CERN-JINR European School of High Energy Physics Bautzen, 14 27 June 2009 Glen Cowan Physics Department Royal Holloway, University
More informationPART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics
Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability
More informationStrong Lens Modeling (II): Statistical Methods
Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationBayesian Inference. Chris Mathys Wellcome Trust Centre for Neuroimaging UCL. London SPM Course
Bayesian Inference Chris Mathys Wellcome Trust Centre for Neuroimaging UCL London SPM Course Thanks to Jean Daunizeau and Jérémie Mattout for previous versions of this talk A spectacular piece of information
More informationAST 418/518 Instrumentation and Statistics
AST 418/518 Instrumentation and Statistics Class Website: http://ircamera.as.arizona.edu/astr_518 Class Texts: Practical Statistics for Astronomers, J.V. Wall, and C.R. Jenkins Measuring the Universe,
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationProbabilistic Graphical Models Lecture 17: Markov chain Monte Carlo
Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationAdvanced Statistical Methods. Lecture 6
Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution
More informationE. Santovetti lesson 4 Maximum likelihood Interval estimation
E. Santovetti lesson 4 Maximum likelihood Interval estimation 1 Extended Maximum Likelihood Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson
More informationVariational Methods in Bayesian Deconvolution
PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationIntroduc)on to Bayesian Methods
Introduc)on to Bayesian Methods Bayes Rule py x)px) = px! y) = px y)py) py x) = px y)py) px) px) =! px! y) = px y)py) y py x) = py x) =! y "! y px y)py) px y)py) px y)py) px y)py)dy Bayes Rule py x) =
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown
More informationRecent advances in cosmological Bayesian model comparison
Recent advances in cosmological Bayesian model comparison Astrophysics, Imperial College London www.robertotrotta.com 1. What is model comparison? 2. The Bayesian model comparison framework 3. Cosmological
More informationBayesian analysis in nuclear physics
Bayesian analysis in nuclear physics Ken Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory Tutorials presented at LANSCE Los Alamos Neutron Scattering Center July 25 August
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More information@R_Trotta Bayesian inference: Principles and applications
@R_Trotta Bayesian inference: Principles and applications - www.robertotrotta.com Copenhagen PhD School Oct 6th-10th 2014 astro.ic.ac.uk/icic Bayesian methods on the rise Bayes Theorem Bayes' Theorem follows
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.
Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 Suggested Projects: www.cs.ubc.ca/~arnaud/projects.html First assignement on the web: capture/recapture.
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationLikelihood-free MCMC
Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLecture 6: Markov Chain Monte Carlo
Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationBayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I
Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I 1 Sum rule: If set {x i } is exhaustive and exclusive, pr(x
More informationA Bayesian Approach to Phylogenetics
A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof
More informationVariational Scoring of Graphical Model Structures
Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational
More informationBayesian Computation
Bayesian Computation CAS Centennial Celebration and Annual Meeting New York, NY November 10, 2014 Brian M. Hartman, PhD ASA Assistant Professor of Actuarial Science University of Connecticut CAS Antitrust
More informationA note on Reversible Jump Markov Chain Monte Carlo
A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction
More informationBayesian phylogenetics. the one true tree? Bayesian phylogenetics
Bayesian phylogenetics the one true tree? the methods we ve learned so far try to get a single tree that best describes the data however, they admit that they don t search everywhere, and that it is difficult
More informationAssessing Regime Uncertainty Through Reversible Jump McMC
Assessing Regime Uncertainty Through Reversible Jump McMC August 14, 2008 1 Introduction Background Research Question 2 The RJMcMC Method McMC RJMcMC Algorithm Dependent Proposals Independent Proposals
More informationIntroduction into Bayesian statistics
Introduction into Bayesian statistics Maxim Kochurov EF MSU November 15, 2016 Maxim Kochurov Introduction into Bayesian statistics EF MSU 1 / 7 Content 1 Framework Notations 2 Difference Bayesians vs Frequentists
More informationA Probabilistic Framework for solving Inverse Problems. Lambros S. Katafygiotis, Ph.D.
A Probabilistic Framework for solving Inverse Problems Lambros S. Katafygiotis, Ph.D. OUTLINE Introduction to basic concepts of Bayesian Statistics Inverse Problems in Civil Engineering Probabilistic Model
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More information