Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017
|
|
- Alfred Goodman
- 6 years ago
- Views:
Transcription
1 Chalmers April 6, 2017
2 Bayesian philosophy
3 Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are conceptually different. Variables represent (potential) data. Parameters are assumed to have a FIXED but UNKNOWN value. Thus, models for unrepeatable events are meaningless. Procedures to compute the parameters from data are judged by their properties when applied to potential new data. Models with estimated parameters inserted yield predictions. Bayesian statistics: Models have only variables; their distribution represent (some person s) KNOWLEDGE about the some part of the world. Models for unrepeatable events are meaningful. Models give predictions of (relative) probabilities for data, even before data is observed. Predictions for new observations are made from models conditional on old data.
4 Example A sequence of independent and equivalent trials is performed, each resulting in success (1) or failure (0). The following data is observed: 0, 1, 0, 0, 1, 0, 0, 1. Classical analysis: A possible model is a Binomial distribution, with probability of success p and x out of 8 trials observed as successes. A possible estimator for p is ˆp = x/8. One can show this estimator is unbiased, i.e., E [ˆp] = p. With our data, we get ˆp = 3/8. Plugging into model, we compute the probability that 4 of the next 5 trials will be successes. Another possible model is a negative Binomial distribution, where y is the number of of trials needed to observe 3 successes. A possible estimator for p is ˆp = 3/y. This estimator for p has a different distribution. For example, it is biased, i.e., E [ˆp] p. One might instead use the minimum variance unbiased estimator for p, and get ˆp = = 2/7. But this would yield the probability that 4 of the next 5 trials will be successes.
5 Example, continued Assume we want to do a hypothesis test where H 0 : p 0.6, while H 1 : p < 0.6. What will the p-value be? The answer depends on which test statistic we use. Recall that the p-value is the probability, assuming H 0 and generating new data, of observing something equally or more extreme than the given test statistic, in terms of rejecting H 0. One possibility is the test statistic x, the number of successes in 8 trials. The probability of observing 0, 1, 2, or 3 successes when p = 0.6 is 0.174, so the p-value is Another possibility is the test statistic y, the number of trials needed to observe 3 successes. The probability of needing 8 or more trials when p = 0.6 is
6 Example, continued In the classical analysis, answers depend on choice of estimator or test statistic. However, they do not depend on the context. Consider the following contexts: 8 tosses of a coin gives 3 heads. 8 tests of a new medical procedure leads to 3 fatalities. Controlling 8 items produced in a factory uncovers 3 faulty items. In real life, the predicted probability of 4 successes in the next 5 trials would be different in these three contexts. In a Bayesian analysis, the different contexts would be taken into account by formulating a prior probability distribution for p, indicating the prior knowledge: For the coin example, we might use p Beta(20, 20). For the medical example, studies of similar medical procedures might yield p Beta(2, 6). For the factory example, knowledge gained in similar testing might be formulated with p Beta(1, 10).
7 Digression: The Beta distribution θ has a Beta distribution on [0, 1], with parameters α and β, if its density has the form 1 π(θ α, β) = B(α, β) θα 1 (1 θ) β 1 where B(α, β) is the Beta function defined by B(α, β) = Γ(α)Γ(β) Γ(α + β) where Γ(t) is the Gamma function defined by Γ(t) = 0 x t 1 e x dx Recall that for positive integers, Γ(n) = (n 1)! = 0 1 (n 1). See for example Wikipedia for more properties of the Beta distribution, and the Beta and Gamma functions. We write π(θ α, β) = Beta(θ; α, β) for the Beta density.
8 Example, continued The Bayesian model consists of the appropriate prior for p, and conditionally on each such p, a model for the data: It could be either a Binomial model for x or a negative Binomial model for y. Thus, the Bayesian model is a bivariate probability distribution; there is no conceptual difference between the variable for observed data (be it x or y) and p. Before the data is observed, the probability of observing just this data can be computed using the marginal distribution of x (or y) computed from the bivariate distribution representing the model.
9 Example, continued The knowledge about p after considering the data can be computed as the conditional distribution when we fix the data. This is called the posterior distribution for p. Note that there is no need to make a subjective choice of an estimator for p. Crucially, the posterior will be the same whether we use a Binomial model or a negative Binomial model for the data. Let z be the number of successes in 5 new trials. Given the posterior distribution for p, we get a bivariate model for p and z by multiplying with a Binomial distribution for z, with 5 trials and probability of success p. The distribution of z can be computed as the marginal distribution over this model. Note that predictions about z will not depend on whether we used a Binomial or negative Binomial distribution for the data.
10 , simplest example s can in fact always be performed by multiplying probability densities or functions, and taking conditional or marginal distributions. Example: An archeological item could be from either of three areas, A, B, or C. Based on visual inspection, it is judged to be from A, B, or C with probabilities 0.2, 0.5, and 0.3, respectively. Now a chemical analysis is done to detect two trace elements, X and Y. We know that the probabilities of detecting combinations of these trace elements, given the item s origin, is given in the table below: Both X and Y X only Y only None A B C
11 , simplest example, continued How can we answer for example questions like What is the probability that the item is from A, given that X only is detected? The table below represents the joint distribution: Both X and Y X only Y only None A B C All questions can be answered by computing coditional or marginal distributions from the table above. For example, Pr(A X ) = 0.14/0.22 = 0.636
12 Computations in Beta-Binomial example Let s choose one of the priors: Assume p Beta(2, 6). The probability density becomes π(p) = 1 B(2,6) p2 1 (1 p) 6 1. If we use that x has a Binomial distribution with 8 trials and parameter p, we get the probability function π(x p) = ( 8 x) p x (1 p) 8 x. The joint model becomes π(x, p) = π(x p)π(p) = ( ) 8 x p x (1 p) 8 x 1 B(2,6) p1 (1 p) 5. We would like to compute π(p x) = π(x,p) π(x) with x fixed to the value 3. Note that, as a function of p, this must be proportional to p 4 (1 p) 10. Note also that the Beta distribution with parameters 5 and 11 has a density proportional to p 4 (1 p) 10. Thus these two densities must be identical! We get. π(p x = 3) = Beta(p; 5, 11)
13 Computations in previous example, continued More generally, if we had used the prior Beta(p, α, β) for p, we would get the posterior Beta(p; α + 3, β + 5). Note that, if we had chosen to use data y with a negative Binomial distribution, we would have π(y p) = ( ) y 1 3 (1 p) y 3 p 3, and one can check that the posterior for p would become the same. The possible new data z has a Binomial distribution with 5 trials and parameter p. Multiplying this probability function with the posterior density found above, we get the joint distribution for z and p given the data. We can now compute π(z) = π(z p)π(p) π(p z) = ( 5 z) 1 Beta(5,11) 1 Beta(5+z,16 z) = ( ) 5 Beta(5 + z, 16 z). z Beta(5, 11) Thus we get that the probability of 4 successes in 5 new trials is π(z = 4) =
14 More advanced More generally, let x be a vector representing the data, and let θ be a vector representing the variables of interest. Assume we can write down the probability (density) function π(x θ), and the prior π(θ). Then the posterior for the parameter θ is then given by Bayes formula π(θ x) = π(x θ)π(θ) π(x) = π(x θ)π(θ) π(x θ)π(θ) dθ θ π(x θ)π(θ) where π(x) is the marginal probability (density) for x. Note the notation using θ : If we only know the posterior π(θ x) up to a factor not depending on θ, it can be reconstructed by requiring the sum (or integral) to be 1. Thus, in order to do inference, i.e., compute the posterior distribution of θ, we only need to compute the distribution for θ whose density is proportional to π(x θ)π(θ).
15 Computational methods for the posterior When all variables are finite-valued, there are algorithms for exact efficient computations, even when the distributions π(θ) and π(x θ) are expressed in terms of a network of dependent variables. The computations in the second example above work out (fairly) easily because we chose as the prior for p a distribution that is conjugate to the Binomial distribution (or negative Binomial) for the data. With enough conjucacies, one can also obtain exact posteriors. In all other cases, one can only compute approximations of the posterior. The group of methods called Markov chain Monte Carlo (McMC) are by far the most general and popular approximation methods. There are some other approximative algorithms, for example INLA (Integrated Nested Laplace Approximation), but they can be applied to more limited sets of models.
16 Markov chain Monte Carlo The idea is to generate an (approximative) sample from the posterior. Then, inference can be done based on this sample. The sample is produced using a Markov chain. The chain is produced by * starting at some fairly random value θ 0, * for each step, generating a new proposed value from the old, using some algorithm, and * accepting or rejecting the proposed value based on an acceptance criterium. The acceptance criterium depends on the posterior distribution π(θ x), but it needs to be known only up to a constant. This fits our situation perfectly. The distribution of the chain converges to the correct distribution, but the convergence may be slow. The chain may also have autocorrelation.
17 Checking convergence The simplest is to monitor the series of values of a variable. Does the pattern seem to stabilize? A slightly more advanced method is to use several parallell Markov chains with independent starting points. If convergence is reached, the range of values spanned by all chains should be the same as the range of values spanned by each chain; otherwise it is larger. This is measured by a quantity called R, and estimated by ˆR. If ˆR goes down towards 1, this indicates convergence. High autocorrelation means that the chain moves very slowly. This will also indicate slow convergence.
18 Improving convergence A popular type of McMC is Gibbs sampling. Each proposal changes only one of the variables in the variable vector, and the proposal is based on the conditional distribution of this variable given all the others. Gibbs sampling often works great, and is easy to implement. However, for highly correlated variables, convergence can be too slow. General methods to improve convergence speed exist. But often, the most efficient is to look carefully at the shape of your distribution, and choose a proposal function adapted to it.
19 Using the sample for inference Given a sample from a distribution, all properties of the distribution can in fact be estimated from this sample. For example given a sample of size of a variable, you can estimate a 95% credibility interval (i.e., an interval that covers 95% of the probability density) by finding the 250 th and the 9750 th values in the ordered set. In R, use quantile. To estimate the expectation of any function f of the variable θ, simply compute f (θ 1 ), f (θ 2 ),..., f (θ ) and take their average.
20 Most statisticians use both frequentist and Bayesian methods, so a large proportion of software available, also R packages, use some Bayesian ideas. When models are Bayesian Networks with finite-values variables (or only normally distributed variables), algorithms for exact inference are available in programs like Hugin (commercial) or GeNIe (free). There are a few general-purpose programs for models formulated as a Bayesian Network. The most famous and oldest is BUGS, which exists in a number of incarnations (WinBUGS, OpenBUGS). It implements Gibbs sampling, basically. It can be accesseed from R via a number of different R packages, e.g., R2OpenBUGS, brugs, etc. etc. Some more modern general-purpose programs exist, most notably JAGS and STAN. They implement improvements to the algoritmns of BUGS that in general increase convergence speed.
Probabilistic Machine Learning
Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent
More informationIntroduction to Bayesian Statistics 1
Introduction to Bayesian Statistics 1 STA 442/2101 Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 42 Thomas Bayes (1701-1761) Image from the Wikipedia
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationPARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.
PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More information2 Inference for Multinomial Distribution
Markov Chain Monte Carlo Methods Part III: Statistical Concepts By K.B.Athreya, Mohan Delampady and T.Krishnan 1 Introduction In parts I and II of this series it was shown how Markov chain Monte Carlo
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationHPD Intervals / Regions
HPD Intervals / Regions The HPD region will be an interval when the posterior is unimodal. If the posterior is multimodal, the HPD region might be a discontiguous set. Picture: The set {θ : θ (1.5, 3.9)
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationBayesian Inference: Posterior Intervals
Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)
More informationChapter 5. Bayesian Statistics
Chapter 5. Bayesian Statistics Principles of Bayesian Statistics Anything unknown is given a probability distribution, representing degrees of belief [subjective probability]. Degrees of belief [subjective
More informationEstimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio
Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist
More informationBeta statistics. Keywords. Bayes theorem. Bayes rule
Keywords Beta statistics Tommy Norberg tommy@chalmers.se Mathematical Sciences Chalmers University of Technology Gothenburg, SWEDEN Bayes s formula Prior density Likelihood Posterior density Conjugate
More informationIntroduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??
to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationBayesian Statistics. Debdeep Pati Florida State University. February 11, 2016
Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationST 740: Model Selection
ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationDS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling
DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationComputer intensive statistical methods
Lecture 11 Markov Chain Monte Carlo cont. October 6, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university The two stage Gibbs sampler If the conditional distributions are easy to sample
More informationComputer intensive statistical methods
Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of
More informationMarkov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017
Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the
More informationBUGS Bayesian inference Using Gibbs Sampling
BUGS Bayesian inference Using Gibbs Sampling Glen DePalma Department of Statistics May 30, 2013 www.stat.purdue.edu/~gdepalma 1 / 20 Bayesian Philosophy I [Pearl] turned Bayesian in 1971, as soon as I
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationBayes: All uncertainty is described using probability.
Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationINTRODUCTION TO BAYESIAN STATISTICS
INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationBayesian inference: what it means and why we care
Bayesian inference: what it means and why we care Robin J. Ryder Centre de Recherche en Mathématiques de la Décision Université Paris-Dauphine 6 November 2017 Mathematical Coffees Robin Ryder (Dauphine)
More informationBayesian Computation
Bayesian Computation CAS Centennial Celebration and Annual Meeting New York, NY November 10, 2014 Brian M. Hartman, PhD ASA Assistant Professor of Actuarial Science University of Connecticut CAS Antitrust
More informationConfidence Intervals. CAS Antitrust Notice. Bayesian Computation. General differences between Bayesian and Frequntist statistics 10/16/2014
CAS Antitrust Notice Bayesian Computation CAS Centennial Celebration and Annual Meeting New York, NY November 10, 2014 Brian M. Hartman, PhD ASA Assistant Professor of Actuarial Science University of Connecticut
More informationSTAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01
STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More informationA Bayesian Approach to Phylogenetics
A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte
More information36-463/663Multilevel and Hierarchical Models
36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population
More information36-463/663: Hierarchical Linear Models
36-463/663: Hierarchical Linear Models Taste of MCMC / Bayes for 3 or more levels Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Practical Bayes Mastery Learning Example A brief taste of JAGS
More informationBayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014
Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation
More information(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics
Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What
More informationDiscrete Binary Distributions
Discrete Binary Distributions Carl Edward Rasmussen November th, 26 Carl Edward Rasmussen Discrete Binary Distributions November th, 26 / 5 Key concepts Bernoulli: probabilities over binary variables Binomial:
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationBayesian Networks in Educational Assessment
Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior
More informationBayesian Inference. Chapter 1. Introduction and basic concepts
Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More informationMcGill University. Department of Epidemiology and Biostatistics. Bayesian Analysis for the Health Sciences. Course EPIB-675.
McGill University Department of Epidemiology and Biostatistics Bayesian Analysis for the Health Sciences Course EPIB-675 Lawrence Joseph Bayesian Analysis for the Health Sciences EPIB-675 3 credits Instructor:
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationTwo examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.
Two examples of the use of fuzzy set theory in statistics Glen Meeden University of Minnesota http://www.stat.umn.edu/~glen/talks 1 Fuzzy set theory Fuzzy set theory was introduced by Zadeh in (1965) as
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers
ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors RicardoS.Ehlers Laboratório de Estatística e Geoinformação- UFPR http://leg.ufpr.br/ ehlers ehlers@leg.ufpr.br II Workshop on Statistical
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationIntroduc)on to Bayesian Methods
Introduc)on to Bayesian Methods Bayes Rule py x)px) = px! y) = px y)py) py x) = px y)py) px) px) =! px! y) = px y)py) y py x) = py x) =! y "! y px y)py) px y)py) px y)py) px y)py)dy Bayes Rule py x) =
More informationComputational Perception. Bayesian Inference
Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters
More informationData Analysis and Uncertainty Part 2: Estimation
Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable
More informationCS 361: Probability & Statistics
March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the
More informationBayesian statistics, simulation and software
Module 3: Bayesian principle, binomial model and conjugate priors Department of Mathematical Sciences Aalborg University 1/14 Motivating example: Spelling correction (Adapted from Bayesian Data Analysis
More informationECE521 W17 Tutorial 6. Min Bai and Yuhuai (Tony) Wu
ECE521 W17 Tutorial 6 Min Bai and Yuhuai (Tony) Wu Agenda knn and PCA Bayesian Inference k-means Technique for clustering Unsupervised pattern and grouping discovery Class prediction Outlier detection
More informationThe Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model
Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for
More informationLecture 2: Conjugate priors
(Spring ʼ) Lecture : Conjugate priors Julia Hockenmaier juliahmr@illinois.edu Siebel Center http://www.cs.uiuc.edu/class/sp/cs98jhm The binomial distribution If p is the probability of heads, the probability
More informationPIER HLM Course July 30, 2011 Howard Seltman. Discussion Guide for Bayes and BUGS
PIER HLM Course July 30, 2011 Howard Seltman Discussion Guide for Bayes and BUGS 1. Classical Statistics is based on parameters as fixed unknown values. a. The standard approach is to try to discover,
More informationBayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida
Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:
More informationCS540 Machine learning L9 Bayesian statistics
CS540 Machine learning L9 Bayesian statistics 1 Last time Naïve Bayes Beta-Bernoulli 2 Outline Bayesian concept learning Beta-Bernoulli model (review) Dirichlet-multinomial model Credible intervals 3 Bayesian
More informationComparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters
Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University
More information19 : Slice Sampling and HMC
10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often
More informationThe Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations
The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture
More informationReview: Statistical Model
Review: Statistical Model { f θ :θ Ω} A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced the data. The statistical model
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationCS 340 Fall 2007: Homework 3
CS 34 Fall 27: Homework 3 1 Marginal likelihood for the Beta-Bernoulli model We showed that the marginal likelihood is the ratio of the normalizing constants: p(d) = B(α 1 + N 1, α + N ) B(α 1, α ) = Γ(α
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.
Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.
More informationBayesian Inference. p(y)
Bayesian Inference There are different ways to interpret a probability statement in a real world setting. Frequentist interpretations of probability apply to situations that can be repeated many times,
More informationA short diversion into the theory of Markov chains, with a view to Markov chain Monte Carlo methods
A short diversion into the theory of Markov chains, with a view to Markov chain Monte Carlo methods by Kasper K. Berthelsen and Jesper Møller June 2004 2004-01 DEPARTMENT OF MATHEMATICAL SCIENCES AALBORG
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationBayesian Analysis of RR Lyrae Distances and Kinematics
Bayesian Analysis of RR Lyrae Distances and Kinematics William H. Jefferys, Thomas R. Jefferys and Thomas G. Barnes University of Texas at Austin, USA Thanks to: Jim Berger, Peter Müller, Charles Friedman
More informationHierarchical Models & Bayesian Model Selection
Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or
More informationLecture 2: Priors and Conjugacy
Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.
More informationFundamental Probability and Statistics
Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are
More informationSpatial Statistics Chapter 4 Basics of Bayesian Inference and Computation
Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed
More informationTopic 16 Interval Estimation. The Bootstrap and the Bayesian Approach
Topic 16 Interval Estimation and the Bayesian Approach 1 / 9 Outline 2 / 9 The confidence regions have been determined using aspects of the distribution of the data, by, for example, appealing to the central
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference
More informationLecture 6: Markov Chain Monte Carlo
Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline
More informationBayesian Meta-analysis with Hierarchical Modeling Brian P. Hobbs 1
Bayesian Meta-analysis with Hierarchical Modeling Brian P. Hobbs 1 Division of Biostatistics, School of Public Health, University of Minnesota, Mayo Mail Code 303, Minneapolis, Minnesota 55455 0392, U.S.A.
More informationMonte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 12 MCMC for Bayesian computation II March 1, 2013 J. Olsson Monte Carlo-based
More informationInference for a Population Proportion
Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More informationBayesian Inference. STA 121: Regression Analysis Artin Armagan
Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y
More informationBayesian inference for factor scores
Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor
More informationCompute f(x θ)f(θ) dθ
Bayesian Updating: Continuous Priors 18.05 Spring 2014 b a Compute f(x θ)f(θ) dθ January 1, 2017 1 /26 Beta distribution Beta(a, b) has density (a + b 1)! f (θ) = θ a 1 (1 θ) b 1 (a 1)!(b 1)! http://mathlets.org/mathlets/beta-distribution/
More information