Beta statistics. Keywords. Bayes theorem. Bayes rule

Similar documents
PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

Chapter 5. Bayesian Statistics

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Introduction to Bayesian Methods

Classical and Bayesian inference

Bayesian Statistics Part III: Building Bayes Theorem Part IV: Prior Specification

Chapter 8: Sampling distributions of estimators Sections

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Introduction to Applied Bayesian Modeling. ICPSR Day 4

STAT 425: Introduction to Bayesian Analysis

Statistical Theory MT 2007 Problems 4: Solution sketches

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Introduction to Bayesian Statistics 1

A Very Brief Summary of Bayesian Inference, and Examples

USEFUL PROPERTIES OF THE MULTIVARIATE NORMAL*

1 Introduction. P (n = 1 red ball drawn) =

Statistical Theory MT 2006 Problems 4: Solution sketches

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

HPD Intervals / Regions


Bayesian statistics, simulation and software

Bayesian statistics, simulation and software


General Bayesian Inference I

Chapter 5 continued. Chapter 5 sections

ST 740: Multiparameter Inference

Advanced topics from statistics

Statistics Masters Comprehensive Exam March 21, 2003

Other Noninformative Priors

A Very Brief Summary of Statistical Inference, and Examples

Chapter 4. Bayesian inference. 4.1 Estimation. Point estimates. Interval estimates

Hierarchical Models & Bayesian Model Selection

Computational Perception. Bayesian Inference

Bayesian Inference. p(y)

STAT J535: Chapter 5: Classes of Bayesian Priors

Foundations of Statistical Inference

Lecture 2. (See Exercise 7.22, 7.23, 7.24 in Casella & Berger)

Part III. A Decision-Theoretic Approach and Bayesian testing

Computer intensive statistical methods

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

36-463/663: Hierarchical Linear Models

Lecture 4. Continuous Random Variables and Transformations of Random Variables

Estimation of Quantiles

Department of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences

INTRODUCTION TO BAYESIAN METHODS II

1 A simple example. A short introduction to Bayesian statistics, part I Math 217 Probability and Statistics Prof. D.

Some Asymptotic Bayesian Inference (background to Chapter 2 of Tanner s book)

One-parameter models

9 Bayesian inference. 9.1 Subjective probability

Generalized Bayesian Inference with Sets of Conjugate Priors for Dealing with Prior-Data Conflict

Statistical Inference: Maximum Likelihood and Bayesian Approaches

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

Parameter Estimation

STAT 509 Section 3.4: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

Homework 1: Solution

CS 361: Probability & Statistics

Continuous Random Variables and Continuous Distributions

Point Estimation. Vibhav Gogate The University of Texas at Dallas

Contents 1. Contents

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

EXAMPLE: INFERENCE FOR POISSON SAMPLING

Part 2: One-parameter models

Introduction to Probabilistic Machine Learning

Lecture 2: Conjugate priors

10. Exchangeability and hierarchical models Objective. Recommended reading

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Bayesian Inference: Posterior Intervals

The Random Variable for Probabilities Chris Piech CS109, Stanford University

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Bayesian inference: what it means and why we care

Lecture 3. Univariate Bayesian inference: conjugate analysis

Introduction to Bayesian Inference

Checking for Prior-Data Conflict

Classical and Bayesian inference

Monte Carlo-based statistical methods (MASM11/FMS091)

Probability Review - Bayes Introduction

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Markov Chain Monte Carlo methods

Part 3 Robust Bayesian statistics & applications in reliability networks

The Bayesian Paradigm

Foundations of Statistical Inference

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

Continuous random variables

Common probability distributionsi Math 217 Probability and Statistics Prof. D. Joyce, Fall 2014

Part 4: Multi-parameter and normal models

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Poisson CI s. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Computer intensive statistical methods

Bayes and Empirical Bayes Estimation of the Scale Parameter of the Gamma Distribution under Balanced Loss Functions

Probability and Estimation. Alan Moses

INTRODUCTION TO BAYESIAN STATISTICS

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

HT Introduction. P(X i = x i ) = e λ λ x i

Discrete Binary Distributions

CS 361: Probability & Statistics

More Spectral Clustering and an Introduction to Conjugacy

Transcription:

Keywords Beta statistics Tommy Norberg tommy@chalmers.se Mathematical Sciences Chalmers University of Technology Gothenburg, SWEDEN Bayes s formula Prior density Likelihood Posterior density Conjugate priors Beta and Gamma distributions Up-dating proportions and rates Credibility intervals Predictive density Non-informative or reference prior Reference analysis February 22, 2010 p. 1/2 February 22, 2010 p. 2/2 Bayes rule Bayes theorem We first go from the elementary formula, P (A B) = to the advanced rule of Bayes, P (A)P (B A) P (A)P (B A)+P (A )P (B A ) f(y x) = R f(y)f(x y) f(y)f(x y) dy π(θ x) π(θ)f(x θ) θ is a parameter, the value of which we are uncertain about, and π(θ) is the pdf modelling the uncertainty, f(x θ) is the statistical model for our observation x, π(θ x) is the pdf modelling our uncertainty after observing x. February 22, 2010 p. 3/2 February 22, 2010 p. 4/2

Terminology Proportions or probabilities π(θ x) π(θ)f(x θ) π(θ) is called the prior density for θ, f(x θ) is called the likelihood for x given θ, π(θ x) is called the posterior density for θ. are often modelled by beta(α, β)-densities. The beta(α, β)-pdf is given by π(θ) = Γ(α + β) Γ(α)Γ(β) θα 1 (1 θ) β 1 for 0 <θ<1 The mean, variance and mode are The word prior stems from the Latin àpriori, meaning beforehand or in advance, and posterior stems from the Latin word à posteriori, meaning after. μ θ = α α + β σ 2 θ = (the mode only for α, β > 1). αβ (α + β) 2 (α + β +1) θ = α 1 α + β 2 February 22, 2010 p. 5/2 February 22, 2010 p. 6/2 Up-dating a proportion Exercise If x θ bin(n, θ) where θ beta(α, β) then θ x beta(α + x, β + n x) Thus, the beta prior is conjugate to the binomial likelihood. 7. Assume that you are working with remediation of contaminated land. At a particular site the object of interest is the proportion θ oftheareathatis contaminated. Assume that θ is modelled by a beta-density with mean μ θ =0.6 and standard deviation σ θ =0.2. Then 5 soil samples are taken in randomly and independently chosen locations, the outcome of which are that contamination is found in 4 of the 5 samples. Calculate the posterior mean and standard deviation for θ. February 22, 2010 p. 7/2 February 22, 2010 p. 8/2

Rates Up-dating an exponential rate are often modelled by gamma(α, σ)-densities. If The gamma(α, σ)-pdf is given by x = x 1,...,x n i.i.d exp(λ) where λ gamma(α, σ) π(λ) = (λ/σ)α 1 e λ/σ σγ(α) The mean, variance and mode are for λ>0 then ( λ x gamma α + n, σ 1+σ i x i ) μ λ = ασ σ 2 λ = ασ2 ˆλ =(α 1)σ (the mode only for α>1). Thus, the gamma density is conjugate to the exponential likelihood. February 22, 2010 p. 9/2 February 22, 2010 p. 10/2 Exercise Up-dating the rate of a Poisson process 8. Let λ be the time until failure of a critical component in a computer installation. Suppose that 3 installations are tested until failure and that the observed failure times are 0.166, 0.117, 1.500 (in some conveniently chosen time unit). If λ prior to observing the data is modelled by a gamma-density with mean 2 and standard deviation 2, what is the posterior density for λ given the data. Calculate its mean and standard deviation. If then x Poi(λt) where λ gamma(α, σ) ( λ x gamma α + x, ) σ 1+σt Thus, the gamma prior is also conjugate to the Poisson likelihood. February 22, 2010 p. 11/2 February 22, 2010 p. 12/2

Exercise 9. Road accidents often occur according to a Poisson process. Assume that the mean number of accidents per year, λ say, is modelled by a gamma(α, σ)-density with α =0.5 and σ =. (This is the so called reference prior. Note that it is non-proper.) Two years pass, during which 6 accidents occur. What are the parameters of the posterior gamma-density. Calculate its mean and standard deviation. Conjugate priors Whenever the posterior is of the same distributional family as the prior, the latter is said to be conjugate to the likelihood. The beta and gamma priors are conjugate to the binomial and exponential (or Poisson) likelihoods, resp. February 22, 2010 p. 13/2 February 22, 2010 p. 14/2 Credibility intervals Expert elicitation Assume that we have modelled the uncertainty in a parameter θ with a density π(θ). Let θ 0.05 and θ 0.95 be the 5th and 95th percentiles of π(θ). Then (θ 0.05,θ 0.95 ) is a 90% (symmetric) credibility or uncertainty interval for θ. Intervals with other credibility or level are defined analogously. Credibility intervals can of course be one-sided. In such cases one may talk about credibility or uncertainty bounds. Often expert opinion are given in terms of an expected value and a symmetric credibility interval. Suppose, for instance, that λ =0.25 ± 0.10 is an expert s stated 90% credibility interval for a rate λ. This may be interpreted as follows: μ λ =0.25 0.10 1.645σ λ The latter equation is based on a normal approximation of the prior density. It is reasonably correct if σ λ is small relative to μ λ. February 22, 2010 p. 15/2 February 22, 2010 p. 16/2

Exercises 10. Assume that you are going to simulate a probability θ and that the available experts asserts that θ =0.4 ± 0.25 with 90% certainty. Suggest a suitable density for θ and calculate approximation of its parameters. 11. Assume that you are going to simulate an exponential rate λ, and that youre expert asserts that λ =0.25 ± 0.10 with 90% certainty. Suggest a suitable density for λ and calculate approximation of its parameters. The predictive density Suppose that we have up-dated a prior π(θ) with data x following a law defined by a likelihood f(x θ). We then have calculated the posterior density π(θ x) with Bayes s theorem stating π(θ x) π(θ)f(x θ) Suppose next that we are about to make a new observation y independent of the already observed x. We then may calculate the predictive density for y, givenx, as follows: π(y x) = f(y θ)π(θ x) dθ R This is a highlight of the Bayesian theory. February 22, 2010 p. 17/2 February 22, 2010 p. 18/2 Reference priors If nothing is known about the parameter, use beta(1, 1) or beta(0.5, 0.5) as prior in case of an unknown proportion θ, and gamma(1, ) or gamma(1/2, ) in case of an exponential or Poissonian rate λ. beta(0.5, 0.5) is a so called reference prior. This is a prior that has the least influence on the posterior in an information theoretic sense. gamma(1/2, ) is the reference prior for an exponential rate λ; I don t know whether gamma(1/2, ) is the reference prior also for a Poissonian rate λ. I guess it is. Bayesian reference analysis In a Bayesian reference analysis, the reference density is used as prior. We saw an example of this in Exercise 9 above. Another example was demonstrated in our first lecture on statistical inference. February 22, 2010 p. 19/2 February 22, 2010 p. 20/2

Exercises 12. In an attempt to verify that a parameter H, say,is non-negative, 10 unbiased i.i.d observations of it were made. Assume that the observations are normally distributed and that the observed mean and standard deviation are 1.85 and 2.193, resp. Calculate the posterior probability that H is non-negative, given the data. If you cannot calculate an exact value, bound it as much as possible instead. 13. Of ten independentand randomly positioned soil samples at a contaminated site, seven were contaminated. Suggest a prior density for the contaminated proportion. Then calculate the mean and standard deviation of the posterior density. February 22, 2010 p. 21/2 February 22, 2010 p. 22/2 14. Seven automobiles are each run over a 30 000 km test schedule. The testing produced a total of 19 failures. Assuming that the number of failures is x Poi(λt), where the mean no of failures per km, λ, is gamma distributed with α =3and 1/σ = 30 000, what is the prior and posterior mean and standard deviation of λ? Cf with the posterior mean and standard deviation if nothing is known about λ beforehand? 15. On a major highway the number of accidents with fatalities were 0, 0, 2, 0, 1, respectively, during the last five years. If you were asked to predict the no of such accidents during the next year, how would you go along? Describe, but do not carry out the mathematics. Instead solve the simpler problem of predicting whether there will be at least one such accident during the next year or not. February 22, 2010 p. 23/2 February 22, 2010 p. 24/2

Reference Gelman, Carlin: Bayesian data analysis, 2nd ed, 2003. February 22, 2010 p. 25/2