Beta statistics. Keywords. Bayes theorem. Bayes rule

Keywords Beta statistics Tommy Norberg tommy@chalmers.se Mathematical Sciences Chalmers University of Technology Gothenburg, SWEDEN Bayes s formula Prior density Likelihood Posterior density Conjugate priors Beta and Gamma distributions Up-dating proportions and rates Credibility intervals Predictive density Non-informative or reference prior Reference analysis February 22, 2010 p. 1/2 February 22, 2010 p. 2/2 Bayes rule Bayes theorem We first go from the elementary formula, P (A B) = to the advanced rule of Bayes, P (A)P (B A) P (A)P (B A)+P (A )P (B A ) f(y x) = R f(y)f(x y) f(y)f(x y) dy π(θ x) π(θ)f(x θ) θ is a parameter, the value of which we are uncertain about, and π(θ) is the pdf modelling the uncertainty, f(x θ) is the statistical model for our observation x, π(θ x) is the pdf modelling our uncertainty after observing x. February 22, 2010 p. 3/2 February 22, 2010 p. 4/2

Terminology Proportions or probabilities π(θ x) π(θ)f(x θ) π(θ) is called the prior density for θ, f(x θ) is called the likelihood for x given θ, π(θ x) is called the posterior density for θ. are often modelled by beta(α, β)-densities. The beta(α, β)-pdf is given by π(θ) = Γ(α + β) Γ(α)Γ(β) θα 1 (1 θ) β 1 for 0 <θ<1 The mean, variance and mode are The word prior stems from the Latin àpriori, meaning beforehand or in advance, and posterior stems from the Latin word à posteriori, meaning after. μ θ = α α + β σ 2 θ = (the mode only for α, β > 1). αβ (α + β) 2 (α + β +1) θ = α 1 α + β 2 February 22, 2010 p. 5/2 February 22, 2010 p. 6/2 Up-dating a proportion Exercise If x θ bin(n, θ) where θ beta(α, β) then θ x beta(α + x, β + n x) Thus, the beta prior is conjugate to the binomial likelihood. 7. Assume that you are working with remediation of contaminated land. At a particular site the object of interest is the proportion θ oftheareathatis contaminated. Assume that θ is modelled by a beta-density with mean μ θ =0.6 and standard deviation σ θ =0.2. Then 5 soil samples are taken in randomly and independently chosen locations, the outcome of which are that contamination is found in 4 of the 5 samples. Calculate the posterior mean and standard deviation for θ. February 22, 2010 p. 7/2 February 22, 2010 p. 8/2

Rates Up-dating an exponential rate are often modelled by gamma(α, σ)-densities. If The gamma(α, σ)-pdf is given by x = x 1,...,x n i.i.d exp(λ) where λ gamma(α, σ) π(λ) = (λ/σ)α 1 e λ/σ σγ(α) The mean, variance and mode are for λ>0 then ( λ x gamma α + n, σ 1+σ i x i ) μ λ = ασ σ 2 λ = ασ2 ˆλ =(α 1)σ (the mode only for α>1). Thus, the gamma density is conjugate to the exponential likelihood. February 22, 2010 p. 9/2 February 22, 2010 p. 10/2 Exercise Up-dating the rate of a Poisson process 8. Let λ be the time until failure of a critical component in a computer installation. Suppose that 3 installations are tested until failure and that the observed failure times are 0.166, 0.117, 1.500 (in some conveniently chosen time unit). If λ prior to observing the data is modelled by a gamma-density with mean 2 and standard deviation 2, what is the posterior density for λ given the data. Calculate its mean and standard deviation. If then x Poi(λt) where λ gamma(α, σ) ( λ x gamma α + x, ) σ 1+σt Thus, the gamma prior is also conjugate to the Poisson likelihood. February 22, 2010 p. 11/2 February 22, 2010 p. 12/2

Exercise 9. Road accidents often occur according to a Poisson process. Assume that the mean number of accidents per year, λ say, is modelled by a gamma(α, σ)-density with α =0.5 and σ =. (This is the so called reference prior. Note that it is non-proper.) Two years pass, during which 6 accidents occur. What are the parameters of the posterior gamma-density. Calculate its mean and standard deviation. Conjugate priors Whenever the posterior is of the same distributional family as the prior, the latter is said to be conjugate to the likelihood. The beta and gamma priors are conjugate to the binomial and exponential (or Poisson) likelihoods, resp. February 22, 2010 p. 13/2 February 22, 2010 p. 14/2 Credibility intervals Expert elicitation Assume that we have modelled the uncertainty in a parameter θ with a density π(θ). Let θ 0.05 and θ 0.95 be the 5th and 95th percentiles of π(θ). Then (θ 0.05,θ 0.95 ) is a 90% (symmetric) credibility or uncertainty interval for θ. Intervals with other credibility or level are defined analogously. Credibility intervals can of course be one-sided. In such cases one may talk about credibility or uncertainty bounds. Often expert opinion are given in terms of an expected value and a symmetric credibility interval. Suppose, for instance, that λ =0.25 ± 0.10 is an expert s stated 90% credibility interval for a rate λ. This may be interpreted as follows: μ λ =0.25 0.10 1.645σ λ The latter equation is based on a normal approximation of the prior density. It is reasonably correct if σ λ is small relative to μ λ. February 22, 2010 p. 15/2 February 22, 2010 p. 16/2

Exercises 10. Assume that you are going to simulate a probability θ and that the available experts asserts that θ =0.4 ± 0.25 with 90% certainty. Suggest a suitable density for θ and calculate approximation of its parameters. 11. Assume that you are going to simulate an exponential rate λ, and that youre expert asserts that λ =0.25 ± 0.10 with 90% certainty. Suggest a suitable density for λ and calculate approximation of its parameters. The predictive density Suppose that we have up-dated a prior π(θ) with data x following a law defined by a likelihood f(x θ). We then have calculated the posterior density π(θ x) with Bayes s theorem stating π(θ x) π(θ)f(x θ) Suppose next that we are about to make a new observation y independent of the already observed x. We then may calculate the predictive density for y, givenx, as follows: π(y x) = f(y θ)π(θ x) dθ R This is a highlight of the Bayesian theory. February 22, 2010 p. 17/2 February 22, 2010 p. 18/2 Reference priors If nothing is known about the parameter, use beta(1, 1) or beta(0.5, 0.5) as prior in case of an unknown proportion θ, and gamma(1, ) or gamma(1/2, ) in case of an exponential or Poissonian rate λ. beta(0.5, 0.5) is a so called reference prior. This is a prior that has the least influence on the posterior in an information theoretic sense. gamma(1/2, ) is the reference prior for an exponential rate λ; I don t know whether gamma(1/2, ) is the reference prior also for a Poissonian rate λ. I guess it is. Bayesian reference analysis In a Bayesian reference analysis, the reference density is used as prior. We saw an example of this in Exercise 9 above. Another example was demonstrated in our first lecture on statistical inference. February 22, 2010 p. 19/2 February 22, 2010 p. 20/2

Exercises 12. In an attempt to verify that a parameter H, say,is non-negative, 10 unbiased i.i.d observations of it were made. Assume that the observations are normally distributed and that the observed mean and standard deviation are 1.85 and 2.193, resp. Calculate the posterior probability that H is non-negative, given the data. If you cannot calculate an exact value, bound it as much as possible instead. 13. Of ten independentand randomly positioned soil samples at a contaminated site, seven were contaminated. Suggest a prior density for the contaminated proportion. Then calculate the mean and standard deviation of the posterior density. February 22, 2010 p. 21/2 February 22, 2010 p. 22/2 14. Seven automobiles are each run over a 30 000 km test schedule. The testing produced a total of 19 failures. Assuming that the number of failures is x Poi(λt), where the mean no of failures per km, λ, is gamma distributed with α =3and 1/σ = 30 000, what is the prior and posterior mean and standard deviation of λ? Cf with the posterior mean and standard deviation if nothing is known about λ beforehand? 15. On a major highway the number of accidents with fatalities were 0, 0, 2, 0, 1, respectively, during the last five years. If you were asked to predict the no of such accidents during the next year, how would you go along? Describe, but do not carry out the mathematics. Instead solve the simpler problem of predicting whether there will be at least one such accident during the next year or not. February 22, 2010 p. 23/2 February 22, 2010 p. 24/2

Reference Gelman, Carlin: Bayesian data analysis, 2nd ed, 2003. February 22, 2010 p. 25/2