Likelihood and Bayesian Inference for Proportions September 9, 2009 Readings Hoff Chapter 3 Likelihood and Bayesian Inferencefor Proportions p.1/21
Giardia In a New Zealand research program on human health risks from recreational contact with water contaminated with pathogenic microbiological material, the National Institiute of Water and Atmosphere conducted a water quality study at a variety of catchment types. They found that out of n = 87 one-liter samples from municipal catchments y = 6 sites contained Giardia cysts. Inference about, the proportion of sites with Giardia. Probability model for data? Likelihood and Bayesian Inferencefor Proportions p.2/21
Binomial Model Independent Bernoulli trials X i, (i = 1,...,n) Success probability : p(x i = x i ) = x i (1 ) 1 x i Y = number of successes = n i=1 X i Y Bin(n,) ( ) n p(y ) = y (1 ) n y for y = 0, 1,...,n y E(Y ) = n, V (Y ) = n(1 ) R functions: dbinom, pbinom, qbinom, rbinom Likelihood and Bayesian Inferencefor Proportions p.3/21
Sampling Distribution Assumptions Y Bin(n,) INFERENCE for : based on observed proportion p = y/n (and n of course) Common point estimate: observed proportion p = y/n E(p ) = (unbiased estimator) V (p ) = (1 )/n increase in precision for large n and small/high CLT for CI and testing Likelihood and Bayesian Inferencefor Proportions p.4/21
Likelihood Function Likelihood function: L() p(y ) for FIXED y, look at how probability of the data changes as varies over parameter space ( ) 87 L() = 6 (1 ) 81 6 (1 ) 81 6 For each value of, the likelihood says how well that value of explains the observed data Calculations easier with log likelihood function log L() y log() + (n y) log(1 ) Likelihood and Bayesian Inferencefor Proportions p.5/21
Maximum Likelihood Estimate What is most likely value of for this data? Find value of that maximizes the likelihood Maximum likelihood estimate of : ˆ = p = y/n Binomial Likelihood n=87, y=6 Log Likelihood n=87, y=6 L() 0.0 e+00 1.5 e 10 3.0 e 10 0.0 0.2 0.4 0.6 0.8 1.0 log(l()) 500 300 100 0.0 0.2 0.4 0.6 0.8 1.0 Likelihood and Bayesian Inferencefor Proportions p.6/21
Adding Greek and Math Symbols to Plots In the plot command I used xlab=expression(theta), ylab=expression(l(theta))) If the text argument to one of the text-drawing functions ( text, mtext, axis, titles, x- and y-axis labels) in R is an expression, the argument is interpreted as a mathematical expression and the output will be formatted according to TeX-like rules. For lots of examples see: demo(plotmath) Likelihood and Bayesian Inferencefor Proportions p.7/21
Functions of Parameters: Odds odds: o o() = /(1 ) inverse (o) = o/(1 + o) Likelihood is same under 1-1 transformation: p(y o) = p(y (o)) MLE of g() is g(ˆ) estimated probability that a sample will contain Giardia cysts 6/87 = 0.069 estimated odds that a sample will contain Giardia cysts is 0.074 to 1 estimated odds that a sample will not contain Giardia cysts is 13.5 to 1 Likelihood and Bayesian Inferencefor Proportions p.8/21
Likelihood Ratios Likelihood ratios: compare two values of Likelihood defined up to multiplicative (positive) constant Standardized (or relative) likelihood: relative to value at MLE r() = p(y ) p(y ˆ) Same answers (from likelihood viewpoint) from binomial data (y successes out of n) observed Bernoulli data (list of successes/failures in order) Likelihood and Bayesian Inferencefor Proportions p.9/21
Likelihood Intervals Relative Liklihood 0.0 0.2 0.4 0.6 0.8 1.0 0.00 0.05 0.10 0.15 0.20 0.25 Interval of values such that r() > r, r = 0.1: Interval is 0.026, 0.142, ˆ = 0.069 Interval is not symmetric around ˆ Probability that interval covers true? Likelihood and Bayesian Inferencefor Proportions p.10/21
Large Sample Approximations For large n, CLT p = y/n N(,(1 )/n) Asymptotic approximation of likelihood and distribution theory leads to ˆ ± 2 log(r) p(1 p) Choose r to have asymptotic 95% coverage: 95% CI for : (0.016, 0.122) Probability that the interval covers (prior to seeing the data) is 0.95. Likelihood and Bayesian Inferencefor Proportions p.11/21 n
Bayes Theorem Conditional on observed outcome y (and n) the posterior distribution of is p( y) = p()p(y ) p(y) 0 < < 1 where p(y) = p(y )p()d is the marginal density of data OR p( y) p()p(y ) subject to normalization to unit integral (Bayes theorem in proportional form) Likelihood and Bayesian Inferencefor Proportions p.12/21
Bayes Inference about Initial prior uncertainty about described by a prior distribution p(). Uniform density is a common non-informative choice p() = 1, 0 < < 1 flat density, each point equally weighted uninformative about true value Likelihood and Bayesian Inferencefor Proportions p.13/21
Results with Uniform Prior Posterior p( y) y (1 ) n y = y+1 1 (1 ) n y+1 1 Recognize that kernel of density is a Beta(y + 1,n y + 1) p( y) = Γ(a + b Γ(a)Γ(b) a 1 (1 ) b 1 for 0 < < 1 Posterior mean (y +1)/(n+2) and mode (usually) y/n In R: dbeta, pbeta, qbeta, rbeta quantiles (percentiles, percentage points), qbeta(c(0.025,0.5,0.975),y+1,n-y+1) Likelihood and Bayesian Inferencefor Proportions p.14/21
Posterior Distribution for Water Samples Under Uniform prior, posterior distribution for is Beta(7, 82) plot(theta, dbeta(theta,7,82),t="l",lty=1) lines(theta, dbeta(theta, 1,1), lty=2) Posterior Density 0 5 10 15 Posterior Prior 0.00 0.05 0.10 0.15 0.20 0.25 Likelihood and Bayesian Inferencefor Proportions p.15/21
Beta Distributions p() = 1 B(a,b) a 1 (1 ) b 1 for 0 < < 1 where B(a,b) = Γ(a)Γ(b) Γ(a+b) Beta(a,b) or Beta(Mm,M(1 m)) where m = a/m,m = a + b E() = a/(a + b) = m V () = m(1 m)/(m + 1) More concentrated, or precise, for larger M unique mode at (a 1)/(a + b 2) if a,b > 1 mode at 0 if a < 1, and/or one at 1 if b < 1 Likelihood and Bayesian Inferencefor Proportions p.16/21
Shapes of Beta Priors Beta(0.5,0.5) Beta(0.5,1) Beta(0.5,2) density 1.0 2.0 density 0.5 2.0 3.5 density 0 2 4 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 Beta(1,0.5) Beta(1,1) Beta(1,2) density 0.5 2.0 3.5 density 0.6 1.0 1.4 density 0.0 1.0 2.0 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 Beta(4,0.5) Beta(4,1) Beta(4,2) density 0 3 6 density 0 2 4 density 0.0 1.0 2.0 0.0 0.4 0.8 0.0 0.4 0.8 0.0 0.4 0.8 Likelihood and Bayesian Inferencefor Proportions p.17/21
Beta-Binomial Model Prior: Beta(a,b) p() a 1 (1 ) b 1 Likelihood: L() = y (1 ) n y Posterior: p( Y ) L()p() y+a 1 (1 ) n y+b 1 Y Beta(y + a,n y + b) Posterior Mean: ( ) y + a n + a + b = n (y ) + n + a + b n weighted average of MLE and prior mean ( a + b ) ( a ) n + a + b a + b a is prior number of 1 s and a + b is prior sample size Likelihood and Bayesian Inferencefor Proportions p.18/21
Conjugate Prior Distributions Consider a class of prior distributions, p() P. We say that the class is conjugate for a sampling model p(y ), if p() P implies that p( Y ) P for all p() P and data y. If y has a binomial distribution, then the class of Beta prior distributions is conjugate. We will see th<at sampling models based on exponential families all have conjugate priors. Likelihood and Bayesian Inferencefor Proportions p.19/21
Posterior Intervals Summary estimates of Names: Credible intervals, (Bayesian) Confidence, Posterior intervals Central intervals (equal tails): e.g., 95% interval qbeta(c(0.025,0.975),a,b) Alternatives: Highest posterior density (HPD) intervals One-sided intervals, etc. Likelihood and Bayesian Inferencefor Proportions p.20/21
HPD Regions p( c(0.03 0.14) y) = 0.95 posterior Density 0 5 10 15 0.00 0.05 0.10 0.15 0.20 0.25 solve.hpd.beta(6,87, h=.14,xlim=c(0,0.25)) Likelihood and Bayesian Inferencefor Proportions p.21/21