The beta density, Bayes, Laplace, and Pólya

The beta desity, Bayes, Laplae, ad Pólya Saad Meimeh The beta desity as a ojugate form Suppose that is a biomial radom variable with idex ad parameter p, i.e. ( ) P ( p) p ( p) Applyig Bayes s rule, we have: Therefore, a prior of the form f(p ) p ( p) f(p) f(p) p α ( p) β is a ojugate prior sie the posterior will have the form: It is ot hard to show that f(p ) p +α ( p) +β p α ( p) β dp Γ(α)Γ(β) Γ(α + β) Let s deote the above by B(α, β). Therefore, f(p) Be(α, β) where Be(α, β) is alled the beta desity with parameters α > ad β >, ad is give by: B(α, β) pα ( p) β Note that the beta desity a also be viewed as the posterior for p after observig α suesses ad β failures, give a uiform prior o p (here both α ad β are itegers). f(p α, β) p α ( p) β

Example: Cosider a ur otaiig red ad bla balls. The probability of a red ball is p, but p is uow. The prior o p is uiform betwee ad (o speifi owledge). We repeatedly draw balls with replaemet. What is the posterior desity for p after observig α red balls ad β bla balls? f(p α red, β bla) ( α + β 2 α ) p α ( p) β Therefore, f(p) Be(α, β). Note that both α ad β eed to be equal to at least. For istae, after drawig oe red ball oly (α 2, β ), the posterior will be f(p) 2p. Here s a table listig some possible observatios: observatio posterior α, β f(p) α 2, β f(p) 2p α 2, β 2 f(p) 6p( p) α 3, β f(p) 3p 2 α 3, β 2 f(p) 2p 2 ( p) α 3, β 3 f(p) 3p 2 ( p) 2 2 Laplae s rule of suessio I 774, Laplae laimed that a evet whih has ourred times, ad has ot failed thus far, will our agai with probability ( + )/( + 2). This is ow as Laplae s rule of suessio. Laplae applied this result to the surise problem: What is the probability that the su will rise tomorrow? Let X, X 2,... be a sequee of idepedet Beroulli trials with parameter p. Note that this otio of depedee is oditioal o p. More preisely: P (X b, X 2 b 2,..., X b p) P (X i b i ) I fat, X i ad X j are ot idepedet beause by observig X i, oe ould say somethig about p, ad hee about X j. This is a osequee of the Bayesia approah whih treats p itself as a radom variable (uow). Let S i X i. We would lie to fid the followig probability: P (X + S ) i

Observe that: P (X + S ) P (X + p, S )f(p S )dp P (X + p)f(p S )dp pf(p S )dp Therefore, we eed to fid the posterior desity of p. Assumig we ow othig about p iitially, we will adopt the uiform prior f(p) betwee ad. Applyig Bayes rule: We olude that: Fially, f(p S ) P (S p)f(p) p ( p) f(p S ) B( +, + ) p(+) ( p) ( +) P (X + S ) We obtai Laplae s result by settig. 3 Geeralizatio pf(p S )dp + + 2 Cosider a oi toss that a result i head, tail, or edge. We deote by p the probability of head, ad by q the probability of tail, thus the probability of edge is p q. Observe that p, q [, ] ad p + q. I oi tosses, the probability of observig heads ad 2 tails (ad thus 2 edges) is give by the multiomial probability mass futio (this geeralizes the biomial): ( ) ( ) P (, 2 ) p q 2 ( p q) 2 2 The Dirihlet desity is a geeralizatio of beta ad is ojugate to multiomial. For istae: f(p, q) Γ(α + β + γ) Γ(α)Γ(β)Γ(γ) pα q β ( p q) γ

4 Pólya s ur Pólya s ur represets a geeralizatio of a Biomial radom variable. Cosider the followig sheme: A ur otais b bla ad r red balls. The ball draw is always replaed, ad, i additio, balls of the olor draw are added to the ur. Whe, drawigs are equivalet to idepedet Beroulli proesses b b+r with p. However, with, the Beroulli proesses are depedet, eah with a parameter that depeds o the sequee of previous drawigs. For istae, if the first ball is bla, the (oditioal) probability of a bla ball at the seod drawig is (b + )/(b + + r). The probability of the sequee bla, bla is, therefore, b b + b + r b + + r Let X be a radom variable deotig the umber of bla balls draw i trials. What is P (X )? It is easy to show that all sequees with bla balls have the same probability p ad, therefore, ( ) P (X ) p We ow ompute p as: i [b + (i )] i [r + (i )] [b + r + (i )] i Rewritig i terms of the Gamma futio (assumig > ), we have: i [ b + i ] i [ r + i ] i [ b+r + i ] Γ( b +)Γ( r + ) Γ( b )Γ( r ) Γ( b+r +) Γ( b+r ) Γ( b + )Γ( r + ) Γ( b + r + ) Γ( b + r ) Γ( b )Γ( r ) B( b +, r + ) B( b, r ) Therefore, the importat parameters are b/ ad r/. Note that we a rewrite the above as (verify it): So, p ( ) P (X ) ( b p ( p) Be, r ) dp ( b p ( p) Be, r ) dp

5 Pólya s ur geerates beta We ow show that Pólya s ur geerates a beta distributio at the limit. For this, we will osider. lim X /. First ote that we a write P (X ) as follows: P (X ) Γ( b + r ) Γ( + b ) Γ( + r ) Γ( + ) Γ( b )Γ( r ) Γ( + ) Γ( + ) Γ( + b + r ) Usig Stirlig s approximatio Γ(x + ) 2πx ( x e ) x as x goes to ifiity, we a olude that whe x goes to ifiity, Γ(x + a) Γ(x + b) xa b Therefore, whe (but x for some < x < ), Now, P (X ) B( b, r ) b ( ) r b r P ( X x) P (X ) + P (X As goes to ifiity, / goes to zero; therefore: ) +... + P (X x ) P ( X u)du lim [P (X ) + P (X ) +... + P (X x )] P ( X x) P ( X u)du P (X u)du Ad sie u, we a replae by u i the limitig expressio we obtaied for P (X ) to get: P ( X x) B( b, r )ub ( u) r du It is rather iterestig that this limitig property of Pólya s ur depeds o the iitial oditio. Eve more iterestig is that if Y lim X /, the oditioed o Y p we have idepedet Beroulli trials with parameter p (stated without proof). P (X Y p) ( ) p ( p)