Last Lecture Biostatistics 60 - Statistical Iferece Lecture 16 Evaluatio of Bayes Estimator Hyu Mi Kag March 14th, 013 What is a Bayes Estimator? Is a Bayes Estimator the best ubiased estimator? Compared to other estimators, what are advatages of Bayes Estimator? What is cojugate family? What are the cojugate families of Biomial, oisso, ad Normal distributio? Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 1 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 / 8 - Bayes Estimator - Example θ : parameter π(θ) : prior distributio θ f (x θ) : samplig distributio osterior distributio of θ x Joit π(θ x) Margial f (x θ)π(θ) m(x) m(x) f(x θ)π(θ)dθ (Bayes rule) Bayes Estimator of θ is E(θ x) θπ(θ x)dθ θ 1,, iid Beroulli(p) π(p) Beta(α, β) rior guess : ˆp α α+β osterior distributio : π(p x) Beta( x i + α, x i + β) Bayes estimator ˆp α + x i α + β + xi α + β + + α α + β α + β α + β + Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 3 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 4 / 8
Loss Fuctio Optimality Loss Fuctio Let L(θ, ˆθ) be a fuctio of θ ad ˆθ Squared error loss The mea squared error (MSE) is defied as MSE(ˆθ) Eˆθ θ Let ˆθ is a estimator If ˆθ θ, it makes a correct decisio ad loss is 0 If ˆθ θ, it makes a mistake ad loss is ot 0 L(ˆθ, θ) (ˆθ θ) MSE Average Loss EL(θ, ˆθ) which is the expectatio of the loss if ˆθ is used to estimate θ Absolute error loss L(ˆθ) ˆθ θ A loss that pealties overestimatio more tha uderestimatio L(θ, ˆθ) (ˆθ θ) I(ˆθ < θ) + 10(ˆθ θ) I(ˆθ θ) Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 5 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 6 / 8 Risk Fuctio - Average Loss Alterative defiitio of R(θ, ˆθ) EL(θ, ˆθ()) θ If L(θ, ˆθ) (ˆθ θ), R(θ, ˆθ) is MSE A estimator with smaller R(θ, ˆθ) is preferred Defiitio : Bayes risk is defied as the average risk across all values of θ give prior π(θ) R(θ, ˆθ)π(θ)dθ The Bayes rule with respect to a prior π is the optimal estimator with respect to a Bayes risk, which is defied as the oe that miimize the Bayes risk R(θ, ˆθ)π(θ)dθ EL(θ, ˆθ())π(θ)dθ f(x θ)l(θ, ˆθ(x))dx π(θ)dθ f(x θ)l(θ, ˆθ(x))π(θ)dx dθ π(θ x)m(x)l(θ, ˆθ(x))dx dθ L(θ, ˆθ())π(θ x)dθ m(x)dx Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 7 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 8 / 8
osterior Expected Loss Bayes Estimator based o squared error loss osterior expected loss is defied as π(θ x)l(θ, ˆθ(x))dθ L(ˆθ, θ) (ˆθ θ) osterior expected loss (θ ˆθ) π(θ x)dθ E(θ ˆθ) x So, the goal is to miimize E(θ ˆθ) x A alterative defiitio of Bayes rule estimator is the estimator that miimizes the posterior expected loss E (θ ˆθ) x E (θ E(θ x) + E(θ x) ˆθ) x E (θ E(θ x)) x + E E (θ E(θ x)) x + (E(θ x) ˆθ) x E(θ x) ˆθ Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 9 / 8 which is miimized whe ˆθ E(θ x) Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 10 / 8 so far Bayes Estimator based o absolute error loss Loss fuctio L(θ, ˆθ) eg (ˆθ θ), ˆθ θ Risk fuctio R(θ, ˆθ) is average of L(θ, ˆθ) across all x For squared error loss, risk fuctio is the same to MSE Bayes risk Average risk across all θ, based o the prior of θ Alteratively, average posterior error loss across all x Bayes estimator ˆθ Eθ x Based o squared error loss, Miimize Bayes risk Miimize osterior Expected Loss Suppose that L(θ, ˆθ) θ ˆθ The posterior expected loss is EL(θ, ˆθ(x)) θ ˆθ(x) π(θ x)dθ E θ ˆθ x ˆθ (θ ˆθ)π(θ x)dθ + ˆθ EL(θ, ˆθ(x)) 0, ad ˆθ is posterior media ˆθ (θ ˆθ)π(θ x)dθ Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 11 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 1 / 8
Two Bayes Rules Example Cosider a poit estimatio problem for real-valued parameter θ For squared error loss, the posterior expected loss is (θ ˆθ) π(θ x)dθ E(θ ˆθ) x This expected value is miimized by ˆθ E(θ x) So the Bayes rule estimator is the mea of the posterior distributio For absolute error loss, the posterior expected loss is E( θ ˆθ x) As show previously, this is miimized by choosig ˆθ as the media of π(θ x) 1,, iid Beroulli(p) π(p) Beta(α, β) The posterior distributio follows Beta( x i + α, x i + β) Bayes estimator that miimizes posterior expected squared error loss is the posterior mea xi + α ˆp α + β + Bayes estimator that miimizes posterior expected absolute error loss is the posterior media ˆθ 0 Γ(α + β + ) Γ( x i + α)γ( x i + β) p x i +α 1 (1 p) xi +β 1 dp 1 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 13 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 14 / 8 Asymptotic Evaluatio of oit Estimators Tools for provig cosistecy Whe the sample size approaches ifiity, the behaviors of a estimator are ukow as its asymptotic properties Defiitio - Let W W ( 1,, ) W () be a sequece of estimators for τ(θ) We say W is cosistet for estimatig τ(θ) if W τ(θ) uder θ for every θ W τ(θ) (coverges i probability to τ(θ)) meas that, give ay ϵ > 0 lim τ(θ) ϵ) 0 lim τ(θ) < ϵ) 1 Whe W τ(θ) < ϵ ca also be represeted that W is close to τ(θ) implies that the probability of W close to τ(θ) approaches to 1 as goes to Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 15 / 8 Use defiitio (complicated) Chebychev s Iequality r( W τ(θ) ϵ) r((w τ(θ)) ϵ ) EW τ(θ) ϵ MSE(W ) ϵ Bias (W ) + Var(W ) ϵ Need to show that both Bias(W ) ad Var(W ) coverges to zero Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 16 / 8
Theorem for cosistecy Weak Law of Large Numbers Theorem 1013 If W is a sequece of estimators of τ(θ) satisfyig lim > Bias(W ) 0 lim > Var(W ) 0 for all θ, the W is cosistet for τ(θ) Theorem 55 Let 1,, be iid radom variables with E() µ ad Var() σ < The coverges i probability to µ ie µ Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 17 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 18 / 8 Cosistet sequece of estimators Example Theorem 1015 Let W is a cosistet sequece of estimators of τ(θ) Let a, b be sequeces of costats satisfyig 1 lim a 1 lim b 0 The U a W + b is also a cosistet sequece of estimators of τ(θ) Cotiuous Map Theorem If W is cosistet for θ ad g is a cotiuous fuctio, the g(w ) is cosistet for g(θ) roblem 1,, are iid samples from a distributio with mea µ ad variace σ < 1 Show that is cosistet for µ Show that 1 i1 ( i ) is cosistet for σ 3 Show that 1 1 i1 ( i ) is cosistet for σ Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 19 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 0 / 8
Example - Solutio Solutio - cosistecy for σ roof: is cosistet for µ By law of large umbers, is cosistet for µ Bias( ) E( ) µ µ µ 0 ( ) i1 Var( ) Var i 1 i1 Var( i) σ / σ lim Var() lim 0 By Theorem 1013 is cosistet for µ Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 1 / 8 (i ) By law of large umbers, 1 i ( i + i ) i + i1 i i E µ + σ Note that is a fuctio of Defie g(x) x, which is a cotiuous fuctio The g() is cosistet for µ Therefore, (i ) i (µ + σ ) µ σ Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 / 8 Solutio - cosistecy for σ (cot d) Example - Expoetial Family From the preious slide, ( i ) / is cosistet for σ Defie S 1 1 (i ), ad (S ) 1 (i ) S 1 (i ) (S 1 ) 1 Because (S ) was show to be cosistet for σ previously, ad a 1 1 as, by Theorem 1015, S is also cosistet for σ roblem iid Suppose 1,, Expoetial(β) 1 ropose a cosistet estimator of the media ropose a cosistet estimator of r( c) where c is costat Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 3 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 4 / 8
Cosistet estimator for the media Cosistet estimator of r( c) First, we eed to express the media i terms of the parameter β m 0 1 β e x/β dx 1 e x/β m 1 0 1 e m/β 1 media m β log By law of large umbers, is cosistet for E β Applyig cotiuous mappig Theorem to g(x) x log, g() log is cosistet for g(β) β log (media) r( c) c 0 1 β e x/β dx 1 e c/β As is cosistet for β, 1 e c/β is cotiuous fuctio of β By cotiuous mappig Theorem, g() 1 e c/ is cosistet for r( c) 1 e c/β g(β) Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 5 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 6 / 8 Cosistet estimator of r( c) - Alterative Method Defie Y i I( i c) The Y i iid Beroulli(p) where p r( c) Y 1 Y i 1 i1 I( i c) i1 is cosistet for p by Law of Large Numbers Today Fuctios Law of Large Numbers Next Lecture Cetral Limit Theorem Slutsky Theorem Delta Method Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 7 / 8 Hyu Mi Kag Biostatistics 60 - Lecture 16 March 14th, 013 8 / 8