STAT-36700 Homework 7 - Solutios Fall 208 October 28, 208 This cotais solutios for Homework 7. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better isight. Problem. Let X,..., X N (µ, σ 2 ). (a) Assume that σ 2 is kow. Ivert the likelihood ratio test to costruct a exact α cofidece iterval for µ. (b) Agai, assume that σ 2 is kow. Ivert the asymptotic likelihood ratio test to costruct a approximate α cofidece iterval for µ. (c) Now assume that σ 2 is ukow. Ivert the asymptotic likelihood ratio test to costruct a approximate α cofidece set for (µ, σ). Solutio. We derive each of the results below: (a) For each µ 0, we ca costruct a α level test of H 0 : µ = µ 0 versus H : µ = µ 0 as follows: λ(x,..., X ) = sup µ Θ 0 L(µ) sup µ Θ L(µ) = L(µ 0) L( ˆµ) = exp( (X i µ 0 ) 2 /(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( (X i X + X µ 0 ) 2 /(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( ((X i X) 2 + (X µ 0 ) 2 + 2(X i X)(X µ 0 ))/(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( (X i X) 2 /(2σ 2 )) exp( (X µ 0 ) 2 /(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( (X µ 0 ) 2 /(2σ 2 )) where ˆµ = ˆµ MLE = X sice (X i X) = 0 Hece, we reject H 0 if λ(x,..., X ) c ( X µ 0 σ/ )2 c for some c The ( X µ 0 σ/ )2 c becomes a α level test if P µ0 (( X µ 0 σ/ )2 c ) = α
stat-36700 homework 7 - solutios 2 which implies that c = χ 2,α sice ( X µ 0 σ/ )2 χ 2 uder H 0. The, by ivertig the test, we have that C = {µ : ( X µ σ/ )2 < χ 2,α } = [X is a α cofidece iterval for µ. (b) The asymptotic likelihood ratio test claims that χ 2,α σ, X + χ 2,α σ ] (c) 2 log λ(x,..., X ) χ 2 sice df = dim(θ) dim(θ 0 ) = 0 =, which implies that by calculatios from part (a). The ( X µ 0 σ/ )2 χ 2 P µ0 (( X µ 0 σ/ )2 χ 2,α ) = α ad the cofidece iterval for µ is (same as part (a)) by ivertig the test. C = [X χ 2,α σ, X + χ 2,α σ ] λ(x,..., X ) = sup µ,σ Θ 0 L(µ, σ 2 ) sup µ,σ Θ L(µ, σ 2 ) = L(µ 0, ˆσ 0 2) L( ˆµ, ˆ σ2 ) where ˆµ = ˆµ MLE = X, ˆσ 2 = ˆσ 2 MLE = (X i X) 2, ad ˆσ 2 0 = (X i µ 0 ) 2. The λ(x,..., X ) = ( ˆσˆσ 0 ) exp( 2ˆσ 2 0 = ( ˆσˆσ 0 ) exp( 2 + 2 ) = ( ˆσ2 ˆσ 0 2 ) /2 = (X i X) 2 (X i µ 0 ) 2 (X i µ 0 ) 2 + 2ˆσ 2 (X i X) 2 )
stat-36700 homework 7 - solutios 3 Hece, we reject at α level if 2 log λ(x,..., X ) > χ 2,α 2 log( (X i X) 2 ) + 2 log( (agai sice df = dim(θ) dim(θ 0 ) = 0 = ) ad thus α cofidece iterval for µ is C = {µ : 2 log( by ivertig the test. (X i X) 2 ) + 2 log( (X i µ) 2 ) χ 2,α } (X i µ 0 ) 2 ) > χ 2,α
stat-36700 homework 7 - solutios 4 Problem 2. Let X,..., X Poisso(λ). (a) By ivertig the LRT, costruct a approximate α cofidece set for λ (b) Let λ = 0. Fix a value of. Simulate observatios from the Poisso(λ) distributio. Costruct your cofidece set from part (a) ad see if it icludes the true value. Repeat this experimet 000 times to estimate the coverage probability. Plot the coverage as a fuctio of. Iclude your code as a appedix to the assigmet. Solutio 2. We derive each of the results below: (a) We claim that E(X) = µ. Proof. For the Poisso distributio we have P(X = x) = e λ λ x x!. So we ca derive the MLE as follows. L(θ x, x 2,..., x ) = = e λ λ x i x i! ( e λ λ x i = l(θ x, x 2,..., x ) = λ + = l(θ x, x 2,..., x ) θ Settig this to 0 we have that = + x i λ = x i λ = λ = x Differetiatig the log-likelihood, we have that So ˆ λ MLE = X is the MLE. 2 l(λ x, x 2,..., x ) θ 2 ) x i! x i log(λ) = x i λ 2 < 0 log(x i ) Now we have that dim (Θ) dim (Θ 0 ) = 0 =. So we do t reject the LRT (cotrollig for Type I error at α) if ( ) L(λ0 ) 2 log χ L( ˆλ 2,α MLE ) [ ( ) ] ˆλMLE λ0 2 log exp ( (λ 0 ˆλ MLE )) χ ˆλ 2,α MLE [ 2 (λ 0 ˆλ MLE ) ˆλ MLE log ( λ0 ˆλ MLE )] χ 2,α λ 0 ˆλ MLE log(λ 0 ) ˆλ MLE ˆλ MLE log( ˆλ MLE ) + χ2,α 2
stat-36700 homework 7 - solutios 5 We ca ivert this expressio for θ to give us the α cofidece set for λ from the LRT as follows: { } C = λ λ ˆλ MLE log(λ) ˆλ MLE ˆλ MLE log( ˆλ MLE ) + χ2,α 2 (b) We ca use R to perform the approximatio as follows: # Clea up rm ( l i s t = l s ( ) ) cat ( "\04" ) # L i b r a r i e s # i s t a l l. p a c k a g e s (" t i d y v e r s e ") l i b r a r y ( t i d y v e r s e ) # S e t s e e d f o r r e p r o d u c i b i l i t y b a s e : : s e t. s e e d (7456445) # Setup p a r a m e t e r s lambd _ 0 < 0 um_exp_ r e p l < 000 a l p h < 0. 0 5 um_ samps _ low < 0 um_ samps _ high < 5000 um_ samps _ by < 500 # H elper f u c t i o s # Check t h a t our t r u e lambda _0 s a t i s f i e s our c o v e r a g e CI a t t h e a l p h a l e v e l # f o r t h e s p e c i f i e d umber o f s i m u l a t e d s a m p l e s p o i s s _ l r t _cov < fuctio ( lambd _ 0, lambd _MLE, alph, ) { b a s e : : r e t u r ( lambd _0 lambd _MLE* log ( lambd _ 0) <= lambd _MLE lambd _MLE* log ( lambd _MLE) + s t a t s : : qchisq ( p = alph, df = ) / (2 * ) ) } # S i g l e e x p e r i m e t s i g l e _ p o i s s _ e x p t < fuctio (, lambd _ 0, a l p h ) { draw_ p o i s s < s t a t s : : rpois ( =, lambda = lambd _ 0) lambd _MLE < b a s e : : mea ( draw_ p o i s s ) c o f _ s e t _cov < p o i s s _ l r t _cov ( lambd _0 = lambd _ 0, lambd _MLE = lambd _MLE, a l p h = alph, = ) b a s e : : r e t u r ( c o f _ s e t _cov ) }
stat-36700 homework 7 - solutios 6 # C r e a t e s e q u e c e f o r i. e. umber o f p o i s s o s a m p l e s t o draw f o r # e a c h r e p l i c a t i o um_ samps < b a s e : : seq. i t ( from = um_ samps _low, t o = um_ samps _ high, by = um_ samps _by ) # Ru t h e e x p e r i m e t s, f o r e a c h, we r e p l i c a t e t h e e x p e r i m e t # (um_ exp _ r e p l ) 000 t i m e s out _exp < p u r r r : : map (. x = um_samps, ~ r e p l i c a t e ( = um_exp_ r e p l, expr = s i g l e _ p o i s s _ e x p t ( =. x, lambd _0 = lambd _ 0, a l p h = a l p h ) ) ) # For e a c h, we measure c o v e r a g e p r o b a b i l i t y as a mea o f a l l # t i m e s c o v e r a g e was s a t i s f i e d i t h e r e p l i c a t i o s out _exp_ covg < p u r r r : : map_ d b l (. x = out _exp, mea ) # We p l o t t h e c o v e r a g e p r o b a b i l i t y as a f u c t i o o f covg _ df < t i b b l e : : t i b b l e ( = um_samps, covg _ prob = out _exp_ covg ) covg _ plot < covg _ df %>% g g p l o t 2 : : g g p l o t ( data =., a e s ( x =, y = out _exp_ covg ) ) + g g p l o t 2 : : geom_ p o i t ( ) + g g p l o t 2 : : geom_ l i e ( ) + # Add t h e a l p h a l i e g g p l o t 2 : : geom_ h l i e ( y i t e r c e p t = alph, l i e t y p e =" dashed ", c o l o r = " b l u e ", s i z e =) + g g p l o t 2 : : ylim ( 0. 6, ) + g g p l o t 2 : : l a b s ( t i t l e = " Coverage p r o b a b i l i t y o f lambda (= 0) vs (000 r x = "Number o f s a m p l e s ", y = " Coverage p r o b a b i l i t y " ) covg _ plot We observe that as icreases the coverage probability becomes more stable aroud α = 0.95 for alpha = 0.05
Plot: stat-36700 homework 7 - solutios 7
stat-36700 homework 7 - solutios 8 Problem 3. Suppose we are give idepedet p-values P,..., P N. (a) Fid the distributio of mi i P i whe all the ull hypothesis are true. (b) Suppose we reject all ull hypotheses such that P i < t. Fid the probability of at least oe false rejectio whe all the ull hypothesis are true. Fid t that makes this probability exactly α. How does this compare to the Boferroi rule? Solutio 3. We derive each of the results below: (a) We claim that F mii [N] P i (γ) = [ γ] N. We ote that for Homework 6, problem 3(a) that whe the ull hypothesis is true that the p-value is distributed as a Uif[0, ] distributio. Now usig this fact we have that uder the global ull all P i Uif[0, ] (idepedet ad idetically distributed). Now let P () := mi i [N] P i Proof. ( ) F P() (γ) = P P () γ ( ) = P P () > γ (Usig complemetary evets) ( ) = P i [N] {P i > γ} (Sice mi implies all evets greater tha γ) = P(P i > γ) (Usig idepedece of P i s) i [] = [ P(P i γ)] (Usig complemetary evets) i [N] [ = FPi (γ) ] i [N] = [ F P (γ) ] N (Sice P i s are idetically distributed) = ( γ) N (Sice P i Uif[0, ] i [N]) (b) We claim that t = ( α) N. Suppose we reject all ull hypotheses such that P i < t. Let I = {i H 0,i is true} The give the idepedece of the tests we proceed as follows:
stat-36700 homework 7 - solutios 9 Proof. P(makig atleast oe false rejectio) = P(P i < t for some i I) = P( i I {P i < t}) Settig ( t) N = α we have that t = ( α) N. = P( i I {P i t}) (Usig complemetary evets) ( ) = P mi P i t (Re-expressig usig setup from part (a)) i [I] ( ) = P mi P i < t (Usig complemetary evets) i [I] = ( t) I (Usig the CDF from part (a)) ( t) N (give I N ad t (0, )) Commet > α We ote i geeral that by Taylor expasio that ( α) N ad as such assumig the tests are idepedet we ote that Boferroi correctio is more coservative tha the Sidak correctio proposed here. If the tests are ot idepedet the we ca t use this correctio with guaratee ad thus ot directly comparable to the Boferroi correctio.
stat-36700 homework 7 - solutios 0 Problem 4. Suppose we observe iid p-values P,..., P N. Suppose that the distributio for P i is πu + ( π)g where 0 π, U deotes a Uiform (0,) distributio ad G is some other distributio o (0, ). I other words, there is a probability π that H 0 is true for each p-value. (a) Suppose π is kow. Suppose we use the rejectio threshold defied by the largest i such that P (i) < iα/(nπ). Show that this cotrols the false discovery rate at level α. (Hit: use the proof i Lecture Notes 8.) Explai why this rule has higher power tha the Bejamii-Hochberg threshold. (b) I practice π is ot kow. But we ca estimate it as follows. Whe the ull is false, we expect the p-value to be ear 0. To capture this idea, suppose that G is a distributio that puts all of its probability less tha /2. I other words, P(P i < /2) =. Let ˆπ = (/N) i I(P i > /2). Show that 2 ˆπ is a cosistet estimator of π. Solutio 4.(a) Followig the same logic as the proof i Lecture 8, let F be the distributio of P i, we have N E[FDP] N W ie[i(p i t)] N N E[I(P i t)] = t I /N F(t) tπ ˆF(t) Let t = P (i) < iα Nπ, the ˆF(t) = i/n, ad E[FDP] = α iα π Nπ i/n Higher power tha the Bejamii-Hochberg threshold because Nπ iα > iα N ad thus max{j : P (j) < iα Nπ } > max{j : P (j) < iα N }. This test has a bigger rejectio regio. (b) First, otice that E[I(P i > /2)] = P(P i > /2) = P(P i /2) = [πu(/2) + ( π)g(/2)] = π/2 + π sice G(/2) = = π/2
stat-36700 homework 7 - solutios The for all ε > 0, P( 2 ˆπ π > ε) = P( N N I(P i > /2) π 2 > ε 2 ) Var( N N I(P i > /2)) ε 2 /4 N = (E(I(P i > /2) 2 ) E(I(P i > /2)) 2 ) ε 2 /4 = N 0 π 2 π2 4 ε 2 /4
stat-36700 homework 7 - solutios 2 Problem 5. I this questio, we explore the frequetist properties of the Bayesia posterior. Let X N(µ, ). Let µ N(0, ) be the prior for µ. (a) Fid the posterior p(µ X). (b) Fid a(x) ad b(x) such that b(x) p(µ X)dµ = 0.95. I other words, a(x) C = [a(x), b(x)] is the 95 percet Bayesia posterior iterval. (c) Now compute the frequetist coverage of C as a fuctio of µ. I other words, fix the true value of the parameter µ. Now, treatig µ as fixed ad X N(µ, ) as radom, compute P µ (a(x) < µ < b(x)). Plot this probability as a fuctio of µ. Is the coverage probability equal to 0.95? Solutio 5. We ote the solutios for each part as follows: (a) See Lec Notes 7, example 2, the posterior distributio of µ is µ X N ( X 2, 2 ) (b) Let µ X N ( X 2, 2 ), we wat Thus P[a(X) µ b(x)] = 0.95 a(x) X/2 P[ µ X/2 b(x) X/2 ] = 0.95 /2 /2 /2 a(x) = X 2 2 z 0.025 b(x) = X 2 + 2 z 0.025 (c) P µ = P[ X 2 2 z 0.025 < µ < X 2 + 2 z 0.025] = P[µ 2 z 0.025 < X µ < µ + 2 z 0.025 ] = Φ(µ + 2 z 0.025 ) Φ(µ 2 z 0.025 ) The coverage probability is ot 0.95. See the figure below.
stat-36700 homework 7 - solutios 3 Plot: Code i Pytho: from s c i p y. s t a t s import orm import umpy a s p import m a t p l o t l i b. p y p l o t as p l t def p_mu (mu ) : upper = mu + 2 * * 0. 5 * orm. p p f ( 0. 9 7 5, 0, ) l o w e r = mu 2 * * 0. 5 * orm. p p f ( 0. 9 7 5, 0, ) retur orm. c d f ( upper ) orm. c d f ( l o w e r ) mu_lst = p. a r a g e ( 2.0, 2. 0, 0. ) p _ l s t = [ ] f o r mu i mu_lst : p _ l s t. apped ( p_mu (mu ) ) f i g, ax = p l t. s u b p l o t s (, ) ax. p l o t ( mu_lst, p _ l s t ) ax. s e t _ x l a b e l ( "mu" ) ax. s e t _ y l a b e l ( "P_mu" ) p l t. show ( )