STAT Homework 7 - Solutions

Similar documents
Lecture Notes 15 Hypothesis Testing (Chapter 10)

Summary. Recap ... Last Lecture. Summary. Theorem

Last Lecture. Wald Test

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

STAT Homework 1 - Solutions

32 estimating the cumulative distribution function

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Introductory statistics

STAT Homework 2 - Solutions

x = Pr ( X (n) βx ) =

Frequentist Inference

Topic 9: Sampling Distributions of Estimators

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Lecture 11 and 12: Basic estimation theory

Topic 9: Sampling Distributions of Estimators

STATISTICAL INFERENCE

Stat 319 Theory of Statistics (2) Exercises

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Topic 9: Sampling Distributions of Estimators

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Problem Set 4 Due Oct, 12

STAT431 Review. X = n. n )

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

STAC51: Categorical data Analysis

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Chapter 6 Principles of Data Reduction

Estimation for Complete Data

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Lecture 7: Properties of Random Samples

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Common Large/Small Sample Tests 1/55

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Simulation. Two Rule For Inverting A Distribution Function

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

SDS 321: Introduction to Probability and Statistics

AMS 216 Stochastic Differential Equations Lecture 02 Copyright by Hongyun Wang, UCSC ( ( )) 2 = E X 2 ( ( )) 2

Lecture 12: September 27

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Statistical Theory MT 2009 Problems 1: Solution sketches

Solutions: Homework 3

Sample Size Determination (Two or More Samples)

Stat 421-SP2012 Interval Estimation Section

Expectation and Variance of a random variable

Statistical Theory MT 2008 Problems 1: Solution sketches

Topic 18: Composite Hypotheses

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

HOMEWORK I: PREREQUISITES FROM MATH 727

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Last Lecture. Unbiased Test

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Statistics 20: Final Exam Solutions Summer Session 2007

Exponential Families and Bayesian Inference

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

independence of the random sample measurements, we have U = Z i ~ χ 2 (n) with σ / n 1. Now let W = σ 2. We then have σ 2 (x i µ + µ x ) 2 i =1 ( )

Lecture 33: Bootstrap

MATH/STAT 352: Lecture 15

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

5. Likelihood Ratio Tests

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

AMS570 Lecture Notes #2

Department of Mathematics

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Maximum Likelihood Estimation

Homework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.

4. Hypothesis testing (Hotelling s T 2 -statistic)

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Composite Hypotheses

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Mathematical Statistics Anna Janicka

Efficient GMM LECTURE 12 GMM II

Distributions of Functions of. Normal Random Variables Version 27 Jan 2004

TAMS24: Notations and Formulas

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

This section is optional.

Confidence Intervals for the Population Proportion p

Lecture 18: Sampling distributions

Lecture 19: Convergence

Stat410 Probability and Statistics II (F16)

Transcription:

STAT-36700 Homework 7 - Solutios Fall 208 October 28, 208 This cotais solutios for Homework 7. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better isight. Problem. Let X,..., X N (µ, σ 2 ). (a) Assume that σ 2 is kow. Ivert the likelihood ratio test to costruct a exact α cofidece iterval for µ. (b) Agai, assume that σ 2 is kow. Ivert the asymptotic likelihood ratio test to costruct a approximate α cofidece iterval for µ. (c) Now assume that σ 2 is ukow. Ivert the asymptotic likelihood ratio test to costruct a approximate α cofidece set for (µ, σ). Solutio. We derive each of the results below: (a) For each µ 0, we ca costruct a α level test of H 0 : µ = µ 0 versus H : µ = µ 0 as follows: λ(x,..., X ) = sup µ Θ 0 L(µ) sup µ Θ L(µ) = L(µ 0) L( ˆµ) = exp( (X i µ 0 ) 2 /(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( (X i X + X µ 0 ) 2 /(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( ((X i X) 2 + (X µ 0 ) 2 + 2(X i X)(X µ 0 ))/(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( (X i X) 2 /(2σ 2 )) exp( (X µ 0 ) 2 /(2σ 2 )) exp( (X i X) 2 /(2σ 2 )) = exp( (X µ 0 ) 2 /(2σ 2 )) where ˆµ = ˆµ MLE = X sice (X i X) = 0 Hece, we reject H 0 if λ(x,..., X ) c ( X µ 0 σ/ )2 c for some c The ( X µ 0 σ/ )2 c becomes a α level test if P µ0 (( X µ 0 σ/ )2 c ) = α

stat-36700 homework 7 - solutios 2 which implies that c = χ 2,α sice ( X µ 0 σ/ )2 χ 2 uder H 0. The, by ivertig the test, we have that C = {µ : ( X µ σ/ )2 < χ 2,α } = [X is a α cofidece iterval for µ. (b) The asymptotic likelihood ratio test claims that χ 2,α σ, X + χ 2,α σ ] (c) 2 log λ(x,..., X ) χ 2 sice df = dim(θ) dim(θ 0 ) = 0 =, which implies that by calculatios from part (a). The ( X µ 0 σ/ )2 χ 2 P µ0 (( X µ 0 σ/ )2 χ 2,α ) = α ad the cofidece iterval for µ is (same as part (a)) by ivertig the test. C = [X χ 2,α σ, X + χ 2,α σ ] λ(x,..., X ) = sup µ,σ Θ 0 L(µ, σ 2 ) sup µ,σ Θ L(µ, σ 2 ) = L(µ 0, ˆσ 0 2) L( ˆµ, ˆ σ2 ) where ˆµ = ˆµ MLE = X, ˆσ 2 = ˆσ 2 MLE = (X i X) 2, ad ˆσ 2 0 = (X i µ 0 ) 2. The λ(x,..., X ) = ( ˆσˆσ 0 ) exp( 2ˆσ 2 0 = ( ˆσˆσ 0 ) exp( 2 + 2 ) = ( ˆσ2 ˆσ 0 2 ) /2 = (X i X) 2 (X i µ 0 ) 2 (X i µ 0 ) 2 + 2ˆσ 2 (X i X) 2 )

stat-36700 homework 7 - solutios 3 Hece, we reject at α level if 2 log λ(x,..., X ) > χ 2,α 2 log( (X i X) 2 ) + 2 log( (agai sice df = dim(θ) dim(θ 0 ) = 0 = ) ad thus α cofidece iterval for µ is C = {µ : 2 log( by ivertig the test. (X i X) 2 ) + 2 log( (X i µ) 2 ) χ 2,α } (X i µ 0 ) 2 ) > χ 2,α

stat-36700 homework 7 - solutios 4 Problem 2. Let X,..., X Poisso(λ). (a) By ivertig the LRT, costruct a approximate α cofidece set for λ (b) Let λ = 0. Fix a value of. Simulate observatios from the Poisso(λ) distributio. Costruct your cofidece set from part (a) ad see if it icludes the true value. Repeat this experimet 000 times to estimate the coverage probability. Plot the coverage as a fuctio of. Iclude your code as a appedix to the assigmet. Solutio 2. We derive each of the results below: (a) We claim that E(X) = µ. Proof. For the Poisso distributio we have P(X = x) = e λ λ x x!. So we ca derive the MLE as follows. L(θ x, x 2,..., x ) = = e λ λ x i x i! ( e λ λ x i = l(θ x, x 2,..., x ) = λ + = l(θ x, x 2,..., x ) θ Settig this to 0 we have that = + x i λ = x i λ = λ = x Differetiatig the log-likelihood, we have that So ˆ λ MLE = X is the MLE. 2 l(λ x, x 2,..., x ) θ 2 ) x i! x i log(λ) = x i λ 2 < 0 log(x i ) Now we have that dim (Θ) dim (Θ 0 ) = 0 =. So we do t reject the LRT (cotrollig for Type I error at α) if ( ) L(λ0 ) 2 log χ L( ˆλ 2,α MLE ) [ ( ) ] ˆλMLE λ0 2 log exp ( (λ 0 ˆλ MLE )) χ ˆλ 2,α MLE [ 2 (λ 0 ˆλ MLE ) ˆλ MLE log ( λ0 ˆλ MLE )] χ 2,α λ 0 ˆλ MLE log(λ 0 ) ˆλ MLE ˆλ MLE log( ˆλ MLE ) + χ2,α 2

stat-36700 homework 7 - solutios 5 We ca ivert this expressio for θ to give us the α cofidece set for λ from the LRT as follows: { } C = λ λ ˆλ MLE log(λ) ˆλ MLE ˆλ MLE log( ˆλ MLE ) + χ2,α 2 (b) We ca use R to perform the approximatio as follows: # Clea up rm ( l i s t = l s ( ) ) cat ( "\04" ) # L i b r a r i e s # i s t a l l. p a c k a g e s (" t i d y v e r s e ") l i b r a r y ( t i d y v e r s e ) # S e t s e e d f o r r e p r o d u c i b i l i t y b a s e : : s e t. s e e d (7456445) # Setup p a r a m e t e r s lambd _ 0 < 0 um_exp_ r e p l < 000 a l p h < 0. 0 5 um_ samps _ low < 0 um_ samps _ high < 5000 um_ samps _ by < 500 # H elper f u c t i o s # Check t h a t our t r u e lambda _0 s a t i s f i e s our c o v e r a g e CI a t t h e a l p h a l e v e l # f o r t h e s p e c i f i e d umber o f s i m u l a t e d s a m p l e s p o i s s _ l r t _cov < fuctio ( lambd _ 0, lambd _MLE, alph, ) { b a s e : : r e t u r ( lambd _0 lambd _MLE* log ( lambd _ 0) <= lambd _MLE lambd _MLE* log ( lambd _MLE) + s t a t s : : qchisq ( p = alph, df = ) / (2 * ) ) } # S i g l e e x p e r i m e t s i g l e _ p o i s s _ e x p t < fuctio (, lambd _ 0, a l p h ) { draw_ p o i s s < s t a t s : : rpois ( =, lambda = lambd _ 0) lambd _MLE < b a s e : : mea ( draw_ p o i s s ) c o f _ s e t _cov < p o i s s _ l r t _cov ( lambd _0 = lambd _ 0, lambd _MLE = lambd _MLE, a l p h = alph, = ) b a s e : : r e t u r ( c o f _ s e t _cov ) }

stat-36700 homework 7 - solutios 6 # C r e a t e s e q u e c e f o r i. e. umber o f p o i s s o s a m p l e s t o draw f o r # e a c h r e p l i c a t i o um_ samps < b a s e : : seq. i t ( from = um_ samps _low, t o = um_ samps _ high, by = um_ samps _by ) # Ru t h e e x p e r i m e t s, f o r e a c h, we r e p l i c a t e t h e e x p e r i m e t # (um_ exp _ r e p l ) 000 t i m e s out _exp < p u r r r : : map (. x = um_samps, ~ r e p l i c a t e ( = um_exp_ r e p l, expr = s i g l e _ p o i s s _ e x p t ( =. x, lambd _0 = lambd _ 0, a l p h = a l p h ) ) ) # For e a c h, we measure c o v e r a g e p r o b a b i l i t y as a mea o f a l l # t i m e s c o v e r a g e was s a t i s f i e d i t h e r e p l i c a t i o s out _exp_ covg < p u r r r : : map_ d b l (. x = out _exp, mea ) # We p l o t t h e c o v e r a g e p r o b a b i l i t y as a f u c t i o o f covg _ df < t i b b l e : : t i b b l e ( = um_samps, covg _ prob = out _exp_ covg ) covg _ plot < covg _ df %>% g g p l o t 2 : : g g p l o t ( data =., a e s ( x =, y = out _exp_ covg ) ) + g g p l o t 2 : : geom_ p o i t ( ) + g g p l o t 2 : : geom_ l i e ( ) + # Add t h e a l p h a l i e g g p l o t 2 : : geom_ h l i e ( y i t e r c e p t = alph, l i e t y p e =" dashed ", c o l o r = " b l u e ", s i z e =) + g g p l o t 2 : : ylim ( 0. 6, ) + g g p l o t 2 : : l a b s ( t i t l e = " Coverage p r o b a b i l i t y o f lambda (= 0) vs (000 r x = "Number o f s a m p l e s ", y = " Coverage p r o b a b i l i t y " ) covg _ plot We observe that as icreases the coverage probability becomes more stable aroud α = 0.95 for alpha = 0.05

Plot: stat-36700 homework 7 - solutios 7

stat-36700 homework 7 - solutios 8 Problem 3. Suppose we are give idepedet p-values P,..., P N. (a) Fid the distributio of mi i P i whe all the ull hypothesis are true. (b) Suppose we reject all ull hypotheses such that P i < t. Fid the probability of at least oe false rejectio whe all the ull hypothesis are true. Fid t that makes this probability exactly α. How does this compare to the Boferroi rule? Solutio 3. We derive each of the results below: (a) We claim that F mii [N] P i (γ) = [ γ] N. We ote that for Homework 6, problem 3(a) that whe the ull hypothesis is true that the p-value is distributed as a Uif[0, ] distributio. Now usig this fact we have that uder the global ull all P i Uif[0, ] (idepedet ad idetically distributed). Now let P () := mi i [N] P i Proof. ( ) F P() (γ) = P P () γ ( ) = P P () > γ (Usig complemetary evets) ( ) = P i [N] {P i > γ} (Sice mi implies all evets greater tha γ) = P(P i > γ) (Usig idepedece of P i s) i [] = [ P(P i γ)] (Usig complemetary evets) i [N] [ = FPi (γ) ] i [N] = [ F P (γ) ] N (Sice P i s are idetically distributed) = ( γ) N (Sice P i Uif[0, ] i [N]) (b) We claim that t = ( α) N. Suppose we reject all ull hypotheses such that P i < t. Let I = {i H 0,i is true} The give the idepedece of the tests we proceed as follows:

stat-36700 homework 7 - solutios 9 Proof. P(makig atleast oe false rejectio) = P(P i < t for some i I) = P( i I {P i < t}) Settig ( t) N = α we have that t = ( α) N. = P( i I {P i t}) (Usig complemetary evets) ( ) = P mi P i t (Re-expressig usig setup from part (a)) i [I] ( ) = P mi P i < t (Usig complemetary evets) i [I] = ( t) I (Usig the CDF from part (a)) ( t) N (give I N ad t (0, )) Commet > α We ote i geeral that by Taylor expasio that ( α) N ad as such assumig the tests are idepedet we ote that Boferroi correctio is more coservative tha the Sidak correctio proposed here. If the tests are ot idepedet the we ca t use this correctio with guaratee ad thus ot directly comparable to the Boferroi correctio.

stat-36700 homework 7 - solutios 0 Problem 4. Suppose we observe iid p-values P,..., P N. Suppose that the distributio for P i is πu + ( π)g where 0 π, U deotes a Uiform (0,) distributio ad G is some other distributio o (0, ). I other words, there is a probability π that H 0 is true for each p-value. (a) Suppose π is kow. Suppose we use the rejectio threshold defied by the largest i such that P (i) < iα/(nπ). Show that this cotrols the false discovery rate at level α. (Hit: use the proof i Lecture Notes 8.) Explai why this rule has higher power tha the Bejamii-Hochberg threshold. (b) I practice π is ot kow. But we ca estimate it as follows. Whe the ull is false, we expect the p-value to be ear 0. To capture this idea, suppose that G is a distributio that puts all of its probability less tha /2. I other words, P(P i < /2) =. Let ˆπ = (/N) i I(P i > /2). Show that 2 ˆπ is a cosistet estimator of π. Solutio 4.(a) Followig the same logic as the proof i Lecture 8, let F be the distributio of P i, we have N E[FDP] N W ie[i(p i t)] N N E[I(P i t)] = t I /N F(t) tπ ˆF(t) Let t = P (i) < iα Nπ, the ˆF(t) = i/n, ad E[FDP] = α iα π Nπ i/n Higher power tha the Bejamii-Hochberg threshold because Nπ iα > iα N ad thus max{j : P (j) < iα Nπ } > max{j : P (j) < iα N }. This test has a bigger rejectio regio. (b) First, otice that E[I(P i > /2)] = P(P i > /2) = P(P i /2) = [πu(/2) + ( π)g(/2)] = π/2 + π sice G(/2) = = π/2

stat-36700 homework 7 - solutios The for all ε > 0, P( 2 ˆπ π > ε) = P( N N I(P i > /2) π 2 > ε 2 ) Var( N N I(P i > /2)) ε 2 /4 N = (E(I(P i > /2) 2 ) E(I(P i > /2)) 2 ) ε 2 /4 = N 0 π 2 π2 4 ε 2 /4

stat-36700 homework 7 - solutios 2 Problem 5. I this questio, we explore the frequetist properties of the Bayesia posterior. Let X N(µ, ). Let µ N(0, ) be the prior for µ. (a) Fid the posterior p(µ X). (b) Fid a(x) ad b(x) such that b(x) p(µ X)dµ = 0.95. I other words, a(x) C = [a(x), b(x)] is the 95 percet Bayesia posterior iterval. (c) Now compute the frequetist coverage of C as a fuctio of µ. I other words, fix the true value of the parameter µ. Now, treatig µ as fixed ad X N(µ, ) as radom, compute P µ (a(x) < µ < b(x)). Plot this probability as a fuctio of µ. Is the coverage probability equal to 0.95? Solutio 5. We ote the solutios for each part as follows: (a) See Lec Notes 7, example 2, the posterior distributio of µ is µ X N ( X 2, 2 ) (b) Let µ X N ( X 2, 2 ), we wat Thus P[a(X) µ b(x)] = 0.95 a(x) X/2 P[ µ X/2 b(x) X/2 ] = 0.95 /2 /2 /2 a(x) = X 2 2 z 0.025 b(x) = X 2 + 2 z 0.025 (c) P µ = P[ X 2 2 z 0.025 < µ < X 2 + 2 z 0.025] = P[µ 2 z 0.025 < X µ < µ + 2 z 0.025 ] = Φ(µ + 2 z 0.025 ) Φ(µ 2 z 0.025 ) The coverage probability is ot 0.95. See the figure below.

stat-36700 homework 7 - solutios 3 Plot: Code i Pytho: from s c i p y. s t a t s import orm import umpy a s p import m a t p l o t l i b. p y p l o t as p l t def p_mu (mu ) : upper = mu + 2 * * 0. 5 * orm. p p f ( 0. 9 7 5, 0, ) l o w e r = mu 2 * * 0. 5 * orm. p p f ( 0. 9 7 5, 0, ) retur orm. c d f ( upper ) orm. c d f ( l o w e r ) mu_lst = p. a r a g e ( 2.0, 2. 0, 0. ) p _ l s t = [ ] f o r mu i mu_lst : p _ l s t. apped ( p_mu (mu ) ) f i g, ax = p l t. s u b p l o t s (, ) ax. p l o t ( mu_lst, p _ l s t ) ax. s e t _ x l a b e l ( "mu" ) ax. s e t _ y l a b e l ( "P_mu" ) p l t. show ( )