Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/ jmding/math494/ ) Spring 2018 Jimin Ding, Math WUSTL Math 494 Spring 2018 1 / 21
Optimal Estimations and Tests Jimin Ding, Math WUSTL Math 494 Spring 2018 2 / 21
Motivation We have learned how to find an estimator, how to construct a CI, and how to construct a test in statistics. But when there are many possible estimators or different ways of testing, how do we evaluate or compare different estimators, different CIs, and different test? Jimin Ding, Math WUSTL Math 494 Spring 2018 3 / 21
Motivation We have learned how to find an estimator, how to construct a CI, and how to construct a test in statistics. But when there are many possible estimators or different ways of testing, how do we evaluate or compare different estimators, different CIs, and different test? For example, Both sample mean and sample median can be used to estimate the population mean, which one is better? Jimin Ding, Math WUSTL Math 494 Spring 2018 3 / 21
Motivation We have learned how to find an estimator, how to construct a CI, and how to construct a test in statistics. But when there are many possible estimators or different ways of testing, how do we evaluate or compare different estimators, different CIs, and different test? For example, Both sample mean and sample median can be used to estimate the population mean, which one is better? In the first exam, the confidence interval for the intensity of Poisson random variables, someone suggested to use X ± 1.96S/ n, while the posted solution says which one is better? X 1±1.96/ n, Jimin Ding, Math WUSTL Math 494 Spring 2018 3 / 21
Motivation We have learned how to find an estimator, how to construct a CI, and how to construct a test in statistics. But when there are many possible estimators or different ways of testing, how do we evaluate or compare different estimators, different CIs, and different test? For example, Both sample mean and sample median can be used to estimate the population mean, which one is better? In the first exam, the confidence interval for the intensity of Poisson random variables, someone suggested to use X ± 1.96S/ n, while the posted solution says which one is better? X 1±1.96/ n, When the data are normality distributed with known variance σ 2, for testing the mean, both t-test and z-test can be used, which one is better? Jimin Ding, Math WUSTL Math 494 Spring 2018 3 / 21
Motivation We have learned how to find an estimator, how to construct a CI, and how to construct a test in statistics. But when there are many possible estimators or different ways of testing, how do we evaluate or compare different estimators, different CIs, and different test? For example, Both sample mean and sample median can be used to estimate the population mean, which one is better? In the first exam, the confidence interval for the intensity of Poisson random variables, someone suggested to use X ± 1.96S/ n, while the posted solution says which one is better? X 1±1.96/ n, When the data are normality distributed with known variance σ 2, for testing the mean, both t-test and z-test can be used, which one is better? Jimin Ding, Math WUSTL Math 494 Spring 2018 3 / 21
Outline Optimal Estimations and Tests UMVUE UMP Tests Jimin Ding, Math WUSTL Math 494 Spring 2018 4 / 21
UMVUE Jimin Ding, Math WUSTL Math 494 Spring 2018 5 / 21
Minimum Variance Unbiased estimator (MVUE) If ˆθ 1 and ˆθ 2 are both unbiased estimators for θ, then we would prefer to use the one with smaller variance, since it is more precise and leads to a narrower confidence interval. Definition An unbiased estimator ˆθ of θ is called a minimum variance unbiased estimator (MVUE) if it has the smallest variance among all unbiased estimators. That is, V ar(ˆθ) V ar(ˆθ ), where ˆθ is any estimator of θ with E(ˆθ ) = θ. Jimin Ding, Math WUSTL Math 494 Spring 2018 6 / 21
Uniform Minimum Variance Unbiased estimator (UMVUE) Note the variance of estimators may depend on the true parameter θ. For example, let X 1,, X n iid Ber(θ), θ (0, 1). Then we can consider ˆθ = X as an estimator of θ. Here V ar(ˆθ) = θ(1 θ)/n, which depends on θ. Of course, we often don t know θ. So we would want an unbiased estimator that has the smallest variance no matter what value θ is. Definition An unbiased estimator ˆθ of θ is called uniform a minimum variance unbiased estimator (UMVUE) if it has the smallest variance among all unbiased estimators for all θ Θ. That is, V ar(ˆθ) V ar(ˆθ ), θ Θ where ˆθ is any estimator of θ with E(ˆθ ) = θ. Jimin Ding, Math WUSTL Math 494 Spring 2018 7 / 21
How to Find MVUE or UMVUE Theorem (Rao-Blackwell (Theorem 7.3.1)) Let ˆθ 1 be an unbiased estimator of θ and T be a sufficient statistic for θ. Define ˆθ 2 := E(ˆθ 1 T ), which is a function of T. Then ˆθ 2 is also unbiased for θ and V ar(ˆθ 2 ) V ar(ˆθ 1 ), θ Θ. Remark: So ˆθ 2 is an improved estimator for θ. Theorem (Lehmann & Scheffé (Theorem 7.4.1)) Let T be a complete and sufficient statistic for θ. If E(g(T )) = θ, then g(t ) is the unique UMVUE of θ. Remark: If an unbiased estimator ˆθ is a function of complete and sufficient statistic, then it is the unique UMVE. Jimin Ding, Math WUSTL Math 494 Spring 2018 8 / 21
Examples 1. Ex.7.5.11: Let X 1,, X n iid Ber(θ), θ (0, 1). Jimin Ding, Math WUSTL Math 494 Spring 2018 9 / 21
Examples 1. Ex.7.5.11: Let X 1,, X n iid Ber(θ), θ (0, 1). 2 T = i=1 X i is complete and sufficient. (Exponential family.) Using the unbiased ˆθ1 = (X 1 + X 2 )/2 to construct ˆθ 2 = E(ˆθ 1 T ) = E(X 1 T ) = T/n. E(ˆθ 2 ) = θ, V ar(ˆθ 2 ) = θ(1 θ) n < θ(1 θ) 2 = V ar(ˆθ 1 ). ˆθ2 is the unique UMVUE by Lehmann & Scheffé theorem. Jimin Ding, Math WUSTL Math 494 Spring 2018 9 / 21
Examples 1. Ex.7.5.11: Let X 1,, X n iid Ber(θ), θ (0, 1). 2 T = i=1 X i is complete and sufficient. (Exponential family.) Using the unbiased ˆθ1 = (X 1 + X 2 )/2 to construct ˆθ 2 = E(ˆθ 1 T ) = E(X 1 T ) = T/n. E(ˆθ 2 ) = θ, V ar(ˆθ 2 ) = θ(1 θ) n < θ(1 θ) 2 = V ar(ˆθ 1 ). ˆθ2 is the unique UMVUE by Lehmann & Scheffé theorem. 2. Ex.7.3.2 & Example 7.4.2: Let X 1,, X n iid U(0, θ), θ > 0. Jimin Ding, Math WUSTL Math 494 Spring 2018 9 / 21
Examples 1. Ex.7.5.11: Let X 1,, X n iid Ber(θ), θ (0, 1). 2 T = i=1 X i is complete and sufficient. (Exponential family.) Using the unbiased ˆθ1 = (X 1 + X 2 )/2 to construct ˆθ 2 = E(ˆθ 1 T ) = E(X 1 T ) = T/n. E(ˆθ 2 ) = θ, V ar(ˆθ 2 ) = θ(1 θ) n < θ(1 θ) 2 = V ar(ˆθ 1 ). ˆθ2 is the unique UMVUE by Lehmann & Scheffé theorem. 2. Ex.7.3.2 & Example 7.4.2: Let X 1,, X n iid U(0, θ), θ > 0. T = X(n) is sufficient and the MLE of θ. It is also complete. Let ˆθ1 = T. Since E(ˆθ) = n n+1 θ < θ, ˆθ 1 is biased. Consider ˆθ 2 = n+1 n T. It is unbiased. By Lehmann & Scheffé theorem, ˆθ2 is the unique UMVUE. Jimin Ding, Math WUSTL Math 494 Spring 2018 9 / 21
Examples 1. Ex.7.5.11: Let X 1,, X n iid Ber(θ), θ (0, 1). 2 T = i=1 X i is complete and sufficient. (Exponential family.) Using the unbiased ˆθ1 = (X 1 + X 2 )/2 to construct ˆθ 2 = E(ˆθ 1 T ) = E(X 1 T ) = T/n. E(ˆθ 2 ) = θ, V ar(ˆθ 2 ) = θ(1 θ) n < θ(1 θ) 2 = V ar(ˆθ 1 ). ˆθ2 is the unique UMVUE by Lehmann & Scheffé theorem. 2. Ex.7.3.2 & Example 7.4.2: Let X 1,, X n iid U(0, θ), θ > 0. T = X(n) is sufficient and the MLE of θ. It is also complete. Let ˆθ1 = T. Since E(ˆθ) = n n+1 θ < θ, ˆθ 1 is biased. Consider ˆθ 2 = n+1 n T. It is unbiased. By Lehmann & Scheffé theorem, ˆθ2 is the unique UMVUE. Remark: this is a non regular case and does not belong to exponential family. Jimin Ding, Math WUSTL Math 494 Spring 2018 9 / 21
More Examples Not Efficient 1 An UMVUE may not always reach the CR lower bound, hence is not efficient. iid 3. Let X 1,, X n f(x; θ) = θe θx, x > 0, θ > 0. T = Xi is complete and sufficient for θ. (Exponential family.) T Gamma(n, 1 θ ) since X i Gamma(1, 1 θ ). The MLE ˆθ 1 = 1/ X, but E(ˆθ 1 ) = n θ, biased. Consider ˆθ2 = n 1 ˆθ n 1 = n 1 n So ˆθ 2 is the unique UMVUE. n 1 / X. a function of T, unbiased. But the variance of this UMVUE does not reach CR lower bound: I(θ) = V ar( θ log f(x; θ)) = V ar( 1 θ X) = θ2, V ar(ˆθ 2 ) = θ 2 /(n 2) > θ 2 /n = (ni(θ)) 1, So here the UMVUE is not efficient. Jimin Ding, Math WUSTL Math 494 Spring 2018 10 / 21
More Examples Not Efficient 2 An UMVUE may not always reach the CR lower bound, hence is not efficient. 4. Let X 1,, X n iid N(µ, σ 2 ), µ R, σ 2 R +. T = ( X i, X 2 i ) is complete and sufficient for (µ, σ2 ). The MLE: ˆµ = X, ˆσ 2 = 1 n n i=1 (X i X) 2. a function of T. But biased: E(ˆµ) = µ, E(ˆσ 2 ) = n n 1 σ2. Consider ˆσ 2 2 = n n 1 ˆσ2 1 = S 2. Still a function of T. Now unbiased. So ˆθ2 = (ˆµ, ˆσ 2) 2 is the unique UMVUE. But the UMVUE is not efficient since the variance does not reach CR lower ( bound ) ( ) σ 2 V ar(ˆθ 2 ) = n 0 σ 2 2σ 0 4 > n 0 = (ni(θ)) 1 2σ (n 1) 0 4 n Jimin Ding, Math WUSTL Math 494 Spring 2018 11 / 21
UMP Tests Jimin Ding, Math WUSTL Math 494 Spring 2018 12 / 21
Review Let X 1,, X n iid f(x; θ), θ Θ. H 0 : θ Θ 0 v.s. H 1 : θ Θ 1 = Θ \ Θ 0 (Simple, Composite) Rejection rule: Reject H 0 if (X 1,, X n ) C (rejection region). Fail to reject H 0 if (X 1,, X n ) C c (acceptance region). Probability of Type I error (false rejection): α = max θ Θ 0 P ((X 1,, X n ) C) = max P (rejection H 0 H 0 is true) Power function: γ C (θ) = P ((X 1,, X n ) C), θ Θ Power: 1 β = P ((X 1,, X n ) C), θ Θ 1 Probability of Type II error (false acceptance): β Jimin Ding, Math WUSTL Math 494 Spring 2018 13 / 21
Visualizing α, β, and Power in Test 0.000 0.001 0.002 0.003 0.004 Power α β µ 1 3400 µ 0 3500 Reject Null Fail to Reject Null Jimin Ding, Math WUSTL Math 494 Spring 2018 14 / 21
Comparison of Tests Definition If there are two rejection regions C 1 and C 2 s.t. they have the same type II error, i.e. α = max θ Θ 0 P ((X 1,, X n ) C 1 ) = max θ Θ 0 P ((X 1,, X n ) C 2 ) and γ C1 (θ) > γ C2 (θ), θ Θ 1, then we say C 1 is better than C 2 (that is, the rejection rule based on C 1 is better than the rejection rule based on C 2 ). Jimin Ding, Math WUSTL Math 494 Spring 2018 15 / 21
Comparison of Tests Definition If there are two rejection regions C 1 and C 2 s.t. they have the same type II error, i.e. α = max θ Θ 0 P ((X 1,, X n ) C 1 ) = max θ Θ 0 P ((X 1,, X n ) C 2 ) and γ C1 (θ) > γ C2 (θ), θ Θ 1, then we say C 1 is better than C 2 (that is, the rejection rule based on C 1 is better than the rejection rule based on C 2 ). Q: Is there a /the best? If so, how to find it? Jimin Ding, Math WUSTL Math 494 Spring 2018 15 / 21
Comparison of Tests Definition If there are two rejection regions C 1 and C 2 s.t. they have the same type II error, i.e. α = max θ Θ 0 P ((X 1,, X n ) C 1 ) = max θ Θ 0 P ((X 1,, X n ) C 2 ) and γ C1 (θ) > γ C2 (θ), θ Θ 1, then we say C 1 is better than C 2 (that is, the rejection rule based on C 1 is better than the rejection rule based on C 2 ). Q: Is there a /the best? If so, how to find it? First control α, then maximize power. Jimin Ding, Math WUSTL Math 494 Spring 2018 15 / 21
Most Powerful Test & Uniform Most Powerful Test Consider to test a simple hypothesis H 0 : θ = θ 0 v.s. H 1 : θ = θ 1. We say C is a best rejection region of size α for the test if and P ((X 1,, X n ) C θ = θ 0 ) = α P ((X 1,, X n ) C θ = θ 1 ) P ((X 1,, X n ) C θ = θ 1 ), for any C s.t. P ((X 1,, X n ) C θ = θ 0 ) = α. The test based on C is called a most powerful test. Jimin Ding, Math WUSTL Math 494 Spring 2018 16 / 21
Most Powerful Test & Uniform Most Powerful Test Consider to test a simple hypothesis H 0 : θ = θ 0 v.s. H 1 : θ = θ 1. We say C is a best rejection region of size α for the test if and P ((X 1,, X n ) C θ = θ 0 ) = α P ((X 1,, X n ) C θ = θ 1 ) P ((X 1,, X n ) C θ = θ 1 ), for any C s.t. P ((X 1,, X n ) C θ = θ 0 ) = α. The test based on C is called a most powerful test. Next consider to test a composite hypothesis H 0 : θ = θ 0 v.s. H 1 : θ Θ 1. We say C is a uniform most powerful (UMP) rejection region of size α for the test, if C is a best rejection region of size α for H 0 : θ = θ 0 v.s. H 1 : θ = θ 1, for all θ 1 Θ 1. And the test based on C is called a UMP test. Jimin Ding, Math WUSTL Math 494 Spring 2018 16 / 21
Neyman-Pearson Theorem Recall that the likelihood ration test is based on the ratio of the likelihood functions evaluated under the null and alternative hypothesis: Λ = L(θ 0; X 1,, X n ) L(θ 1 ; X 1,, X n ). We naturally reject H 0 : θ = θ 0 in favor of H 1 : θ = θ 1 if Λ k for some small k. Jimin Ding, Math WUSTL Math 494 Spring 2018 17 / 21
Neyman-Pearson Theorem Recall that the likelihood ration test is based on the ratio of the likelihood functions evaluated under the null and alternative hypothesis: Λ = L(θ 0; X 1,, X n ) L(θ 1 ; X 1,, X n ). We naturally reject H 0 : θ = θ 0 in favor of H 1 : θ = θ 1 if Λ k for some small k. The critical value should be chosen to meet α (the boundary of type I error). Jimin Ding, Math WUSTL Math 494 Spring 2018 17 / 21
Neyman-Pearson Theorem Recall that the likelihood ration test is based on the ratio of the likelihood functions evaluated under the null and alternative hypothesis: Λ = L(θ 0; X 1,, X n ) L(θ 1 ; X 1,, X n ). We naturally reject H 0 : θ = θ 0 in favor of H 1 : θ = θ 1 if Λ k for some small k. The critical value should be chosen to meet α (the boundary of type I error). Theorem (Neyman-Pearson Theorem (Theorem 8.1.1)) The likelihood ratio test (LRT) is a most powerful test and provides a best rejection region of size α for testing H 0 : θ = θ 0 v.s. H 1 : θ = θ 1. Jimin Ding, Math WUSTL Math 494 Spring 2018 17 / 21
Example 8.1.2 Jimin Ding, Math WUSTL Math 494 Spring 2018 18 / 21
Example 8.2.3 Jimin Ding, Math WUSTL Math 494 Spring 2018 19 / 21
Unbiased Tests Jimin Ding, Math WUSTL Math 494 Spring 2018 20 / 21
Unbiased Tests Definition If a rejection region of size α satisfies γ C (θ) α, θ Θ 1, we say it is unbiased (or the test is unbiased). Jimin Ding, Math WUSTL Math 494 Spring 2018 20 / 21
Unbiased Tests Definition If a rejection region of size α satisfies γ C (θ) α, θ Θ 1, we say it is unbiased (or the test is unbiased). Theorem A most powerful test for simple hypothesis is always unbiased. Jimin Ding, Math WUSTL Math 494 Spring 2018 20 / 21
Unbiased Tests Definition If a rejection region of size α satisfies γ C (θ) α, θ Θ 1, we say it is unbiased (or the test is unbiased). Theorem A most powerful test for simple hypothesis is always unbiased. Proof. Consider a randomized test in which we ignore the data and simply reject H 0 with probability α. For example, we can randomly generate U Ber(α) and reject H 0 if U = 1. Although this is a silly est, its level is α and γ C (θ) α. A most powerful test has power no smaller than the power of this silly test γ C (θ) α, hence is unbiased. Jimin Ding, Math WUSTL Math 494 Spring 2018 20 / 21
Uniform Most Powerful Unbiased Test Although the likelihood ratio test is a most powerful test, it may depend on the alternative hypothesis. Hence the uniform most powerful test may not always exists. See example 8.1.2. Jimin Ding, Math WUSTL Math 494 Spring 2018 21 / 21
Uniform Most Powerful Unbiased Test Although the likelihood ratio test is a most powerful test, it may depend on the alternative hypothesis. Hence the uniform most powerful test may not always exists. See example 8.1.2. Recall in estimation, we did not just look for the estimator with minimal variance (a silly constant estimator has the smallest variance 0). Instead, we first consider restrict ourself to unbiased estimators, and then minimize the variance of the estimators among all unbiased estimators. Can we do this for tests? Jimin Ding, Math WUSTL Math 494 Spring 2018 21 / 21
Uniform Most Powerful Unbiased Test Although the likelihood ratio test is a most powerful test, it may depend on the alternative hypothesis. Hence the uniform most powerful test may not always exists. See example 8.1.2. Recall in estimation, we did not just look for the estimator with minimal variance (a silly constant estimator has the smallest variance 0). Instead, we first consider restrict ourself to unbiased estimators, and then minimize the variance of the estimators among all unbiased estimators. Can we do this for tests? Definition If a test is a UMP test among all unbiased tests, then it is called a Uniform Most Powerful Unbiased Test (UMPU) test. Remark: Unfortunately, UMPU test may not exist too. But, for exponential family distributions with p(θ) = θ, a UMPU test (of size α) exists and is based on T = i K(X i). Jimin Ding, Math WUSTL Math 494 Spring 2018 21 / 21