STA 260: Statistics and Probability II

Similar documents
Chapters 9. Properties of Point Estimators

STAT 512 sp 2018 Summary Sheet

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Review. December 4 th, Review

Mathematical statistics

p y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.

Mathematical statistics

March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

Module 6: Methods of Point Estimation Statistics (OA3102)

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Elements of statistics (MATH0487-1)

Math 494: Mathematical Statistics

INTRODUCTION TO BAYESIAN METHODS II

Exercises and Answers to Chapter 1

Theory of Statistics.

Problem 1 (20) Log-normal. f(x) Cauchy

Topic 12 Overview of Estimation

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Mathematical statistics

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Chapter 8.8.1: A factorization theorem

ECE531 Lecture 10b: Maximum Likelihood Estimation

Parameter Estimation

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Mathematical Statistics

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

ECE 275B Homework # 1 Solutions Winter 2018

[Chapter 6. Functions of Random Variables]

IEOR E4703: Monte-Carlo Simulation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

ECE 275B Homework # 1 Solutions Version Winter 2015

ST5215: Advanced Statistical Theory

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

ECE 275A Homework 7 Solutions

Statistics and Econometrics I

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

1. Point Estimators, Review

MAS3301 Bayesian Statistics

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Theory of Statistical Tests

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Sufficiency 1. Sufficiency. Math Stat II

Parametric Inference

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

1. (Regular) Exponential Family

F & B Approaches to a simple model

Information in a Two-Stage Adaptive Optimal Design

Review and continuation from last week Properties of MLEs

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

Introduction to Maximum Likelihood Estimation

SOLUTION FOR HOMEWORK 8, STAT 4372

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

February 26, 2017 COMPLETENESS AND THE LEHMANN-SCHEFFE THEOREM

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

STAT 730 Chapter 4: Estimation

SUFFICIENT STATISTICS

Statistics. Statistics

ECE531 Lecture 8: Non-Random Parameter Estimation

ECON 4160, Autumn term Lecture 1

Chapter 3: Maximum Likelihood Theory

Link lecture - Lagrange Multipliers

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

2.3 Methods of Estimation

Solution-Midterm Examination - I STAT 421, Spring 2012

Practice Problems Section Problems

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Evaluating the Performance of Estimators (Section 7.3)

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

A Very Brief Summary of Statistical Inference, and Examples

Stat 451 Lecture Notes Monte Carlo Integration

Master s Written Examination - Solution

Chapter 3. Point Estimation. 3.1 Introduction

Introduction to Bayesian Methods

Central Limit Theorem ( 5.3)

Simple and Multiple Linear Regression

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

ECE531 Homework Assignment Number 6 Solution

The Expectation-Maximization Algorithm

Last Lecture. Biostatistics Statistical Inference Lecture 14 Obtaining Best Unbiased Estimator. Related Theorems. Rao-Blackwell Theorem

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Math 152. Rumbos Fall Solutions to Assignment #12

EIE6207: Estimation Theory

MATH4427 Notebook 2 Fall Semester 2017/2018

Lecture 5. i=1 xi. Ω h(x,y)f X Θ(y θ)µ Θ (dθ) = dµ Θ X

Statistics 3858 : Maximum Likelihood Estimators

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Regression Estimation Least Squares and Maximum Likelihood

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Introduction to Simple Linear Regression

System Identification, Lecture 4

System Identification, Lecture 4

COMP2610/COMP Information Theory

Transcription:

Al Nosedal. University of Toronto. Winter 2017

1 Properties of Point Estimators and Methods of Estimation 2 3

If you can t explain it simply, you don t understand it well enough Albert Einstein.

Definition 9.1 Given two unbiased estimators ˆθ 1 and ˆθ 2 of a parameter θ, with variances V (ˆθ 1 ) and V (ˆθ 2 ), respectively, then the efficiency of ˆθ 1 relative to ˆθ 2, denoted eff(ˆθ 1, ˆθ 2 ), is defined to be the ratio eff(ˆθ 1, ˆθ 2 ) = V (ˆθ 2 ) V (ˆθ 1 )

Exercise 9.1 In Exercise 8.8, we considered a random sample of size 3 from an exponential distribution with density function given by { (1/θ)e y/θ y > 0 f (y) = 0 elsewhere and determined that ˆθ 1 = Y 1, ˆθ 2 = (Y 1 + Y 2 )/2, ˆθ 3 = (Y 1 + 2Y 2 )/3, and ˆθ 5 = Ȳ are all unbiased estimators for θ. Find the efficiency of ˆθ 1 relative to ˆθ 5, of ˆθ 2 relative to ˆθ 5, and of ˆθ 3 relative to ˆθ 5

Solution V (ˆθ 1 ) = V (Y ( 1 ) = θ) 2 (From Table). V (ˆθ 2 ) = V Y1 +Y 2 2 = 2θ2 4 = θ2 2 ) V (ˆθ 3 ) = V ( Y1 +2Y 2 3 V (ˆθ 5 ) = V ( Ȳ ) = θ2 3 = 5θ2 9

Solution eff(ˆθ 1, ˆθ 5 ) = V (ˆθ 5 ) eff(ˆθ 2, ˆθ 5 ) = V (ˆθ 5 ) eff(ˆθ 3, ˆθ 5 ) = V (ˆθ 5 ) = θ 2 3 = 1 V (ˆθ 1 ) θ 2 3 = θ 2 3 = 2 V (ˆθ 2 ) θ 2 3 2 = θ 2 3 = 3 V (ˆθ 3 ) 5θ 2 5 9

Exercise 9.3 Let Y 1, Y 2,..., Y n denote a random sample from the uniform distribution on the interval (θ, θ + 1). Let ˆθ 1 = Ȳ 1 2 and ˆθ 2 = Y (n) n n+1. a. Show that both ˆθ 1 and ˆθ 2 are unbiased estimators of θ. b. Find the efficiency of ˆθ 1 relative to ˆθ 2.

Solution a. E(ˆθ 1 ) = E(Ȳ 1 2 ) = E(Ȳ ) E(1 2 ) = E(Y 1+Y 2 +...+Y n n ) 1 2 = 2θ+1 2 1 2 = θ. Since Y i has a Uniform distribution on the interval (θ, θ + 1), V (Y i ) = 1 12 (check Table). V (ˆθ 1 ) = V (Ȳ 1 2 ) = V (Ȳ ) = V ( Y 1+Y 2 +...+Y n n ) = 1 12n

Solution Let W = Y (n) = max{y 1,..., Y n }. F W (w) = P[W w] = [F Y (w)] n f W (w) = d dw F W (w) = n[f Y (w)] n 1 f Y (w) In our case, f W (w) = n[w θ] n 1, θ < w < θ + 1. Now that we have the pdf of W, we can find its expected value and variance.

Solution E(W ) = θ+1 θ nw[w θ] n 1 dw (integrating by parts) E(W ) = w[w θ] n θ+1 θ θ+1 θ [w θ] n dw E(W ) = (θ + 1) [w θ]n+1 E(W ) = θ + n n+1 n+1 θ+1 θ = (θ + 1) 1 n+1

Solution E(W 2 ) = θ+1 θ nw 2 [w θ] n 1 dw (integrating by parts) E(W 2 ) = w 2 [w θ] n θ+1 θ θ+1 θ 2w[w θ] n dw E(W 2 ) = (θ + 1) 2 2 θ+1 n+1 θ (n + 1)w[w θ] n dw ( ) E(W 2 ) = (θ + 1) 2 2 n+1 θ + n+1 n+2 E(W 2 ) = (θ + 1) 2 2θ n+1 2 n+2

Solution V (W ) = E(W 2 ) [E(W )] 2 [ V (W ) = θ 2 + 2θ + 1 2θ n+1 2 n+2 (after doing a bit of algebra...) V (W ) = n (n+2)(n+1) 2 θ + ] 2 n n+1

Solution E(ˆθ 2 ) = E(W V (ˆθ 2 ) = V (W n n+1 ) = E(W ) E( n n+1 ) = θ + n n+1 n n+1 = θ. n n+1 ) = V (W ) = n (n+2)(n+1) 2

Solution eff(ˆθ 1, ˆθ 2 ) = V (ˆθ 2 ) V (ˆθ 1 ) = n (n+2)(n+1) 2 1 12n = 12n 2 (n + 2)(n + 1) 2

Definition 9.2 The estimator ˆθ n is said to be consistent estimator of θ if, for any positive number ɛ, or, equivalently, lim P( ˆθ n θ ɛ) = 1 n lim P( ˆθ n θ > ɛ) = 0. n

Theorem 9.1 An unbiased estimator ˆθ n for θ is a consistent estimator of θ if lim V (ˆθ n ) = 0. n

Exercise 9.15 Refer to Exercise 9.3. Show that both ˆθ 1 and ˆθ 2 are consistent estimators for θ.

Solution Recall that we have already shown that ˆθ 1 and ˆθ 2 are unbiased estimator of θ. Thus, if we show that lim n V (ˆθ 1 ) = 0 and lim n V (ˆθ 2 ) = 0, we are done. Clearly, which implies that 0 V (ˆθ 1 ) = 1 12n lim V (ˆθ 1 ) = 0 n

Solution Clearly, 0 V (ˆθ 2 ) = which implies that n (n + 2)(n + 1) 2 = n (n + 2) lim V (ˆθ 1 2 ) lim n n (n + 1) 2 = 0 Therefore, ˆθ 1 and ˆθ 2 are consistent estimators for θ. 1 (n + 2) 1 (n + 1) 2 (n + 2) (n + 1) 2

Definition 9.3 Let Y 1, Y 2,..., Y n denote a random sample from a probability distribution with unknown parameter θ. Then the statistic U = g(y 1, Y 2,..., Y n ) is said to be sufficient for θ if the conditional distribution of Y 1, Y 2,..., Y n, given U, does not depend on θ.

Example Let X 1, X 2, X 3 be a sample of size 3 from the Bernoulli distribution. Consider U = g(x 1, X 2, X 3 ) = X 1 + X 2 + X 3. We will show that g(x 1, X 2, X 3 ) is sufficient.

Solution Values of U f X1,X 2,X 3 U (0,0,0) 0 1 (0,0,1) 1 1/3 (0,1,0) 1 1/3 (1,0,0) 1 1/3 (0,1,1) 2 1/3 (1,0,1) 2 1/3 (1,1,0) 2 1/3 (1,1,1) 3 1

Example The conditional densities given in the last column are routinely calculated. For instance, f X1,X 2,X 3 U=1(0, 1, 0 1) = P[X 1 = 0, X 2 = 1, X 3 = 0 U = 1] = P[X 1=0 and X 2 =1 and X 3 =0 and U=1] P[U=1] = (1 p)(p)(1 p) ( 3 1)p(1 p) 2 = 1 3

Definition 9.4 Let y 1, y 2,..., y n be sample observations taken on corresponding random variables Y 1, Y 2,..., Y n whose distribution depends on a parameter θ. Then, if Y 1, Y 2,..., Y n are discrete random variables, the likelihood of the sample, L(y 1, y 2,..., y n θ), is defined to be the joint probability of y 1, y 2,..., y n. If Y 1, Y 2,..., Y n are continuous random variables, the likelihood L(y 1, y 2,..., y n θ), is defined to be the joint density of y 1, y 2,..., y n.

Theorem 9.4 Let U be a statistic based on the random sample Y 1, Y 2,..., Y n. Then U is a sufficient statistic for the estimation of a parameter θ if and only if the likelihood L(θ) = L(y 1, y 2,..., y n θ) can be factored into two nonnegative functions, L(y 1, y 2,..., y n θ) = g(u, θ)h(y 1, y 2,..., y n ) where g(u, θ) is a function only of u and θ and h(y 1, y 2,..., y n ) is not a function of θ.

Exercise 9.37 Let X 1, X 2,..., X n denote n independent and identically distributed Bernoulli random variables such that P(X i = 1) = θ and P(X i = 0) = 1 θ, for each i = 1, 2,..., n. Show that n i=1 X i is sufficient for θ by using the factorization criterion given in Theorem 9.4.

Solution L(x 1, x 2,..., x n θ) = P(x 1 θ)p(x 2 θ)...p(x n θ) = θ x 1 (1 θ) 1 x 1 θ x 2 (1 θ) 1 x 2...θ xn (1 θ) 1 xn = θ n i=1 x i (1 θ) n n i=1 x i By Theorem 9.4, n i=1 x i is sufficient for θ with and. g( n i=1 x i, θ) = θ n i=1 x i (1 θ) n n i=1 x i h(x 1, x 2,..., x n ) = 1

Indicator Function For a < b, I (a,b) (y) = { 1 if a < y < b 0 otherwise.

Exercise 9.49 Let Y 1, Y 2,..., Y n denote a random sample from the Uniform distribution over the interval (0, θ). Show that Y (n) = max(y 1, Y 2,..., Y n ) is sufficient for θ.

Solution L(y 1, y 2,..., y n θ) = f (y 1 θ)f (y 2 θ)...f (y n θ) = 1 θ I (0,θ)(y 1 ) 1 θ I (0,θ)(y 2 )... 1 θ I (0,θ)(y n ) = 1 θ n I (0,θ) (y 1 )I (0,θ) (y 2 )...I (0,θ) (y n ) = 1 θ n I (0,θ) (y (n) ) Therefore, Theorem 9.4 is satisfied with and. g(y (n), θ) = 1 θ n I (0,θ)(y (n) ) h(y 1, y 2,..., y n ) = 1

Exercise 9.51 Let Y 1, Y 2,..., Y n denote a random sample from the probability density function { e (y θ) y θ f (y θ) = 0 elsewhere Show that Y (1) = min(y 1, Y 2,..., Y n ) is sufficient for θ.

Solution L(y 1, y 2,..., y n θ) = f (y 1 θ)f (y 2 θ)...f (y n θ) = e (y1 θ) I [θ, ) (y 1 )e (y2 θ) I [θ, ) (y 2 )...e (yn θ) I [θ, ) (y n ) = e nθ e n i=1 y i I [θ, ) (y 1 )I [θ, ) (y 2 )...I [θ, ) (y n ) = e nθ e n i=1 y i I [θ, ) (y (1) ) = e nθ I [θ, ) (y (1) )e n i=1 y i

Solution Therefore, Theorem 9.4 is satisfied with and and Y (1) is sufficient for θ. g(y (1), θ) = e nθ I [θ, ) (y (1) ) h(y 1, y 2,..., y n ) = e n i=1 y i

Theorem 9.5 The Rao-Blackwell Theorem. Let ˆθ be an unbiased estimator for θ such that V (ˆθ) <. If U is a sufficient statistic for θ, define ˆθ = E(ˆθ U). Then, for all θ, E(ˆθ ) = θ and V (ˆθ ) V (ˆθ).

Proof Check page 465. (It is almost identical to what we did in class, Remember?)

Exercise 9.61 Refer to Exercise 9.49. Use Y (n) to find an MVUE of θ.

Exercise 9.62 Refer to Exercise 9.51. Find a function of Y (1) that is an MVUE for θ.

Solution Please, see review 2.

The The method of moments is a very simple procedure for finding an estimator for one or more population parameters. Recall that the kth moment of random variable, taken about the origin, is µ k = E(Y k ). The corresponding kth sample moment is the average m k = 1 n n i=1 Y k i.

The Choose as estimates those values of the parameters that are solutions of the equations µ k, for k = 1, 2,..., t, where t is the number of parameters to be estimated.

Exercise 9.69 Let Y 1, Y 2,..., Y n denote a random sample from the probability density function { (θ + 1)y θ 0 < y < 1; θ > 1 f (y θ) = 0 elsewhere Find an estimator for θ by the method of moments.

Solution Note that Y is a random variable with a Beta distribution where α = θ + 1 and β = 1. Therefore, µ 1 = E(Y ) = α α + β = θ + 1 θ + 2 (we can find this formula on our Table).

Solution The corresponding first sample moment is m 1 = 1 n n Y i = Ȳ. i=1

Solution Equating the corresponding population and sample moment, we obtain (solving for θ) θ + 1 θ + 2 = Ȳ ˆθ MOM = 2Ȳ 1 1 Ȳ = 1 2Ȳ Ȳ 1

Exercise 9.75 Let Y 1, Y 2,..., Y n be a random sample from the probability density function given by f (y θ) = { Γ(2θ) [Γ(2θ)] 2 y θ 1 (1 y) θ 1 0 < y < 1; θ > 1 0 elsewhere Find the method of moments estimator for θ.

Solution Note that Y is a random variable with a Beta distribution where α = θ and β = θ. Therefore, µ 1 = E(Y ) = α α + β = θ 2θ = 1 2

Solution The corresponding first sample moment is m 1 = 1 n n Y i = Ȳ. i=1

Solution Equating the corresponding population and sample moment, we obtain 1 2 = Ȳ (since we can t solve for θ, we have to repeat the process using second moments).

Solution Recalling that V (Y ) = E(Y 2 ) [E(Y )] 2 and solving for E(Y 2 ), we have that (we can easily get V (Y ) from our table, Right?) µ 2 = E(Y 2 ) = (after a little bit of algebra...) θ 2 (2θ) 2 (2θ + 1) + 1 4 = 1 4(2θ + 1) + 1 4 E(Y 2 ) = θ + 1 4θ + 2.

Solution The corresponding second sample moment is m 2 = 1 n n i=1 Y 2 i.

Solution Solving for θ ˆθ MOM = 1 2m 2 4m 2 1

Suppose that the likelihood function depends on k parameters θ 1, θ 2,..., θ k. Choose as estimates those values of the parameters that maximize the likelihood L(y 1, y 2,..., y n θ 1, θ 2,..., θ k ).

Problem Given a random sample Y 1, Y 2,..., Y n from a population with pdf f (x θ), show that maximizing the likelihood function, L(y 1, y 2,..., y n θ), as a function of θ is equivalent to maximizing lnl(y 1, y 2,..., y n θ).

Proof Let ˆθ MLE, which implies that L(y 1, y 2,..., y n θ) L(y 1, y 2,..., y n ˆθ MLE ) for all θ. We know that g(θ) = ln(θ) is monotonically increasing function of θ, thus for θ 1 θ 2 we have that ln(θ 1 ) ln(θ 2 ). Therefore lnl(y 1, y 2,..., y n θ) lnl(y 1, y 2,..., y n ˆθ MLE ) for all θ. We have shown that lnl(y 1, y 2,..., y n θ) attains its maximum at ˆθ MLE.

Example 9.16 Let Y 1, Y 2,..., Y n be a random sample of observations from a uniform distribution with probability density function f (y i θ) = 1 θ, for 0 y i θ and i = 1, 2,..., n. Find the MLE of θ.

MLEs have some additional properties that make this method of estimation particularly attractive. Generally, if θ is the parameter associated with a distribution, we are sometimes interested in estimating some function of θ - say t(θ) - rather than θ itself. In exercise, 9.94, you will prove that if t(θ) is a one-to-one function of θ and if ˆθ is the MLE for θ, then the MLE of t(θ) is given by ˆ t(θ) = t(ˆθ). This result, sometimes referred to as the invariance property of MLEs, also holds for any function of a parameter of interest (not just one-to-one functions).

Exercise 9.81 Suppose that Y 1, Y 2,..., Y n denote a random sample from an exponentially distributed population with mean θ. Find the MLE of the population variance θ 2.

Exercise 9.85 Let Y 1, Y 2,..., Y n be a random sample from the probability density function given by { 1 f (y α, θ) = Γ(α)θ y α 1 e y/θ y > 0, α 0 elsewhere, where α > 0 is known.

Exercise 9.85 a. Find the MLE ˆθ of θ. b. Find the expected value and variance of ˆθ MLE. c. Show that ˆθ MLE is consistent for θ. d. What is the best sufficient statistic for θ in this problem?

Solution a) L(θ) = 1 [Γ(α)θ α ] n Πn i=1y α 1 i e n i=1 y i /θ

Solution a) n n i=1 lnl(θ) = (α 1) ln(y i ) y i θ i=1 nlnγ(α) nαln(θ)

Solution a) dlnl(θ) dθ n i=1 = y i nαθ ˆθ MLE = ȳ α θ 2

Solution a) (Let us check that we actually have a maximum...) d 2 lnl(θ) dθ 2 = 2 n i=1 y i + nαθ θ 3 d 2 lnl(ˆθ MLE ) dθ 2 = α3 n ȳ 2 < 0

Solution b) ( ) ȳ E(ˆθ MLE ) = E = θ. α ( ) ȳ V (ˆθ MLE ) = V = θ2 α αn.

Solution c) Since ˆθ is unbiased, we only need to show that lim n V (ˆθ MLE ) = lim n θ 2 αn = 0.

Solution d) By the Factorization Theorem, and g(u, θ) = g( n i=1 n yi, θ) = e i=1 y i /θ θ αn h(y 1, y 2,..., y n ) = Πn i=1 y α 1 i [Γ(α)] n

Example 9.15 Let Y 1, Y 2,..., Y n be a random sample from a normal distribution with mean µ and variance σ 2. Find the MLEs of µ and σ 2.