Mathmatical Statisticals

Similar documents
Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

STATISTICAL INFERENCE

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

Statistical Theory MT 2009 Problems 1: Solution sketches

Stat410 Probability and Statistics II (F16)

Lecture 7: Properties of Random Samples

Chapter 6 Principles of Data Reduction

Statistical Theory MT 2008 Problems 1: Solution sketches

EE 4TM4: Digital Communications II Probability Theory

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

6. Sufficient, Complete, and Ancillary Statistics

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Mathematical Statistics - MS

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Lecture 12: September 27

Introductory statistics

2. The volume of the solid of revolution generated by revolving the area bounded by the

Lecture 11 and 12: Basic estimation theory

Probability and Statistics

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Lecture 16: UMVUE: conditioning on sufficient and complete statistics

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Random Variables, Sampling and Estimation

Unbiased Estimation. February 7-12, 2008

Summary. Recap ... Last Lecture. Summary. Theorem

APPM 5720 Solutions to Problem Set Five

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Lecture 23: Minimal sufficiency

This section is optional.

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Lecture 8: Convergence of transformations and law of large numbers

Topic 9: Sampling Distributions of Estimators

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Distribution of Random Samples & Limit theorems

AMS570 Lecture Notes #2

Final Examination Statistics 200C. T. Ferguson June 10, 2010

Machine Learning Brett Bernstein

Lecture 19: Convergence

LECTURE 8: ASYMPTOTICS I

Recap! Good statistics, cont.! Sufficiency! What are good statistics?! 2/20/14

Stat 421-SP2012 Interval Estimation Section

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Last Lecture. Wald Test

Sequences and Series of Functions

Lecture 33: Bootstrap

Solutions: Homework 3

Last Lecture. Unbiased Test

1.010 Uncertainty in Engineering Fall 2008

Notes 5 : More on the a.s. convergence of sums

Homework for 2/3. 1. Determine the values of the following quantities: a. t 0.1,15 b. t 0.05,15 c. t 0.1,25 d. t 0.05,40 e. t 0.

Estimation for Complete Data

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

7.1 Convergence of sequences of random variables

Problem Set 4 Due Oct, 12

SOLUTION FOR HOMEWORK 7, STAT np(1 p) (α + β + n) + ( np + α

Point Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.

Summary. Recap. Last Lecture. Let W n = W n (X 1,, X n ) = W n (X) be a sequence of estimators for

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Expectation and Variance of a random variable

7.1 Convergence of sequences of random variables

1. Parameter estimation point estimation and interval estimation. 2. Hypothesis testing methods to help decision making.

TAMS24: Notations and Formulas

Lecture 12: November 13, 2018

Lecture 6 Ecient estimators. Rao-Cramer bound.

ST5215: Advanced Statistical Theory

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

MATHEMATICAL SCIENCES PAPER-II

Properties of Point Estimators and Methods of Estimation

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Exponential Families and Bayesian Inference

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Simulation. Two Rule For Inverting A Distribution Function

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Approximations and more PMFs and PDFs

SDS 321: Introduction to Probability and Statistics

Parameter, Statistic and Random Samples

STAT Homework 1 - Solutions

5. Likelihood Ratio Tests

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Maximum Likelihood Estimation

Department of Mathematics

32 estimating the cumulative distribution function

Lecture 15: Density estimation

Advanced Stochastic Processes.

x = Pr ( X (n) βx ) =

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Last Lecture. Biostatistics Statistical Inference Lecture 16 Evaluation of Bayes Estimator. Recap - Example. Recap - Bayes Estimator

Lecture Notes 15 Hypothesis Testing (Chapter 10)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Transcription:

Mathmatical Statisticals Che, L.-A. Chapter 4. Distributio of Fuctio of Radom variables Sample space S : set of possible outcome i a experimet. Probability set fuctio P: ()P (A) 0, A S. ()P (S) =. (3)P ( A i ) = P (A i ), ifa i A j =, i j. Radom variable X: X : S R Give B R, P (X B) = P ({s S : X(s) B}) = P (X (B)) where X (B) S. X is a discrete radom variable if its rage X(s) = {x R : s S, X(s) = x} is coutable. The probability desity/mass fuctio (p.d.f) of X is defied as f(x) = P (X = x), x R. Distributio fuctio F: F (x) = P (X x), x R. A r.v. is called a cotiuous r.v. if there exists f(x) 0 such that F (x) = x where f is the p.d.f of cotiuous r.v. X. f(t)dt, x R.

Let X be a r.v. with p.d.f f(x). Let g : R R Q: What is the p.d.f. of g(x)? ad is g(x) a r.v.?(yes) Aswer: (a) distributio method : Suppose that X is a cotiuous r.v.. Let Y = g(x) The d.f(distributio fuctio) of Y is G(y) = P (Y y) = P (g(x) y) If G is differetiable the the p.d.f. of Y = g(x) is g(y) = G (y). (b) mgf method :(momet geeratig fuctio) { e E[e tx tx f(x) (discrete) ] = etx f(x)dx (cotiuous) Thm. m.g.f. M x (t) ad its distributio (p.d.f. or d.f.) forms a fuctios. ex: M Y (t) = e t = M N(0,) (t) Y N(0, ) Let X,..., X be radom variables. If they are discrete, the joit p.d.f. of X,..., X is f(x,..., x ) = P (X = x, X = x,..., X = x ), If X,..., X are cotiuous r.v. s, there exists f such that x x F (x,..., x ) =... f(t,..., t )dt... dt, for We call f the joit p.d.f. of X,..., X. x. x x. x R R If X is cotiuous, the F (x) = x f(t)dt ad P (X = x) = x x f(t)dt = 0, x R.

Margial p.d.f s: Discrete: f Xi (x) = P (X i = x) =...... f(x,..., x i, x, x i+,..., x ) x x i+ x i x Cotiuous: f Xi (x) =...... f(x,..., x i, x, x i+,..., x )dx... dx i dx i+... dx Evets A ad B are idepedet if P (A B) = P (A)P (B). Q: If A B =, are A ad B idepedet? A: I geeral, they are ot. Let X ad Y be r.v. s with joit p.d.f. f(x, y) ad margial p.d.f. f X (x) ad f Y (y). We say that X ad Y are idepedet if ( ) x f(x, y) = f X (x)f Y (y), R y Radom variables X ad Y are idetically distributed (i.d.) p.d.f. s f ad g satisfy f = g or d.f. s F ad G satisfy F = G. if margial We say that X ad Y are iid radom variables if they are idepedet ad idetically distributed. Trasformatio of r.v. s (discrete case) Uivariate: Y = g(x), p.d.f. of Y is g(y) = P (Y = y) = P (g(x) = y) = P ({x Rage of X : g(x) = y}) = {x:g(x)=y} For radom variables X,..., X with joit p.d.f. f(x,..., x ), defie trasformatios Y = g (X,..., X ),..., Y m = g m (X,..., X ). The joit p.d.f. of Y,..., Y m is f(x) 3

g(y,..., y m ) = P (Y = y,..., Y m = y m ) = P ({ =. x x. x : g (x,..., x ) = y,..., g m (x,..., x ) = y m }) x { :g (x,...,x )=y,...,g m(x,...,x )=y m} Example: joit p.d.f. of X, X, X 3 is f(x,..., x ) (x, x, x 3 ) (0, 0, 0) (0, 0, ) (0,, ) (, 0, ) (,, 0) (,, ) f(x, x, x 3 ) 8 3 8 8 Y = X + X + X 3, Y = X 3 X Space of (Y, Y ) is {(0, 0), (, ), (, 0), (, ), (3, 0)}. Joit p.d.f. of Y ad Y is (y, y ) (0, 0) (, ) (, 0) (, ) (3, 0) g(y, y ) 8 3 8 8 Cotiuous oe-to-oe trasformatios: Let X be a cotiuous r.v. with joit p.d.f. f(x) ad rage A = X(s). Cosider Y = g(x), a differetiable fuctio. We wat p.d.f. of Y. Thm. If g is - trasformatio, the the p.d.f. of Y is { f X (g (y)) dg (y) y g(a) f Y (y) = dy 0 otherwise. Proof. The d.f. of Y is 8 F Y (y) = P (Y y) = P (g(x) y) (a) If g is, g is also.( dg dy > 0) F Y (y) = P (X g (y)) = 4 8 8 g (y) 8 f X (x)dx 8

p.d.f. of Y is g (y) f Y (y) = D y f X (x)dx (b) If g is, g is also. ( dg dy < 0) F Y (y) = P (X g (y)) = p.d.f. of Y is = f X (g (y)) dg (y) dy = f X (g (y)) dg (y) dy g (y) f Y (y) = D y ( f X (x)dx = g (y) f X (x)dx) = f X (g (y)) dg (y) dy = f X (g (y)) dg (y) dy g (y) f X (x)dx Example : X U(0, ), Y = l(x) = g(x) sol: p.d.f. of X is {, if 0 < x < f X (x) = 0, elsewhere. A = (0, ), g(a) = (0, ), p.d.f. of Y is x = e y = g (y), dx dy = y e f Y (y) = f X (g (y)) dy dx = y e, y > 0 (X U(a, b) if f X (x) = { b a if a < x < b 0 elsewhere. ) 5

Y χ () (X χ (r) if f X (x) = Cotiuous -r.v.-to-m-r.v., > m, case : x. x Γ( r ) x r r e x, x > 0) Y = g (X,..., X ). Y m = g m (X,..., X ) g. g R m R m Q : What are the margial p.d.f. of Y,, Y m A : We eed to defie Y m+ = g m+ (X,..., X ),, Y = g (X,..., X ) such that g. g is - from R to R. Theory for chage variables : x P (. A) = x f X,...,X (x,..., x )dx dx Let y = g (x,..., x ),, y = g (x,..., x ) be a fuctio with iverse x = w (y,..., y ),, x = w (y,..., y ) ad Jacobia x x y y J =.. x x y y The f X,...,X (x,..., x )dx dx = f X,...,X (w (y,..., y ),..., w (y,..., y )) J dy dy Hece, joit p.d.f. of Y,, Y is f Y,...,Y (y,..., y ) = f X,...,X (w,..., w ) J 6

Thm. Suppose that X ad X are two r.v. s with cotiuous joit p.d.f. f X,X ad sample space A. If Y = g (X, X ), Y = g (X, X ) forms a trasformatio iverse fuctio ( X X ) = the joit p.d.f. of Y, Y is ( ) w (Y, Y ) ad Jacobia J = w (Y, Y ) f Y,Y (y, y ) = f X,X (w (y, y ), w (y, y )) J, Steps : (a) joit p.d.f. of X, X, space A. (b) check if it is trasformatio. Iverse fuctio X = w (Y, Y ), X = w (Y, Y ) (c) Rage of ( Y ) ( Y = g ) g (A) ( y x x y y x x y y y ) ( g iid Example : For X, X U(0, ), let Y = X + X, Y = X X. Wat margial p.d.f. of Y, Y Sol : joit p.d.f. of X, X is { if 0 < x f X,X (x, x ) = <, 0 < x < 0 elsewhere. ( ) X A = { : 0 < x <, 0 < x < } X Give y, y, solve y = x + x, y = x x. g ) (A). x = y + y = w (y, y ), x = y y ( trasformatio) = w (y, y ) Jacobia is J = x x y y x x y y The joit p.d.f. of Y, Y is = f Y,Y (y, y ) = f X,X (w, w ) J, 7 = 4 4 = ( y y ) B

Margial p.d.f. of Y, Y are y dy y = y, 0 < y < f Y (y ) = y y dy = y, < y < 0, elsewhere. f Y (y ) = +y y dy = y +, < y < 0 y dy y = y, 0 < y < 0, elsewhere. Def. If a sequece of r.v. s X,..., X are idepedet ad idetically distributed (i.i.d.),the they are called a radom sample. If X,..., X is a radom sample from a distributio with p.d.f. f 0, the the joit p.d.f. of X,..., X is x f(x,..., x ) = f 0 (x i ),. R Def. Ay fuctio g(x,..., X ) of a radom sample X,..., X which is ot depedet o a parameter θ is called a statistic. Note : If X is a radom sample with p.d.f. f(x, θ), where θ is a ukow costat, the θ is called parameter. For example, N(µ, σ ) : µ, σ are parameters. Poisso(λ) : λ is a parameter. x Example of statistics : X,..., X are iid r.v. s X ad S are statistics. Note : If X,..., X are r.v. s, the m.g.f of X,..., X is M X,...,X (t,..., t ) = E(e t X + +t X ) m.g.f M x (t) = E(e tx ) = e tx f(x)dx D t M x (t) = D t E(e tx ) = D t e tx f(x)dx = D t e tx f(x)dx 8

Lemma. X ad X are idepedet if ad oly if M X,X (t, t ) = M X (t )M X (t ), t, t. Proof. ) If X, X are idepedet, M X,X (t, t ) = E(e t X +t X ) = e t x +t x f(x, x )dx dx = e t x f X (x )dx e t x f X (x )dx = E(e t X )E(e t X ) = M X (t )M X (t ) ) M X,X (t, t ) = E(e t X +t X ) = e t x +t x f(x, x )dx dx M X (t )M X (t ) = E(e t X )E(e t X ) = = e t x f X (x )dx e t x f X (x )dx e t x +t x f(x, x )dx dx With correspodece betwee m.g.f ad p.d.f, the f(x, x ) = f (x )f (x ), x, x X, X are idepedet. X ad Y are idepedet, deote by X Y. X N(µ, σ σ µt+ ), M x (t) = e t, t R X Gamma(α, β), M x (t) = ( βt) α, t < β X b(, p), M x (t) = ( p + pe t ), t R X Poisso(λ), M x (t) = e λ(et ), t R Note : 9

(a) If (X,..., X ) ad (Y,..., Y m ) are idepedet, the g(x,..., X ) ad h(y,..., Y m ) are also idepedet. (b) If X, Y are idepedet, the E[g(X)h(Y )] = E[g(X)]E[h(Y )]. Thm. If (X,..., X ) is a radom sample from N(µ, σ ), the Proof. (a) m.g.f. of X is (a)x N(µ, σ ) (b)x ad S are idepedet. ( )S (c) χ ( ) σ M X (t) = E(e tx ) = E(e t X i ) = E(e t X e t X e t X ) = E(e t X )E(e t X )E(e t X ) = M X ( t )M X ( t ) M X ( t ) = (e µ t + σ ( t ) ) = e µt+ σ / t X (µ, σ ) (b) First we wat to show that X ad (X X, X X,..., X X) are 0

idepedet. Joit m.g.f. of X ad (X X, X X,..., X X) is M X,X X,X X,...,X X (t, t,..., t ) = E[e tx+t (X X)+ +t (X X) ] = E[e t X i+ t ix i t Xi i ] = E[e ( t +t i t)x i ], t = = E[e (t i t)+t = E[ e (t i t)+t X i ] = X i ] e µ (t i t)+t + σ ((t i t)+t) ((t i t)+t)+ σ ((t i t)+t) µ = e = e µt+ σ / t +µ (t i t)+ σ (ti t) + σ t (t i t) = e µt+ σ / t e σ (ti t) = M X (t)m (X X,X X,...,X X) (t,..., t ) X ad (X X, X X,..., X X) are idepedet. X ad S = (X i X) are idepedet. (c) () Z N(0, ), Z χ () () X χ (r ) ad Y χ (r ) are idepedet. X + Y χ (r + r ) Proof. m.g.f. of X + Y is M X+Y (t) = E(e t(x+y ) ) = E(e tx+ty ) = E(e tx )E(e ty ) = M X (t)m Y (t) X + Y χ (r + r ) (3) = ( t) r ( t) r = ( t) r +r X µ σ t i (X,..., X ) iid N(µ, σ), X µ,..., X µ σ σ iid N(0, )

(X µ), (X µ),..., (X µ) iid χ () σ σ σ (X i µ) σ = ( )s σ = (X i µ) σ (X i X) χ () σ χ ( ) ( t) = M (Xi µ) (t) = E(e t (Xi µ) σ ) σ = E(e t (Xi X+X µ) σ ) = E(e t (Xi X) +(X µ) = E(e t ( )s σ = E(e t ( )s σ e t (X µ) σ / ) )E(e t (X µ) σ / ) = M ( )s (t)m (X µ) (t) σ σ / = M ( )s σ (t)( t) M ( )s (t) = ( t) σ σ ) ( )s σ χ ( )

Chapter 3. Statistical Iferece Poit Estimatio Problem i statistics: A radom variables X with p.d.f. of the form f(x, θ) where fuctio f is kow but parameter θ is ukow. We wat to gai kowledge about θ. What we have for iferece: There is a radom sample X,..., X from f(x, θ). Poit estimatio: ˆθ = ˆθ(X,..., X ) Iterval estimatio: Estimatio Fid statistics T Statistical ifereces = t (X,..., X ), T = t (X,..., X ) such that α = P (T θ T ) Hypothesis testig: H 0 : θ = θ 0 or H 0 : θ θ 0. Wat to fid a rule to decide if we accept or reject H 0. Def. We call a statistic ˆθ = ˆθ(X,..., X ) a estimator of parameter θ if it is used to estimate θ. If X = x,..., X = x are observed, the ˆθ = ˆθ(x,..., x ) is called a estimate of θ. Two problems are cocered i estimatio of θ : (a) How ca we evaluate a estimator ˆθ for its use i estimatio of θ? Need criterio for this estimatio. (b) Are there geeral rules i derivig estimators? We will itroduce two methods for derivig estimator of θ. Def. We call a estimator θ ubiased for θ if it satisfies E θ (ˆθ(X,..., X )) = θ, θ. { E θ (ˆθ(X,..., X )) = ˆθ(x,..., x )f(x,..., x, θ)dx dx θ fˆθ(θ )dθ where ˆθ = ˆθ(X,..., X ) is a r.v. with pdf fˆθ(θ ) Def. If E θ (ˆθ(X,..., X )) θ for some θ, we said that ˆθ is a biased estimator. 3

iid Example : X,..., X N(µ, σ ), Suppose that our iterest is µ, X, E µ (X ) = µ, is ubiased for µ, (X + X ), E( X +X ) = µ, is ubiased for µ, X, E µ (X) = µ, is ubiased for µ, a a, if, for ɛ > 0, there exists N > 0 such that a a < ɛ if N. {X } is a sequece of r.v. s. How ca we defie X X as? Def. We say that X coverges to X, a r.v. or a costat, i probability if for ɛ > 0, P ( X X > ɛ) 0, as. I this case, we deote X Thm. Proof. P X. If E(X ) = a or E(X ) a ad Var(X ) 0, the X E[(X a) ] = E[(X E(X ) + E(X ) a) ] P a. = E[(X E(X )) ] + E[(E(X ) a) ] + E[(X E(X ))(E(X ) a)] = Var(X ) + E((X ) a) Chebyshev s Iequality : For ɛ > 0, P ( X X ɛ) E(X X) 0 P ( X a > ɛ) = P ((X a) > ɛ ) ɛ or P ( X µ kσ) k E(X a) ɛ = Var(X ) + (E(X ) a) ɛ 0 as. P ( X a > ɛ) 0, as. X P a. Thm. Weak Law of Large Numbers(WLLN) If X,..., X is a radom sample with mea µ ad fiite variace σ, the X P µ. 4

Proof. E(X) = µ, Var(X) = σ 0 as. X P µ. Def. We sat that ˆθ is a cosistet estimator of θ if ˆθ P θ. Example : X,..., X is a radom sample with mea µ ad fiite variace σ.is X a cosistet estimator of µ? E(X )=µ, X is ubiased for µ. Let ɛ > 0, P ( X µ > ɛ) = P ( X µ ɛ) = P (µ ɛ X µ + ɛ) = µ+ɛ µ ɛ X is ot a cosistet estimator of µ f X (x)dx > 0, 0 as. E(X) = µ, Var(X) = σ 0 as. X P µ. X is a cosistet estimator of µ. Ubiasedess ad cosistecy are two basic coditios for good estimator. Momets : Let X be a radom variable havig a p.d.f. momet is defied by x k f(x, θ) E θ (X k ) = all x xk f(x, θ)dx f(x, θ), the populatio k th, discrete, cotiuous The sample k th momet is defied by Note : E( X k i ) = X k i. E(X k i ) = 5 E θ (X k ) = E θ (X k )

Sample k th momet is ubiased for populatio k th momet. Var( X k i ) = Var( X k i ) = k X i P E θ (X k ). Var(X k i ) = Var(Xk ) 0 as. X k i is a cosistet estimator of E θ (X k ). Let X,..., X be a radom sample with mea µ ad variace σ.the sample variace is defied by S = (X i X) Wat to show that S is ubiased for σ. Var(X) = E[(X µ) ] = E[X µx + µ ] = E(X ) µ E(S ) = E( S = E(X ) = Var(X) + µ = Var(X) + (E(X)) E(X) = µ, Var(X) = σ (X i X) ) = E( Xi X = E( Xi X ) = [ E(Xi ) E(X )] = [σ + µ ( σ + µ )] = ( )σ = σ (X i X) is ubiased for σ. S = [ Xi X ] = s P σ [ X i + X ) Xi X P ] E(X ) µ = σ + µ µ = σ X,..., X are iid with mea µ ad variace σ X,..., X are iid r.v. s with mea E(X ) = µ + σ By WLLN, X P i E(X ) = µ + σ 6

Def. Let X,..., X f(x, θ) be a radom sample from a distributio with p.d.f. (a) If θ is uivariate, the method of momet estimator ˆθ solve θ for X = E θ (X) (b) If θ = (θ, θ ) is bivariate, the method of momet estimator ( ˆθ, ˆθ ) solves (θ, θ ) for X = E θ,θ (X), X i = E θ,θ (X ) (c) If θ = (θ,..., θ k ) is k-variate, the method of momet estimator ( ˆθ,..., ˆθ k ) solves θ,..., θ k for Example : X j i = E θ,...,θ k (X j ), j =,..., k iid (a) X,..., X Beroulli(p) Let X = E p (X) = p The method of momet estimator of p is ˆp = X By WLLN, ˆp = X P E p (X) = p ˆp is cosistet for p. E(ˆp) = E(X) = E(X) = p ˆp is ubiased for p. (b) Let X,..., X be a radom sample from Poisso(λ) Let X = E λ (X) = λ The method of momet estimator of λ is ˆλ = X E(ˆλ) = E(X) = λ ˆλ is ubiased for λ. ˆλ = X P E(X) = λ ˆλ is cosistet for λ. (c) Let X,..., X be a radom sample with mea µ ad variace σ. θ = (µ, σ ) Let X = E µ,σ (X) = µ X i = E µ,σ (X ) = σ + µ Method of momet estimator are ˆµ = X, 7

ˆσ = X i X = (X i X). X is ubiased ad cosistet estimator for µ. E(ˆσ ) = E( (Xi X) ) = E( (Xi X) ) = σ σ ˆσ is ot ubiased for σ ˆσ = X i X p E(X ) µ = σ ˆσ is cosistet for σ. Maximum Likelihood Estimator : Let X,..., X be a radom sample with p.d.f. f(x, θ). The joit p.d.f. of X,..., X is f(x,..., x, θ) = f(x i, θ), x i R, i =,..., Let Θ be the space of possible values of θ. We call Θ the parameter space. Def. The likelihood fuctio of a radom sample is defied as its joit p.d.f. as L(θ) = L(θ, x,..., x ) = f(x,..., x, θ), θ Θ. which is cosidered as a fuctio of θ. For (x,..., x ) fixed, the value L(θ, x,..., x ) is called the likelihood at θ. Give observatio x,..., x, the likelihood L(θ, x,..., x ) is cosidered as the probability that X = x,..., X = x occurs whe θ is true. Def. Let ˆθ = ˆθ(x,..., x ) be ay value of θ that maximizes L(θ, x,..., x ). The we call ˆθ = ˆθ(x,..., x ) the maximum likelihood estimator (m.l.e) of θ. Whe X = x,..., X = x is observed, we call ˆθ = ˆθ(x,..., x ) the maximum likelihood estimate of θ. Note : (a) Why m.l.e? Whe L(θ, x,..., x ) L(θ, x,..., x ), we are more cofidet to believe θ = θ tha to believe θ = θ 8

(b) How to derive m.l.e? l x = > 0 l x is i x x x If L(θ ) L(θ ), the l L(θ ) l L(θ ) If ˆθ is the m.l.e., the L(ˆθ, x,..., x ) = max L(θ, x,..., x ) ad θ Θ l L(ˆθ, x,..., x ) = max l L(θ, x,..., x ) θ Θ Two cases to solve m.l.e. : l L(θ) (b.) = 0 θ (b.) L(θ) is mootoe. Solve max L(θ, x,..., x ) from mootoe θ Θ property. Order statistics: Let (X,..., X ) be a radom sample with d.f. F ad p.d.f. f. Let (Y,..., Y ) be a permutatio (X,..., X ) such that Y Y Y. The we call (Y,..., Y ) the order statistic of (X,..., X ) where Y is the first (smallest) order statistic, Y is the secod order statistic,..., Y is the largest order statistic. If (X,..., X ) are idepedet, the P (X A, X A,..., X A ) = f(x,..., x )dx dx A A = f (x )dx f (x )dx A A = P (X A ) P (X A ) Thm. Let (X,..., X ) be a radom sample from a cotiuous distributio with p.d.f. f(x) ad d.f F (x). The the p.d.f. of Y = max{x,..., X } is g (y) = (F (y)) f(y) ad the p.d.f. of Y = mi{x,..., X } is g (y) = ( F (y)) f(y) Proof. This is a R R trasformatio. Distributio fuctio of Y is G (y) = P (Y y) = P (max{x,..., X } y) = P (X y,..., X y) = P (X y)p (X y) P (X y) = (F (y)) 9

p.d.f. of Y is g (y) = D y (F (y)) = (F (y)) f(y) Distributio fuctio of Y is G (y) = P (Y y) = P (mi{x,..., X } y) = P (mi{x,..., X } > y) = P (X > y, X > y,..., X > y) = P (X > y)p (X > y) P (X > y) = ( F (y)) p.d.f. of Y is g (y) = D y ( ( F (y)) ) = ( F (y)) f(y) Example : Let (X,..., X ) be a radom sample from U(0, θ). Fid m.l.e. of θ. Is it ubiased ad cosistet? sol: The p.d.f. of X is f(x, θ) = { θ if 0 x θ 0 elsewhere. Cosider the idicator fuctio I (a,b) (x) = { if a x b 0 elsewhere. The f(x, θ) = θ I [0,θ](x). The likelihood fuctio is L(θ) = f(x i, θ) = θ I [0,θ](x i ) = θ I [0,θ] (x i ) Let Y = max{x,..., X } The I [0,θ] (x i ) = 0 x i θ, for all i =,..., 0 y θ We the have L(θ) = θ I [0,θ](y ) = θ I [y, ](θ) = { θ if θ y 0 if θ < y L(θ) is maximized whe θ = y. The m.l.e. of θ is ˆθ = Y The d.f. of x is F (x) = P (X x) = x 0 θ dt = x θ, 0 x θ 0

The p.d.f. of Y is g (y) = ( y θ ) θ = y θ, 0 y θ E(Y ) = θ y y dy = θ θ m.l.e. ˆθ = Y 0 θ + is ot ubiased. However, E(Y ) = θ θ as, m.l.e. ˆθ is asymptotically ubiased. + E(Y ) = Var(Y ) = E(Y ) (EY ) = θ 0 y y θ dy = + θ + ) θ θ θ = 0 as. + θ ( P Y θ m.l.e. ˆθ = Y is cosistet for θ. Is there ubiased estimator for θ? E( + Y ) = + E(Y ) = + +Y is ubiased for θ. Example : (a) Y b(, p) The likelihood fuctio is l L(p) p L(p) = f Y (y, p) = l L(p) = l = y p y p = 0 y p = y p + θ = θ ( ) p y ( p) y y ( ) + y l p + ( y) l ( p) y m.l.e. ˆp = Y E(ˆp) = E(Y ) = p m.l.e. ˆp = Y is ubiased. Var(ˆp) = Var(Y ) = p( p) 0 as m.l.e. ˆp = Y is cosistet for p. y( p) = p( y) y = p (b) X,..., X are a radom sample from N(µ, σ ). Wat m.l.e. s of µ ad σ The likelihood fuctio is L(µ, σ ) = π(σ ) e (x i µ) σ = (π) (σ ) (x e i µ) σ

l L(µ, σ ) = ( ) l (π) l σ σ l L(ˆµ, σ ) σ l L(µ, σ ) µ = σ + σ 4 (x i µ) = (x σ i µ) = 0 ˆµ = X (x i x) = 0 ˆσ = (x i x) E(ˆµ) = E(X) = µ (ubiased),var(ˆµ) = Var(X) = σ 0 as m.l.e. ˆµ is cosistet for µ. E(ˆσ ) = E( (Xi X) ) = σ σ (biased). E(ˆσ ) = σ σ as ˆσ is asymptotically ubiased. Var(ˆσ ) = Var( (x i x) ) = (x i x) Var(σ ) σ (x i x) = σ4 Var( ( ) ) = σ 4 0 as σ m.l.e. ˆσ is cosistet for σ. Suppose that we have m.l.e. ˆθ = ˆθ(x,..., x ) for parameter θ ad our iterest is a ew parameter τ(θ), a fuctio of θ. What is the m.l.e. of τ(θ)? The space of τ(θ) is T = {τ : θ Θ s.t τ = τ(θ)} Thm. If ˆθ = ˆθ(x,..., x ) is the m.l.e. of θ ad τ(θ) is a - fuctio of θ, the m.l.e. of τ(θ) is τ(ˆθ) Proof. The likelihood fuctio for θ is L(θ, x,..., x ). The the likelihood fuctio for τ(θ) ca be derived as follows : L(θ, x,..., x ) = L(τ (τ(θ)), x,..., x ) = M(τ(θ), x,..., x ) = M(τ, x,..., x ), τ T

M(τ(ˆθ), x,..., x ) = L(τ (τ(ˆθ), x,..., x ) τ(ˆθ) is m.l.e. of τ(θ). This is the ivariace property of m.l.e. Example : ()If Y b(, p), m.l.e of p is ˆp = Y τ(p) m.l.e of τ(p) p p = ( Y ) p p = e p e p Y ê p = e Y ê p = e Y = L(ˆθ, x,..., x ) L(θ, x,..., x ), θ Θ = L(τ (τ(θ)), x,..., x ) = M(τ(θ), x,..., x ), θ Θ = M(τ, x,..., x ), τ T p( p) is ot a - fuctio of p. iid () X,..., X N(µ, σ ), m.l.e. s of (µ, σ ) is (X, (Xi X) ). m.l.e. s of (µ, σ) is (X, (Xi X) ) ( σ (0, ) σ σ is -) You ca also solve l L(µ, σ, x,..., x ) = 0 µ l L(µ, σ, x,..., x ) = 0 for µ, σ σ (µ, σ) is ot a - fuctio of (µ, σ ). ( µ (, ) µ µ is t -) Best estimator : Def. A ubiased estimator ˆθ = ˆθ(X,..., X ) is called a uiformly miimum variace ubiased estimator (UMVUE) or best estimator if for ay ubiased estimator ˆθ,we have Var θ ˆθ Varθ ˆθ, for θ Θ (ˆθ is uiformly better tha ˆθ i variace. ) 3

There are several ways i derivig UMVUE of θ. Cramer-Rao lower boud for variace of ubiased estimator : Regularity coditios : (a) Parameter space Θ is a ope iterval. costats ot depedig o θ. (a, ), (a, b), (b, ), a,b are (b) Set {x : f(x, θ) = 0} is idepedet of θ. (c) f(x,θ) dx = θ θ f(x, θ)dx = 0 (d) If T = t(x,..., x ) is a ubiased estimator, the f(x, θ) t dx = tf(x, θ)dx θ θ Thm. Cramer-Rao (C-R) Suppose that the regularity coditios hold. If ˆτ(θ) = t(x,..., X ) is ubiased for τ(θ), the (τ (θ)) Var θ ˆτ(θ) l f(x,θ) E θ [( ) θ ] = (τ (θ)) E θ [( l f(x,θ) )] for θ Θ θ Proof. Cosider oly the cotiuous distributio. E[ l f(x, θ) ] = θ l f(x, θ) f(x, θ)dx = θ = f(x, θ)dx = 0 θ τ(θ) = E θˆτ(θ) = E θ (t(x,..., x )) = t(x,..., x ) f(x, θ) dx θ f(x i, θ) dx i Takig derivatives both sides. τ (θ) = t(x,..., x ) f(x i, θ) dx i τ(θ) f(x i, θ) dx i θ θ = t(x,..., x ) f(x i, θ) dx i τ(θ) f(x i, θ) dx i θ θ = (t(x,..., x ) τ(θ))( f(x i, θ)) dx i θ 4

Now, θ f(x i, θ) = θ [f(x, θ)f(x, θ) f(x, θ)] = ( θ f(x, θ)) i f(x i, θ) + + ( θ f(x, θ)) i f(x i, θ) = = = j= j= j= θ f(x j, θ) f(x i, θ) i j l f(x j, θ) f(x j, θ) f(x i, θ) θ i j l f(x j, θ) f(x i, θ) θ j= Cauchy-Swartz Iequality The τ (θ) = [E(XY )] E(X )E(Y ) (t(x,..., x ) τ(θ))( = E[(t(x,..., x ) τ(θ)) j= (τ (θ)) E[(t(x,..., x ) τ(θ)) ] E[( Var(ˆτ(θ)) Sice E[( j= (τ (θ)) E[( l f(x j,θ) ) θ ] j= l f(x j, θ) ) ] = θ = j= j= j= l f(x j, θ) ) θ l f(x j, θ) ] θ j= E( l f(x j, θ) ) + θ i j E( l f(x j, θ) ) θ = E( l f(x j, θ) ) θ 5 l f(x j, θ) ) ] θ f(x i, θ) E( l f(x j, θ) θ dx i l f(x i, θ) ) θ

The, we have You may further check that Var θ ˆτ(θ) (τ (θ)) l f(x,θ) E θ [( ) θ ] E θ ( l f(x, θ) l f(x, θ) ) = E θ θ ( ) θ Thm. If there is a ubiased estimator ˆτ(θ) with variace achievig the Cramer-Rao lower boud,the ˆτ(θ) is a UMVUE of τ(θ). (τ (θ)) E θ [( l f(x,θ) θ )] Note: If τ(θ) = θ, the ay ubiased estimator ˆθ satisfies Example: Var θ (ˆθ) (τ (θ)) E θ ( l f(x,θ) θ ) (a)x,..., X iid Poisso(λ), E(X) = λ, Var(X) = λ. MLE ˆλ = X, E(ˆλ) = λ, Var(ˆλ) = λ. p.d.f. f(x, λ) = λx e λ x!, x = 0,,... l f(x, λ) = x l λ λ l x! λ l f(x, λ) = x λ λ l f(x, λ) = x λ E( λ l f(x, λ)) = E( x λ ) = E(X) λ Cramer-Rao lower boud is ( λ ) = λ = Var(ˆλ) MLE ˆλ = X is the UMVUE of λ. = λ 6

(b)x,..., X iid Beroulli(p), E(X) = p, Var(X) = p( p). Wat UMVUE of p. p.d.f f(x, p) = p x ( p) x l f(x, p) = x l p + ( x) l( p) p l f(x, p) = x p x p p l f(x, p) = x p + x ( p) E( p l f(x, p)) = E( X p + X ( p) ) = p + p = p( p) C-R lower boud for p is p( p) ( = ) p( p) m.l.e. of p is ˆp = X E(ˆp) = E(X) = p, Var(ˆp) = Var(X) = p( p) MLE ˆp is the UMVUE of p. = C-R lower boud. 7

Chapter 4. Cotiue to Poit Estimatio-UMVUE Sufficiet Statistic: A,B are two evets. The coditioal probability of A give B is P (A B) = P (A B), A S. P (B) P ( B) is a probability set fuctio with domai of subsets of sample space S. Let X,Y be two r.v s with joit p.d.f f(x, y) ad margial p.d.f s f X (x) ad f Y (y). The coditioal p.d.f of Y give X = x is f(y x) = f(x, y) f X (x), y R Fuctio f(y x) is a p.d.f satisfyig f(y x)dy = I estimatio of parameter θ, we have a radom sample X,..., X from p.d.f f(x, θ). The iformatio we have about θ is cotaied i X,..., X. Let U = u(x,..., X ) be a statistic havig p.d.f f U (u, θ) The coditioal p.d.f X,..., X give U = u is f(x,..., x u) = f(x,..., x, θ), {(x,..., x ) : u(x,..., x ) = u} f U (u, θ) Fuctio f(x,..., x u) is a joit p.d.f with u(x,...,x )=u f(x,..., x u)dx dx = Let X be r.v. ad U = u(x) f(x U = u) = f(x, u) f U (u) = { fx (x) f U (u) 0 = 0 f U (u) if u(x) = u if u(x) u If, for ay u, coditioal p.d.f f(x,..., x, θ u) is urelated to parameter θ, the the radom sample X,..., X cotais o iformatio about θ whe U = u is observed. This says that U cotais exactly the same amout of iformatio about θ as X,..., X. Def. Let X,..., X be a radom sample from a distributio with p.d.f f(x, θ), θ Θ. We call a statistic U = u(x,..., X ) a sufficiet statistic if, for ay value U = u, the coditioal p.d.f f(x,..., x u) ad its domai all ot 8

deped o parameter θ. Let U = (X,..., X ). The f(x,..., x, θ u = (x, x,..., x )) = { f(x,...,x,θ) f(x,x,...,x,θ) The (X,..., X ) itself is a sufficiet statistic of θ. if x = x, x = x,..., x = x 0 if x i x i for some i s. Q: Why sufficiecy? A: We wat a statistic with dimesio as small as possible ad cotais iformatio about θ the same amout as X,..., X does. Def. If U = u(x,..., X ) is a sufficiet statistic with smallest dimesio, it is called the miimal sufficiet statistic. Example: (a) Let (X,..., X ) be a radom sample from a cotiuous distributio with p.d.f f(x, θ). Cosider the order statistic Y = mi{x,..., X },..., Y = max{x,..., X }. If Y = y,..., Y = y are observed, sample X,..., X have equal chace to have values i {(x,..., x ) : (x,..., x ) is a permutatio of (y,..., y )}. The the coditioal joit p.d.f of X,..., X give Y = y,..., Y = y is { if x f(x,..., x, θ y,..., y ) =!,..., x is a permutatio of y,..., y. 0 otherwise. The order statistic (Y,..., Y ) is also a sufficiet statistic of θ. Order statistic is ot a good sufficiet statistic sice it has dimesio. (b)let X,..., X be a radom sample from Beroulli distributio. The joit p.d.f of X,..., X is f(x,..., x, p) = p x i ( p) x i = p x i ( p) x i, x i = 0,, i =,...,. Cosider the statistic Y = X i which has biomial distributio b(, p) with p.d.f f Y (y, p) = ( ) p y ( p) y, y = 0,,..., y 9

If Y = y, the space of (X,..., X ) is {(x,..., x ) : The coditioal p.d.f of X,..., X give Y = y is x i( p) x i p x i = y} = py ( p) y ( y)p f(x,..., x, p y) = y ( p) y ( = y)p y ( p) y ( y) = if ( x i ) 0 if x i = y x i y which is idepedet of p. Hece, Y = X i is a sufficiet statistic of p ad is a miimal sufficiet statistic. (c)let X,..., X be a radom sample from uiform distributio U(0, θ). Wat to show that the largest order statistic Y = max{x,..., X } is a sufficiet statistic. The joit p.d.f of X,..., X is f(x,..., x, θ) = The p.d.f of Y is = { θ I(0 < x i < θ) = θ θ I(0 < x i < θ) if 0 < x i < θ, i =,..., 0 otherwise. f Y (y, θ) = ( y θ ) θ = y θ, 0 < y < θ Whe Y = y is give, X,..., X be values with 0 < x i y, i =,..., The coditioal p.d.f of X,..., X give Y = y is f(x,..., x y) = f(x,..., x, θ) f Y (y, θ) = θ = y y θ 0 otherwise. idepedet of θ. So, Y = max{x,..., X } is a sufficiet statistic of θ. 0 < x i y, i =,..., Q: 30

(a) If U is a sufficiet statistic, are U+5, U, cos(u) all sufficiet for θ? (b) Is there easier way i fidig sufficiet statistic? T = t(x,..., X ) is sufficiet for θ if coditioal p.d.f f(x,..., x, θ t) is idep. of θ. Idepedece:.fuctio f(x,..., x, θ t) ot deped o θ..domai of X,..., X ot deped o θ. Thm. Factorizatio Theorem. Let X,..., X be a radom sample from a distributio with p.d.f f(x, θ). A statistic U = u(x,..., X ) is sufficiet for θ iff there exists fuctios K, K 0 such that the joit p.d.f of X,..., X may be formulated as f(x,..., x, θ) = K (u(x,..., X ), θ)k (x,..., x ) where K is ot a fuctio of θ. Proof. Cosider oly the cotiuous r.v s. ) If U is sufficiet for θ, the f(x,..., x, θ u) = f(x,..., x, θ) is ot a fuctio of θ f U (u, θ) f(x,..., x, θ) = f U (u(x,..., X ), θ)f(x,..., x u) = K (u(x,..., X ), θ)k (x,..., x ) ) Suppose that f(x,..., x, θ) = K (u(x,..., X ), θ)k (x,..., x ) Let Y = u (X,..., X ), Y = u (X,..., X ),..., Y = u (X,..., X ) be a - fuctio with iverse fuctios x = w (y,..., y ), x = w (y,..., y ),..., x = w (y,..., y ) ad Jacobia J = x y The joit p.d.f of Y,..., Y is. x y x y x y. (ot deped o θ.) f Y,...,Y (y,..., y, θ) = f(w (y,..., y ),..., w (y,..., y ), θ) J = K (y, θ)k (w (y,..., y ),..., w (y,..., y ), θ) J 3

The margial p.d.f of U = Y is f U (y, θ) = K (y, θ) K (w (y,..., y ),..., w (y,..., y )) J dy dy } {{ } ot deped o θ. The the coditioal p.d.f of X,..., X give U = u is f(x,..., x, θ u) = f(x,..., x, θ) f U (u, θ) K (x,..., x ) = K (w (y,..., y ),..., w (y,..., y ), θ) J dy dy which is idepedet of θ. This idicates that U is sufficiet for θ. Example : (a)x,..., X is a radom sample from Poisso(λ).Wat sufficiet statistic for λ. Joit p.d.f of X,..., X is λ x i e λ f(x,..., x, λ) = = λ x i e λ x i! = λ x i e λ x i! x i! = K ( X i is sufficiet for λ. We also have X = x i, λ)k (x,..., x ) f(x,..., x, λ) = λ x e λ = K (x, λ)k (x,..., x ) x i! X i is sufficiet for λ. We also have f(x,..., x, λ) = λ (x ) e λ = K (x, λ)k (x,..., x ) x i! X is sufficiet for λ. 3

(b)let X,..., X be a radom sample from N(µ, σ ).Wat sufficiet statistic for (µ, σ ). Joit p.d.f of X,..., X is f(x,..., x, µ, σ ) = (x i µ) = e (x i µ) πσ (x i x+x µ) = f(x,..., x, µ, σ ) = (s = (π) (σ ) (X, s ) is sufficiet for (µ, σ ). σ = (π) (σ ) (x i µ) e σ (x i x) +(x µ) = ( )s +(x µ) (x i x) ) e ( )s +(x µ) σ = K (x, s, µ, σ )K (x,..., x ) What is useful with a sufficiet statistic for poit estimatio? Review : X, Y r.v. s with joi p.d.f f(x, y). Coditioal p.d.f f(x, y) f(y x) = f X (x) f(x, y) = f(y x)f X(x) f(x, y) f(x y) = f Y (y) f(x, y) = f(x y)f Y (y) Coditioal expectatio of Y give X = x is E(Y x) = yf(y x)dy The radom coditioal expectatio E(Y X) is fuctio E(Y x) with x replaced by X. Coditioal variace of Y give X = x is Var(Y x) = E[(Y E(Y x)) x] = E(Y x) (E(Y x)) The coditioal variace Var(Y X) is Var(Y x) replacig x by X. Thm. Let Y ad X be two r.v. s. (a) E[E(Y x)] = E(Y ) (b) Var(Y ) = E(Var(Y x)) + Var(E(Y x) 33

Proof. (a) E[E(Y x)] = = = = = = E(Y ) E(Y x)f X (x)dx y( yf(y x)dyf X (x)dx yf(x, y)dxdy yf Y (y)dy f(x, y)dx)dy (b) Var(Y x) = E(Y x) (E(Y x)) E(Var(Y x)) = E[E(Y x)] E[(E(Y x)) ] = E(Y ) E[(E(Y x)) ] Also, Var(E(Y x) = E[(E(Y x)) ] E[(E(Y x))] = E[(E(Y x)) ] (E(Y )) E(Var(Y x)) + Var(E(Y x) = E(Y ) (E(Y )) = Var(Y ) Now, we comeback to the estimatio of parameter fuctio τ(θ). We have a radom sample X,..., X from f(x, θ). Lemma. Let ˆτ(X,..., X ) be a ubiased estimator of τ(θ) ad U = u(x,..., X ) is a statistic.the (a)e θ [ˆτ(X,..., X ) U] is ubiased for τ(θ) (b)var θ (E[ˆτ(X,..., X ) U]) Var θ (ˆτ(X,..., X )) Proof. (a) E θ [E(ˆτ(X,..., X ) U)] = E θ (ˆτ(X,..., X )) = τ(θ), θ Θ. The E θ [ˆτ(X,..., X ) U] is ubiased for τ(θ). (b) Var θ (ˆτ(X,..., X )) = E θ [Var θ (ˆτ(X,..., X ) U)] + Var θ [E θ (ˆτ(X,..., X ) U) Var θ [E θ (ˆτ(X,..., X ) U)], θ Θ. 34

Coclusios: (a) For ay estimator ˆτ(X,..., X ) which is ubiased for τ(θ), ad ay statistic U, E θ [ˆτ(X,..., X ) U] is ubiased for τ(θ) ad with variace smaller tha or equal to ˆτ(X,..., X ). (b) However, E θ [ˆτ(X,..., X ) U] may ot be a statistic. If it is ot, it caot be a estimator of τ(θ). (c)if U is a sufficiet statistic, f(x,..., x, θ u) is idepedet of θ, the E θ [ˆτ(X,..., X ) u] is idepedet of θ. So, E θ [ˆτ(X,..., X ) U] is a ubiased estimator. If U is ot a sufficiet statistic, f(x,..., x, θ u) is ot oly a fuctio of u but also a fuctio of θ, the E θ [ˆτ(X,..., X ) u] is a fuctio of u ad θ. Ad E θ [ˆτ(X,..., X ) u] is ot a statistic. Thm. Rao-Blackwell If ˆτ(X,..., X ) is ubiased for τ(θ) ad U is a sufficiet statistic, the (a)e θ [ˆτ(X,..., X ) U] is a statistic. (b)e θ [ˆτ(X,..., X ) U] is ubiased for τ(θ). (c)var θ (E[ˆτ(X,..., X ) U]) Var θ (ˆτ(X,..., X )), θ Θ. If ˆτ(θ) is a ubiased estimator for τ(θ) ad U, U,... are sufficiet statistics, the we ca improve ˆτ(θ) with the followig fact: Var θ (E[ˆτ(θ) U ]) Var θˆτ(θ) Var θ E(E(ˆτ(θ) U ) U ) Var θ E(ˆτ(θ) U ) Var θ E[E(E(ˆτ(θ) U ) U ) U 3 ] Var θ E(E(ˆτ(θ) U ) U ) Will this process eds with Cramer-Rao lower boud? This ca be solved with complete statistic. Note: Let U be a statistic ad h is a fuctio. (a) If h(u) = 0 the E θ (h(u)) = E θ (0) = 0, θ Θ.. 35

(b) If P θ (h(u) = 0) =, θ Θ.h(U) has a p.d.f {, if h = 0 f h(u) (h) = The E θ (h(u)) = hf h(u) (h) = 0 0, otherwise. all h Def. X,..., X is radom sample from f(x, θ). A statistic U = u(x,..., X ) is a complete statistic if for ay fuctio h(u) such that E θ (h(u)) = 0, θ Θ, the P θ (h(u) = 0) =, for θ Θ. Q : For ay statistic U, how ca we verify if it is complete or ot complete? A : () To prove completeess, you eed to show that for ay fuctio h(u) with 0 = E θ (h(u)), θ Θ.the followig = P θ (h(u) = 0), θ Θ hold. ()To prove i-completeess, you eed oly to fid oe fuctio h(u) that satisfies E θ (h(u)) = 0, θ Θ ad P θ (h(u) = 0) <, for some θ Θ. Examples: iid (a)x,..., X Beroulli(p) Fid a complete statistic ad i-complete statistic? sol: (a.) We show that Y = X i is a complete statistic. Y b(, p). Suppose that fuctio h(y ) satisfies 0 = E p h(y ), 0 < p < Now, ( ) 0 = E p h(y ) = h(y) p y ( p) y y y=0 ( ) = ( p) p h(y) ( y p )y, 0 < p < 0 = y=0 ( ) p h(y) ( y p )y, 0 < p < y=0 (Let θ = p p, 0 < p < 0 < θ < ) ( ) 0 = h(y) θ y, 0 < θ < y y=0 36

A order + polyomial equatio caot have ifiite solutios except that coefficiets are zero s. ( ) h(y) = 0, y = 0,..., for 0 < θ < y h(y) = 0, y = 0,..., for 0 < p <. P p (h(y ) = 0) P p (Y = 0,..., ) = Y = X i is complete (a.) We show that Z = X X is ot complete. E p Z = E p (X X ) = E p X E p X = p p = 0, 0 < p < P p (Z = 0) = P p (X X = 0) = P p (X = X = 0 or X = X = ) = P p (X = X = 0) + P p (X = X = ) = ( p) + p < for 0 < p <. Z = X X is ot complete. (b)let (X,..., X ) be a radom sample from U(0, θ). We have to show that Y = max{x,..., X } is a sufficiet statistic. Here we use Factorizatio theorem to prove it agai. f(x,..., x, θ) = θ I(0 < x i < θ) = θ = θ I(0 < y < θ) Y is sufficiet for θ Now, we prove it complete. The p.d.f of Y is I(0 < x i < θ, i =,..., ) f Y (y) = ( y θ ) θ = θ y, 0 < y < θ 37

Suppose that h(y ) satisfies 0 = E θ h(y ), 0 < θ < 0 = E θ h(y ) = 0 = θ 0 θ 0 h(y) θ y dy = θ h(y)y dy θ h(y)y dy, θ > 0 Takig differetiatio both sides with θ. 0 = h(θ)θ, θ > 0 0 = h(y), 0 < y < θ, θ > 0 P θ (h(y ) = 0) = P θ (0 < Y < θ) =, θ > 0 Y = max{x,..., X } is complete. Def. If the p.d.f of r.v. X ca be formulated as f(x, θ) = e a(x)b(θ)+c(θ)+d(x), l < x < q where l ad q do ot deped o θ, the we say that f belogs to a expoetial family. Thm. Let X,..., X be a radom sample from f(x, θ) which belogs to a expoetial family as f(x, θ) = e a(x)b(θ)+c(θ)+d(x), l < x < q The a(x i ) is a complete ad sufficiet statistic. Note: We say that X = Y if P (X = Y ) =. Thm. Lehma-Scheffe Let X,..., X be a radom sample from f(x, θ). Suppose that U = u(x,..., X ) is a complete ad sufficiet statistic. If ˆτ = t(u) is ubiased for τ(θ), the ˆτ is the uique fuctio of U ubiased for τ(θ)ad is a UMVUE of τ(θ). (Ubiased fuctio of complete ad sufficiet statistic is UMVUE.) Proof. If ˆτ = t (U) is also ubiased for τ(θ), the E θ (ˆτ ˆτ ) = E θ (ˆτ) E θ (ˆτ ) = τ(θ) τ(θ) = 0, θ Θ. = P θ (ˆτ ˆτ = 0) = P (ˆτ = ˆτ ), θ Θ. ˆτ = ˆτ, ubiased fuctio of U is uique. If T is ay ubiased estimator of τ(θ) the Rao-Blackwell theorem gives: 38 0

(a) E(T U) is ubiased estimator of τ(θ). By uiqueess, E(T U) = ˆτ with probability. (b) Var θ (ˆτ) = Var θ (E(T U)) Var θ (T ), θ Θ. This holds for every ubiased estimator T. The ˆτ is UMVUE of τ(θ) Two ways i costructig UMVUE based o a complete ad sufficiet statistic U: (a) If T is ubiased for τ(θ), the E(T U) is the UMVUE of τ(θ). This is easy to defie but difficult to trasform it i a simple form. (b) If there is a costat such that E(U) = c θ, the T = U is the UMVUE c of θ. Example : (a)let X,..., X be a radom sample from U(0, θ). Wat UMVUE of θ. sol: Y = max{x,..., X } is a complete ad sufficiet statistic. The p.d.f of Y is f Y (y, θ) = ( y θ ) θ = y, 0 < y < θ E(Y ) = θ We the have E( +Y ) = +E(Y ) = θ. So, +Y is the UMVUE of θ. 0 θ y y θ dy = + θ. (b) Let X,..., X be a radom sample from Beroulli(p). Wat UMVUE of θ. sol: The p.d.f is f(x, p) = p x ( p) x p = ( p)( p )x = e x l( X i is complete ad sufficiet. E( X i ) = E(X i ) = p ˆp = X i = X is UMVUE of p. p p )+l( p) 39

(c)x,..., X iid N(µ, ). Wat UMVUE of µ. sol: The p.d.f of X is f(x, µ) = e (x µ) = e (x µx+µ ) x µx = e µ l π π π X i is complete ad sufficiet. E( X i ) = E(X i ) = µ ˆµ = X i = X is UMVUE of µ. Sice X is ubiased, we see that E(X X i ) = X (d)x,..., X iid Possio(λ). Wat UMVUE of e λ. sol: The p.d.f of X is f(x, λ) = x! λx e λ x l λ λ l x! = e X i is complete ad sufficiet. E(I(X = 0)) = P (X = 0) = f(0, λ) = e λ where I(X = 0) is a idicator fuctio. I(X = 0) is ubiased for e λ E(I(X = 0) X i ) is UMVUE of e λ. 40

Chapter 5. Cofidece Iterval Let Z be the r.v. with stadard ormal distributio N(0, ) We ca fid z α ad z α that satisfy α = P (Z z α ) = P (Z z α ) ad α = P ( z α Z z α ). A table of z α is the followig : α z α 0.8.8 (z 0. ) 0.9.645 (z 0.05 ) 0.95.96 (z 0.05 ) 0.99.58 (z 0.005 ) 0.9973 3 (z 0.0035 ) Def. Suppose that we have a radom sample from f(x, θ). For 0 < α <, if there exists two statistics T = t (X,..., X ) ad T = t (X,..., X ) satisfyig α = P (T θ T ) We call the radom iterval (T, T ) a 00( α)% cofidece iterval of parameter θ. If X = x,..., X = x is observed, we also call (t (X,..., X ), t (X,..., X )) a 00( α)% cofidece iterval(c.i.) for θ Costructig C.I. by pivotal quatity: Def. A fuctio of radom sample ad parameter, Q = q(x,..., X, θ), is called a pivotal quatity if its distributio is idepedet of θ With a pivotal quatity q(x,..., X, θ), there exists a, b such that α = P (a q(x,..., X, θ) b), θ Θ. The iterest of pivotal quatity is that there exists statistics T = t (X,..., X ) ad T = t (X,..., X ) with the followig - trasformatio a q(x,..., X, θ) b iff T θ T The we have α = P (T θ T ) ad (T, T ) is a 00( α)% C.I. for θ Cofidece Iterval for Normal mea: Let X,..., X be a radom sample from N(µ, σ ). We cosider the C.I. of 4

parameter µ. (I) σ = σ 0 is kow X N(µ, σ 0 ) X µ σ 0 / N(0, ) (X z α α = P ( z α Z z α ), Z N(0, ) = P ( z α X µ σ 0 / z α ) σ 0 σ = P ( z α X µ z α 0 ) = P (X z α σ 0 µ X + z α σ 0, X + z α σ 0 ) is a 00( α)% C.I. for µ. σ 0 ) ex: = 40, σ 0 = iid 0, x = 7.64 (X,..., X 40 N(µ, 0).) Wat a 80% C.I. for µ. sol: A 80% C.I. for µ. is σ 0 σ (X z α 0 0 0, X + z α ) = (7.64.8, 7.64 +.8 ) 40 40 P (X z α = (6.53, 7.805) σ 0 µ X + z α σ 0 ) = α = 0.8 (II)σ is ukow. P (6.53 µ 7.805) = or 0 Def. If Z N(0, ) ad χ (r) are idepedet, we call the distributio of the r.v. T = Z χ (r) r a t-distributio with r degrees of freedom. The p.d.f of t-distributio is f T (t) = Γ( r+ ) Γ( r ) rπ( + t ) r+ r, < t < 4

f T ( t) = f T (t) t-distributio is symmetric at 0. iid Now X,..., X N(µ, σ ). We have { X N(µ, σ ( )s σ ) χ ( ) idep. { X µ σ/ ( )s σ N(0, ) χ ( ) idep. Let t α satisfies T = X µ σ/ ( )s σ ( ) = X µ s/ t( ) (X t α s, X + t α α = P ( t α X µ s/ t α = P ( t α = P (X t α ) s X µ t α s ) s µ X + t α s ) is a 00( α)% C.I. for µ. s ) ex: Suppose that we have = 0, x = 3. ad s =.7. We also have t 0.05 =.6. Wat a 95% C.I. for µ. sol: A 95% C.I. for µ is (3..6.7 0, 3. +.6.7 0 ) = (.34, 4.0) 43