March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS

Similar documents
Mathematical statistics

Miscellaneous Errors in the Chapter 6 Solutions

February 26, 2017 COMPLETENESS AND THE LEHMANN-SCHEFFE THEOREM

1. (Regular) Exponential Family

Sufficiency 1. Sufficiency. Math Stat II

ECE 275B Homework # 1 Solutions Version Winter 2015

1. Fisher Information

ECE 275B Homework # 1 Solutions Winter 2018

SUFFICIENT STATISTICS

Mathematical statistics

Classical Estimation Topics

STA 260: Statistics and Probability II

Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee

INTRODUCTION TO BAYESIAN METHODS II

Mathematical statistics

Patterns of Scalable Bayesian Inference Background (Session 1)

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Chapter 8.8.1: A factorization theorem

STAT215: Solutions for Homework 1

Math 494: Mathematical Statistics

MIT Spring 2016

Foundations of Statistical Inference

STAT 730 Chapter 4: Estimation

Mathematical Statistics

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Answers to the 8th problem set. f(x θ = θ 0 ) L(θ 0 )

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

6 Classic Theory of Point Estimation

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

DA Freedman Notes on the MLE Fall 2003

General Bayesian Inference I

Brief Review on Estimation Theory

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci

Stat 411 Lecture Notes 03 Likelihood and Maximum Likelihood Estimation

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Proof In the CR proof. and

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Lecture 17: The Exponential and Some Related Distributions

Fundamentals of Statistics

Course arrangements. Foundations of Statistical Inference

CHAPTER 6 SOME CONTINUOUS PROBABILITY DISTRIBUTIONS. 6.2 Normal Distribution. 6.1 Continuous Uniform Distribution

Stat410 Probability and Statistics II (F16)

Statistical Theory MT 2007 Problems 4: Solution sketches

Conjugate Priors, Uninformative Priors

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

5.2 Fisher information and the Cramer-Rao bound

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Statistics 3858 : Maximum Likelihood Estimators

Chapter 3 : Likelihood function and inference

A Few Notes on Fisher Information (WIP)

ST5215: Advanced Statistical Theory

Maximum Likelihood Estimation

Stat 512 Homework key 2

Last Lecture. Biostatistics Statistical Inference Lecture 14 Obtaining Best Unbiased Estimator. Related Theorems. Rao-Blackwell Theorem

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

ST5215: Advanced Statistical Theory

Maximum Likelihood Estimation

Exercises and Answers to Chapter 1

SOLUTION FOR HOMEWORK 7, STAT p(x σ) = (1/[2πσ 2 ] 1/2 )e (x µ)2 /2σ 2.

STAT 450: Statistical Theory. Distribution Theory. Reading in Casella and Berger: Ch 2 Sec 1, Ch 4 Sec 1, Ch 4 Sec 6.

A Very Brief Summary of Statistical Inference, and Examples

Continuous Random Variables and Continuous Distributions

Lecture 1: Introduction

Statistics 581, Problem Set 8 Solutions Wellner; 11/22/2018

Actuarial models. Proof. We know that. which is. Furthermore, S X (x) = Edward Furman Risk theory / 72

Introduction to Rare Event Simulation

Chapter 1. Statistical Spaces

Fisher Information & Efficiency

Statistical Theory MT 2006 Problems 4: Solution sketches

MATH4427 Notebook 2 Fall Semester 2017/2018

Lecture 2. (See Exercise 7.22, 7.23, 7.24 in Casella & Berger)

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Example: An experiment can either result in success or failure with probability θ and (1 θ) respectively. The experiment is performed independently

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

SOLUTIONS TO MATH68181 EXTREME VALUES AND FINANCIAL RISK EXAM

A Very Brief Summary of Statistical Inference, and Examples

LECTURE 2 NOTES. 1. Minimal sufficient statistics.

Elements of statistics (MATH0487-1)

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

Midterm Examination. STA 215: Statistical Inference. Due Wednesday, 2006 Mar 8, 1:15 pm

Lecture 4: UMVUE and unbiased estimators of 0

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

Lecture 4. Continuous Random Variables and Transformations of Random Variables

p y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.

Review. December 4 th, Review

Lecture 5: Moment generating functions

Parametric Inference

Chapters 9. Properties of Point Estimators

Theory of Statistical Tests

Limiting Distributions

1 Probability Model. 1.1 Types of models to be discussed in the course

Information in Data. Sufficiency, Ancillarity, Minimality, and Completeness

Notes 9 : Infinitely divisible and stable laws

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Review and continuation from last week Properties of MLEs

Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)

Transcription:

March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS Abstract. We will introduce a class of distributions that will contain many of the discrete and continuous we are familiar with. This class will help to explain why the sample sum is often a sufficient and complete statistic. 1. Introduction Let Θ R be an open interval that is possibly infinite. We say that a family of pdfs {f θ } θ Θ is of exponential class if f(x; θ) = h(x) exp(η(θ)k(x) A(θ)) for some functions η and A which depend on θ and some functions k and h 0 which only depend on x and not on θ. Thus we are assuming that the support of f θ does not depend on θ and the family has common support S. Notice that since f is a pdf, if it is a density for a continuous random variable, then right away we know that ( ) A(θ) = log h(x) exp(η(θ)k(x))dx, and similarly in the discrete case ( ) A(θ) = log h(x) exp(η(θ)k(x). By taking and x s(x) := exp(log(h(x))) q(θ) := A(θ) we can also write as some other texts: ( ) f(x; θ) = exp η(θ)k(x) + s(x) + q(θ) 1[x S]. Furthermore, we say that an exponential family is regular if η is a non-constant continuous function of θ; in the continuous case, we also require that k (x) is not identically zero and h to be a continuous function; in the discrete case, we require that k(x) is not a constant function. Let us remark that if k is the constant function, then exp(η(θ)k(x) A(θ)) is equal to a constant for all θ, and there is really only one pdf in the family. Many familiar families pdfs both discrete and continuous are of (regular) exponential class.

2 THE EXPONENTIAL CLASS OF DISTRIBUTIONS Exercise 1. Fix an integer n 1. Let p (0, 1). Show that Binomial family given by ( ) n f p (x) = p x (1 p) n x x is of exponential class Solution. Write ( ) n f p (x) = exp ( x log(p) + (n x) log(1 p) ) x ( ) n ( ) = exp log( p 1 p x )x + n log(1 p) Take h(x) = ( ) n x, η(p) = log( p ), k(x) = x, and A(θ) = n log(1 p). 1 p Recall that a continuous random variable X is said to have a gamma distribution with parameters α > 0 and β > 0 if it has a pdf given by { 1 β f(x; α, β) = α Γ(α) xα 1 e x/β if x > 0, 0 otherwise; Exercise 2. Fix α > 0. Show that the family of gamma distributions given by {f( ; α, β)} β>0 is of exponential class. Solution. Set h(x) = 1 Γ(α) xα 1 1[x > 0], η(β) = 1/β, k(x) = x, and A(β) = α log(β). Theorem 3. Let X = (X 1,..., X n ) be a random sample from a family of regular exponential class, where Then the sum given by f(x 1 ; θ) = h(x 1 ) exp(η(θ)k(x 1 ) A(θ)). T = t(x) := k(x i ) is a sufficient and complete statistic; in particular, the family of pdfs corresponding to T is also of regular exponential class. Proof of Theorem 3 (sufficiency). Observe that ( ) n L(x; θ) = exp η(θ)t(x) na(θ) h(x i ). Hence set ( ) g(t(x); θ) := exp η(θ)t(x) na(θ)

and THE EXPONENTIAL CLASS OF DISTRIBUTIONS 3 H(x) := n h(x i ) and the result follows from the Neyman factorization theorem. Proposition 4. Let X be a random variable with pdf from a regular exponential class given by f(x; θ) = h(x) exp(η(θ)k(x) A(θ)). Then provided all the derivatives exists, and E θ k(x) = A (θ)/η (θ) Var θ (k(x)) = 1 η (θ) 3 (η (θ)a (θ) A (θ)η (θ)). Proof of Proposition 4 (continuous case). We use the same trick as we did with Fisher information, differentiate, with respect to θ, the identity 1 = f(x; θ)dx. (1) We can bring the derivative inside the integral since the support of f is independent of θ. This gives 0 = f(x; θ) [ η (θ)k(x) A (θ) ] dx. (2) Note that E θ k(x) = k(x)f(x; θ)dx. Some rearranging, and using (1) gives the desired result for the expectation. For the variance, we differentiate the identity (2) with respect to θ, and obtain 0 = f(x; θ) [ η (θ)k(x) A (θ) ] + f(x; θ) [ η (θ)k(x) A (θ) ] 2 dx We recognize from (2) that so that 0 = = E θ [η (θ)k(x) A (θ)] = 0, f(x; θ) [ η (θ)k(x) A (θ) ] dx + Var θ [η (θ)k(x) A (θ)] f(x; θ) [ η (θ)k(x) A (θ) ] dx + η (θ) 2 Var θ k(x).

4 THE EXPONENTIAL CLASS OF DISTRIBUTIONS Some algebra and the previous identity give η (θ) 2 Var θ k(x) = A (θ) η (θ)e θ k(x) = A (θ) η (θ) A (θ) η (θ), from which the desired result follows. Proposition 5 (Additive property). Let X = (X 1,..., X n ) be a random sample from a family of regular exponential class, where Let f(x 1 ; θ) = h(x 1 ) exp(η(θ)k(x 1 ) A(θ)). T = t(x) := Then the pdf of T has the form k(x i ). g(t; θ) = r(t) exp ( η(θ)t na(θ) ), where r(t) does not depend on θ, so that in particular, the family of pdfs corresponding to T is of regular exponential class. Proof of Proposition 5 (discrete case). Suppose x is such that t(x) = t, then n n P(X = x) = f(x i ; θ) = exp[η(θ)t na(θ)] h(x i ). For each t, let S t := {x : t(x) = t}. Set We have that r(t) = x S t n h(x i ). P(T = t) = r(t) exp[η(θ)t na(θ)]. The proof of Proposition 5 in the continuous case is a bit harder. Proposition 6. A family of pdfs that is of regular exponential class of the form: f(x; θ) = h(x) exp(η(θ)x A(θ)) is complete. Thus in Proposition 6 we require that k(x) = x.

THE EXPONENTIAL CLASS OF DISTRIBUTIONS 5 Sketch Proof of Proposition 6 (continuous case). Suppose that family is given by f(x; θ) = h(x) exp(η(θ)x A(θ)) where θ Θ. Let u be so that u(x)h(x) exp(η(θ)x A(θ))dx = 0 for all θ Θ. Let v(x) = u(x)h(x), then we have that v(x) exp(η(θ)x) = 0 for all θ Θ. Recall that η is a continuous function, and as a consequence of the intermediate value property, continuous functions map closed intervals to closed intervals. Since η is not a constant, we have that ˆv(s) := v(x) exp(sx) = 0 for all s [a, b], where a < b. Here, we appeal to the theory of Laplace transforms to obtain that v = 0; since h is non-zero on the support of f, we can conclude that u = 0. Proof of Theorem 3. We already proved the sufficiency of T, the rest follow from Propositions 5 and 6, and the fact that the pdf of T has the form required by Proposition 6. 2. An example from the Beta family Exercise 7 (Beta(θ, 1)). Let X = (X 1,..., X n ) be a random sample, where X 1 has pdf f(x 1 ; θ) := 1[x 1 (0, 1)]θx θ 1 1, where θ > 0. Show that G := (X 1 X 2 X n ) 1/n is a complete and sufficient statistic for θ. Solution. We can rewrite f as f(x 1 ; θ) = 1[x 1 (0, 1)] exp[(θ 1) log(x 1 ) + log(x 1 )]. Thus we have η(θ) = θ 1 and k(x 1 ) = log(x 1 ), and by Theorem 3, we know that T = log(x i ) is sufficient and complete. Recall that any 1-1 function of a sufficient statistic is again sufficient, so G = exp(t/n) is also sufficient. It is also clear that 1-1 function of a complete statistic must also be complete.

6 THE EXPONENTIAL CLASS OF DISTRIBUTIONS Exercise 8. Referring to Exercise 7, show that the mle for θ is given by n Z := n log(x i). Exercise 9. referring to Exercise 8, let Y i := log(x i ). Show that Y i Γ(1, 1/θ). Exercise 10. Show that EZ = θ( n n 1 ). Exercise 11. Referring to Exercise 7, find the MVUE for θ. Exercise 12. Let W Γ(α, β). Let a > 0. Assume that α > a. Show that EW a Γ(α a) = β a Γ(α). Exercise 13. Show that the variance of the MVUE in Exercise 11 is θ 2 /(n 2). Exercise 14. Referring to Exercise 7, show that Fisher information of X 1 is given by I(θ) = θ 2. Thus the MVUE is not efficient. 3. Another example Exercise 15. Let X = (X 1,..., X n ) be a random sample from the distribution θ f θ (x 1 ) = (1 + x 1 ) 1[x 1+θ 1 > 0]. (a) Find the Cramer-Rao lower bound for an unbiased estimator of θ. (b) Find the Cramer-Rao lower bound for an unbiased estimator of 1/θ. (c) Find the mle for θ. (d) Show that there is an efficient estimator of 1/θ. (e) Show that there does not exist an efficient estimator for θ. Solution. (a) We have that for x 1 > 0, l(x 1 ; θ) = log(θ) (1 + θ) log(1 + x 1 ); l (x 1 ; θ) = 1/θ log(1 + x 1 ); l (x 1 ; θ) = 1/θ 2. Hence I(θ) = 1/θ 2 and if Y is an unbiased estimator for θ, then Var θ (Y ) θ2 n. (b) If Z is an unbiased estimator for g(θ) = 1/θ, then we have that Var θ (Z) (g (θ)) 2 ni(θ) = 1 nθ 2.

THE EXPONENTIAL CLASS OF DISTRIBUTIONS 7 (c) If x (0, ) n, we have that l(x; θ) = log(θ) (1 + θ) log(1 + x i ); (d) Let l (x; θ) = 1/θ log(1 + x i ); setting this to 0 and solving for θ, we obtain that mle is given by ( 1. W := n log(1 + X i )) W = log(1 + X i ). We want to compute the distribution for W. Let Y i := log(1+x i ). We have that P(Y 1 z) = P(X 1 e z 1) = e z 1 0 θ (1 + x 1 ) 1+θ dx 1. The fundamental theorem of calculus and the chain rule give that pdf for Y 1 is given by f Y1 (z) = θ e z+zθ ez = θe zθ ; thus Y i are independent exponential random variables with mean 1/θ and we know that W Γ(n, 1/θ). In particular, we have that E(W ) = n/θ and Var(W ) = nθ 2. Hence from the previous calculations, we know that W /n is an efficient estimator for 1/θ. Another way to proceeds would be to note that f θ (x 1 ) = θ (1 + x 1 ) 1[x 1+θ 1 > 0] = 1[x 1 > 0] exp[ (1 + θ) log(1 + x 1 ) + log(θ)]; hence we are dealing with a family of exponential class with: η(θ) = (1 + θ), k(x 1 ) = log(1 + x 1 ), and A(θ) = log(θ). By our theory, we know that W is a complete and sufficient statistic; in particular, we can compute its mean and variance of Y 1 using Proposition 4.

8 THE EXPONENTIAL CLASS OF DISTRIBUTIONS (e) By the Lehmann-Scheffe theorem, it suffices to find an unbiased estimator for θ that is a function of W, since our theory on the exponential family, gives that W is both sufficient and complete. By a previous calculation, in Section 2, we already know what to do: just consider Z = n 1 n W ; we know that Z is unbiased, and Var θ (Z) = θ2 n 2. End of Midterm 2 coverage