Lecture 5: Moment generating functions

Similar documents
E[X n ]= dn dt n M X(t). ). What is the mgf? Solution. Found this the other day in the Kernel matching exercise: 1 M X (t) =

1.1 Review of Probability Theory

Lecture 1: August 28

Stat410 Probability and Statistics II (F16)

Lecture 14: Multivariate mgf s and chf s

1 Solution to Problem 2.1

Brief Review of Probability

STAT 3610: Review of Probability Distributions

BMIR Lecture Series on Probability and Statistics Fall 2015 Discrete RVs

4 Moment generating functions

SOLUTION FOR HOMEWORK 12, STAT 4351

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Chapter 5. Chapter 5 sections

Continuous Random Variables

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

Lecture 19: Properties of Expectation

Mathematical Statistics 1 Math A 6330

Lecture 3: Random variables, distributions, and transformations

Chapter 4. Chapter 4 sections

Exponential Distribution and Poisson Process

1.6 Families of Distributions

15 Discrete Distributions

Summary. Ancillary Statistics What is an ancillary statistic for θ? .2 Can an ancillary statistic be a sufficient statistic?

Lecture The Sample Mean and the Sample Variance Under Assumption of Normality

Chapter 5 continued. Chapter 5 sections

Review 1: STAT Mark Carpenter, Ph.D. Professor of Statistics Department of Mathematics and Statistics. August 25, 2015

p. 6-1 Continuous Random Variables p. 6-2

Probability Distributions

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

Will Murray s Probability, XXXII. Moment-Generating Functions 1. We want to study functions of them:

Test Problems for Probability Theory ,

Continuous Distributions

Statistics 3657 : Moment Generating Functions

3. Probability and Statistics

18.175: Lecture 8 Weak laws and moment-generating/characteristic functions

Lecture 4. Continuous Random Variables and Transformations of Random Variables

Sampling Distributions

Math 341: Probability Eighteenth Lecture (11/12/09)

Continuous Random Variables and Continuous Distributions

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

1 Expectation of a continuously distributed random variable

Generating and characteristic functions. Generating and Characteristic Functions. Probability generating function. Probability generating function

Econ 508B: Lecture 5

STAT 512 sp 2018 Summary Sheet

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Department of Mathematics

We introduce methods that are useful in:

Discrete Distributions

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

STA 4321/5325 Solution to Homework 5 March 3, 2017

7 Random samples and sampling distributions

Random Variables and Their Distributions

LIST OF FORMULAS FOR STK1100 AND STK1110

Math Spring Practice for the final Exam.

Actuarial Science Exam 1/P

ACM 116: Lectures 3 4

Mathematical Statistics 1 Math A 6330

Sampling Distributions

Transformations and Expectations

Math 341: Probability Seventeenth Lecture (11/10/09)

Review for the previous lecture

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Chapter 3 Common Families of Distributions

PROBABILITY THEORY LECTURE 3

Chapter 2. Discrete Distributions

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

STA 256: Statistics and Probability I

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Chapter 3: Random Variables 1

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

5.6 The Normal Distributions

Things to remember when learning probability distributions:

Northwestern University Department of Electrical Engineering and Computer Science

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

ECON 5350 Class Notes Review of Probability and Distribution Theory

Exam P Review Sheet. for a > 0. ln(a) i=0 ari = a. (1 r) 2. (Note that the A i s form a partition)

Multiple Random Variables

Multivariate Random Variable

Limiting Distributions

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

18.440: Lecture 28 Lectures Review

Formulas for probability theory and linear models SF2941

the convolution of f and g) given by

Characteristic Functions and the Central Limit Theorem

Partial Solutions for h4/2014s: Sampling Distributions

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

STAT Chapter 5 Continuous Distributions

MATH2715: Statistical Methods

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

HW3 : Moment functions Solutions

STAT/MATH 395 PROBABILITY II

Some Special Distributions (Hogg Chapter Three)

1 Review of Probability

Some Special Distributions (Hogg Chapter Three)

Transcription:

Lecture 5: Moment generating functions Definition 2.3.6. The moment generating function (mgf) of a random variable X is { x e tx f M X (t) = E(e tx X (x) if X has a pmf ) = etx f X (x)dx if X has a pdf provided that E(e tx ) exists. (Note that M X () = E(e X ) = 1 always exists.) Otherwise, we say that the mgf M X (t) does not exist at t. Theorem 2.3.15. For any constants a and b, the mgf of the random variable ax + b is Proof. By definition, M ax+b (t) = e bt M X (at) M ax+b (t) = E(e t(ax+b) ) = E(e tax e bt ) = e bt E(e (ta)x ) = e bt M X (at) UW-Madison (Statistics) Stat 69 Lecture 5 215 1 / 16

The main use of mgf It can be used to generate moments. It helps to characterize a distribution. Theorem 2.3.7. If M X (t) exists at ±t, then E(X n ) exists for any positive integer n and E(X n ) = M (n) X () = d n dt n M X (t) t= i.e., the nth moment is the nth derivative of M X (t) evaluated at t =. Proof. Assuming that we can exchange the differentiation and integration (which will be justified later), d n dt n M X (t) = d n dt n e tx d n e tx f X (x)dx = dt n f X (x)dx = x n e tx f X (x)dx Hence d n dt n M X (t) = t= x n e x f X (x)dx = x n f X (x)dx = E(X n ) UW-Madison (Statistics) Stat 69 Lecture 5 215 2 / 16

The condition that M X (t) exists at ±t is in fact used to ensure the validity of the exchange of differentiation and integration. Besides this, we also need to argue that E X n < for any positive integer n under the condition that M X (t) exists at ±t. We first show that, if M X (t) exists at ±t, then for any s ( t,t), M X (s) exists: M X (s) = E(e sx ) = E[e sx I(X > )] + E[e sx I(X )] E[e tx I(X > )] + E[e tx I(X )] E(e tx ) + E(e tx ) = M X (t) + M X ( t) < where I(X > ) is the indicator of X >. Next, we show that, for any positive p >, E X p < under the condition M X (t) exists at ±t. For a given p >, choose s such that < ps < t. Because s X e s X, we have s p X p e ps X e psx + e psx and E X p s p E(e psx + e psx ) = s p M X (ps) + M X ( ps) < UW-Madison (Statistics) Stat 69 Lecture 5 215 3 / 16

In fact, the condition that M X (t) exists at ±t ensures that M X (s) has the power series expansion E(X M X (s) = k )s k t < s < t k! k= If the distribution of X is symmetric (about ), i.e., X and X have the same distribution, then M X (t) = E(e tx ) = E(e t( X) ) = E(e tx ) = M X ( t) i.e., M X (t) is an even function and M X (t) exists at ±t is the same as M X (t) exists at a t >. Example 2.3.8 (Gamma mgf) Let X have the gamma pdf 1 f X (x) = Γ(α)β α x α 1 e x/β, x >, where α > and β > are two constants and Γ(α) = is the so-called gamma function. x α 1 e x dx α > UW-Madison (Statistics) Stat 69 Lecture 5 215 4 / 16

If t < 1/β, M X (t) = E(e tx 1 ) = Γ(α)β α = 1 Γ(α)β α = ( β 1 βt )α Γ(α)β α x α 1 e x/( 1 βt ) dx e tx x α 1 e x/β dx β s α 1 e s ds = 1 (1 βt) α If t 1/β, then E(e tx ) =. We can obtain E(X) = d dt M X (t) αβ = t= (1 βt) α+1 = αβ t= For any integer n > 1, E(X n ) = d n dt n M X (t) α(α + 1) (α + n 1)β n = t= (1 βt) α+n = α(α + 1) (α + n 1)β n t= UW-Madison (Statistics) Stat 69 Lecture 5 215 5 / 16

Can the moments determine a distribution? Can two random variables with different distributions have the same moments of any order? Example 2.3.1. X 1 has pdf f 1 (x) = 1 2πx e (logx)2 /2, x X 2 has pdf f 2 (x) = f 1 (x)[1 + sin(2π logx)], x For any positive integer n, E(X1 n ) = 1 x n 1 e (logx)2 /2 dx 2π = 1 2π = en2 /2 2π = e n2 /2 e ny y 2 /2 dy e (y n)2 /2 dy y = logx using the property of a normal distribution UW-Madison (Statistics) Stat 69 Lecture 5 215 6 / 16

E(X2 n ) = x n f 1 (x)[1 + sin(2π logx)]dx = E(X1 n ) + 1 x n 1 e (logx)2 /2 sin(2π logx)dx 2π = E(X n 1 ) + 1 2π = E(X n 1 ) + en2 /2 2π = E(X n 1 ) + en2 /2 2π = E(X n 1 ) + en2 /2 2π = E(X n 1 ) e ny e y 2 /2 sin(2πy)dy e (y n)2 /2 sin(2πy)dy e s2 /2 sin(2π(s + n))ds e s2 /2 sin(2πs)ds since e s2 /2 sin(2πs) is an odd function. This shows that X 1 and X 2 have the same moments of order n = 1, 2,..., but they have different distributions. UW-Madison (Statistics) Stat 69 Lecture 5 215 7 / 16

In some cases, moments determine the distributions. The mgf, if it exists, determines a distribution. Theorem 2.3.11 Let X and Y be random variables with cdfs F X and F Y, respectively. a. If X and Y are bounded, then F X (u) = F Y (u) for all u iff E(X r ) = E(Y r ) for all r = 1,2,... b. If mgf s exist in a neighborhood of and M X (t) = M Y (t) for all t, then F X (u) = F Y (u) for all u. The key idea of the proof can be explained as follows. Note that M X (t) = e tx f X (x)dx is the Laplace transformation of f X (x). From the uniqueness of the Laplace transformation, there is a one-to-one correspondence between the mgf and the pdf. We will give a proof of this result in Chapter 4 for the multivariate case, after we introduce the characteristic functions. UW-Madison (Statistics) Stat 69 Lecture 5 215 8 / 16

From the power series result in the last lecture, if the mgf of X exists in a neighborhood of, then it has a power series expansion which is determined by moments E(X n ), n = 1,2,... Therefore, knowing the mgf and knowing moments of all order are the same, but this is under the condition that the mgf exists in a neighborhood of. Once we establish part (b), the proof of part (a) is easy: if X and Y are bounded, then their mgf s exist for all t and thus their cdf s are the same iff their moments are the same for any order. The condition that the mgf exists in a neighborhood of is important. There are random variables with finite moments of any order, but their mgf s do not exist. Example The pdf f X (x) = 1 2πx e (logx)2 /2, x is called the log-normal distribution or density, because if X has pdf f X, then logx has a normal pdf. UW-Madison (Statistics) Stat 69 Lecture 5 215 9 / 16

In Example 2.3.1, we have shown that the log-normal distribution has finite moments of any order. For t >, e tx M X (t) = e (logx)2 /2 dx = 2πx because, when t >, When t <, M X (t) = lim x e tx 2πx e (logx)2 /2 = e tx e (logx)2 /2 1 dx e (logx)2 /2 dx = 1 2πx 2πx and, hence, M X (t) exists for all t <. How do we find a distribution for which all moments exist and the mgf does not exists for any t? Consider the pdf { fx (x)/2 x > f Y (x) = f X ( x)/2 x < UW-Madison (Statistics) Stat 69 Lecture 5 215 1 / 16

For this pdf, E( Y n ) = x n f X (x) 2 dx + ( x) n f X ( x) 2 which has been derived for any n = 1,2,... On the other hand, E(e ty ) = e tx f X (x) 2 dx + dx = x n f X (x)dx = E(X n ) e tx f X ( x) dx 2 = e tx f X (x) 2 dx + e tx f X (x) 2 dx and we have shown that one of these integrals is (depending on whether t > or < ). Theorem. If a random variable X has finite moment a n = E(X n ) for any n = 1,2,..., and the series a n t n < with t > n! n= then the cdf of X is determined by a n, n = 1,2... UW-Madison (Statistics) Stat 69 Lecture 5 215 11 / 16

Example. Suppose that a n = n! is the nth moment of a random variable X. Since a n t n = n! t n = 1 t < 1 1 t n= n= and this function is the mgf of Gamma(1,1) at t, we conclude that X Gamma(1,1). Suppose that the nth moment of a random variable Y is { n!/(n/2)! if n is even a n = if n is odd Then a n t n = n! n= n!(t 2 ) n/2 n!(n/2)! = t 2k k! = et2 t R n is even k= Later, we show that this is the mgf of N(, 2), hence X N(, 2). For a log-normal distributed random variable X discussed in the beginning of this lecture, E(X n ) = e n2 /2 and n= en2 /2 t n /n! = for any t > and, hence, the theorem is not applicable. UW-Madison (Statistics) Stat 69 Lecture 5 215 12 / 16

In applications we often need to approximate a cdf by a sequence of cdf s. The next theorem gives a sufficient condition for the convergence of cdf s and moments of random variables in terms of the convergence of mgf s. Theorem 2.3.12 Suppose that X 1,X 2,... is a sequence of random variables with mgf s M Xn (t), and lim M X n n (t) = M X (t) < for all t in a neighborhood of where M X (t) is the mgf of a random variable X. Then, for all x at which F X (x) is continuous, lim F X n n (x) = F X (x). Furthermore, for any p >, we have lim E X n p = E X p and lim E X n X p = n n UW-Madison (Statistics) Stat 69 Lecture 5 215 13 / 16

Example 2.3.13 (Poisson approximation) The cdf of the binomial pmf, ( ) n f X (x) = p x (1 p) n x, x x =,1,...,n, where n is a positive integer and < p < 1, may not be easy to calculate when n is very large. It is often approximated by the cdf of the Poisson pmf, f Y (x) = e λ λ x, x =,1,2,... x! where λ > is a constant. For this purpose, we first compute the mgf s for the binomial and Poisson distributions. Example 2.3.9 (binomial mgf) Using the binomial formula n x= ( n x ) u x v n x = (u + v) n UW-Madison (Statistics) Stat 69 Lecture 5 215 14 / 16

we obtain n M X (t) = E(e tx ) = x=e tx( n x = n x= ( ) n (pe t ) x (1 p) n x x = (pe t + 1 p) n ) p x (1 p) n x Note that E(X) = d dt M X (t) = n(pe t + 1 p) n 1 pe t = np t= t= E(X 2 ) = d 2 dt 2 M X (t) t= = [n(n 1)(pe t + 1 p) n 1 (pe t ) 2 + n(pe t + 1 p) n 1 pe t ] = n(n 1)p 2 + np t= We got the same results previously, but the calculation here is simpler. UW-Madison (Statistics) Stat 69 Lecture 5 215 15 / 16

Example 2.3.13 (continued) If Y has the Poisson pmf, then M Y (t) = x= e tx e λ λ x x! = e λ (e t λ) x x! x= = e λ e et λ = e λ(et 1) which is finite for any t R. Let X n have the binomial distribution with n and p. Suppose that lim n np = λ > (that means p also depends on n and p when n ). Then, for any t, as n, [ ] n M Xn (t) = (pe t + 1 p) n = 1 + (np)(et 1) e λ(et 1) = M Y (t) n using the fact that, for any sequence of numbers a n converges to a, ( lim 1 + a ) n n = e a n n With this result and Theorem 2.3.12, we can approximate P(X n u) by P(Y u) when n is large and np is approximately a constant. UW-Madison (Statistics) Stat 69 Lecture 5 215 16 / 16