This section is optional.

Similar documents
EE 4TM4: Digital Communications II Probability Theory

Distribution of Random Samples & Limit theorems

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables

LECTURE 8: ASYMPTOTICS I

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Convergence of random variables. (telegram style notes) P.J.C. Spreij

4. Partial Sums and the Central Limit Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Learning Theory: Lecture Notes

Lecture 19: Convergence

Lecture 20: Multivariate convergence and the Central Limit Theorem

AMS570 Lecture Notes #2

SDS 321: Introduction to Probability and Statistics

Lecture 7: Properties of Random Samples

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Advanced Stochastic Processes.

ST5215: Advanced Statistical Theory

1 Convergence in Probability and the Weak Law of Large Numbers

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Parameter, Statistic and Random Samples

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

STAT Homework 1 - Solutions

Mathematics 170B Selected HW Solutions.

Central Limit Theorem using Characteristic functions

Introduction to Probability. Ariel Yadin

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Mathematical Statistics - MS

An Introduction to Randomized Algorithms

2.2. Central limit theorem.

Approximations and more PMFs and PDFs

5. INEQUALITIES, LIMIT THEOREMS AND GEOMETRIC PROBABILITY

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Unbiased Estimation. February 7-12, 2008

The Central Limit Theorem

Chapter 2 The Monte Carlo Method

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Notes 5 : More on the a.s. convergence of sums

Probability and Random Processes

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Statistical Theory; Why is the Gaussian Distribution so popular?

Central limit theorem and almost sure central limit theorem for the product of some partial sums

AMS 216 Stochastic Differential Equations Lecture 02 Copyright by Hongyun Wang, UCSC ( ( )) 2 = E X 2 ( ( )) 2

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Math 525: Lecture 5. January 18, 2018

Lecture 2: Concentration Bounds

Asymptotic distribution of products of sums of independent random variables

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

MIT Spring 2016

Statistical Theory MT 2008 Problems 1: Solution sketches

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

HOMEWORK I: PREREQUISITES FROM MATH 727

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Application to Random Graphs

Notes 19 : Martingale CLT

2.1. Convergence in distribution and characteristic functions.

Statistical Theory MT 2009 Problems 1: Solution sketches

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

4. Basic probability theory

Exponential Families and Bayesian Inference

5. Limit Theorems, Part II: Central Limit Theorem. ECE 302 Fall 2009 TR 3 4:15pm Purdue University, School of ECE Prof.

Basics of Probability Theory (for Theory of Computation courses)

Basis for simulation techniques

Probability and statistics: basic terms

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Random Variables, Sampling and Estimation

Chapter 6 Sampling Distributions

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

NOTES ON DISTRIBUTIONS

Lecture 6: Coupon Collector s problem

Estimation for Complete Data

1 Introduction to reducing variance in Monte Carlo simulations

Topic 9: Sampling Distributions of Estimators

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

10/31/2018 CentralLimitTheorem

Solutions: Homework 3

Lecture 12: September 27

The central limit theorem for Student s distribution. Problem Karim M. Abadir and Jan R. Magnus. Econometric Theory, 19, 1195 (2003)

Chapter 6 Principles of Data Reduction

Lecture Chapter 6: Convergence of Random Sequences

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

6.041/6.431 Spring 2009 Final Exam Thursday, May 21, 1:30-4:30 PM.

Sequences and Series of Functions

n=1 a n is the sequence (s n ) n 1 n=1 a n converges to s. We write a n = s, n=1 n=1 a n

Lecture 8: Convergence of transformations and law of large numbers

6 Infinite random sequences

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

5 Birkhoff s Ergodic Theorem

The standard deviation of the mean

Transcription:

4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore g (1) (0) = E[X]. g (1) (t) = de[etx ] dt g (2) (t) = d2 E[e tx ] dt 2 Therefore g (2) (0) = E[X 2 ]. Iductively, we have [ ] de tx = E = E[Xe tx ]. dt [ d 2 e tx ] = E = E[X 2 e tx ]. dt 2 g () (0) = E[X ] for = 1, 2,...,. 1

Example 1. I this example, we will fid the momet geeratig fuctio of X which follows the geometric distributio with the followig p.m.f.: p(1 p) x 1, x = 1, 2,..., ad 0 < p < 1. By defiitio, for (1 p)e t < 1, we have g(t) = E[e tx p ] = 1 p ((1 p)et ) x = x=1 pe t 1 (1 p)e t. Example 2. I this example, we will fid the momet geeratig fuctio of X which follows the expoetial distributio with the followig p.d.f.: By defiitio, for t < λ, we have λe λx, x 0 ad λ > 0. g(t) = E[e tx ] = 0 e tx λe λx dx = λ λ t. Remark 1. The momet geeratig fuctio uiquely determies the distributio. I other words, if two radom variables X ad Y have the same momet geeratig fuctio, the X ad Y have the same distributio. 2

5 Some Fudametal Results The etire sectio, except the cetral limit theorem, is optioal. The followig result is kow as the Markov iequality. Propositio 2. If a radom variable X takes o oly o-egative values, the for ay a > 0, we have P (X a) E[X] a. Proof. We give a proof for the case whe X is a cotiuous radom variable. The case for discrete radom variable is similar ad therefore omitted. Let f(x) be the probability desity fuctio of X. E[X] = = a 0 a a xf(x)dx = xf(x)dx a 0 a xf(x)dx + f(x)dx = ap (X a). a xf(x)dx af(x)dx (because xf(x) af(x), for x a) 3

Usig the Markov iequality, oe ca also prove Chebyshev s iequality. Propositio 3. Let X be a radom variable (which may take egative values). The, for ay a > 0, we have P ( X E[X] a) V ar(x) a 2. Proof. Note that (X E[X]) 2 is a o-egative radom variable with mea E [ (X E[X]) 2] = V ar(x). Now, applyig the Markov iequality, we deduce that, for ay a > 0, P ( X E[X] a) = P ( (X E[X]) 2 a 2) V ar(x) a 2. 4

The followig result is the famous weak Law of Large Numbers. Propositio 4. Let X 1, X 2,..., X,... be a sequece of idepedet ad idetically distributed (i.i.d.) radom variables havig mea µ ad fiite variace σ 2. The for ay ε > 0 we have ( ) lim P X 1 + X 2 +... + X µ > ε = 0. 5

Proof. Let X = 1 (X 1 + X 2 +... + X ). We the have ad E[ X] = E[X 1] + E[X 2 ] +... + E[X ] = µ + µ +... + µ = µ Var( X) = Var(X 1) + Var(X 2 ) +... + Var(X ) = σ2 + σ 2 +... + σ 2 2 2 By Chebyshev s iequality, we have P ( X µ ε) Therefore for ay positive ε we have V ar( X) ε 2 lim P ( X µ > ε) = 0. = σ2 ε 2. = σ2. 6

Here let us also state without proof the strog law of large umbers. Propositio 5. Let X 1, X 2,..., X,... be a sequece of idepedet ad idetically distributed radom variables havig mea µ ad fiite variace σ 2. The we have P ( lim X 1 + X 2 +... + X ) = µ = 1. The strog law of large umbers is a stroger versio of the weak law of large umbers, which meas that the log-ru average of a sequece of idepedet ad idetically distributed radom variables will coverge to its mea. 7

The followig result is the famous cetral limit theorem (C.L.T.), which is of cetral importace i probability theory ad statistics. The statemet of cetral limit theorem is required (but its proof is optioal) i this course ad will be reviewed i greater details later i the lecture otes. Propositio 6. Let X 1, X 2,..., X,... be a sequece of idepedet ad idetically distributed radom variables with mea µ ad variace σ 2. The the cumulative distributio fuctio of the followig radom variable teds to that of the ormal radom variable with mea 0 ad variace 1 as : X 1 + X 2 +... + X µ σ. E Proof. Suppose first that each X i has mea 0 ad variace 1. The the momet geeratig fuctio of (X 1 + X 2 + + X )/ is [ { ( )}] X1 + X 2 + + X [ exp t = E e tx 1/ e tx 2/... e tx / ] = (E[e tx 1/ ]). 8

Now, for large eough, we obtai from the Taylor series expasio of the fuctio e x that e tx 1/ 1 + tx 1 + t2 X1 2 2. Takig expectatio shows that whe is large, E[e tx 1/ ] 1 + te[x] + t2 E[X 2 ] 2 Therefore, we obtai that whe is large, [ { E exp t ( X1 + X 2 + + X )}] = 1 + t2 2. ) (1 + t2 e t2 /2. 2 I other words, the momet geeratig fuctio of (X 1 + +X )/ coverges to e t2 /2, the momet geeratig of fuctio of a ormal radom variable with mea 0 ad variace 1 (stadard ormal radom variable). Cosequetly, its distributio fuctio also coverges to that of a stadard ormal radom variable. Whe each X i has mea µ ad variace σ 2, the radom variable (X i µ)/σ has mea 0 ad 1. Applyig the previous argumet to (X i µ)/σ, we deduce that the distributio of (X 1 µ + X 2 µ + + X µ)/(σ ) coverges to that of a stadard ormal radom variable, which completes the proof. 9

Summary 1. Combiatio 2. Permutatio 3. Sample space 4. Radom variable 5. Coditioal probability ad idepedet evets 6. Probability distributio 7. Uiform distributio 8. Expoetial distributio 9. Normal distributio 10. Expected value E[X] 11. Variace V ar(x) 12. V ar(x) = E[X 2 ] E 2 [X] 10