Review of Probability Theory II January 9-3, 008 Exectation If the samle sace Ω = {ω, ω,...} is countable and g is a real-valued function, then we define the exected value or the exectation of a function f of X by Eg(X) = i g(x(ω i ))P {ω i }. To create a formula for discrete random variables, write R(x) for the set of ω so that X(ω) = x, then the sum above can be written Eg(X) = g(x(ω))p {ω} = g(x)p {ω} x ω R(x) x ω R(x) = g(x) P {ω} = g(x)p {ω; X(ω) = x} x ω R(x) x = g(x)p {X = x} = g(x)(x). x x rovided that the sum converges absolutely. Here is the mass function for X. For a continuous random variable, with distribution function F and density f, choose a small ositive value x, and let X be the random variable obtained by rounding the value of X down to the nearest integer multile of x, then Eg( X) = x g( x)p { X = x} = x g( x)p { x X < x + x} = x g( x)(f ( x + x) F ( x)) x g( x)f( x) x g(x)f(x) dx. Provided that the integral converges absolutely, these aroximations become an equality in the limit as x 0. Exercise. Let X and X be random variables on a countable samle sace Ω having a common state sace Let g and g be two real valued functions on the state sace and two numbers c and c. Then E[c g (X ) + c g (X )] = c Eg (X ) + c Eg (X ).
Several choice for g have secial names.. If g(x) = x, then µ = EX is call variously the mean, and the first moment.. If g(x) = x k, then EX k is called the k-th moment. 3. If g(x) = (x) k, where (x) k = x(x ) (x k +), then E(X) k is called the k-th factorial moment. 4. If g(x) = (x µ) k, then E(X µ) k is called the k-th central moment. 5. The second central moment σ X = E(X µ) is called the variance. Note that Var(X) = E(X µ) = EX µex + µ = EX µ + µ = EX µ. 6. If X is R d -valued and g(x) = e i θ,x, where, is the standard inner roduct, then φ(θ) = Ee i θ,x is called the Fourier transform or the characteristic function. 7. Similarly, if X is R d -valued and g(x) = e θ,x, then m(θ) = Ee θ,x is called the Lalace transform or the moment generating function. 8. If X is Z + -valued and g(x) = z x, then ρ(z) = Ez X = x=0 P {X = x}zx is called the (robability) generating function. Table of Discrete Random Variables random variable arameters mean variance generating function Bernoulli ( ) ( ) + z binomial n, n n( ) (( ) + z) n nk hyergeometric N, n, k N geometric nk N ( N k ) ( N n ) N N ( ( )z) a a ( )z negative binomial a, a Poisson λ λ λ ex( λ( z)) uniform a, b b a+ (b a+) z a z b a+ b a+ z
Table of Continuous Random Variables random variable arameters mean variance characteristic function beta, β +β β (+β) (+β+) F, (a, b; iθ π ) Cauchy µ, σ none none ex(iµθ σ ) chi-squared a a a exonential λ λ F q, a a a λ, a > a q+a q(a 4)(a ) ( iθ) a/ iλ θ+iλ gamma, β β θ+iβ Lalace µ, σ µ σ ex(iµθ) +σ θ normal µ, σ µ σ ex(iµθ σ θ ) c Pareto, c, > c ( )( ) t a, µ, σ µ, a > σ a, a > a a+b uniform a, b Joint Distributions and Conditioning β ( iβ ) (b a) i ex(iθb) ex(iθa) θ(b a) A air of random variables X and X is called indeendent if for every air of events A, A, P {X A, X A } = P {X A }P {X A }. (8). For their distribution functions, F X and F X, (8) is equivalent to factoring of the joint distribution function F (x, x ) = F X (x )F X (x ), to the factoring of joint density for continuous random variables f(x, x ) = f X (x )f X (x ), to the factoring of the joint mass function for discrete random variables and, finally, to the factoring of exectations (x, x ) = X (x ) X (x ), Eg (X )g (X ) = Eg (X )Eg (X ). Definition. For a air of random variables X and X, the covariance with means µ and µ is defined by Cov(X, X ) = E(X µ )(X µ ) = EX X µ µ. In articular, if X and X are indeendent, then Cov(X, X ) = 0. The correlation Cov(X, X ) ρ(x, X ) = Var(X )Var(X ). 3
Exercise 3. Var(X + + X n ) = n i= n j= Cov(X i, X j ). For a air of jointly continuous random variables, the marginal density of X is The conditional density of Y given X is f X (x) = f Y X (y x) = f(x, y) dy. f(x, y) f X (x). The conditional exectation is the exectation using the conditional density. E[g(Y ) X = x] = g(y)f Y X (y x) dy. Similar exression marginal mass function and conditional mass function, relacing integrals by sums, exists for discrete random variables. The conditional mass function of Y given X is Y X (y x) = (x, y) X (x). The conditional exectation is the exectation using the conditional density. E[g(Y ) X = x] = y g(y) Y X (y x). 3 Law of Large Numbers The law of large numbers states that the long term emirical average of indeendent random variables X, X,... having a common distribution function F ossessing a mean µ. In words, we have with robability, X n = n (X + X + + X n ) = n S n µ as n. We can defiine the emrical distribution function F n (x) = n #( observations from X, X,..., X n that are less than or equal to x) = n n I (,x] (X i ). i= Then, by the strong law,we have with robability, F n (x) F (x) as n. The Glivenko-Cantelli theorem states that this convergence is uniform in x. 4
4 Central Limit Theorem For the situation above, we have that X n µ 0 as n with robability. The central limit theorem states that if we magnify the difference by a factor of n, then we see convergence of the distributions to a normal random variable. Definition 4. A sequence of distribution functions {F n ; n } is said to converge in distribution to the distribution function F if lim n F n(x) = F (x) whenever x is a continuity oint for F. Theorem 5 (Central Limit Theorem). If the sequence {X n ; n } introduced above has common variance σ, then { n lim P ( Xn µ ) } z = Φ(z) n σ where Φ is the distribution function of a standard normal random variable. We often write n σ ( Xn µ ) = S n nµ σ n. 5