Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation ˆ Basic Bivariate Distributions; joint, marginal & conditional pdf Discrete Distributions The (theoretical population mean, population variance, and the population sd (standard deviation are: µ N N i f(, σ ( µ f(, and σ σ i The sample mean, sample variance, and the sample sd from a dataset: n i, s i i ( i n, s s or equivalently, from a frequency table: n k i f i i, s n k { fi ( i }, s s i Note also: σ ( µ f( ( µ + µ f( f( µ f( + µ f( f( µ (Why? f( is called the nd moment about the origin. E.. Find the mean and sd of the following observations:,,,, 4 6, 5.. Let f( /6,,, be the pmf of X. Find the mean (µ and the sd (σ of X.
E. Let the random variable X the number of rolls of a regular die until the first 5 or 6. The probability of rolling a 5 or 6 is /, thus the pmf of X is written as: f( P (X Find (a the mean (µ and the sd (σ of X. Answer: µ f( ( + ( (,,,,,... ( ( + ( ( You will see quite a few of similar infinite sums of this kind. Here is how to find the answer easily. First, let s call a. Then, Write one more line like µ a + µ Subtract, we get µ a + µ 9a ( ( a + a ( ( a + a ( a + a / a ( a Remark Tetbook introduces another way of handling this kind of infinite sum. First recall the Taylor series epansion: f( f(a + f (a! + f (a! Net, apply the Taylor series epansion to f( ( around 0, we get ( + + Now, notice that we have µ + ( + (. The RHS is (, where /. That is, µ ( /. E ( X f( ( + ( ( ( + ( Chapter, page
Rewrite as E ( X + ( + ( Write one more line like before Subtract the last two equations, we get ( E ( X ( ( ( + + E ( X + Write one more line again like before ( E ( X ( + 5 ( + Subtract the last equation from the one above, we get Basel Problem: n ( E ( X ( + n π 6 + + ( / / E ( X 5 ( + 7 ( + 5 ( + σ E ( X µ 5 9 6 We begin with the Taylor series epansion of sin( ( ( ( sin(! + 5! 5 7! 7 Divide both sides by sin(! + 5! 4 7! 6 Chapter, page
Notice that the roots of sin( are ±nπ (i.e., sin(±nπ 0. factors like The above epression must have ( ( + ( ( + ( ( + π π π π π π ( ( π ( 4π 9π ( π + 4π + 9π π n n Comparing the coefficients of term, we get E. Find a constant c so that f( c, Definition. (Mathematical Epectation! π n n n n π 6 ±, ±,... is a pmf. Univariate case: X f( pdf for a continuous rv; pmf for a discrete rv. f( d E(X f( Multivariate case: X (X, X,..., X n f(,,..., n pdf or pmf u(,..., n f(,..., n d... d n E {u (X, X,..., X n } u(,..., n f(,..., n Properties. If k is a constant E(k k. Properties. If k is a constant { and v( is a function, then E{kv(X} ke{v(x}. m } m This can be etended to E k i v i (X k i E {v i (X}. Proof. i i Chapter, page 4
E 4. Let the random variable X have the pmf f( (,,,,.... Find µ and σ. Answer: You can use the same methods shown before or use the Taylor series epansion of: ( + + and ( + + 4 + 5 4 µ E(X ( f( { ( + + ( + ( ( + } ( E {X(X + } ( ( ( + f( + ( { + ( 8 ( + 4 ( + 4 } ( So, E ( X 6 (Why? σ E ( X {E(X} 6 4 Some special mathematical epectations ˆ Mean value of X: f( d E(X µ f( ˆ Variance of X: ( µ f( d E(X µ σ ( µ f( ˆ Moment generating function (mgf of X: e t f( d M(t e t f( Related facts Chapter, page 5
ˆ σ E ( X {E(X} ˆ (sd σ σ ˆ Not every distribution has an mgf. Suppose M X (t M Y (t for t < h, and some h > 0, then P X ( P Y (, i.e., F X (z F Y (z, z. This is called the uniqueness of mgf, i.e., mgf uniquely determines the distribution. ˆ M (0 E(X, M (0 E ( X,, M (k (0 E ( X k. The last part is because t M(T t E ( e tx E ( Xe tx E 5. {Cauchy distribution} X has pdf f( π M X (t do NOT eist. (Why?: b lim a a b π, < <. Then, both E(X & + d? (Does it eist? + lim b b 0 0 lim a a π π d lim + b π d lim + a π { log ( + } b { log ( + } 0 0 + a + Binomial distribution Definition. X binomial (n, p X has a binomial ( distribution with parameters n & p (n,,..., 0 p n f( p ( p n, 0,,..., n X binomial (, p is particularly called the Bernoulli random variable, i.e., P (X p P (X 0 Properties. X binomial (n, p M X (t ( pe t + q n, q p. Proof. ( M X (t E ( e tx ( (Uniqueness of mgf 0 ( n (pe t ( p n ( pe t + q n Chapter, page 6
Properties 4. Representational definition of binomial (n, p X binomial (n, p X Z i, Z,..., Z n : iid binomial(, p. i Proof. Begin with the mgf of Z i M X (t E (e t Z i E ( e tz e tz e tzn E ( e tz i M Zi (t (Why? (Why? ( pe t + q (Why? ( pe t + q n X binomial (n, p X Z i, Z,..., Z n : iid binomial(, p i The last part is by the uniqueness of mgf. Properties 5. Mean & variance of binomial (n, p X binomial (n, p E(X np, Var(X npq. Proof. ˆ Easiest way: use the representational definition ˆ Proof by mgf: Try on your own using M (0 E(X, M (0 E ( X ˆ Proof by pmf: E(X n!!(n! p ( p n n! (!(n! p ( p n (let k 0 n (n! np k!(n k! pk ( p n k k0 np (Why? Chapter, page 7
E {X(X } n! (!(n! p ( p n n! (!(n! p ( p n (let k 0 n n(n p (n! k!(n k! pk ( p n k k0 n(n p (Why? σ E {X(X } + E(X {E(X} n(n p + np (np np( p E 6. {WLLN: Weak Law of Large Numbers} (Binomial case Chebyshev s Inequality: P ( X µ kσ k Chebyshev s inequality came from the following: σ E { (X µ } S ( µ f( A( µ f(, where A { : µ kσ} for a positive constant k. This leads to σ ( µ f( k σ f( k σ f( k σ P (X A A A A Now, let X, X,..., X n be an iid binomial(, p, ˆp n X i sample success ratio. Consider P ( ˆp p ɛ p( p Note that E (ˆp p and σ (ˆp n, so by plugging into the Chebyshev s inequality we get P ( ˆp p ɛ p( p ɛ n p( p lim P ( ˆp p ɛ lim n n ɛ 0 n This means that the probability that the sample success ratio (ˆp is more than ɛ away from p goes to zero as n goes to, and we say (ˆp converges in probability to p. Chapter, page 8
Definition. Cumulative distribution function (cdf univariate case Properties 6. cdf F ( F ( P (X. (monotonicity a b F (a F (b. F ( lim F ( 0,. (right continuity lim F ( + h F ( h 0 + F (+ lim F ( A random variable X may not have a pdf or mgf, BUT it always has a cdf. Definition 4. Relationship with pdf (or pmf when it eists f(t dt F ( f(t t (continuous case (discrete case F ( for where f is continuous (continuous case f( F ( F ( (discrete case E 7.. Find the cdf F ( of. Find the cdf F ( of Answer: for,, f( 6 0 otherwise f( for > 0 otherwise 0 < 0 /6 < F ( /6 < Chapter, page 9
F ( t dt, > 0 otherwise E 8. Let X binomial (8, 0.65. Find (a P (X 5, (b P (X 5, (c P (X 5 P (X 4 Answer by R: > pbinom(5,8,0.65 [] 0.5786 > dbinom(5,8,0.65 [] 0.785858 > pbinom(5,8,0.65-pbinom(4,8,0.65 [] 0.785858 Poisson distribution Poisson approimation to the binomial distribution: ( n p q n e λ λ, as n, np λ (fied! There are different ways to show this. Here is an easy way from the tetbook. We begin with Net, let n P (X P (X lim n lim n n!!(n! n!!(n! ( λ ( λ n n n ( λ ( λ n n n n(n (n + λ n! λ! e λ (Why? e λ λ Poisson with λ! ( λ n ( λ n n Definition 5. X Poisson (λ X has a Poisson distribution with parameter λ (> 0 f( e λ λ, 0,,,...! Properties 7. X Poisson (λ M X (t ep { λ ( e t }. Chapter, page 0
Proof. M X (t E ( e tx 0 e t e λ λ! ( λe e λ t! 0 e λ e (λet e {λ(e t } Properties 8. Mean & variance of Poisson (λ X Poisson (λ E(X λ, Var(X λ. Properties 9. (Reproductive property X,..., X k independent & X i Poisson (λ i, i,..., k, then X i Poisson ( λ i Proof. Begin with the mgf of Z i M X i (t E (e t X i E ( e tx e tx e txn E ( e tx i M Xi (t e λ i(e t e λ i(e t ( mgf of Poisson λi (Why? (Why? X i Poisson ( λ i by the uniqueness of mgf. E 9. Let X Poisson (λ. Find (a P (X, (b P (X, (c P (X P (X Answer by R: > ppois(, [] 0.996986 > dpois(, [] 0.8997 > ppois(,-ppois(, [] 0.8997 Geometric distribution Definition 6. Y Geometric (p f(y pq y, y 0,,... ; q p Chapter, page
X,..., X n iid binomial (, p, Y # of failures before the first success. Properties 0. Y Geometric (p M Y (t p ( qe t, qe t > 0. E(Y q p, V ar(y q p. Proof. M Y (t E ( e ty p ( qe t y y0 p ( qe t, qet > 0 One note: Tetbook (page. 64 uses a slightly different definition. There, X the trial number on which the st success occurs and it s related by Y X, (,,.... According to this definition, we have f( pq (,,...; E(X p, V ar(x q p ; M X(t pe t ( qe t Negative Binomial distribution Definition 7. Y Negative Binomial (r, p ( y + r f(y p r q y, y 0,,... ; q p r X,..., X n iid binomial (, p, Y # of failures before the rth success (r. Properties. Y NB (r, p M Y (t p r ( qe t r, qe t > 0. r Y Z i, Z,..., Z r iid geometric(p. Proof. Note first E(Y r q p, V ar(y r q p. ( y + r q y p r (Why?. This is known as the negative binomial r y0 epansion and it s the Taylor series epansion of f(q ( q r as shown below Making use of this, we have M Y (t E ( e ty f( f(0 + f (0! ( q r + r! y0 + f (0! r(r + q + q! ( y + r p r ( qe t y p r ( qe t r, qe t > 0 r Chapter, page
Proof. (mgf representational definition M Z i (t E (e t Z i, where r E ( e tz i r M Zi (t r {p ( qe t } p r ( qe t r mgf of NB (r, p r Z i iid geometric (p (Why? (Why? Z i NB (r, p by the uniqueness of mgf. Another note: Tetbook (page. 64 uses a slightly different definition. There, X the trial number on which the rth success occurs and it s related by Y X r, ( r, r +,.... Tetbook calls Y has a translated negative binomial distribution. According to this definition, we have f( ( p r q r ( r, r +,...; µ r r p, σ rq p ; M X(t ( pe t r ( qe t r Hypergeometric distribution Definition 8. X Hypergeometric (N, N, n f( ( N ( N n ( N n, n In a bo, there are N red balls and N blue balls, X # of red balls. Properties. X Hypergeom (N, N, n µ np, σ np( p ( N n, p N N N. MLE: Maimum Likelihood Estimate Definition 9. Suppose X, X,..., X n are random samples from the same underlying distribution (i.e., iid with the pdf f ( i ; θ, then n i f ( i; θ is called the jpdf (joint pdf or the likelihood Chapter, page
function. Furthermore, the value of the parameter ˆθ that maimizes the likelihood is called the mle (maimum likelihood estimator of θ. E 0. X, X,..., X n : iid from binomial (, p. Find the mle of p. Answer: L(p f ( i ; p i p i ( p i, 0 < p <, i 0 or i p i ( p n i ln L(p i ln p + (n i ln( p (log-likelihood function Now, to find p that maimizes the log likelihood, differentiate this wrt p and set it equal to zero, we get i ln L(p (n i 0 p p p ( p i (n i p 0 ˆp i n To make sure that this indeed makes the log likelihood maimum, we can do the second derivative test. Here, we have i ln L(p p p (n i ( p This is always < 0 regardless of ˆp, which means that our solution ˆp is indeed the mle. One note: X is called the maimum likelihood estimator, and is the maimum likelihood estimate. E. X, X,..., X n : iid from Poisson (λ. Find the mle of λ. Answer: L(λ f ( i ; λ i λ i e λ i i! λ i e nλ!! n! ln L(λ nλ + i ln λ ln (! n! i ln L(λ n + λ λ 0 i ˆλ n Now, the second derivative is λ ln L(λ i, and this is always < 0 regardless of ˆλ, which λ means ˆλ is the mle. Chapter, page 4
E. X, X,..., X n : iid from discrete uniform for,,..., θ. Find the mle of θ. Answer: L(θ f ( i ; θ i i θ ln L(θ n ln θ θ ln L(θ n θ ˆθ ma (,..., n i ( n, i,..., θ θ This agrees with our intuition because in n observations of a discrete uniform random variable, the largest value should be taken as the upper bound. Epected Values Linear Functions of Independent Random variables E. X, X : independent random samples from Poisson with λ, λ, respectively. Find (a P (X, X 4, (b P (X + X. Answer: ( e ( 4 e P (X, X 4 9! 4! e 5 0.00 P (X + X P (0, + P (, + P (, 0 ( 0 e ( e ( e ( e ( e ( 0 e + + 0.084 0!!!!! 0! Let u (X, X be a new random created as a function of two independent random variables X and X, where X and X have pdf s f (, f (, respectively. Then the epected value of u (X, X can be found by E {u (X, X } u (, f (, u (, f ( f ( The last part, where the joint pdf is written as a product of marginal pdf, is due to the independence of X and X. Now, consider Y a X + a X, where a, a are constants. We have E(Y µ Y E (a X + a X a E (X + a E (X a µ + a µ { V ar(y σy E (a X + a X a µ a µ } E [{a (X µ + a (X µ } ] { a E (X µ } { + a E (X µ } + a a E {(X µ (X µ } a V ar (X + a V ar (X + a a E (X µ E (X µ a σ + a σ In general, let Y n i a ix i, where a i s are constants and X i s are independent random Chapter, page 5
samples with mean µ i and variance σi, then ( E(Y µ Y E a i X i a i E (X i a i µ i i i ( V ar(y σy V ar a i X i i a i V ar (X i i i i One note: In case when X,..., X n are not independent, we have a i σi σ Y a i σi + a ia j σ ij, where σ ij Cov(X i, X j i<j i E 4. Let X binomial(n 00, p /, X binomial(n 48, p /4 and they are independent. Find the epected value and the variance of Y X X. Answer: E(Y E (X X E (X E (X 50 8 V ar(y V ar (X X V ar (X + V ar (X 5 + 9 4 Definition 0. f (, joint pdf of X and X. f ( marginal pdf of X f(, or f(, d. f ( f (, f ( conditional pdf of X given X. E 5. Consider the following (discrete joint pmf of X and X. X X marginal X 4/0 /0 6/0 X /0 /0 4/0 marginal 7/0 /0 Find (a f(,, (b f ( f (, (c pmf of Y X + X, (d E(Y, and (e E (X + X. Chapter, page 6
Answer: f(, 0 ; f ( f ( 7 0 4/0, Y P (Y 5/0, Y /0, Y 4 ( ( ( 4 5 E(Y + + 4 0 0 0 E (X + X E (X + E (X 4 0 8 00 ; f(, f ( f ( 7 0 ( 7 0 + ( ( ( 6 4 + + 7 0 0 0 0 Definition. Cov (X, X σ E {(X µ (X µ } E (X X µ µ ρ Cor (X, X Cov(X, X σ σ σ σ σ. Definition. Conditional mean & conditional variance: f( d E (X µ X f ( ] { E (X } V ar (X σx E [{X f( d E (X } { E (X } f ( The conditional variance can also be shown as V ar (X E ( X {E (X } E 6. Let X and X have the joint pmf f(, +,,,,, Here is the probability table for your information. X X X marginal X / / 4/ 9/ X / 4/ 5/ / marginal 5/ 7/ 9/ Find (a the marginal pmf s f (, f (, (b conditional pmf s f(, f(, (c conditional epectation E (X, (d conditional variance V ar (X, and (e P (X X, E (X, V ar (X. Answer: Shown below are the conditional probabilities for your information. Check your calculation below with these values. Chapter, page 7
X X X conditional prob X /9 /9 4/9 f( X / 4/ 5/ f( X X X X /5 /7 4/9 X /5 4/7 5/9 conditional prob f( f( f( f ( f(, f ( f(, + + + + + + + + +,,, + + + 6,, f ( f (, f ( f ( f (, f ( + +6 + + + + 6,,, when, + +,, when,, P (X X f( 4 E (X f ( ( +,, when,, + E (X ( + 9 E ( X f ( E ( X ( + 9 ( + 9 + ( + + ( + 9 + ( + 9 4 9,, when,, 4( + 9 4 9 V ar (X E ( X {E (X },, ;,, V ar (X E ( X {E (X } 4 9 ( 4 0 9 8 Chapter, page 8
One note: Check E {E (X }? E (X { } ( + E {E (X } E E + E (X ( + 6 ( + 5 {( + 5 + + ( } + Definition. The conditional epectation of a function of random variables X, X : Properties. Conditional epectation:. E (ax + b ae (X + b u(, f( d E {u (X, X } u(, f (. E (X + X E (X + E (X. X 0 E (X 0 4. E {E (X } E (X 5. E {g( X } g( E (X Proof. E (ax + b ae (X + b (a + bf( d a f( d + b f( d E (X + X E (X + E (X E (X f( d 0 E {E (X } E (X f( d { } f( d f ( d E {g( X } g( g( E (X (Property of integral (Definition f( f ( d d f(, d d (Def of conditional pdf E (X f( d g( f( d Chapter, page 9