Stat 134 Fall 2011: Notes on generating functions

Similar documents
Things to remember when learning probability distributions:

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

n px p x (1 p) n x. p x n(n 1)... (n x + 1) x!

Common Discrete Distributions

Poisson approximations

Mathematical Statistics 1 Math A 6330

SDS 321: Introduction to Probability and Statistics

1 Random Variable: Topics

Lecture 3. Discrete Random Variables

3 Multiple Discrete Random Variables

1 Review of Probability

4 Branching Processes

University of California, Berkeley, Statistics 134: Concepts of Probability. Michael Lugo, Spring Exam 1

1 Bernoulli Distribution: Single Coin Flip

Discrete Distributions

Taylor and Maclaurin Series

18.175: Lecture 13 Infinite divisibility and Lévy processes

Lectures on Elementary Probability. William G. Faris

Tom Salisbury

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Lecture 16. Lectures 1-15 Review

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Lecture 4: Probability and Discrete Random Variables

ALL TEXTS BELONG TO OWNERS. Candidate code: glt090 TAKEN FROM

SUFFICIENT STATISTICS

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Example 1. Assume that X follows the normal distribution N(2, 2 2 ). Estimate the probabilities: (a) P (X 3); (b) P (X 1); (c) P (1 X 3).

Probability and Statistics

18.175: Lecture 17 Poisson random variables

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME

Page Max. Possible Points Total 100

4 Moment generating functions

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Topic 3: The Expectation of a Random Variable

MAS113 Introduction to Probability and Statistics. Proofs of theorems

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8.

1 Solving Linear Recurrences

Chapter 2. Discrete Distributions

Lecture 8 : The Geometric Distribution

Statistics 100A Homework 5 Solutions

MAS113 Introduction to Probability and Statistics. Proofs of theorems

3. DISCRETE RANDOM VARIABLES

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

SERIES

Northwestern University Department of Electrical Engineering and Computer Science

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Chapter 2: Random Variables

8 Laws of large numbers

Chapter 3: Random Variables 1

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

More on Distribution Function

Random Models. Tusheng Zhang. February 14, 2013

To understand and analyze this test, we need to have the right model for the events. We need to identify an event and its probability.

CMPSCI 240: Reasoning Under Uncertainty

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS

CS145: Probability & Computing

CS 246 Review of Proof Techniques and Probability 01/14/19

ECE353: Probability and Random Processes. Lecture 5 - Cumulative Distribution Function and Expectation

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

X = X X n, + X 2

Stat 100a, Introduction to Probability.

Lecture 12. Poisson random variables

Discrete Distributions

Part (A): Review of Probability [Statistics I revision]

Random Variables Example:

p. 4-1 Random Variables

Chapter 4 : Expectation and Moments

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Chapter Generating Functions

1.1 Review of Probability Theory

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 1

Mathematics 350: Problems to Study Solutions

Physics Sep Example A Spin System

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

1 Presessional Probability

18.440: Lecture 19 Normal random variables

Overview. CSE 21 Day 5. Image/Coimage. Monotonic Lists. Functions Probabilistic analysis

Chernoff Bounds. Theme: try to show that it is unlikely a random variable X is far away from its expectation.

Chapter 3 Discrete Random Variables

Polytechnic Institute of NYU MA 2212 MIDTERM Feb 12, 2009

Discrete Random Variables

6.1 Moment Generating and Characteristic Functions

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 120 minutes.

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Expected Value 7/7/2006

ECE 302 Division 1 MWF 10:30-11:20 (Prof. Pollak) Final Exam Solutions, 5/3/2004. Please read the instructions carefully before proceeding.

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

Discrete Distributions Chapter 6

Sequences and infinite series

Name: 180A MIDTERM 2. (x + n)/2

Randomized Algorithms

Review of Probability Theory

Discrete Mathematics. Spring 2017

Week 2. Review of Probability, Random Variables and Univariate Distributions

HEAGAN & CO., OPP. f>, L. & W. DEPOT, DOYER, N. J, OUR MOTTO! ould Iwv ia immediate vltlui. VEEY BEST NEW Creamery Butter 22c ib,

Transcription:

Stat 3 Fall 0: Notes on generating functions Michael Lugo October, 0 Definitions Given a random variable X which always takes on a positive integer value, we define the probability generating function f X (z) = P (X = 0) + P (X = )z + P (X = )z + = P (X = k)z k This infinite sum converges at least in some neighborhood of the origin z = 0; in particular f X () = P (X = k) = But we won t usually worry about questions of convergence The nice thing about generating functions is that they let us take a whole sequence of numbers and let us consolidate them into a single function; as Herb Wilf puts it in his book generatingfunctionology, A generating function is a clothesline on which we hang up a sequence of numbers for display Then we can use the tools of calculus on that function Since we know calculus this is useful In particular, if we want to know E(X), E(X ), we can find them by taking successive derivatives of f X Differentiating term by term gives [ ] d dz f X(z) = d P (X = k)z k = P (X = k) d dz dz zk = P (X = k)kz k and so f X () = P (X = k)k = E(X) To find E(X ) is a bit harder The obvious thing to do is differentiate twice, but this gives [ ] d dz f X(z) = d P (X = k)z k = P (X = k) d dz dz zk = P (X = k)k(k )z k Therefore letting z = gives () = P (X = k)k(k ) = E(X(X ))

But we need not despair! We have E(X ) = E(X(X ) + X) = E(X(X )) + E(X) = X() + f X() and if we recall that V ar(x) = E(X ) E(X), we can derive the formula V ar(x) = X() + f X() f X() We won t often have use for higher moments, but let s consider how we could find E(X 3 ) Differentiating three times gives f (3) () = E(X(X )(X ) It turns out that x 3 = x(x )(x ) + 3x(x ) + x, as you can easily verify Thus we get the formula E(X 3 ) = f (3) () + 3 () + f() This continues In general { } { } n n n k=0 (x) k k = x n, where denotes a Stirling number k of the second kind ; this is the number of ways to partition a set of n objects into k nonempty subsets, although for our purposes we can just think of these as the number that makes this identity holds Therefore E(X k ) = n k= { } n f (k) (n) k Incidentally, you might also write f X (z) = E(z X ), where the exponent is a random variable; this means that a lot of what we say here about discrete random variables can carry over to continuous random variables Finding some generating functions We know some distributions, so let s find their generating functions and then use these to derive moments The Bernoulli distribution Recall that the Bernoulli distribution with parameter p has P (X = 0) = p, P (X = ) = p Therefore the generating function is ( p)z 0 + pz, or pz + q This gives f X(z) = p, X(z) = 0 and so we can use the formulas E(X) = f X (), V ar(x) = f X () + f X () f X () to get E(X) = p, V ar(x) = 0 + p p = p( p) = pq But we already knew these The geometric distribution This is the time until the first head if we flip a coin with probability p of heads This distribution has P (X = k) = q k p for k, and so f X (z) = k q k pz k = k (qz) k pz

By the usual formula for the sum of a geometric series, we get f X (z) = pz/( qz) Differentiating once gives f X(z) = ( qz)p pz( q)/( qz) = p( qz), f X() = p( q) = p p = /p and so E(X) = /p Differentiating again gives X(z) = pq( qz) 3, X() = pq( q) 3 = pq/p 3 = q/p We can put this all together to get V ar(x) = X() + f X() f X() = q p + p p = q p The Poisson distribution The Poisson random variable has P (X = k) = e λ λ k /k! Therefore we have f X (z) = λ λk e k! zk = e λ (λz) k k! Recognizing the sum as the Taylor series for e λz, we get f X (z) = e λ e λz = e λ(z ) Therefore f X (z) = λeλ(z ), f X (z) = λ e λ(z ) Thus E(X) = λ, E(X X) = λ, E(X ) = λ + λ, V ar(x) = E(X ) E(X) = λ The mean and variance of the Poisson are λ, which we already knew The uniform distribution Somewhat surprisingly, it s difficult to use generating functions to get facts about the uniform distribution The generating function of the uniform distribution on 0,,,, n is f X (z) = n k=0 If you differentiate this you get n zk = n ( + z + + zn ) = ( zn ) n( z) f X(z) = n ( (n )z n nz n + ( z) But how can we evaluate this at z = 0? It appears to be 0/0 We take the limit as z : ) E(X) = n lim (n )z n nz n + z ( z) We then apply l Hopital s rule twice to get E(X) = n lim z (n )n(n )z n n(n )(n )z n 3 3

Plug in z = and simplify to get E(X) = (n )/ The variance can be found in the same way: we have f X(z) = (n )(n )z n n(n )z n + n(n )z n n (z ) 3 and we take the limit as z We have to apply l Hopital s rule three times to get f X() = (n )(n )n(n )(n ) n(n )(n )(n )(n 3) + n(n )(n )(n 3)(n n 6 and after much simplifcation this is (n )(n )/3 Finally V ar(x) = (n )(n ) 3 + n ( ) n = n But there are easier ways to get this see for example Pitman, exercise 330 The convolution formula One nice thing about generating functions is that they play well with respect to multiplication In particular, if X and Y are independent, and we have S = X + Y, then f S (z) = f X (z)f Y (z) That is, the generating function of the sum is the product of the generating functions To prove this fact, we show that the coefficient of z k in f S (z) is the same as that in f X (z)f Y (z) The coefficient of z k in f X (z)f Y (z) is k j=0 X and Y are independent, this is k j=0 be broken down into disjoint events: P (X = j)p (Y = k j) But since P (X = j, Y = k j) Finally, the event S = k can {S = k} = {X = 0, Y = k} {X =, Y = k } {X =, Y = k } {X = k, Y = 0} and so P (S = k) is the sum of the probabilities of these subevents So the coefficients of z k in f S (z) and f X (z)f Y (z) are the same for all k; thus the functions are the same Of course this can be extended to a sum of any finite number of random variables In particular, if S = X + + X n where X,, X n each have the distribution of X and the X i are independent, then f S (z) = f X (z) n Binomial distribution A binomial(n, p) random variable is the sum of n Bernoulli(p) random variables; therefore its generating function is f X (z) = (pz + q) n, the nth power of that of the Bernoulli So we have f X(z) = np(pz + q) n, X(z) = n(n )p (pz + q) n and in particular f X () = np, f X = n(n )p Thus E(X) = np and V ar(x) = n(n )p + np (np) = n p np + np n p = np np = np( p) = npq Negative binomial distribution Consider the waiting time until the rth success in a series of independent trials, where the first success occurs on the pth trial The overall

waiting time is the sum of r waiting times, each of which is geometric with parameter p The generating function of this waiting time T is therefore ( ) r pz f T (z) = ( qz) We can find the mean and the variance of T from this A useful trick is to note that d (log f dz T (z)) = f T (z)/f T (z); this is known as logarithmic differentiation Therefore we have and differentiating gives log f T (z) = r(log pz log( qz)) f T (z) ( f T (z) = r Letting z = gives f T ()/f T () = r Since f T () = we have f T z + q qz ) ( ) + q ; simplifying gives f q T ()/f T () = r/p () = r/p Finding the variance of T is left as an exercise 3 Alternative proofs of some facts When is the sum of two binomials a binomial? I claimed in class that the sum of two independent binomials is only a binomial if they have the same success probability We can prove this using generating functions Let X Bin(n, p ) and Y Bin(n, p ) Then they have generating functions (p z + q ) n and (p z + q ) n, respectively The sum S = X + Y has generating function f S (z) = (p z + q ) n (p z + q ) n, which is only of the form (pz + q) n if n = n One way to see this is to note that f S (z) = (p z + q ) n (p z + q ) n has real zeros at z = q /p, q /p If p p then this means that S has two real zeros, while (pz + q) n only has one Means and variances add Say X and Y are independent random variables with S = X + Y If X, Y have generating functions then we can show that E(X + Y ) = E(X) + E(Y ), V ar(x + Y ) = V ar(x) + V ar(y ) purely by calculus In particular, we have f S (z) = f X (z)f Y (z) by the convolution formula above Differentiating once gives and if we let z = we get f S(z) = f X(z)f Y (z) + f X (z)f Y (z) f S() = f X()f Y () + f X ()f Y () This should be read as I m writing these notes on a Friday afternoon and have done enough calculus for one day 5

But since f X, f Y are probability generating functions, f X () = f Y () = and so we have f S () = f X () + f Y () In terms of expectations this is just E(S) = E(X) + E(Y ) Similarly, we have and at z = this becomes S(z) = X(z)f Y (z) + f X(z)f Y (z) + f X (z) Y (z) S() = X() + f X()f Y () + Y () Combining this with the known expression for f S () we get V ar(s) = X() + f X()f Y () + Y () f X() f Y () (f X() + f Y ()) After some rearrangement this becomes X() + f X() f X() + f Y () + f Y () f Y () and this is clearly V ar(x) + V ar(y ) Another proof of the square root law If S = X + + X n, and all the X i are independent and have the distribution of X, then f S (z) = f X (z) n Differentiating both sides of this identity and substituting gives f S() = nf X () n f X() = nf X() which has the probabilistic interpretation E(S) = ne(x) Differentiating twice gives and letting z = gives Therefore the variance of S is S(z) = n(n )f X (z) n f X(z) + nf X (z) n X(z) S() = n(n )f X() + n X() V ar(s) = S() + f S() f S() = n(n )f X() + n X() + nf X() n f X() After simplifying we get V ar(s) = n( X() + f X() f X() ) = nv ar(x) and taking square roots gives SD(S) = nsd(x), the square root law 6

A couple frivolous results A power series identity from the negative binomial Recall that for the waiting time until the rth success in independent trials with success probability p, we have ( ) t P (T r = t) = p r q t r r But we also know that the generating function of T r is (pz/( qz)) r Therefore, the coefficient of z t in the Taylor expansion of (pz/( qz)) r is ( t r ) p r q t r We write this fact as ( ) r ( ) pz t [z t ] = p r q t r qz r where we use [z k ]f(z) to stand for the coefficient of z k in the Taylor expansion of f(z) Some simple manipulation gives z t r pr ( qz) r z t r ( qz) r = ( ) t r p r q t r = ( ) t r q t r [z t p r z r ] = ( qz) r ( ) t p r q t r r and if we let t = r + k, we get ( ) [z k k + r ] ( qz) = q k r r Finally, summing over k, we get ( qz) = ( ) k + r q k z k r r For example, we have = ( ) k + q k z k ( qz) 3 ( ) ( ) ( ) ( ) 3 5 = + (qz) + (qz) + (qz) 3 + = + 3(qz) + 6(qz) + 0(qz) 3 + Remainders when flipping coins You flip a coin, which comes up heads with probability p, n times The probability that the number obtained is even is (f X () + f X ( ))/, 7

where X is a binomial random variable this was a homework problem Since f X (z) = (pz + q) n, this works out to (p + q) n + ( p + q) n + ( p)n = So for a large number of coin flips (large n), this will be close to /; if p = / this will be exactly / But what if we want to know the probability that we obtain a number of heads which is a multiple of? This is not, in general, / For example, if we flip four fair coins the probability of getting a multiple of heads is ( ( ( 0) + ) )/ = /6; if we flip five fair coins it s ( ( ( 5 0) + 5 ) )/ 5 = 6/3 We can evaluate f X (z) at each of the fourth roots of unity to get f X () = p 0 + p + p + p 3 + p + p 5 + f X (i) = p 0 + ip p ip 3 + p + ip 5 + f X ( ) = p 0 p + p p 3 + p p 5 + f X ( i) = p 0 ip + p + ip 3 + p ip 5 + where p i = P (X = i) Adding all four of these together we get f X () + f X (i) + f X ( ) + f X ( i) = (p 0 + p + p 8 + ) and all the other p j cancel out Therefore we have a formula good for any random variable, P (X is divisible by ) = f X() + f X (i) + f X ( ) + f X ( i) and in the case where X Bin(n, /) this is P (X is divisible by ) = + ( +i ) n + 0 n + ( ) i n Since ( + i)/ < the second and fourth terms in the numerator go away; if we flip a large number of coins the probability that the number of heads is divisible by goes to / as n 8