ALL TEXTS BELONG TO OWNERS. Candidate code: glt090 TAKEN FROM

How are Generating Functions used in finding the closed form of sequences involving recurrence relations and in the analysis of probability distributions? Mathematics Extended Essay Word count: 3865 Abstract word count: 09 The fascination one can have with mathematics is similar to the fascination a boy can have with a girl. Hiroshi Yuki (01)

Contents Important Equations and Identities Abstract 3 Introduction 4 Generating Functions in Finding the Closed Form of Sequences Involving Recurrence Relations 6 A simple sequence {, 4, 8, 16 } 6 The Fibonacci Sequence 7 Derivation of the sum of an infinite geometric series 10 The Catalan Numbers 16 Probability Generating Functions 6 Uses of the PGF 7 The Binomial Distribution 30 The Poisson Distribution 34 The Maclaurin Series 35 Conclusion 39 Works Cited 40 1

Equations and Identities used in this extended essay Generating function for finding the closed form of a sequence: G (x) = u x n n = u x 0 0 + u x 1 1 + u x + u x 3 3 +... + u x n n +... n=0 Probability Generating Functions: G x (t) = t x P (X = x ) x=0 Sum of an infinite power series: 1 + x + x +... = 1 1 x For x < 1 and as n Maclaurin series: f (x) = n=0 1 n n! f (0)x n

Abstract The research question of this Extended Essay is How are Generating Functions used in finding the closed form of sequences involving recurrence relations and in the analysis of probability distributions? Generating functions are incredibly useful as they represent an infinite series of numbers as a single function (Graham, Knuth and Patashnik, 1994). They are therefore a powerful tool in analysing probability distributions and finding the closed form of a sequence. In investigating this research question, I began by applying generating functions to finding the closed form of the Fibonacci Sequence and the Catalan Numbers. The techniques I used in finding these closed forms helped me to approach finding the probability generating functions of four probability distributions: The Binomial Distribution and The Poisson Distribution. The investigation included approaches using the Maclaurin Series and the Binomial Theorem. The conclusion I came to was that the generating functions for any sequence or probability distribution is highly useful, allowing one to analyse the sequence and probability distribution much more quickly and efficiently. The research showed how generating functions analyse probability distributions, sequences involving recurrence relations, and its applications in real life. This essay concludes with a summary of findings and areas of exploration on Generating Functions. (Word count: 09) 3

Introduction The most powerful way to deal with sequences of numbers, as far as anybody knows, is to manipulate infinite series that generate those sequences. (Graham, Knuth, and Patashnik, 1994, p30) Generating functions are one of the most powerful ways of analysing a series of numbers, by expressing the terms of a sequence as coefficients of a series expansion, and finding the closed form of that series expansion. The concept of a generating function was introduced by Abraham de Moivre in 1730 (Knuth, 1973), and has been used since then to analyse sequences, to solve problems in combinatorics, computer programming, and in analysing probability distributions. Generating functions are functions which expresses an infinite sequence as coefficients of a series expansion. Here, u n is the n th term of the sequence as coefficients of x, an indeterminate which cannot be solved for. It is simply a symbol or placeholder. A generating function is not a true function which maps a domain and range but is more of a series which obeys the laws of a function, such as having derivatives and integrals and can be manipulated by the rules following a regular function. Below is the general form of a generating function: G (x) = u xn n n=0 Where x is an indeterminate or variable, and n is a positive whole number. By calculating the function which produces this infinite series, one can generalise the sequence and find its closed form. This may involve calculus, algebra, and the use of Maclaurin Series, falling factorials or other methods in order to find the closed form of the function. In this essay, a closed form is defined as an expression which generates the term by simply knowing the nth term. u n 4

The subject of sequences and generating functions was the first one to interest me - especially because of the Fibonacci Sequence, which appears in nature, art, and poetry. It is a recurrence relation, making it difficult to generalise, but generating functions can find the closed form of this sequence and others which also are in the form of recurrence relations. The Catalan Numbers and the Fibonacci sequence have important roles in combinatorics and finding the general form of these sequences are therefore important. Statistical analysis is also another area in which Generating Functions play an important role, as they can help facilitate the finding of the mean, variance, and probabilities of a probability distribution. As we shall see, the product of probability generating functions of independent variables corresponds to their sum, and this greatly increases the efficiency of computations relating to probability distributions. This extended essay relates to the topics in the Math HL course of Statistics and Probability, Functions and Equations, and was written before we began studying Statistics and Probability, which helped me greatly when we began the topic. The research question, How are Generating Functions used in finding the closed form of sequences involving recurrence relations and in the analysis of probability distributions? is the focus of this extended essay, I will be showing the use of generating functions in finding the closed form of a sequence and its use in probability distributions in this extended essay. 5

Generating Functions in Finding the Closed Form of Sequences Involving Recurrence Relations A simple sequence: {, 4, 8, 16 }, or n 1 u n = u 1 r I used a simple sequence first to illustrate how generating functions work. In this case, the generating function of the sequence shown above would be G (x) = x + 4x + 8x 3 + 16x 4... (refer back to general form on page ) Normally, the general form of this sequence would be relatively easy to figure out by comparison. However, for the sake of demonstrating the process of using a generating function, I will show how to find the general form of the sequence {, 4, 8, 16...} using the generating function method. Since we know the first few terms of the sequence, we make an infinite sum using the sequence as the coefficients of a power series. G (x) = x + 4x + 8x 3 + 16x 4 +... From this we can generalise (the n th term of the sequence) as u n G(x) = n xn where n is a positive whole number n=0 And since we know the generalisation of the generating function, u n = n This gives us the general form of the sequence {, 4, 8, 16...}. 6

Now that I have demonstrated the use of generating functions in finding the closed forms of simple series, I will now move on to more complicated ones such as series containing recurrence relations. One of the most famous examples of this is the Fibonacci sequence. The Fibonacci Sequence Arithmetic and geometric sequences, such as in the above example, are quite easy to find the general formula as they have common ratios and differences which can be easily found using division or subtraction. However, there are other sequences that exist such as the Fibonacci sequence, which is in the form of a recurrence relation. The Fibonacci sequence can be expressed as the sequence {0, 1, 1,, 3, 5, 8, 13, 1, 34 } Where the nth term is the sum of the ( n 1) th and ( n ) th term, such that u 0 = 0 u 1 = 1, n > u n = u n 1 + u n The generation of the n th term as a function of the previous terms is what makes the Fibonacci sequence a recurrence relation. The general form of recurrence relations are generally harder to find as there is no common ratio or common difference or general formula to express the sequence. Although there is a pattern, this is not 7

practical for larger numbers n, thus generating functions are used to find the general forms of sequences which depend on recurrence relations. The generating function of the Fibonacci sequence would be like so (where G (x) is power series with coefficients being the terms of the Fibonacci sequence): G (x) = 0x 0 + 1x 1 + 1x + x 3 + 3x 4 +... G (x) = x + x + x 3 + 3x 4 +... Or, in a more general form, G (x) = u x 0 0 + u x 1 1 + u x + u x 3 3 +... Generalized as G (x) =... + u x n n + u x n 1 n 1 + u x n n +... However, it s nearly impossible to add the terms together in this form as the powers of x are different. I noticed that the recurrence relation of the Fibonacci sequence allowed me to equalize the powers of x, by multiplying G (x) by powers of x. u xn n x = u n xn u xn 1 n 1 x 1 = u n 1 xn I multiplied G(x) by x 0, x 1, and x in order to obtain the following equations: G (x) x 0 = u x 0 0 + u x 1 1 + u x + u x 3 3 +... 1 G (x) x 1 = u0x 1 + u x 1 + u x 3 + u x 4 3 +... 8

G (x) x = u0x + u x 3 1 + u x 4 + u x 5 3 +... 3 Subtracting 1 from 3 + : Left-hand side = G (x) x + G (x) x 1 G (x) x 0 = G(x)(x + x 1 ) Right-hand side = u x 1 0 + u x 0 0 u 1 x1 + ( u 0 + u 1 u ) x + ( u 1 + u u 3 ) x 3 + ( u + u 3 u 4 ) x 4 +... + ( u n + u n 1 u n ) x n +... From the general form of this combined equation I saw that ( u n + u n 1 u n ) x n = [ u n + u n 1 ( u n + u n 1 )] x n = ( u n + u n 1 u n u n 1 ) x n = 0 Everything in this equation will cancel out except for the first three terms, leaving the equation G(x)(x + x 1 ) = u x 1 0 + u x 0 0 u 1 x1 9

And since we know what u 0 and u 1, which are 0 and 1 respectively. G(x)(x + x 1 ) = x G (x) = x 1 x x This is the generating function of the Fibonacci Sequence. This is the sum of the infinite series which has coefficients of the Fibonacci Sequence. However, this generating function needs to be converted into a sequence form in order for it to allow one to generate the nth term of the sequence. Currently, we know G (x) = x 1 x x I realised that in order to convert the Fibonacci Sequence into a generalised sequence, I would need to make use of the sum of an infinite geometric series (Yuki, 01). Derivation of the sum of an infinite geometric series Consider: ( 1)(1 x ) = 1 x 1 ( 1 + x)(1 x ) = 1 x ( 1 + x + x )(1 x ) = 1 x 3 ( 1 + x + x + x 3 )(1 x ) = 1 x 4 We can generalise this: ( 1 + x + x +... + x n )(1 x ) = 1 x n+1 10

Creating the equation ( 1 + x + x +... + x n 1 x ) = n+1 1 x And when x < 1, we can see that lim x xn+1 = 0 Therefore for x < 1 and as n, 1 + x + x +... = 1 1 x For the function f (x) = 1 + x + x + x 3 + x 4 +..., we could also express this as f (x) = 1x 0 + 1 x + 1x + 1x 3 + 1x 4 +... where the coefficients of x form the sequence {1, 1, 1, 1, 1, 1...}. The general form of a generating function G (x) is as follows: G(x) = u x 0 0 + u x 1 1 + u x + u x 3 3 +... = u xn n n=0 Now, back to the Fibonacci Sequence. This helped me in expressing the function G ( x ) as an infinite series. I began by factorizing the bottom of the fraction in G ( x ), taking into account the fact that x is an indeterminate. Consider: x 1 x x = R 1 rx + S 1 sx R(1 sx)+s(1 rx) = (1 rx)(1 sx) 11

R Rsx+S Srx = 1 sx rx+rsx (R+S) (Rs+Sr)x = 1 (s+r)x+rsx By using comparison, we can see that the coefficients come out to 1. R + S = 0. R s + S r = 1 3. s + r = 1 4. r s = 1 We can express R and S in terms of r and s. R = S S s + S r = 1 (by substituting equation 1. Into equation.) S(r s ) = 1 S = 1 r s Similarly, we can find that R = 1 r s And, using this, we can therefore form the expression 1

G (x) = x 1 x x = R 1 rx + S 1 sx = 1 r s 1 1 rx 1 r s 1 1 sx = 1 r s 1 1 rx 1 1 sx ( ) 1 1 1 1 We can see here how 1 rx and 1 sx correspond to the generating function of 1 x. We replace 1 rx 1 and 1 sx with their infinite sums, referring back to the infinite sum on page 11. G (x) = 1 r s ((1 + r x + r x + r 3 x 3 +...) ( 1 + s x + s x + s 3 x 3 +...)) = 1 r s((r s )x + ( r s )x + ( r 3 s 3 )x 3 +...) r s r s + r s 3 3 r s + r s = r s x x x 3 +... Which can be expressed in the general form, G (x) r s = n=0 r s n n n x And since we know that G (x) = n=0 u n xn 13

We can therefore deduce that u n = r n s r s n Where u n is the n th term of the infinite sum. Now, we also know that s + r = 1 And r s = 1 from page 1 And since r and s are the roots of the equation x ( s + r )x + r s = 0 x x 1 = 0 We can use the quadratic formula to solve for the roots of the equation. x = b± b 4ac a 1± 1+4 x = 1± 5 x = Assuming that r > s, 14

1+ 5 r = 1 5 and s = r s = 1+ 5 1 5 1 1+ 5 = = 5 This shows that u n = 1 1+ 5 (( ) n ( ) ) 5 1 5 n Which is the general form of the Fibonacci sequence. We can test this out: u 0 = 1 1+ 5 (( ) 0 1 5 ( ) 0 ) = 1 (1 1 ) = 0 5 u 1 = 1 1+ 5 (( ) 1 1 5) ) ( ) 5 ( 1 = 1 5 = 1 5 5 From this process, we can see that using a generating function to find the general form of a sequence follows these steps: 1. Express the generating function of the sequence in the form of an infinite power series, using the formula G (x) = x u n n n=0. Express the infinite power series as an infinite sum 15

3. Find the general form of G(x) in the form u xn n in which u n is an expression by itself n=0 4. Extract u n from step 3, finding the closed form of the sequence Now, I will demonstrate this method of using generating functions to find the general form, sequence containing recurrence relations using another sequence: the Catalan numbers. u n of a The Catalan Numbers The Catalan numbers are another famous mathematical sequence which had stumped mathematicians since their publication in the work of Eugene Charles Catalan as a solution to a combinatorics problem, similar in the way that the Fibonacci numbers had been discovered multiple times but made famous by one mathematician. The Catalan numbers appear in many situations, one of which is a round table around which n number of people sit. The Catalan numbers ( u n being the nth term) are then the number of ways those n people can shake hands without their arms crossing. Where u n is the n th term of the sequence. The Catalan numbers consist of the following sequence (from n = 1): Table 1: The Catalan Numbers n 1 3 4 5 6 7 8 u n 1 5 14 4 13 49 1430 Another combinatorics problem which could be used to find a solution to the Catalan numbers would be how many different ways to rearrange brackets for n+1 numbers, where A is any arbitrary number. We are not actually adding any numbers but rather just using them as placeholders so we can rearrange the number of brackets in order to illustrate the Catalan numbers. 16

u 0 = 1. For n= 1, A + A = ( A + A ) = 1 way to arrange the brackets For n=, A + A + A = ((A + A ) + A) u 1 u 0 = ( A + ( A + A)) u o u 1 = ways to arrange the brackets For n= 3, A + A + A + A = (((A + A ) + A ) + A ) = ((A + ( A + A )) + A ) = ((A + A ) + ( A + A)) includes arrangement of two n = 1 = ( A + ((A + A ) + A ) = ( A + ( A + ( A + A )) = 5 ways to arrange the brackets There s a pattern emerging in these examples - one can see that there is a grouping of n = in n = 3 : Therefore we can see that u n 1 affects the value of u n. u n will be a sum of the previous sums, or that, u = u 0 u 1 + u 1 u 0 u 3 = u 0 u + u 1 u 1 + u u 0 17

From this, we can generalise that u 0 = 1, u n+1 = u 0 u n 0 + u 1 u n 1 + u u n +... + u k u n k +... + u n 0 u 0 Or, u 0 = 1, n u n+1 = k=0 u k u n k This is the recurrence relation of the Catalan Numbers. Now for the generating function of the Catalan numbers: G (x) = u x 0 0 + u x 1 1 + u x + u x 3 3 +... + u x n n +... G (x) = u 0 + u 1 x + u x + u x 3 3 +... + u x n n +... G (x) = x u n n n=0 In order to find the closed form of G (x), I tried a few methods such as differentiation and subtraction (the technique I used for the Fibonacci sequence) but eventually found that squaring G (x) was the most effective way to find the closed form of the generating function. G(x) = u 0 u 0 + ( u 0 u 1 + u 1 u 0 )x + ( u 0 u + u 1 u 1 + u u )x n 0 +... + ( u u )x n k n k +... k=0 n k=0 Showing that the coefficients of x n are u. u k n k 18

G(x) n = ( u )x u n k n k n=0 k=0 And since we know that n u n+1 = k=0 u k u n k G(x) = n=0 u n+1 xn In a similar fashion to the Fibonacci sequence, we can manipulate this equation by multiplying it by x. Doing so will change the power of x in the current equation and allow us to find the closed form of the Catalan numbers. x G(x) = x n=0 u n+1 xn x G(x) = x u n+1 n+1 n=0 x G(x) = x u n+1 n+1 n+1=1 I added 1 to both sides of the lower limit so n + 1 would be achieved, equal to the notation for u and x. This can be further generalised to x G(x) = n=1 u n xn x G(x) = u n x n u 0 n=0 19

Knowing that G (x) = u xn n, n=0 x G(x) = G(x) u 0 And since we know that u 0 = 1, xg(x) G (x) + 1 = 0 Using the quadratic formula, 1± 1 4x G (x) = x However the ± complicates things. I realised that I couldn t have two closed forms x G (x) = 1 + 1 4x or x G (x) = 1 1 4 x For x=0, (0) G (0) = 1 + 1 4 (0) or (0) G (0) = 1 1 4 (0) 0 = / 1 + 1 0 = 0 Therefore, we know that the correct closed form of G(x) is 1 1 4x G (x) = x (*) In order to find the closed form of the generating function, we ll attempt to find the closed form of the power series which corresponds to the generating function G(x), as we did for the Fibonacci sequence. 0

k=0 Because G(x) is one power series ( u xk k ), and the expansion of 1 4 x can also be expressed as a k=0 power series ( P xk k ), a generating function I will call P (x) so that G (x) will not be confused with it. The coefficients of the generating function P (x) will be P k to avoid confusion with the Catalan Numbers. This would mean that the equation for G ( x ) would now be: x u x k k = 1 P k xk From (*) k=0 k=0 u x k+1 k 1 = 1 ( P 0 + P x k k ) k=0 k=1 u k 1 x = 1 P 0 x k=1 P k k k=1 u x k k 1 + P k x k = 1 P 0 k=1 k=1 k=1 ( u k 1 + P k )x k = 1 P 0 By comparing the coefficients of the terms, I found that 0 = 1 P 0 u 0 + P 1 = 0 u 1 + P = 0 u + P 3 = 0 1

u n + P n+1 = 0 We can generalise this as P 0 = 1 P u n = n+1 Therefore if we find the general form of P n+1 we would be able to find the closed form of the generating function G (x). We know that P (x) = 1 4 x P (x) = P x 0 0 + P x 1 1 + P x + P x 3 3 +... Applying the rules we use for regular function to P(x), differentiating we get P (x) = P 1 + P x + 3P x 3 + 4P x 3 4... P (x) = 1 P + 3P 3 x + 3 4P 4 x3 Thus P (0) = 1P 1 P (0) = P P (0) = 3 1P 3 Or generalised as, P (n) (0) = n!p n (n) P (0) P n = n!

After I arrived at this, I realised that I needed to link the equation P (x) = 1 4 x to the identity of P n by finding the n th derivative of P (x). 1 P (x) = ( 1 4x) P (x) = (1 4x) 1 P (x) = (1 4x) 3 P (3) (x) = 4 3(1 4x) 5 P (4) (x) = 6 5 4(1 4x) 7 P (5) (x) = 8 7 6 5(1 4x) 7 Here I realised that I needed some sort of annotation for the factorial which ended on a number which was not one, for example 8 7 6 5 without using fractions as this would complicate calculations further on if I attempted to use fractions and factorials. After researching, I found that the most effective method was the falling factorial, which is a way of expressing the factorials of a number n number of times. For example: ( x) x = ( x)(x 1)(x )(x 3)(x 4 )...(3)()(1) = x! ( x goes through the factorial x times) ( x) = ( x)(x 1 ) ( n goes through the factorial twice) ( x) 5 = ( x)(x 1)(x )(x 3)(x 4 ) ( n goes through the factorial five times) 3

Generalised as n 1 ( x) n = (x k ) k=0 Where is similar to the sign but multiplies instead of adding. I used this notation in writing the generalisation of the identity of P (n) (x), which is below. P (n) (x) = ( n ) n 1 (1 4x) n 1 P (n+1) (x) = (n) n(1 4x) n+1 P (n+1) (0) = (n) n (n) P (0) And from the equation, : P n = n! P n+1 (n+1) P (0) = (n+1)! = (n) n (n+1) n+1 We can substitute this into the equation P u n = n+1 u n P = n+1 ( from page 1) = (n) n (n+1) n+1 (substituting what we know of P n+ 1 ) = (n) n (n+1)(n) n (separating the falling factorials) 4

From here, we know that ( n) n = n(n 1 )(n )...(n + 1 ) ( n) n(n) n = n(n 1 )(n )...(n + 1 )(n)(n 1 )(n )...(3)()(1) (n)! ( n) n = n! Then we can put that back into the previous equation where we separated the falling factorials: = 1 (n)! n+1 n!n! ( ) = 1 n+1 (n)! n!(n n)! ( ) = 1 n n+1 n ( ) This is the general form of the n th term of the Catalan numbers. After exploring the use of generating functions in finding the general form of a sequence defined by recurrence relations, I decided to look at a more practical approach - the applications of generating functions in statistics. 5

Probability Generating Functions Generating functions are used in probability and statistics to predict the probability of an event occurring as the coefficient of an indeterminate, t (it does not have a value but can be treated like a variable in a regular function), which has the power x, such that For a probability distribution X, G X (t) = E(t x ) = t x P (X = x ) x=0 Where (t) G X is the probability generating function (PGF) of a probability distribution X (the number of successes or occurrences as defined in each different distribution), where P ( X = x ) is the probability distribution. The PGF allows one to find a lot of information about a probability distribution much more simply than other methods. It has the following uses (Chapter 4: Generating Functions, n.d.): 1. Finding P (X = x ) for any value of x as a coefficient of t x. Finding the mean of a probability distribution 3. Finding the variance of a probability distribution 4. Finding the sum of independent, discrete random variables In this section, I will explore two different distributions and the derivations of their PGFs - the Binomial Distribution and the Poisson Distribution. 6

Uses of the PGF Firstly, the derivatives of the PGF can be used to generate the probabilities of the probability distribution the PGF corresponds to (6 - Probability Generating Functions, n.d.). G X (t) = t x P (X = x ) = P (0) + t P (1) + t P () + t 3 P (3) +... x=0 G X (t) = 1 P (1) + tp () + 3t P (3) + 4t 3 P (4) +... G X (t) = ( 1 )P () + ( 3 )tp (3) + ( 4 3)t P (4) +... G X (t) = ( 3 1 )P (3) + ( 4 3 )tp (4) +... I used this formula to generalise the probabilities of any probability distribution by manipulating the derivatives of the PGF: P (X = 0 ) = G X (0) P (X = 1 ) = G X (0) P (X = ) = 1 1 G X (0) 1 3 1 X P (X = 3 ) = G (0) 1 4 3 1 (4) X P (X = 4 ) = G (0) And can be generalised to: (n) P (X = n ) = 1 n! G X (0) The mean and variance can also be found from the PGF of any probability distribution. 7

For the derivation of the mean, consider the PGF of the probability distribution X : G X (t) = t x P (X = x ) x=0 G X (t) = 1 P (X = 1 ) + tp (X = ) + 3t P (X = 3 ) + 4t 3 P (X = 4 ) +... Where G x (t) = d dxg x (t) There is a pattern in this - the series looks like the expectation of the probability distribution; however, the only term that is out of place is t which can be replaced by 1 in order to find the expectation of the probability distribution. G X (1) = 1 P (X = 1 ) + P (X = ) + 3 P (X = 3 ) + 4 P (X = 4 ) +... = x P (X = x ) x=1 = E (X) This shows us that the mean of the probability distribution can be found from the PGF of a probability distribution, rather than having to sum the probability distribution in the usual way: Recall: E (X) = x P (X = x ) And for large probability distributions one would have to sum all the values of x and P (X = x ) in order to find the value of the expectation and therefore the PGF makes it much more efficient to find the expectation of the probability distribution. For the variance, I differentiated the PGF twice. 8

G X (t) = t x P (X = x ) x=0 G X (t) = xt x 1 P (X = x ) x=0 G X (t) = x(x 1)t x P (X = x ) x=0 G X (1) = x(x 1 )P (X = x ) = E(X(X 1 )) = E(X X ) = E(X ) E (X) x=0 Where t = 1 For variance, V ar(x) = E(X ) [ E(X)] = E(X ) [ G X (1)] It seemed as if we were stuck here. However, we know that G X (1) = E(X ) E (X), so we can manipulate this equation: V ar(x) = E(X ) E (X) + E(X) [ G X (1)] = G X (1) + E(X) [ G X (1)] = G X (1) + G X (1) [ G X (1)] Finally, the PGFs of two independent random distributions can be used to find the sum of those independent variables. For example, if X = Z + Y, and Z and Y are independent probability distributions, we can label their generating functions as G Z (t) and (t) respectively. If we want to find the sum of the G Y 9

two distributions, or the generating function of the two distributions, we could do so using the generating functions of Z and Y. G X (t) = E(t x ) = E(t z+y ) = E(t z t y ) = E(t z ) E(t y ) = G Z (t) G Y (t) Hence, we can see that a PGF is immensely useful in finding everything that there is to know about a probability distribution. In the following section, I will consider well-known probability distributions, their PGFs, and how effectively information about the probability distributions can be found using the PGFs. Binomial Distribution The Binomial Distribution is a probability distribution limited by the following rules: 1. The trials are independent and are random variables. There are only two possible outcomes - success and failure 3. The probability of each trial is constant The Binomial Theorem uses the Bernoulli Distribution, which is shown in Table, but is repeated n number of times. Table : The Bernoulli Distribution x 0 1 P(X = x) 1 - p p 30

Where p is defined as the probability of the success occurring and 1-p is the probability of the failure, X is the random variable denoting the number of successes, and P(X = x) denotes the probability that X can assume the value of x which is success or failure. The Binomial Distribution occurs when an experiment is repeated n number of times, the probability of success is p, and where X is the number of successes in these trials. The probability density function (PDF) of the Binomial Distribution is: n X ~ B(n, p) P (X = x ) = ( ) p x x (1 p) n x x = 0, 1,...n The PGF of this Binomial Distribution or (t) G X would therefore be G X (t) = t 0 P (X = 0 ) + t 1 P (X = 1 ) + t P (X = ) + t 3 P (X = 3 ) +... = t 0 n ( )p 0 0 (1 p) n + t 1 n ( )p 1 1 (1 p) n 1 + t n ( )p (1 p) n + t 3 n ( )p 3 3 (1 p) n 3 +... n 0 0 n 1 1 n n 3 3 = ( )(tp) (1 p) n + ( )(tp) (1 p) n 1 + ( )(tp) (1 p) n + ( )(tp) (1 p) n 3 +... Let q = 1 p, n 0 G X (t) = ( )(tp) q n n 1 + ( )(tp) q n 1 n + ( )(tp) q n n 3 + ( )(tp) q n 3 +... 0 1 3 Recall that ( a + b) n n = ( ) a b r=0 n n r r r (the Binomial Theorem) Therefore, G X (t) = ( q + tp) n n = ( 1 p + tp) This is the PGF of the Binomial Distribution. 31

One of the uses of the PGF is that the expectation and variance of the PDF can be found very easily. The expectation of the Binomial distribution can be found easily using the following method: E (X) = G X (1) = G X (t) = p n(q + 1p) n 1 = n p Similarly, the variance of a distribution can be found using the PGF of that distribution. On page 9, the variance of a distribution was defined as: V ar(x) = G X (1) + G X (1) [ G X (1)] And can be used to find the variance of a distribution very quickly: V ar(x) = p n(n 1 )(q + p) n + np ( np) = np (n 1 ) + np ( np) = p(n p n p + n n p) = np(1 p ) = n pq Next I asked myself the question: how can I work backwards from a PGF to find the original probability density function? We can use this PGF in a similar way we used the generating function to find the nth term of a sequence with the Fibonacci numbers, by re-writing the PGF in a form which is an infinite series, allowing us to extract the probability density function. G X (t) = ( q + tp) n n 1 G X (t) = p n(q + tp) 3

G X (t) = p n n(n 1 )(q + tp) G X (t) = p 3 n 3 n(n 1)(n )(q + tp) Generalised as: G (y) X (t) = p y n y (n 1)(n )(n 3)...(n x + 1 )(q + tp) (y) G X (t) = p y n! (n y)! ( q + t n y p) (y) G X (t) = p y n!y! y!(n y)! ( q + t n y p) (y) G X (t) = p y n! y! y!(n y)! ( q + t n y p) (t) = p y y! ( n ) ( q + tp) G X (y) y n y And since we know that P (X = y ) = 1 y! G(y) (0) P (X = y ) = ( n ) p y ( q) y n y P (X = y ) = ( n ) p y ( 1 p) n y y Which follows a Binomial Distribution. Using this method, we can prove that a generating function is of a certain probability distribution or can find the probability density function of the distribution. One other distribution which I looked at was the Poisson Distribution. 33

Poisson Distribution The PDF of the Poisson Distribution is defined as follows: X ~ P o(λ) P (X = x ) = x! e λ x λ x = 1,, 3, 4,... Where λ is the mean of the variable x is the number of successes. The Poisson Distribution is used to predict independent events over a certain period of time. It is the number of x events that will happen, given that λ events happen on average over a certain period of time. This distribution is used in the prediction of crime rates in a certain area, the number of deaths in war, the number of hungry people entering a restaurant at a given period of time (Letkowski, 01.) and many other applications which have independent events happening over a certain period of time. The Poisson Distribution can also be applied spatially - such as the number of errors on a page of writing, or the number of defects on a certain product. Finding the PGF of the Poisson Distribution is therefore a very valuable tool which can be used in industry, criminology, as well as a multitude of other applications. From the definition of the Poisson Distribution, we can see that the PGF will be G X (t) = t x P (X = x ) x=0 = λ x t x x! x=0 e λ 34

= e λ (tλ) x x=0 x! From this point it may seem almost impossible to find the PGF of the Poisson Distribution. However, we must make use of Maclaurin s series to simplify the result and to find the PGF of the series. The Maclaurin series The Maclaurin series is a Taylor Series centred on zero. The Maclaurin Series makes use of an infinite series of polynomials to represent a function. This series is derived from the derivatives of the original function (Khan, 011). A function f (x) could therefore be expressed in a series of numbers. f (x) = a 0 + a 1 x + a x + a x 3 3 + a x 4 4 +... f (x) = a 1 + a x 1 + 3 a x 3 + 4 a x 3 4 +... f (x) = a + 3 a x 1 3 + 4 3 a x 4 + 5 4 a x 3 5 +... f (x) = 3 1 a 3 + 4 3 a 4 x + 5 4 3 a x 5 +... Substituting 0 for x, we can see that f (0) = a 0 (0) f = a 1 (0) a f (0) f = a = (0)! f (0) f = 3 a 3 a 3 = 3! 35

(0)! (n) f (0) f (n) = n a n a n = n! So we can generalise that p (x) = f (0) + f (0)x + 1 f (0)x + 1 3 f (0)x 3 + 1 (4) 4 3 f (0)x 4 +... + 1 n n! f (0)x n +... While researching, I found that the Maclaurin series for e x was important to finding the PGF for the Poisson Distribution. This led me to finding the Maclaurin Series for f(x) = e x. e x has the handy quality, x such that e x = d dx e, leading to the Maclaurin Series for being very simple to find. e x e x = e 0 + e 0 x + 1 0 e x! + 1 0 e x 3 3! + 1 0 e x 4 4! +... + 1 0 n! e x n +... x = 1 + x + x! + 3 x 3! + 4 x 4! +... + n n! +... = x n n=0 n! After I found the Maclaurin Series of e x, I realised why it was important to finding the PGF of the Poisson Distribution. What had seemed like a series that couldn t be simplified into a closed form was now made easier when I compared the Maclaurin Series of e x with the PGF of the Poisson Distribution. (t) G X = e λ (tλ) x x=0 x! Since e x x = n, therefore e tλ (tλ) = x and we can substitute this value into the equation to get: n=0 n! x=0 x! (t) = e e G X λ tλ (t) G X = e λ(t 1) 36

Similarly to the Binomial distribution, how can we prove that this generating function represents the Poisson Distribution or work backwards to find the original probability density function? This can be achieved in the following steps: (t) G X = e λ(t 1) G X (t) = λ e λ(t 1) G X (t) = λ e λ(t 1) G X (t) = λ 3 e λ(t 1) Generalised as: G X (n) (t) = λ n e λ(t 1) Which can be substituted into the equation P (X = n ) = 1 n! G(n) (0) n λ e λ P (X = n ) = n! Which is a Poisson Distribution. Again, this shows us how to find the PDF of a distribution from its PGF. Other than this, One can use the PGF to find the expectation of the distribution much more efficiently than the usual method. In the introductory section of PGFs, I showed how one can use the PGF to find the expectation. The expectation of a probability distribution is: E (X) = x P (X = x) or E (X) = G X (1) x=0 With the first equation, for the Poisson Distribution, we would have to find the expectation through solving this: 37

Solving for the expectation of the Poisson distribution is very simple if this equation is used: E (X) = G X (t) = λ e λ(1 1) = λ Using the PGF of the Poisson Distribution is very efficient as one simply has to differentiate it once and then set x as 0. Similarly, the variance of of the Poisson Distribution can be found using the following method: V ar(x) = G X (1) + G X (1) [ G X (1)] = λ e λ(1 1) + λ e λ(1 1) λ = λ This shows clearly the efficiency of using the PGF in the analysis of probability distributions. 38

Conclusion Through exploring this topic in the Extended Essay, I found that generating functions can be used to find the closed form of sequences involving recurrence relations. Two of these sequences which I found the closed form of were the Fibonacci Sequence and the Catalan Numbers. This shows the powerful computational potential of generating functions, and how its properties can be used to apply methods of manipulating functions to a sequence. I also explored the use of PGFs in the analysis of probability distributions. Using the PGF of a distribution, I found the individual probabilities as well as expectation and variance. The PGF can also be used to find combinations of independent distributions using their individual PGFs. The PGF of a distribution greatly reduces computational time for finding information about the probability distribution. By using the various methods applied to regular functions, one can manipulate a probability distribution to find information about it much more efficiently than if one did not use a PGF. The scope of applications of generating functions is huge, ranging from its use in the sieve method in combinatorics, the Snake Oil method for evaluating combinatorial sums, and many other combinatorial problems. Computer algorithms also use generating functions and they can be used to prove combinatorial identities. Sequences of functions can also be examined by generating functions, and the Legendre polynomial generating function is important in the area of electrodynamics and vectors (Sahanggamu, 006). These, especially examining the use of generating functions in combinatorics, could be topics for further investigation. 39

Works Cited 6 - Probability Generating Functions. (n.d.). 1st ed. [ebook] Cambridge: University of Cambridge. Available at: https://www.cl.cam.ac.uk/teaching/0708/probabilty/prob06.pdf [Accessed 16 Dec. 016]. Chapter 4: Generating Functions. (n.d.). 1st ed. [ebook] The University of Auckland. Available at: https://www.stat.auckland.ac.nz/~fewster/35/notes/ch4.pdf [Accessed 5 Feb. 017]. Graham, R., Knuth, D. and Patashnik, O. (1994). Concrete mathematics. nd ed. Reading: Addison-Wesley, p.30. Khan, S. (011). Taylor & Maclaurin polynomials intro (part 1). [video] Available at: https://www.khanacademy.org/math/calculus-home/series-calc/taylor-series-calc/v/maclauren-and-ta ylor-series-intuition [Accessed 8 Dec. 016]. Knuth, D. (1973). The art of computer programming. 1st ed. Reading: Addison-Wesley. Letkowski, J. (01). Applications of the Poisson probability distribution. 1st ed. [ebook] Springfield: Western New England University. Available at: http://www.aabri.com/sa1manuscripts/sa1083.pdf [Accessed 6 Dec. 016]. Sahanggamu, A. (006). Generating Functions and Their Applications. Undergraduate. MIT Mathematics Department. Yūki, H. and Gonzalez, T. (01). Math Girls. nd ed. Austin: Bento Books, Inc. 40