Notes on generating functions in automata theory

Notes on generating functions in automata theory Benjamin Steinberg December 5, 2009 Contents Introduction: Calculus can count 2 Formal power series 5 3 Rational power series 9 3. Rational power series and linear recurrences.......... 0 3.2 Newton s identities........................ 2 4 Regular languages and generating functions 4 4. Unambiguous regular expressions................ 4 4.. Unambiguous regular expressions and rationality... 6 4.2 A linear algebraic approach................... 9 Introduction: Calculus can count Let L = {0, } \ {0, } {0, }. This is a regular language. Suppose you would like to know how many words of length n belong to this language. It turns out that Taylor Series from Calculus can help us. Let s first try and use bare hands methods to count this. Let a n be the number of words of length n in L. Evidently a 0 = since the empty word belongs to L. Also 0, L, so a = 2. How about length 2? Well, 00, 0, 0 L but / L, so a 2 = 3. Next look at length 3. We have 000, 00, 00, 00, 0. So a 3 = 5. We can t go on like this for ever, so let s try and be smart. Any word w L must either end in 0 or end in 0. More precisely, w must be of the form u0 or v0 with

u, v L. Now there are a n words of length n ending in 0 and a n 2 words of length n ending in 0. Therefore, we have a n = a n + a n 2, n 2 a 0 =, a = 2 (.) This is essentially the Fibonacci sequence, except the Fibonacci sequence is given by,, 2, 3, 5, 8,... so our sequence starts from the second element of the Fibonacci sequence. It turns out to be more convenient to calculate a formula for the Fibonacci sequence. We define the Fibonacci sequence {f n } formally by, f n = f n + f n 2, n 2 f 0 =, f = (.2) So a n = f n+. Therefore, to obtain a formula for a n, we just need to get a formula for f n. How can we get an explicit formula for f n? The extremely clever idea (essentially going back to the 700s or 800s) is to encode the sequence via a Taylor series (or as mathematicians prefer to call it, a power series). So let g(x) = f nx n. This is called the generating function of the sequence {f n }. Elementary calculus says that f n = g(n) (0) n! so if we can identify g, we may be able to use derivatives to calculate the f n. Actually, in most cases we can identity g with a function whose power series we know well. The most typical example is the geometric series ax = + ax + (ax)2 + = (ax) n (.3) We can get more examples by differentiating or integrating. Ok, back to our Fibonacci sequence {f n }. Consider its generating function g(x) = f n x n = + x + 2x 2 + 3x 3 + 5x 4 + 8x 5 +. For the heck of it, lets compute g(x)( x x 2 ). Since xg(x) = f 0 x + f x 2 + f 2 x 3 + = f n x n+ = f n x n (.4) x 2 g(x) = f 0 x 2 + f x 3 + = f n x n+2 = 2 n= f n 2 x n (.5) n=2

we get the equality g(x)( x x 2 ) = f n x n f n x n n= = f 0 + f x f 0 x + f n 2 x n n=2 (f n f n f n 2 ) Using the recursive formula (.2) and f 0 = = f, (.6) becomes yielding the formula n=2 g(x)( x x 2 ) = g(x) = (.6) x x 2 (.7) Now you can see that the choice x x 2 was not at all random. According to (.4), multiplying g(x) by x has the effect of lowering the indices by while (.5) shows multiplying by x 2 lowers the indices by two. Since our recursion expresses the coefficients of g(x) in terms of the previous two indices, our polynomial x x 2 does exactly the job of killing of all but the initial terms. Now, let us find a partial fraction decomposition of. The roots of x x 2 x x 2 are ± 5. Let α = + 5 and β = 5. Then 2 2 2 x x = 2 (x α)(β x) = (x α)(x β) = A x α + B x β This gives us the equations A + B = 0 Aβ + Bα = So A = B and B( β + α) =. But β + α = 5, so B = 5 and A = 5. Therefore, we obtain x x = ( 2 5 x α + x β = ( ) α 5 α x β β x ( = ) (α ) n+ x n (β ) n+ x n 5 ) = [ (α ) n+ (β ) n+] x n 5 3

To obtain our final answer, we first need to compute α, β. In fact, ( α 2 = + 5 = 2 + ) 5 5 = + 5 5 2 ( β 2 = 5 = 2 + ) 5 5 + = 5 5 2 Putting it all together, we obtain g(x) = ( + ) n+ ( 5 ) n+ 5 x n 5 2 2 The formula for the n th Fibonacci number is then given by ( ) f n = n+ ( ) n+ 5 + 5 2 5 2 The amazing thing about this formula is that despite all the 5 s, the answer is always an integer! The number ϕ = + 5 is called the Golden Mean (look it 2 up on Google!). This number fascinated ancient Greeks, as well as Leonardo da Vinci (it even appears in the da Vinci code!). Our formula says f n = ϕn+ ( ϕ) n+ 5. (.8) In fact the ratio of the Fibonacci numbers converges to the Golden Mean. Theorem.. The ratio of the Fibonacci numbers converges to the Golden Mean. That is, f n+ lim = ϕ n f n Proof. First observe that ( ϕ) <. Therefore, by (.8), we have as required. f n+ lim n f n = lim n ϕ n+2 ( ϕ) n+2 ϕ n+ ( ϕ) n+ ϕ n+2 = lim (since ϕ < ) n ϕ n+ = ϕ 4

The plan for the rest of these notes is as follows. First we develop the general theory of power series and generating functions. In particular we focus on the class of rational generating functions. Then we show that the generating function of a regular language is a rational generating function. 2 Formal power series We begin by defining properly a power series. In these notes we won t be concerned about the convergence of these series, although the radius of convergence does give you important information about the growth of the coefficients. Definition 2. (Formal power series). A formal power series is a formal sum f(x) = a nx n where the a n are real numbers. Two power series f(x) = a nx n and g(x) = b nx n are said to be equal if their coefficients agree, that is, a n = b n for all n 0. Since we don t consider convergence, it doesn t make sense to evaluate f at a real number, with the exception of the point x = 0. The number f(0) = a 0 is clearly well defined. A polynomial is a formal power series with only finitely many non-zero terms. We often identify constant polynomials with real numbers. In particular, the 0 power series is the power series with all coefficients 0 whereas the power series is the power series with constant term and all other terms 0. One can define the derivative of a formal power series in a clear way: f (x) = n= na nx n. Of course f (n) (x), then n th derivative of f, is defined by taking n derivatives. It is then a formal calculation to verify that Taylor s formula holds. Theorem 2.2 (Taylor s Formula). If f(x) = a nx n, then a n = f (n) (0) n! This formula should not be confused with Taylor s theorem from Calculus, which gives a good bound on the error of approximating a function by a Taylor polynomial. One can add power series in the usual way. If f(x) = a 0 + a x + a 2 x 2 + and g(x) = b 0 + b x + b 2 x 2 +, then f(x) + g(x) = a 0 + b 0 + (a + b )x + (a 2 + b 2 )x 2 +. 5

In formulas, we have f(x) + g(x) = (a n + b n )x n The negative of a power series is obtained by negating all the terms: f(x) = a 0 a x a 2 x 2. Multiplication of power series is a bit more complicated. If f(x) = a 0 + a x + a 2 x 2 + and g(x) = b 0 + b x + b 2 x 2 +, then f(x)g(x) = a 0 b 0 + (a 0 b + a b 0 )x + (a 0 b 2 + a b + a 2 b 0 )x 2 + This boils down to the formula f(x)g(x) = m=0 n a m b n m x n (2.) What this formula says is that to get the coefficient of x n you look at all pairs of numbers k, l with k + l = n and add up the corresponding products a k b l. As an example, consider ( x)(+x+x 2 + ). Playing with the product symbolically, we obtain + x x + x 2 x 2 + =. Let s try to do this rigorously using (2.). Here we have a 0 =, a = and all b n =. The coefficient of x 0 is just a 0 b 0 =. For n, the coefficient of x n reduces to a 0 b n + a b n = b n b n = = 0. Therefore, f(x)g(x) =. This shows the power series x is invertible, or more precisely x = Definition 2.3 (Invertible power series). We say that a power series f(x) is invertible if there is a power series g(x) such that f(x)g(x) =. Suppose that f(0) = 0. Then f(x)g(x) evaluated at 0 is f(0)g(0) = 0. Therefore, f(x)g(x). The upshot is that we have just shown that the constant term of an invertible power series must be non-zero. It turns out that a power series f(x) is invertible precisely when f(0) 0. To prove this we would like to show that if f is a power series, then x n f n = f 6

But let s not be too hasty. For instance, if f(x) = x then f n = x + ( 2x + x 2 ) + ( 3x + 3x 2 x 3 ) + and so the constant term is sum of infinitely many s, an impossibility. The problem here is that f(x) has a non-zero constant term. Suppose that f(x) = 0, so f(x) = a x + a 2 x 2 +. Then f(x) n = a n x n + where all the other terms have higher order than n. So if you try and compute +f +f 2 + you will never have to add up infinitely many real numbers and so the power series f n makes sense. In fact, the coefficient of x n in + f + f 2 + agrees with the coefficient of x n in + f + + f n because f n+, f n+2, etc., only contribute terms of higher order than n. So assume that f(0) = 0 and let us computes ( f)( + f + f 2 + ). The constant term is clearly (since f(0) = 0). Formally, we have f + f f 2 + f 2 = More rigorously, if we want to show that the coefficient of x n is 0 in this product, it suffices to compute the coefficient of x n in ( f)( + f + f n ) since f n+, etc., only contribute terms of higher order. But a telescoping argument yields ( f)( + f + + f n ) = f + f f 2 + f n + f n f n+ = f n+ and since all terms of f n+ are at least order n +, we see ( f)( + f + + f n ) has 0 as the coefficient of x n. This allows us to rigorously conclude that /( f) = + f + f 2 +. We record this as a proposition. Proposition 2.4. Suppose f is a power series with f(0) = 0, then f = Now we are ready to complete our characterization of invertible power series. Theorem 2.5. A power series f(x) = a nx n is invertible if and only if a 0 0, i.e., f(0) 0. f n 7

Proof. We already saw that if f(0) = 0, then f is not invertible. Conversely, suppose a 0 = f(0) 0. Clearly f is invertible if and only if f/a 0 is invertible, so we may assume without loss of generality that f(0) =. Let g(x) = f(x). Notice that g = f. Since g(0) = 0, Proposition 2.4 shows This completes the proof. g n = g = f Now we can formally define a generating function. Definition 2.6 (Generating function). If {a n } is a sequence of numbers, the generating function for the sequence is the power series f(x) = a n x n Exercise. Verify the following properties of power series.. f + g = g + f 2. (f + g) + h = f + (g + h) 3. f + 0 = f 4. f f = 0 5. f = f 6. (fg)h = f(gh) 7. f(g + h) = fg + fh Exercise 2. Prove Taylor s formula. Exercise 3. Show that every formal power series is a generating function. 8

3 Rational power series The simplest type of power series is a polynomial. Just as quotients of integers are called rational numbers, quotients of polynomials are called rational functions. Definition 3. (Rational power series). A power series f(x) is rational if there are polynomials p(x), q(x) with q(0) 0 such that f(x) = p(x) q(x) The condition q(0) 0 is to guarantee that we can divide by q(x). For example the geometric series xn is rational. So is the generating function of the Fibonacci sequence. In the exercises, you will be asked to verify that sums, products and inverses of rational power series are again rational. Given a rational power series f(x) = p(x), you can use the method of long q(x) division and partial fractions to find the associated power series. Example 3.2. Let s find the power series for f(x) = x+8 x 2 +x 6. Well, f(x) = x + 8 (x 2)(x + 3) = A x 2 + B x + 3 So x + 8 = A(x + 3) + B(x 2). Here s a neat trick: subbing in x = 2 gives 0 = 5A so A = 2; subbing in x = 3 gives 5 = 5B so B =. Therefore, f(x) = 2 x 2 x + 3 We now do some algebraic rearrangement to make things look like a geometric sum; in the first sum multiply top and bottom by and in the second 2 multiply top and bottom by. We obtain 3 f(x) = ( x ) n x n 3 2 ] ( ) 3 ( x) = 2 3 n xn 3 [ ( = 3 ) n+ 2 n x n 9

Example 3.3. Let s write f(x) = x 2 +2x+ f(x) = (x + ) = d ( ) = d 2 dx ( x) dx = as a power series. Notice = n= ( ) n+ x n ( ) n+ nx n ( ) n+2 (n + )x n Exercise 4. Prove if f(x), g(x) are rational power series, then f(x) + g(x) and f(x)g(x) are rational power series. If g(0) 0, show that f(x) is a rational g(x) power series. Exercise 5. Write the following rational functions as power series.. x 2 2. ( x) 3 3. x 2 +2x+3 ( x)( 3x) 3. Rational power series and linear recurrences Rational power series are closely related to linear recurrences (also called linear difference equations). The rule defining the Fibonacci sequence is a linear recurrence. More formally: Definition 3.4 (Linear recurrence). A sequence {a n } satisfies a linear recurrence of order r > 0 if there exists an integer k 0 so that for n k a n+r = c r a n+r + c r 2 a n+r 2 + + c 0 a n (3.) where c 0,..., c r are real numbers. Notice that if a sequence satisfies the recurrence (3.), then it is uniquely determined by the terms a 0,..., a k+r. For instance, the Fibonacci sequence satisfies the second order recurrence f n+2 = f n+ + f n for n 0. Our goal is to imitate what we did for the Fibonacci numbers to show that the generating function of a sequence with a linear recurrence is rational. 0

So let {a n } be a sequence satisfying the linear recurrence (3.) for n k and let f(x) = a nx n be the generating function. We consider the polynomial q(x) = c r x c r 2 x 2 c 0 x r Notice that q(x) has degree r, the order of the linear recurrence. For the Fibonacci sequence, this boils down to the polynomial x x 2 we considered earlier. If n k, then the coefficient of x n+r in f(x)q(x) = (a 0 + a x + + a r+n 2 x n+r 2 + a r+n x n+r is given by + a r+n x n+r + ) ( c r x c r 2 x 2 c 0 x r ) a n+r c r a n+r c r 2 a n+r 2 c 0 a n = 0 where the last equality uses (3.). Therefore, f(x)q(x) is a polynomial p(x) of degree at most k + r and so f(x) = p(x) q(x). Suppose on the other hand f(x) = a nx n is a rational power series and f(x) = p(x) with q(x) a polynomial of degree r. By multiplying top q(x) and bottom by a scalar, we may assume q(x) = c r x + c 0 x r for certain constants c 0,..., c r. Then f(x)q(x) = p(x). If n + r is greater than the degree of p(x), then we have the coefficient of x n+r in f(x)q(x) is 0. This coefficient is a n+r c r a n+r c 0 a n by the same computations as above. Therefore, the sequence {a n } satisfies the order r recurrence (3.) for n deg(p(x)) r +. We summarize this discussion in a theorem. Theorem 3.5. A sequence satisfies a linear recurrence if and only if its generating function is rational. More precisely, a sequence {a n } with generating function f(x) satisfies a linear recurrence (3.) of order r if and only if f(x) = p(x) q(x) where q(x) has degree r. deg(p(x)) r +. Moreover, the recurrence (3.) holds for all n Example 3.6. Let s count the number a n of words of length at most n over the two-letter alphabet {0, } using a second order linear recurrence. Clearly

a 0 =, a = 3. Now there are a n+ a n words of length n +. Since a word of length n + 2 is obtained from a word of length n + by appending either a 0 or a to the end, we have a n+2 = 2(a n+ a n ) + a n+ = 3a n+ 2a n. This is a linear recurrence of order 2 starting from k = 0. Then q(x) = 3x + 2x 2 and f(x)q(x) = ( + 3x + a 2 x 2 + )( 3x + 2x 2 ) = + 3x 3x = since the above discussion shows that the coefficient of x n+2 in f(x)q(x) is zero for n 0 as the recurrence has order 2 and starts from k = 0. So f(x) = 3x + 2x = 2 ( x)( 2x) = x + 2 2x. Therefore, and so a n = 2 n+. f(x) = (2 n+ )x n Exercise 6. Suppose that the sequence {a n } is given by a 0 =, a = 5 and the second order linear recurrence a n+2 = 4a n+ 3a n for n 0. Use generating functions to find an explicit formula for a n. Exercise 7. Give a formula for the number of words of length at most n over a k-letter alphabet using a second order linear recurrence. Exercise 8. Use a simple geometric sum to count the number of words of length at most n over a k-letter alphabet. 3.2 Newton s identities Let f(x) = x m + a m x m + + a 0 be a polynomial with complex roots r,..., r m (with multiplicities). Define a sequence p n of complex numbers, for n, by p n = r n + r n 2 + + r n m. Newton gave a linear recursion for {p n } n= in terms of the coefficients of f. Let s derive it. Let p(x) = n= p nx n be the generating function. Consider the polynomial g(x) = x m f( x ) = + a m x + + a 0 x m Since f(x) = m (x r i ) (3.2) i= 2

we have g(x) = x m m i= ( ) x r i = m ( r i x). Taking logarithms gives log g(x) = m i= log ( r ix). So taking derivatives: g (x) g(x) = d m dx log g(x) = r i r i= i x ( m ) = r n+ i x n Therefore, p(x) = xg (x) g(x) we obtain: = i= i= (r n + + rm)x n n n= = p(x) x is a rational function. Since g (x) = a m + 2a m 2 x + + ma 0 x m Theorem 3.7 (Newton). Let f(x) = x m +a m x m + +a 0 be a polynomial and let p(x) be the generating function for the sequence {p n } =0 where p n is the sum of the n th -powers of the roots of f(x) (with multiplicity). Then p(x) = a m x + 2a m 2 x 2 + + ma 0 x m + a m x + + a 0 x m Consequently, {p n } n= satisfies the linear recurrence of order m: for n. p n+m = a m p n+m a m 2 p n+m 2 a 0 p n One can in fact use Theorem 3.7 to compute recursively all the p n from the coefficients of f(x). Exercise 9. Use the formula from Theorem 3.7 to determine formulas for p 2 and p 3 in terms of the coefficients of f. Exercise 0. Show that if p n = 0 for n, then f(x) = x m. Exercise. Show that if A is an m m matrix such that Trace(A n ) = 0 for all n, then A m = 0. Hint: Use the previous exercise and the fact that if f(x) is the characteristic polynomial of A, then f(a) = 0. 3

4 Regular languages and generating functions Often it is interesting to count the number of words of each length in a language L. For instance, C = {0, 0} is a prefix code. How many words of length n are there in C. We shall compute this with generating functions. Definition 4. (Generating function of a language). Let L A be a language. Then generating function for L is the power series f L (x) = a n x n where a n = L A n, i.e., the number of words of length n in L. For instance, if A = m, then there are m n words of length n and so f A = (mx) n = mx. In particular, the generating function is rational. This will always be the case for regular languages. We give two approaches. 4. Unambiguous regular expressions Our first approach is via unambiguous regular expressions. Definition 4.2 (Unambiguous regular expression). Let L, L 2 A.. The union L + L 2 is called unambiguous if L and L 2 are disjoint. 2. The product L L 2 is called unambiguous if each w L L 2 can be uniquely written as a product w = w w 2 with w i L i, i =, 2. 3. The Kleene star L is called unambiguous if L is a code. One says L is a code if each product L n is unambiguous (n 0) and the union L 0 + L + is a disjoint (that is, unambiguous) union. 4. A language is called unambiguously regular if it can be built from the base regular languages by finitely many applications of unambiguous union, unambiguous product and unambiguous star. The advantage of unambiguous regular operations is that the effect of the operation on generating functions is easy to determine. 4

Proposition 4.3. Let L, L 2 A have respective generating functions f L (x) and f L2 (x). Then:. If L + L 2 is an unambiguous union, then f L +L 2 (x) = f L (x) + f L2 (x) 2. If L L 2 is an unambiguous product, then f L L 2 (x) = f L (x)f L2 (x) 3. If L is a code, then f L (x) = f L (x) Proof. Let a n = L A n and b n = L 2 A n.. A word of length n in L + L 2 comes from either L or L 2, but not both. So (L + L 2 ) A n = a n + b n. Therefore, f L +L 2 (x) = f L (x) + f L2 (x). 2. A word of length n in L L 2 can be uniquely written as a product of a word of length m from L with a word of length n m from L 2. So L L 2 A n = n m=0 a mb n m. Thus (2.) implies as required. f L L 2 (x) = m=0 n a m b n m x n = f L (x)f L2 (x), 3. First note that L a code implies ε / L. There for the constant term of f L (x) is 0 and so /( f L (x)) makes sense. If w L has length m, then w / L n for n > m. Also the smallest degree term of fl n is at least n. So we just need to make sure that f L (x) agrees with + f L (x) + f L (x) m for all terms of degree up to m, for each m 0. But this follows from the previous two parts since L 0 + L + L m is an unambiguous union of unambiguous products. Example 4.4. Let C = {0, 0}. Then C is a prefix code. Clearly f C (x) = x+x 2, so f C = x x. 2 5

We recognize this from (.7) as the generating function for the Fibonacci numbers and so we know that the number of words of length n in C is the n th Fibonacci number f n. In particular, (.8) gives an explicit formula for the number of words of length n. Notice that C (ε+) is the language of all words that do not contain a factor. Indeed, C contains all words ending in 0 with no factor and the product then breaks things up into those words ending in 0 and those words ending in. This is an unambiguous regular expression and so f C (ε+) = + x x x 2 Example 4.5. A composition of a natural number n > 0 is an ordered sequence of positive numbers (m,..., m k ) such that m + + m k = n. Let s compute a formula for the number of compositions of n. Consider the infinite prefix code C = {a k b k 0} = a b. For n > 0 there is a bijection between words of length n in C and compositions of n that corresponds the composition (m,..., m k ) of n to the word (a m b)(a m2 b) (a mk b) of length n (what is the inverse?). The regular expression a b is unambiguous so the generating function for C is f C = x x Thus we have f C = = 2x f C x = + 2 n x n+ = + = x 2x = 2x + x 2x n= 2 n x n Therefore, there are 2 n compositions of n. = + x 2x Exercise 2. Find the generating function f L (x) and a formula for the number of words of length n in L for L = {0, 0, }. Exercise 3. Find a formula for the number of words of length n in the regular language 0. Make sure to justify that you are only using unambiguous products and stars. 4.. Unambiguous regular expressions and rationality Let us observe that the generating functions for the base regular languages are polynomials. f (x) = 0. 6

f {ε} =. f {a} = x, a A. It now follows from Proposition 4.3 and Exercise 4 that any regular language that is unambiguously regular has a rational generating function. Our next theorem, which is an improvement on Kleene s theorem, says that each regular language is in fact unambiguously regular. The argument is an alternative proof of Kleene s theorem. Theorem 4.6. Any regular language is unambiguously regular. Proof. Let A = (S, A, ι, δ, T ) be a deterministic finite state automaton recognizing L. For p, q S, let L p,q be the set of non-empty words recognized by the automaton A p,q = (S, A, p, δ, {q}). Then L = t T L ι,t + ι,t where ι,t = { {ε} if ι T else. Moreover, this union is unambiguous since A is deterministic and so a word can bring the initial state to at most one terminal state. So it suffices to show that each L p,q with p, q S is unambiguously regular. For Q S and p, q S, define L p,q,q to be the set of all non-empty words that label paths from p to q which only pass though states in Q except perhaps the p at the beginning and the q at the end. Then L p,q = L p,s,q. We prove that L p,q,q is unambiguously regular for each Q S, p, q S, by induction on Q. If Q =, then L p,q,q is just the set of labels of edges from p to q, and so is a subset of A and hence unambiguously regular. Assume the result is true for Q = n and now suppose Q = n +. Then Q = P + {r} for some state r / P. The idea is now similar to our old proof of Kleene s theorem. We break paths up according to whether they go through r or not. Then L p,q,q = L p,p,q + L p,p,r L r,p,rl r,p,q. (4.) By induction, L p,p,q, L p,p,r, Lr, P, r and L r,p,q are unambiguously regular. The union in (4.) is unambiguous since words in L p,p,q do not pass through r when going from p to q, while all words in L p,p,r L r,p,r L r,p,q do. The language L r,p,r is a prefix code since it does not contain the empty word and r / P implies no proper prefix of an element of L r,p,r belongs to the language. So L r,p,r is an 7

unambiguous star. Finally, the product L p,p,r L r,p,r L r,p,q is unambiguous since if w goes from p to q through r, it has a unique prefix x that visits r for the first time, a unique suffix z that visits r for the last time and w = xyz where y reads from r to r going through Q = P + {r}. It follows x L p,p,r, y L r,p,r and z L r,p,q and this is the unique factorization of this sort. This completes the induction and the proof of the theorem. Corollary 4.7. The generating function of any regular language is a rational function. In particular, the number of words of length n in a regular language must satisfy a linear recurrence. Because of the close relationship between regular languages and rational generating functions, some books call regular languages rational languages. However, there are languages with rational generating function that are not regular. For instance L = {0 n n n 0} is not regular. This language has exactly one word of every even length and no words of odd length. So its generating function f L (x) = + x 2 + x 4 + = x 2 is rational. In fact this language has the same generating function as (0 2 ). To obtain the proper relationship between regular languages and rational functions, one has to consider generating functions in several non-commuting variables, which is beyond the scope of this course. Example 4.8. Let s compute a formula the number of words of length n in the language L = {0, } 0{0, }. First we need an unambiguous regular expression. A deterministic automaton accepting this language is 0 0, 0 from which we obtain the unambiguous regular expression 00 (0 + ). Therefore, the generating function for L is given by f L = x x x x 2x = x 2 ( x) 2 ( 2x) 8

Using the method of partial fractions, one computes Thus f L (x) = ( x) + 2 2x = d ( ) + dx x 2x = nx n + 2 n x n = n= ( (n + ) + 2 n )x n L A n = 2 n n Exercise 4. Let G be a finite group with identity e. Let f : A G be an onto homomorphism. Show that the generating function for the word problem L = f ({e}) is rational. This result is used in probability theory: from it they deduce that the Green s function of a random walk on a finite group is rational. 4.2 A linear algebraic approach An alternate approach, which works quite well for automata with small numbers of states, is via linear algebra. Let A = (S, A, q, δ, T ) be a deterministic finite state automaton accepting a language L. Let S = {q,..., q m } where q is the initial state. Then, for each i, we define L qi to be the language of the automaton (S, A, q i, δ, T ); so L qi consists of all words w A such that q i w T. In particular, L q = L. Let f i = f Lqi be the generating function of L qi ; so f = f L. The generating functions f,..., f m are closely related, as we shall see momentarily. Let us first observe that if q i is a fail state, then no word labels a path from q i to a terminal state and so L qi =, whence f i = 0. Thus we can omit the fail states in what follows (i.e., work with partial deterministic automata). If f is a generating function, let us write f, x n to denote the coefficient of x n in f(x). So if f(x) = a nx n, then f, x n = a n. Then notice that { f i, x 0 q i T = (4.2) 0 else since ε L qi if and only if q i T. 9

On the other hand if n 0 and w L qi is a word of length n +, then w = au with a A and u = n. Since we are dealing with a deterministic automaton, this means that u L qi a. Conversely, if w = au with a A, u = n and u L qi a, then q i w = q i au T and so w L qi. Again q i a is uniquely determined because A is deterministic. From this, we conclude L qi A n+ = a A L qi a A n = m a ij L qj A n j= where In other words, for n 0, a ij = {a A q i a = q j } f q, x n+ = m a ij f j, x n (4.3) j= The matrix A = (a ij ) is called the adjacency matrix of A. For example, if we consider the automaton A from Example 4.8, then the adjacency matrix of A is given by 0 A = 0 (4.4) 0 0 2 where we order the states from left to right. Let us define { q i T δ i = 0 q i / T I.e., δ i = f i, x 0 by (4.2). Then equations (4.2) and (4.3) can be translated into the following linear system of equations in unknowns f,..., f m and with coefficients polynomials over R: f = δ + x(a f + a 2 f 2 + + a m f m ).. f m = δ m + x(a m f + a m2 f 2 + + a mm f m ) or in matrix form F = +xaf where F = (f,..., f m ) and = (δ,..., δ m ). Equivalently, we have the system of equations (I xa)f = (4.5) 20

where I is the m m identity matrix. Notice that det(i xa) is a polynomial in x of degree at most m with constant term det(i 0A) = det I =. Thus det(i xa) is an invertible power series and we can now apply Cramer s rule (which works over any ring provided the determinant is invertible over the ring) to conclude that f = det((i xa) ) det(i xa) (4.6) where (I xa) is the matrix obtained from I xa by replacing the first column with. Notice that the numerator of (4.6) is also polynomial in x of degree at most m, so this gives another proof that the generating function of a regular language is a rational function. Example 4.9. Let us revisit Example 4.8. Using (4.4) we obtain = (0, 0, ). x x 0 I Ax = 0 x x 0 0 2x 0 x 0 (I Ax) = 0 x x 0 2x and so det((i xa) ) = x 2, det(i xa) = ( x) 2 ( 2x) and so we recover f L (x) = x 2 ( x) 2 ( 2x) Example 4.0. This time we return to the example which is recognized by the automaton L = {0, } \ [{0, } {0, } ] A = 0 0 0, The last state is a fail state and so does not contribute to the generating function computation. Thus we may remove it and work with the partial 2

deterministic automaton B = Ordering the states from left to right, we obtain the adjacency matrix [ ] A = 0 0 0 and = (, ). Thus [ ] x x I xa = x [ ] x (I xa) = and so det((i xa) ) = + x and det(i xa) = x x 2. Thus we recover the generating function f L (x) = + x x x 2 Unfortunately, this linear algebra method becomes exceedingly more difficult to apply as the number of states increases. An alternative approach to Cramer s rule is to observe that (4.5) can be solved using Gaussian elimination, but when performing row reductions you can only divide by power series (in particular, polynomials) with non-zero constant term. 22