Notes on generating functions in automata theory

Similar documents
MATH 115, SUMMER 2012 LECTURE 12

Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes

Chapter 11 - Sequences and Series

In N we can do addition, but in order to do subtraction we need to extend N to the integers

Chapter 3. Rings. The basic commutative rings in mathematics are the integers Z, the. Examples

CALCULUS JIA-MING (FRANK) LIOU

Lecture 7: Polynomial rings

Math /Foundations of Algebra/Fall 2017 Numbers at the Foundations: Real Numbers In calculus, the derivative of a function f(x) is defined

A matrix over a field F is a rectangular array of elements from F. The symbol

ELEMENTARY LINEAR ALGEBRA

1 Functions of Several Variables 2019 v2

Partial Fractions. June 27, In this section, we will learn to integrate another class of functions: the rational functions.

+ 1 3 x2 2x x3 + 3x 2 + 0x x x2 2x + 3 4

Chapter 8. P-adic numbers. 8.1 Absolute values

2a 2 4ac), provided there is an element r in our

Name (print): Question 4. exercise 1.24 (compute the union, then the intersection of two sets)

2 Lecture 2: Logical statements and proof by contradiction Lecture 10: More on Permutations, Group Homomorphisms 31

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra

and the compositional inverse when it exists is A.

CHAPTER 3: THE INTEGERS Z

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

MAT115A-21 COMPLETE LECTURE NOTES

In N we can do addition, but in order to do subtraction we need to extend N to the integers

1 The distributive law

MAT137 Calculus! Lecture 6

Introduction to Techniques for Counting

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Chapter One. The Real Number System

Mathematics 102 Fall 1999 The formal rules of calculus The three basic rules The sum rule. The product rule. The composition rule.

DR.RUPNATHJI( DR.RUPAK NATH )

LINEAR RECURSIVE SEQUENCES. The numbers in the sequence are called its terms. The general form of a sequence is

Definitions, Theorems and Exercises. Abstract Algebra Math 332. Ethan D. Bloch

Review of Linear Algebra

MATH 1A, Complete Lecture Notes. Fedor Duzhin

Matrix Multiplication

Induction 1 = 1(1+1) = 2(2+1) = 3(3+1) 2

0 Sets and Induction. Sets

Chapter Five Notes N P U2C5

Math 0031, Final Exam Study Guide December 7, 2015

TAYLOR AND MACLAURIN SERIES

Automata Theory and Formal Grammars: Lecture 1

Scott Taylor 1. EQUIVALENCE RELATIONS. Definition 1.1. Let A be a set. An equivalence relation on A is a relation such that:

WORKSHEET ON NUMBERS, MATH 215 FALL. We start our study of numbers with the integers: N = {1, 2, 3,...}

MATH 1902: Mathematics for the Physical Sciences I

7.5 Partial Fractions and Integration

Generating Functions

Before we show how languages can be proven not regular, first, how would we show a language is regular?

MSM120 1M1 First year mathematics for civil engineers Revision notes 4

Analysis I. Classroom Notes. H.-D. Alber

Theorem 5.3. Let E/F, E = F (u), be a simple field extension. Then u is algebraic if and only if E/F is finite. In this case, [E : F ] = deg f u.

Finite and Infinite Sets

Comparison of Virginia s College and Career Ready Mathematics Performance Expectations with the Common Core State Standards for Mathematics

Section-A. Short Questions

Mathematical Olympiad Training Polynomials

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

Power series and Taylor series

2. Prime and Maximal Ideals

Mathematics 136 Calculus 2 Everything You Need Or Want To Know About Partial Fractions (and maybe more!) October 19 and 21, 2016

NOTES ON DIOPHANTINE APPROXIMATION

Supplementary Material for MTH 299 Online Edition

CHAPTER I. Rings. Definition A ring R is a set with two binary operations, addition + and

Numbers, sets, and functions

12. Hilbert Polynomials and Bézout s Theorem

1.2 The Role of Variables

NOTES ON FINITE FIELDS

Rings. Chapter 1. Definition 1.2. A commutative ring R is a ring in which multiplication is commutative. That is, ab = ba for all a, b R.

P-adic numbers. Rich Schwartz. October 24, 2014

18. Cyclotomic polynomials II

ELEMENTARY LINEAR ALGEBRA

Theorem. For every positive integer n, the sum of the positive integers from 1 to n is n(n+1)

Rings If R is a commutative ring, a zero divisor is a nonzero element x such that xy = 0 for some nonzero element y R.

Subsequences and Limsups. Some sequences of numbers converge to limits, and some do not. For instance,

Part 2 Continuous functions and their properties

8. Limit Laws. lim(f g)(x) = lim f(x) lim g(x), (x) = lim x a f(x) g lim x a g(x)

are the q-versions of n, n! and . The falling factorial is (x) k = x(x 1)(x 2)... (x k + 1).

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS

Sequences. Chapter 3. n + 1 3n + 2 sin n n. 3. lim (ln(n + 1) ln n) 1. lim. 2. lim. 4. lim (1 + n)1/n. Answers: 1. 1/3; 2. 0; 3. 0; 4. 1.

Proofs. Chapter 2 P P Q Q

Mathematics Course 111: Algebra I Part I: Algebraic Structures, Sets and Permutations

LECTURE 10: REVIEW OF POWER SERIES. 1. Motivation

Generating Functions

Chapter 1. Logic and Proof

OR MSc Maths Revision Course

Advanced Calculus: MATH 410 Real Numbers Professor David Levermore 5 December 2010

Math 300: Final Exam Practice Solutions

Chapter 1 The Real Numbers

Lecture Notes on DISCRETE MATHEMATICS. Eusebius Doedel

Measures and Measure Spaces

Lecture 6: Finite Fields

Structure of R. Chapter Algebraic and Order Properties of R

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CS 6820 Fall 2014 Lectures, October 3-20, 2014

1. Introduction to commutative rings and fields

MATH 102 INTRODUCTION TO MATHEMATICAL ANALYSIS. 1. Some Fundamentals

Polynomial Functions

1 Differentiability at a point

Math Introduction to Modern Algebra

10/22/16. 1 Math HL - Santowski SKILLS REVIEW. Lesson 15 Graphs of Rational Functions. Lesson Objectives. (A) Rational Functions

Transcription:

Notes on generating functions in automata theory Benjamin Steinberg December 5, 2009 Contents Introduction: Calculus can count 2 Formal power series 5 3 Rational power series 9 3. Rational power series and linear recurrences.......... 0 3.2 Newton s identities........................ 2 4 Regular languages and generating functions 4 4. Unambiguous regular expressions................ 4 4.. Unambiguous regular expressions and rationality... 6 4.2 A linear algebraic approach................... 9 Introduction: Calculus can count Let L = {0, } \ {0, } {0, }. This is a regular language. Suppose you would like to know how many words of length n belong to this language. It turns out that Taylor Series from Calculus can help us. Let s first try and use bare hands methods to count this. Let a n be the number of words of length n in L. Evidently a 0 = since the empty word belongs to L. Also 0, L, so a = 2. How about length 2? Well, 00, 0, 0 L but / L, so a 2 = 3. Next look at length 3. We have 000, 00, 00, 00, 0. So a 3 = 5. We can t go on like this for ever, so let s try and be smart. Any word w L must either end in 0 or end in 0. More precisely, w must be of the form u0 or v0 with

u, v L. Now there are a n words of length n ending in 0 and a n 2 words of length n ending in 0. Therefore, we have a n = a n + a n 2, n 2 a 0 =, a = 2 (.) This is essentially the Fibonacci sequence, except the Fibonacci sequence is given by,, 2, 3, 5, 8,... so our sequence starts from the second element of the Fibonacci sequence. It turns out to be more convenient to calculate a formula for the Fibonacci sequence. We define the Fibonacci sequence {f n } formally by, f n = f n + f n 2, n 2 f 0 =, f = (.2) So a n = f n+. Therefore, to obtain a formula for a n, we just need to get a formula for f n. How can we get an explicit formula for f n? The extremely clever idea (essentially going back to the 700s or 800s) is to encode the sequence via a Taylor series (or as mathematicians prefer to call it, a power series). So let g(x) = f nx n. This is called the generating function of the sequence {f n }. Elementary calculus says that f n = g(n) (0) n! so if we can identify g, we may be able to use derivatives to calculate the f n. Actually, in most cases we can identity g with a function whose power series we know well. The most typical example is the geometric series ax = + ax + (ax)2 + = (ax) n (.3) We can get more examples by differentiating or integrating. Ok, back to our Fibonacci sequence {f n }. Consider its generating function g(x) = f n x n = + x + 2x 2 + 3x 3 + 5x 4 + 8x 5 +. For the heck of it, lets compute g(x)( x x 2 ). Since xg(x) = f 0 x + f x 2 + f 2 x 3 + = f n x n+ = f n x n (.4) x 2 g(x) = f 0 x 2 + f x 3 + = f n x n+2 = 2 n= f n 2 x n (.5) n=2

we get the equality g(x)( x x 2 ) = f n x n f n x n n= = f 0 + f x f 0 x + f n 2 x n n=2 (f n f n f n 2 ) Using the recursive formula (.2) and f 0 = = f, (.6) becomes yielding the formula n=2 g(x)( x x 2 ) = g(x) = (.6) x x 2 (.7) Now you can see that the choice x x 2 was not at all random. According to (.4), multiplying g(x) by x has the effect of lowering the indices by while (.5) shows multiplying by x 2 lowers the indices by two. Since our recursion expresses the coefficients of g(x) in terms of the previous two indices, our polynomial x x 2 does exactly the job of killing of all but the initial terms. Now, let us find a partial fraction decomposition of. The roots of x x 2 x x 2 are ± 5. Let α = + 5 and β = 5. Then 2 2 2 x x = 2 (x α)(β x) = (x α)(x β) = A x α + B x β This gives us the equations A + B = 0 Aβ + Bα = So A = B and B( β + α) =. But β + α = 5, so B = 5 and A = 5. Therefore, we obtain x x = ( 2 5 x α + x β = ( ) α 5 α x β β x ( = ) (α ) n+ x n (β ) n+ x n 5 ) = [ (α ) n+ (β ) n+] x n 5 3

To obtain our final answer, we first need to compute α, β. In fact, ( α 2 = + 5 = 2 + ) 5 5 = + 5 5 2 ( β 2 = 5 = 2 + ) 5 5 + = 5 5 2 Putting it all together, we obtain g(x) = ( + ) n+ ( 5 ) n+ 5 x n 5 2 2 The formula for the n th Fibonacci number is then given by ( ) f n = n+ ( ) n+ 5 + 5 2 5 2 The amazing thing about this formula is that despite all the 5 s, the answer is always an integer! The number ϕ = + 5 is called the Golden Mean (look it 2 up on Google!). This number fascinated ancient Greeks, as well as Leonardo da Vinci (it even appears in the da Vinci code!). Our formula says f n = ϕn+ ( ϕ) n+ 5. (.8) In fact the ratio of the Fibonacci numbers converges to the Golden Mean. Theorem.. The ratio of the Fibonacci numbers converges to the Golden Mean. That is, f n+ lim = ϕ n f n Proof. First observe that ( ϕ) <. Therefore, by (.8), we have as required. f n+ lim n f n = lim n ϕ n+2 ( ϕ) n+2 ϕ n+ ( ϕ) n+ ϕ n+2 = lim (since ϕ < ) n ϕ n+ = ϕ 4

The plan for the rest of these notes is as follows. First we develop the general theory of power series and generating functions. In particular we focus on the class of rational generating functions. Then we show that the generating function of a regular language is a rational generating function. 2 Formal power series We begin by defining properly a power series. In these notes we won t be concerned about the convergence of these series, although the radius of convergence does give you important information about the growth of the coefficients. Definition 2. (Formal power series). A formal power series is a formal sum f(x) = a nx n where the a n are real numbers. Two power series f(x) = a nx n and g(x) = b nx n are said to be equal if their coefficients agree, that is, a n = b n for all n 0. Since we don t consider convergence, it doesn t make sense to evaluate f at a real number, with the exception of the point x = 0. The number f(0) = a 0 is clearly well defined. A polynomial is a formal power series with only finitely many non-zero terms. We often identify constant polynomials with real numbers. In particular, the 0 power series is the power series with all coefficients 0 whereas the power series is the power series with constant term and all other terms 0. One can define the derivative of a formal power series in a clear way: f (x) = n= na nx n. Of course f (n) (x), then n th derivative of f, is defined by taking n derivatives. It is then a formal calculation to verify that Taylor s formula holds. Theorem 2.2 (Taylor s Formula). If f(x) = a nx n, then a n = f (n) (0) n! This formula should not be confused with Taylor s theorem from Calculus, which gives a good bound on the error of approximating a function by a Taylor polynomial. One can add power series in the usual way. If f(x) = a 0 + a x + a 2 x 2 + and g(x) = b 0 + b x + b 2 x 2 +, then f(x) + g(x) = a 0 + b 0 + (a + b )x + (a 2 + b 2 )x 2 +. 5

In formulas, we have f(x) + g(x) = (a n + b n )x n The negative of a power series is obtained by negating all the terms: f(x) = a 0 a x a 2 x 2. Multiplication of power series is a bit more complicated. If f(x) = a 0 + a x + a 2 x 2 + and g(x) = b 0 + b x + b 2 x 2 +, then f(x)g(x) = a 0 b 0 + (a 0 b + a b 0 )x + (a 0 b 2 + a b + a 2 b 0 )x 2 + This boils down to the formula f(x)g(x) = m=0 n a m b n m x n (2.) What this formula says is that to get the coefficient of x n you look at all pairs of numbers k, l with k + l = n and add up the corresponding products a k b l. As an example, consider ( x)(+x+x 2 + ). Playing with the product symbolically, we obtain + x x + x 2 x 2 + =. Let s try to do this rigorously using (2.). Here we have a 0 =, a = and all b n =. The coefficient of x 0 is just a 0 b 0 =. For n, the coefficient of x n reduces to a 0 b n + a b n = b n b n = = 0. Therefore, f(x)g(x) =. This shows the power series x is invertible, or more precisely x = Definition 2.3 (Invertible power series). We say that a power series f(x) is invertible if there is a power series g(x) such that f(x)g(x) =. Suppose that f(0) = 0. Then f(x)g(x) evaluated at 0 is f(0)g(0) = 0. Therefore, f(x)g(x). The upshot is that we have just shown that the constant term of an invertible power series must be non-zero. It turns out that a power series f(x) is invertible precisely when f(0) 0. To prove this we would like to show that if f is a power series, then x n f n = f 6

But let s not be too hasty. For instance, if f(x) = x then f n = x + ( 2x + x 2 ) + ( 3x + 3x 2 x 3 ) + and so the constant term is sum of infinitely many s, an impossibility. The problem here is that f(x) has a non-zero constant term. Suppose that f(x) = 0, so f(x) = a x + a 2 x 2 +. Then f(x) n = a n x n + where all the other terms have higher order than n. So if you try and compute +f +f 2 + you will never have to add up infinitely many real numbers and so the power series f n makes sense. In fact, the coefficient of x n in + f + f 2 + agrees with the coefficient of x n in + f + + f n because f n+, f n+2, etc., only contribute terms of higher order than n. So assume that f(0) = 0 and let us computes ( f)( + f + f 2 + ). The constant term is clearly (since f(0) = 0). Formally, we have f + f f 2 + f 2 = More rigorously, if we want to show that the coefficient of x n is 0 in this product, it suffices to compute the coefficient of x n in ( f)( + f + f n ) since f n+, etc., only contribute terms of higher order. But a telescoping argument yields ( f)( + f + + f n ) = f + f f 2 + f n + f n f n+ = f n+ and since all terms of f n+ are at least order n +, we see ( f)( + f + + f n ) has 0 as the coefficient of x n. This allows us to rigorously conclude that /( f) = + f + f 2 +. We record this as a proposition. Proposition 2.4. Suppose f is a power series with f(0) = 0, then f = Now we are ready to complete our characterization of invertible power series. Theorem 2.5. A power series f(x) = a nx n is invertible if and only if a 0 0, i.e., f(0) 0. f n 7

Proof. We already saw that if f(0) = 0, then f is not invertible. Conversely, suppose a 0 = f(0) 0. Clearly f is invertible if and only if f/a 0 is invertible, so we may assume without loss of generality that f(0) =. Let g(x) = f(x). Notice that g = f. Since g(0) = 0, Proposition 2.4 shows This completes the proof. g n = g = f Now we can formally define a generating function. Definition 2.6 (Generating function). If {a n } is a sequence of numbers, the generating function for the sequence is the power series f(x) = a n x n Exercise. Verify the following properties of power series.. f + g = g + f 2. (f + g) + h = f + (g + h) 3. f + 0 = f 4. f f = 0 5. f = f 6. (fg)h = f(gh) 7. f(g + h) = fg + fh Exercise 2. Prove Taylor s formula. Exercise 3. Show that every formal power series is a generating function. 8

3 Rational power series The simplest type of power series is a polynomial. Just as quotients of integers are called rational numbers, quotients of polynomials are called rational functions. Definition 3. (Rational power series). A power series f(x) is rational if there are polynomials p(x), q(x) with q(0) 0 such that f(x) = p(x) q(x) The condition q(0) 0 is to guarantee that we can divide by q(x). For example the geometric series xn is rational. So is the generating function of the Fibonacci sequence. In the exercises, you will be asked to verify that sums, products and inverses of rational power series are again rational. Given a rational power series f(x) = p(x), you can use the method of long q(x) division and partial fractions to find the associated power series. Example 3.2. Let s find the power series for f(x) = x+8 x 2 +x 6. Well, f(x) = x + 8 (x 2)(x + 3) = A x 2 + B x + 3 So x + 8 = A(x + 3) + B(x 2). Here s a neat trick: subbing in x = 2 gives 0 = 5A so A = 2; subbing in x = 3 gives 5 = 5B so B =. Therefore, f(x) = 2 x 2 x + 3 We now do some algebraic rearrangement to make things look like a geometric sum; in the first sum multiply top and bottom by and in the second 2 multiply top and bottom by. We obtain 3 f(x) = ( x ) n x n 3 2 ] ( ) 3 ( x) = 2 3 n xn 3 [ ( = 3 ) n+ 2 n x n 9

Example 3.3. Let s write f(x) = x 2 +2x+ f(x) = (x + ) = d ( ) = d 2 dx ( x) dx = as a power series. Notice = n= ( ) n+ x n ( ) n+ nx n ( ) n+2 (n + )x n Exercise 4. Prove if f(x), g(x) are rational power series, then f(x) + g(x) and f(x)g(x) are rational power series. If g(0) 0, show that f(x) is a rational g(x) power series. Exercise 5. Write the following rational functions as power series.. x 2 2. ( x) 3 3. x 2 +2x+3 ( x)( 3x) 3. Rational power series and linear recurrences Rational power series are closely related to linear recurrences (also called linear difference equations). The rule defining the Fibonacci sequence is a linear recurrence. More formally: Definition 3.4 (Linear recurrence). A sequence {a n } satisfies a linear recurrence of order r > 0 if there exists an integer k 0 so that for n k a n+r = c r a n+r + c r 2 a n+r 2 + + c 0 a n (3.) where c 0,..., c r are real numbers. Notice that if a sequence satisfies the recurrence (3.), then it is uniquely determined by the terms a 0,..., a k+r. For instance, the Fibonacci sequence satisfies the second order recurrence f n+2 = f n+ + f n for n 0. Our goal is to imitate what we did for the Fibonacci numbers to show that the generating function of a sequence with a linear recurrence is rational. 0

So let {a n } be a sequence satisfying the linear recurrence (3.) for n k and let f(x) = a nx n be the generating function. We consider the polynomial q(x) = c r x c r 2 x 2 c 0 x r Notice that q(x) has degree r, the order of the linear recurrence. For the Fibonacci sequence, this boils down to the polynomial x x 2 we considered earlier. If n k, then the coefficient of x n+r in f(x)q(x) = (a 0 + a x + + a r+n 2 x n+r 2 + a r+n x n+r is given by + a r+n x n+r + ) ( c r x c r 2 x 2 c 0 x r ) a n+r c r a n+r c r 2 a n+r 2 c 0 a n = 0 where the last equality uses (3.). Therefore, f(x)q(x) is a polynomial p(x) of degree at most k + r and so f(x) = p(x) q(x). Suppose on the other hand f(x) = a nx n is a rational power series and f(x) = p(x) with q(x) a polynomial of degree r. By multiplying top q(x) and bottom by a scalar, we may assume q(x) = c r x + c 0 x r for certain constants c 0,..., c r. Then f(x)q(x) = p(x). If n + r is greater than the degree of p(x), then we have the coefficient of x n+r in f(x)q(x) is 0. This coefficient is a n+r c r a n+r c 0 a n by the same computations as above. Therefore, the sequence {a n } satisfies the order r recurrence (3.) for n deg(p(x)) r +. We summarize this discussion in a theorem. Theorem 3.5. A sequence satisfies a linear recurrence if and only if its generating function is rational. More precisely, a sequence {a n } with generating function f(x) satisfies a linear recurrence (3.) of order r if and only if f(x) = p(x) q(x) where q(x) has degree r. deg(p(x)) r +. Moreover, the recurrence (3.) holds for all n Example 3.6. Let s count the number a n of words of length at most n over the two-letter alphabet {0, } using a second order linear recurrence. Clearly

a 0 =, a = 3. Now there are a n+ a n words of length n +. Since a word of length n + 2 is obtained from a word of length n + by appending either a 0 or a to the end, we have a n+2 = 2(a n+ a n ) + a n+ = 3a n+ 2a n. This is a linear recurrence of order 2 starting from k = 0. Then q(x) = 3x + 2x 2 and f(x)q(x) = ( + 3x + a 2 x 2 + )( 3x + 2x 2 ) = + 3x 3x = since the above discussion shows that the coefficient of x n+2 in f(x)q(x) is zero for n 0 as the recurrence has order 2 and starts from k = 0. So f(x) = 3x + 2x = 2 ( x)( 2x) = x + 2 2x. Therefore, and so a n = 2 n+. f(x) = (2 n+ )x n Exercise 6. Suppose that the sequence {a n } is given by a 0 =, a = 5 and the second order linear recurrence a n+2 = 4a n+ 3a n for n 0. Use generating functions to find an explicit formula for a n. Exercise 7. Give a formula for the number of words of length at most n over a k-letter alphabet using a second order linear recurrence. Exercise 8. Use a simple geometric sum to count the number of words of length at most n over a k-letter alphabet. 3.2 Newton s identities Let f(x) = x m + a m x m + + a 0 be a polynomial with complex roots r,..., r m (with multiplicities). Define a sequence p n of complex numbers, for n, by p n = r n + r n 2 + + r n m. Newton gave a linear recursion for {p n } n= in terms of the coefficients of f. Let s derive it. Let p(x) = n= p nx n be the generating function. Consider the polynomial g(x) = x m f( x ) = + a m x + + a 0 x m Since f(x) = m (x r i ) (3.2) i= 2

we have g(x) = x m m i= ( ) x r i = m ( r i x). Taking logarithms gives log g(x) = m i= log ( r ix). So taking derivatives: g (x) g(x) = d m dx log g(x) = r i r i= i x ( m ) = r n+ i x n Therefore, p(x) = xg (x) g(x) we obtain: = i= i= (r n + + rm)x n n n= = p(x) x is a rational function. Since g (x) = a m + 2a m 2 x + + ma 0 x m Theorem 3.7 (Newton). Let f(x) = x m +a m x m + +a 0 be a polynomial and let p(x) be the generating function for the sequence {p n } =0 where p n is the sum of the n th -powers of the roots of f(x) (with multiplicity). Then p(x) = a m x + 2a m 2 x 2 + + ma 0 x m + a m x + + a 0 x m Consequently, {p n } n= satisfies the linear recurrence of order m: for n. p n+m = a m p n+m a m 2 p n+m 2 a 0 p n One can in fact use Theorem 3.7 to compute recursively all the p n from the coefficients of f(x). Exercise 9. Use the formula from Theorem 3.7 to determine formulas for p 2 and p 3 in terms of the coefficients of f. Exercise 0. Show that if p n = 0 for n, then f(x) = x m. Exercise. Show that if A is an m m matrix such that Trace(A n ) = 0 for all n, then A m = 0. Hint: Use the previous exercise and the fact that if f(x) is the characteristic polynomial of A, then f(a) = 0. 3

4 Regular languages and generating functions Often it is interesting to count the number of words of each length in a language L. For instance, C = {0, 0} is a prefix code. How many words of length n are there in C. We shall compute this with generating functions. Definition 4. (Generating function of a language). Let L A be a language. Then generating function for L is the power series f L (x) = a n x n where a n = L A n, i.e., the number of words of length n in L. For instance, if A = m, then there are m n words of length n and so f A = (mx) n = mx. In particular, the generating function is rational. This will always be the case for regular languages. We give two approaches. 4. Unambiguous regular expressions Our first approach is via unambiguous regular expressions. Definition 4.2 (Unambiguous regular expression). Let L, L 2 A.. The union L + L 2 is called unambiguous if L and L 2 are disjoint. 2. The product L L 2 is called unambiguous if each w L L 2 can be uniquely written as a product w = w w 2 with w i L i, i =, 2. 3. The Kleene star L is called unambiguous if L is a code. One says L is a code if each product L n is unambiguous (n 0) and the union L 0 + L + is a disjoint (that is, unambiguous) union. 4. A language is called unambiguously regular if it can be built from the base regular languages by finitely many applications of unambiguous union, unambiguous product and unambiguous star. The advantage of unambiguous regular operations is that the effect of the operation on generating functions is easy to determine. 4

Proposition 4.3. Let L, L 2 A have respective generating functions f L (x) and f L2 (x). Then:. If L + L 2 is an unambiguous union, then f L +L 2 (x) = f L (x) + f L2 (x) 2. If L L 2 is an unambiguous product, then f L L 2 (x) = f L (x)f L2 (x) 3. If L is a code, then f L (x) = f L (x) Proof. Let a n = L A n and b n = L 2 A n.. A word of length n in L + L 2 comes from either L or L 2, but not both. So (L + L 2 ) A n = a n + b n. Therefore, f L +L 2 (x) = f L (x) + f L2 (x). 2. A word of length n in L L 2 can be uniquely written as a product of a word of length m from L with a word of length n m from L 2. So L L 2 A n = n m=0 a mb n m. Thus (2.) implies as required. f L L 2 (x) = m=0 n a m b n m x n = f L (x)f L2 (x), 3. First note that L a code implies ε / L. There for the constant term of f L (x) is 0 and so /( f L (x)) makes sense. If w L has length m, then w / L n for n > m. Also the smallest degree term of fl n is at least n. So we just need to make sure that f L (x) agrees with + f L (x) + f L (x) m for all terms of degree up to m, for each m 0. But this follows from the previous two parts since L 0 + L + L m is an unambiguous union of unambiguous products. Example 4.4. Let C = {0, 0}. Then C is a prefix code. Clearly f C (x) = x+x 2, so f C = x x. 2 5

We recognize this from (.7) as the generating function for the Fibonacci numbers and so we know that the number of words of length n in C is the n th Fibonacci number f n. In particular, (.8) gives an explicit formula for the number of words of length n. Notice that C (ε+) is the language of all words that do not contain a factor. Indeed, C contains all words ending in 0 with no factor and the product then breaks things up into those words ending in 0 and those words ending in. This is an unambiguous regular expression and so f C (ε+) = + x x x 2 Example 4.5. A composition of a natural number n > 0 is an ordered sequence of positive numbers (m,..., m k ) such that m + + m k = n. Let s compute a formula for the number of compositions of n. Consider the infinite prefix code C = {a k b k 0} = a b. For n > 0 there is a bijection between words of length n in C and compositions of n that corresponds the composition (m,..., m k ) of n to the word (a m b)(a m2 b) (a mk b) of length n (what is the inverse?). The regular expression a b is unambiguous so the generating function for C is f C = x x Thus we have f C = = 2x f C x = + 2 n x n+ = + = x 2x = 2x + x 2x n= 2 n x n Therefore, there are 2 n compositions of n. = + x 2x Exercise 2. Find the generating function f L (x) and a formula for the number of words of length n in L for L = {0, 0, }. Exercise 3. Find a formula for the number of words of length n in the regular language 0. Make sure to justify that you are only using unambiguous products and stars. 4.. Unambiguous regular expressions and rationality Let us observe that the generating functions for the base regular languages are polynomials. f (x) = 0. 6

f {ε} =. f {a} = x, a A. It now follows from Proposition 4.3 and Exercise 4 that any regular language that is unambiguously regular has a rational generating function. Our next theorem, which is an improvement on Kleene s theorem, says that each regular language is in fact unambiguously regular. The argument is an alternative proof of Kleene s theorem. Theorem 4.6. Any regular language is unambiguously regular. Proof. Let A = (S, A, ι, δ, T ) be a deterministic finite state automaton recognizing L. For p, q S, let L p,q be the set of non-empty words recognized by the automaton A p,q = (S, A, p, δ, {q}). Then L = t T L ι,t + ι,t where ι,t = { {ε} if ι T else. Moreover, this union is unambiguous since A is deterministic and so a word can bring the initial state to at most one terminal state. So it suffices to show that each L p,q with p, q S is unambiguously regular. For Q S and p, q S, define L p,q,q to be the set of all non-empty words that label paths from p to q which only pass though states in Q except perhaps the p at the beginning and the q at the end. Then L p,q = L p,s,q. We prove that L p,q,q is unambiguously regular for each Q S, p, q S, by induction on Q. If Q =, then L p,q,q is just the set of labels of edges from p to q, and so is a subset of A and hence unambiguously regular. Assume the result is true for Q = n and now suppose Q = n +. Then Q = P + {r} for some state r / P. The idea is now similar to our old proof of Kleene s theorem. We break paths up according to whether they go through r or not. Then L p,q,q = L p,p,q + L p,p,r L r,p,rl r,p,q. (4.) By induction, L p,p,q, L p,p,r, Lr, P, r and L r,p,q are unambiguously regular. The union in (4.) is unambiguous since words in L p,p,q do not pass through r when going from p to q, while all words in L p,p,r L r,p,r L r,p,q do. The language L r,p,r is a prefix code since it does not contain the empty word and r / P implies no proper prefix of an element of L r,p,r belongs to the language. So L r,p,r is an 7

unambiguous star. Finally, the product L p,p,r L r,p,r L r,p,q is unambiguous since if w goes from p to q through r, it has a unique prefix x that visits r for the first time, a unique suffix z that visits r for the last time and w = xyz where y reads from r to r going through Q = P + {r}. It follows x L p,p,r, y L r,p,r and z L r,p,q and this is the unique factorization of this sort. This completes the induction and the proof of the theorem. Corollary 4.7. The generating function of any regular language is a rational function. In particular, the number of words of length n in a regular language must satisfy a linear recurrence. Because of the close relationship between regular languages and rational generating functions, some books call regular languages rational languages. However, there are languages with rational generating function that are not regular. For instance L = {0 n n n 0} is not regular. This language has exactly one word of every even length and no words of odd length. So its generating function f L (x) = + x 2 + x 4 + = x 2 is rational. In fact this language has the same generating function as (0 2 ). To obtain the proper relationship between regular languages and rational functions, one has to consider generating functions in several non-commuting variables, which is beyond the scope of this course. Example 4.8. Let s compute a formula the number of words of length n in the language L = {0, } 0{0, }. First we need an unambiguous regular expression. A deterministic automaton accepting this language is 0 0, 0 from which we obtain the unambiguous regular expression 00 (0 + ). Therefore, the generating function for L is given by f L = x x x x 2x = x 2 ( x) 2 ( 2x) 8

Using the method of partial fractions, one computes Thus f L (x) = ( x) + 2 2x = d ( ) + dx x 2x = nx n + 2 n x n = n= ( (n + ) + 2 n )x n L A n = 2 n n Exercise 4. Let G be a finite group with identity e. Let f : A G be an onto homomorphism. Show that the generating function for the word problem L = f ({e}) is rational. This result is used in probability theory: from it they deduce that the Green s function of a random walk on a finite group is rational. 4.2 A linear algebraic approach An alternate approach, which works quite well for automata with small numbers of states, is via linear algebra. Let A = (S, A, q, δ, T ) be a deterministic finite state automaton accepting a language L. Let S = {q,..., q m } where q is the initial state. Then, for each i, we define L qi to be the language of the automaton (S, A, q i, δ, T ); so L qi consists of all words w A such that q i w T. In particular, L q = L. Let f i = f Lqi be the generating function of L qi ; so f = f L. The generating functions f,..., f m are closely related, as we shall see momentarily. Let us first observe that if q i is a fail state, then no word labels a path from q i to a terminal state and so L qi =, whence f i = 0. Thus we can omit the fail states in what follows (i.e., work with partial deterministic automata). If f is a generating function, let us write f, x n to denote the coefficient of x n in f(x). So if f(x) = a nx n, then f, x n = a n. Then notice that { f i, x 0 q i T = (4.2) 0 else since ε L qi if and only if q i T. 9

On the other hand if n 0 and w L qi is a word of length n +, then w = au with a A and u = n. Since we are dealing with a deterministic automaton, this means that u L qi a. Conversely, if w = au with a A, u = n and u L qi a, then q i w = q i au T and so w L qi. Again q i a is uniquely determined because A is deterministic. From this, we conclude L qi A n+ = a A L qi a A n = m a ij L qj A n j= where In other words, for n 0, a ij = {a A q i a = q j } f q, x n+ = m a ij f j, x n (4.3) j= The matrix A = (a ij ) is called the adjacency matrix of A. For example, if we consider the automaton A from Example 4.8, then the adjacency matrix of A is given by 0 A = 0 (4.4) 0 0 2 where we order the states from left to right. Let us define { q i T δ i = 0 q i / T I.e., δ i = f i, x 0 by (4.2). Then equations (4.2) and (4.3) can be translated into the following linear system of equations in unknowns f,..., f m and with coefficients polynomials over R: f = δ + x(a f + a 2 f 2 + + a m f m ).. f m = δ m + x(a m f + a m2 f 2 + + a mm f m ) or in matrix form F = +xaf where F = (f,..., f m ) and = (δ,..., δ m ). Equivalently, we have the system of equations (I xa)f = (4.5) 20

where I is the m m identity matrix. Notice that det(i xa) is a polynomial in x of degree at most m with constant term det(i 0A) = det I =. Thus det(i xa) is an invertible power series and we can now apply Cramer s rule (which works over any ring provided the determinant is invertible over the ring) to conclude that f = det((i xa) ) det(i xa) (4.6) where (I xa) is the matrix obtained from I xa by replacing the first column with. Notice that the numerator of (4.6) is also polynomial in x of degree at most m, so this gives another proof that the generating function of a regular language is a rational function. Example 4.9. Let us revisit Example 4.8. Using (4.4) we obtain = (0, 0, ). x x 0 I Ax = 0 x x 0 0 2x 0 x 0 (I Ax) = 0 x x 0 2x and so det((i xa) ) = x 2, det(i xa) = ( x) 2 ( 2x) and so we recover f L (x) = x 2 ( x) 2 ( 2x) Example 4.0. This time we return to the example which is recognized by the automaton L = {0, } \ [{0, } {0, } ] A = 0 0 0, The last state is a fail state and so does not contribute to the generating function computation. Thus we may remove it and work with the partial 2

deterministic automaton B = Ordering the states from left to right, we obtain the adjacency matrix [ ] A = 0 0 0 and = (, ). Thus [ ] x x I xa = x [ ] x (I xa) = and so det((i xa) ) = + x and det(i xa) = x x 2. Thus we recover the generating function f L (x) = + x x x 2 Unfortunately, this linear algebra method becomes exceedingly more difficult to apply as the number of states increases. An alternative approach to Cramer s rule is to observe that (4.5) can be solved using Gaussian elimination, but when performing row reductions you can only divide by power series (in particular, polynomials) with non-zero constant term. 22