Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30
Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions Finite utomton Regulr lnguge DFAs nd NFAs Determiniztion Execution of NFAs The subset construction 2 / 30
Lnguges nd lphbets Throughout this course, lnguges will consist of finite sequences of symbols drwn from some given lphbet. An lphbet Σ is simply some finite set of letters or symbols which we tret s primitive. These might be... English letters: Σ = {, b,..., z} Deciml digits: Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} ASCII chrcters: Σ = {0, 1,...,, b,...,?,!,...} Progrmming lnguge tokens : Σ = {if, while, x, ==,...} Words in (some frgment of) nturl lnguge. Primitive ctions performble by mchine or system, e.g. Σ = {insert50p, pressbutton1,...} In toy exmples, we ll use simple lphbets like {0, 1} or {, b, c}. 3 / 30
Wht is lnguge? A lnguge over n lphbet Σ will consist of finite sequences (strings) of elements of Σ. E.g. the following re strings over the lphbet Σ = {, b, c}: b b cb bcc cccccccc There s lso the empty string, which we usully write s ɛ. A lnguge over Σ is simply (finite or infinite) set of strings over Σ. A string s is legl in the lnguge L if nd only if s L. We write Σ for the set of ll possible strings over Σ. So lnguge L is simply subset of Σ. (L Σ ) (N.B. This is just technicl definition ny rel lnguge is obviously much more thn this!) 4 / 30
Wys to define lnguge There re mny wys in which we might formlly define lnguge: Direct mthemticl definition, e.g. L 1 = {,, b, bbc} L 2 = {xb x Σ } L 3 = { n b n n 0} Regulr expressions (see Lecture 5): e.g. ( + b) b. Forml grmmrs (see Lecture 9 onwrds): e.g. S ɛ Sb. Specify some mchine for testing whether string is legl or not. The more complex the lnguge, the more complex the mchine might need to be. As we shll see, ech level in the Chomsky hierrchy is correlted with certin clss of mchines. 5 / 30
Finite utomt (.k.. finite stte mchines) 1 1 0 even odd 0 This is n exmple of finite utomton over Σ = {0, 1}. At ny moment, the mchine is in one of 2 sttes. From ny stte, ech symbol in Σ determines destintion stte we cn jump to. The stte mrked with the in-rrow is picked out s the strting stte. So ny string in Σ gives rise to sequence of sttes. Certin sttes (with double circles) re designted s ccepting. We cll string legl if it tkes us from the strt stte to some ccepting stte. In this wy, the mchine defines lnguge L Σ : the lnguge L is the set of ll legl strings. 6 / 30
Quick test question 1 1 0 even odd 0 For the finite stte mchine shown here, which of the following strings re legl (i.e. ccepted)? 1. ɛ 2. 11 3. 1010 4. 1101 7 / 30
Quick test question 1 1 0 even odd 0 For the finite stte mchine shown here, which of the following strings re legl (i.e. ccepted)? 1. ɛ 2. 11 3. 1010 4. 1101 Answer: 1, 2, 3, re legl, 4 isn t. 7 / 30
More generlly, for ny current stte nd ny symbol, there my be zero, one or mny new sttes we cn jump to. 0,1 1 0,1 0,1 0,1 0,1 q0 q1 q2 q3 q4 q5 Here there re two trnsitions for 1 from q0, nd none from q5. The lnguge ssocited with the mchine is defined to consist of ll strings tht re ccepted under some possible execution run. The lnguge ssocited with the exmple mchine bove is {x Σ the symbol fifth from the end of x is 1} 8 / 30
Forml definition of finite utomton Formlly, finite utomton with lphbet Σ consists of: A finite set Q of sttes, A trnsition reltion Q Σ Q, A set S Q of possible strting sttes. A set F Q of ccepting sttes. 9 / 30
Exmple forml definition 0,1 1 0,1 0,1 0,1 0,1 q0 q1 q2 q3 q4 q5 Q = {q0, q1, q2, q3, q4, q5} = { (q0, 0, q0), (q0, 1, q0), (q0, 1, q1), (q1, 0, q2), (q1, 1, q2), (q2, 0, q3), (q2, 1, q3), (q3, 0, q4), (q3, 1, q4), (q4, 0, q5), (q4, 1, q5) } S = {q0} F = {q5} 10 / 30
Regulr lnguge Suppose M = (Q,, S, F ) is finite utomton with lphbet Σ. We sy tht string x Σ is ccepted if there exists pth through the set of sttes Q, strting t some stte s S, ending t some stte f F, with ech step tken from the reltion, nd with the pth s whole spelling out the string x. This enbles us to define the lnguge ccepted by M: L(M) = {x Σ x is ccepted by M} We cll lnguge L Σ regulr if L = L(M) for some finite utomton M. Regulr lnguges re the subject of lectures 4 8 of the course. 11 / 30
DFAs nd NFAs A finite utomton with lphbet Σ is deterministic if: It hs exctly one strting stte. For every stte q Q nd symbol Σ there is exctly one stte q for which there exists trnsition q q in. (in some texts/definitions, this is relxed to t most one stte ) The first condition sys tht S is singleton set. The second condition sys tht specifies function Q Σ Q. Deterministic finite utomt re usully bbrevited DFAs. Generl finite utomt re usully clled nondeterministic, by wy of contrst, nd bbrevited NFAs. Note tht every DFA is n NFA. 12 / 30
Exmple 1 1 0 even odd This is DFA (nd hence n NFA). 0 0,1 1 0,1 0,1 0,1 0,1 q0 q1 q2 q3 q4 q5 This is n NFA but not DFA. 13 / 30
Chllenge question Consider the following NFA over {, b, c}: b c Wht is the minimum number of sttes of n equivlent DFA? (well, first we should sk - *is* there n equivlent DFA?) 14 / 30
Solution An equivlent DFA must hve t lest 5 sttes! b c b c............ (grbge stte),b,c 15 / 30
Specifying DFA Clerly, DFA with lphbet Σ cn equivlently be given by: A finite set Q of sttes, A trnsition function δ : Q Σ Q, A single designted strting stte s Q, A set F Q of ccepting sttes. Exmple: Q = {even, odd} 0 1 δ : even odd even odd even odd s = even F = {even} 16 / 30
Running finite utomton DFAs re ded esy to implement nd efficient to run. We don t need much more thn two-dimensionl rry for the trnsition function δ. Given n input string x it is esy to follow the unique pth determined by x nd so determine whether or not the DFA ccepts x. It is by no mens so obvious how to run n NFA over n input string x. How do we prevent ourselves from mking incorrect nondeterministic choices? Solution: At ech stge in processing the string, keep trck of ll the sttes the mchine might possibly be in. 17 / 30
Executing n NFA: exmple Given n NFA N over Σ nd string x Σ, how cn we in prctice decide whether x L(N)? We illustrte with the running exmple below. q0,b q2,b q1 String to process: b 18 / 30
Stge 0: initil stte At the strt, the NFA cn only be in the initil stte q0. q0,b q2,b q1 String to process: Processed so fr: Next symbol: b ɛ 19 / 30
Stge 1: fter processing The NFA could now be in either q0 or q1. q0,b q2,b q1 String to process: Processed so fr: Next symbol: b b 20 / 30
Stge 2: fter processing b The NFA could now be in either q1 or q2. q0,b q2,b q1 String to process: Processed so fr: Next symbol: b b 21 / 30
Stge 3: finl stte The NFA could now be in q2 or q0. (It could hve got to q2 in two different wys, though we don t need to keep trck of this.) q0,b q2,b q1 String to process: Processed so fr: b b Since we ve reched the end of the input string, nd the set of possible sttes includes the ccepting stte q0, we cn sy tht the string b is ccepted by this NFA. 22 / 30
The key insight The process we ve just described is completely deterministic process! Given ny current set of coloured sttes, nd ny input symbol in Σ, there s only one right nswer to the question: Wht should the new set of coloured sttes be? Wht s more, it s finite stte process. A stte is simply choice of coloured sttes in the originl NFA N. If N hs n sttes, there re 2 n such choices. This suggests how n NFA with n sttes cn be converted into n equivlent DFA with 2 n sttes. 23 / 30
Reference mteril Kozen chpters 3, 5 nd 6. Jurfsky & Mrtin section 2.2 (rther brief). 24 / 30