CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University
First Progrmming Project Instruction Scheduling Project hs een posted Due dtes: code Ferury 25, report Mrch 4 - strt erly - prolem is not totlly defined; you my run into situtions where you will need time to redesign your solution - the report is 40% of the grde, code is 60% - need to follow clling conventions since we will use utomtic testing frmework - we minly provide help for C/C++ implementtions - not group project! 2
Constructing Scnner Review source code specifictions Scnner Scnner Genertor prts of speech & words Tles nd code The scnner is the first stge in the front end Specifictions cn e expressed using regulr expressions Build tles nd code from DFA 3
Review: Gol We show how to construct finite stte utomton to recognize ny RE Overview: Direct construction of nondeterministic finite utomton (NFA) to recognize given RE Requires -trnsitions to comine regulr suexpressions Construct deterministic finite utomton (DFA) to simulte the NFA Use set-of-sttes construction Minimize the numer of sttes Hopcroft stte minimiztion lgorithm Generte the scnner code Additionl specifictions needed for detils 4
Non-deterministic Finite Automt Ech RE corresponds to deterministic finite utomton (DFA) My e hrd to directly construct the right DFA Wht out n RE such s ( ) *? S 0 S 1 S 2 S 3 S 4 This is little different S 0 hs trnsition on S 1 hs two trnsitions on This is non-deterministic finite utomton (NFA) 5
Non-deterministic Finite Automt An NFA ccepts string x iff pth though the trnsition grph from s 0 to finl stte such tht the edge lels spell x Trnsitions on consume no input To run the NFA, strt in s 0 nd guess the right trnsition t ech step Alwys guess correctly If some sequence of correct guesses ccepts x then ccept Why study NFAs? They re the key to utomte the RE DFA construction We cn pste together NFAs with -trnsitions NFA NFA ecomes n NFA 6
Reltionship etween NFAs nd DFAs DFA is specil cse of n NFA DFA hs no trnsitions DFA s trnsition function is single-vlued Sme rules will work DFA cn e simulted with n NFA Oviously NFA cn e simulted with DFA Simulte sets of possile sttes Possile exponentil lowup in the stte spce Still, one stte per chrcter in the input strem (less ovious) 7
Automting Scnner Construction To convert specifiction into code: 1 Write down the RE for the input lnguge 2 Build ig NFA 3 Build the DFA tht simultes the NFA 4 Systemticlly shrink the DFA 5 Turn it into code Scnner genertors Lex nd Flex work long these lines Algorithms re well-known nd well-understood Key issue is interfce to prser You could uild one in weekend! (define ll prts of speech) 8
Review: Automting Scnner Construction RE NFA (Thompson s construction) Build n NFA for ech term Comine them with -moves NFA DFA (suset construction) Build the simultion DFA Miniml DFA Hopcroft s lgorithm The Cycle of Constructions RE NFA DFA miniml DFA DFA RE (Not prt of the scnner construction) All pirs, ll pths prolem Tke the union of ll pths from s 0 to n ccepting stte 9
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for 10
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for 11
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 1 S 2 S 3 S 4 NFA for 12
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 1 S 2 S 0 S 3 S 4 S 5 NFA for 13
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 1 S 2 S 0 S 5 S 3 S 4 NFA for 14
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 0 S 1 S 2 S 3 S 4 NFA for S 5 S 1 S 3 NFA for * 15
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 0 S 1 S 2 S 3 S 4 NFA for S 5 S 1 S 3 NFA for * 16
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 0 S 1 S 2 S 3 S 4 NFA for S 5 S 3 S 4 NFA for * 17
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 0 S 1 S 2 S 3 S 4 NFA for S 5 NFA for * S 3 S 4 18
RE NFA using Thompson s Construction Key ide NFA pttern for ech symol nd ech opertor Ech NFA hs single strt nd ccept stte Join them with moves in precedence order S 3 S 4 NFA for NFA for S 0 S 1 S 2 S 3 S 4 NFA for S 5 NFA for * S 3 S 4 Ken Thompson, CACM, 1968 19
Exmple of Thompson s Construction Let s try ( c ) * 1.,, & c c 2. c S 1 S 2 S 0 S 5 c S 3 S 4 3. ( c ) * S 2 S 3 S 6 S 7 c S 4 S 5 20
Exmple of Thompson s Construction Let s try ( c ) * 1.,, & c c 2. c S 1 S 2 S 0 S 5 c S 3 S 4 3. ( c ) * S 2 S 3 S 6 S 7 S 4 c S 5 21
Exmple of Thompson s Construction (con t) 4. ( c ) * S 4 S 5 S 2 S 3 S 8 S 9 S 6 c S 7 Of course, humn would design something simpler... c But, we cn utomte production of the more complex one... 22
Review: Automting Scnner Construction RE NFA (Thompson s construction) Build n NFA for ech term Comine them with -moves NFA DFA (suset construction) Build the simultion DFA Miniml DFA Hopcroft s lgorithm The Cycle of Constructions RE NFA DFA miniml DFA DFA RE (Not prt of the scnner construction) All pirs, ll pths prolem Tke the union of ll pths from s 0 to n ccepting stte 23
NFA DFA with Suset Construction Need to uild simultion of the NFA Two key functions Move(s i, ) is set of sttes rechle from s i y -closure(s i ) is set of sttes rechle from s i y The lgorithm: Strt stte derived from s 0 of the NFA Tke its -closure S 0 = -closure(s 0 ) Tke the imge of S 0, move(s 0, ) for ech Σ, nd tke its -closure, dd it to the stte set S For ech stte S, compute move(s, ) for ech Σ, nd tke its -closure Iterte until no more sttes re dded Sounds more complex thn it is 24
NFA DFA with Suset Construction The lgorithm: s 0 -closure(q 0 ) dd s 0 to S while ( S is still chnging ) for ech s i S for ech Σ s? -closure(move(s i,)) if ( s? S ) then dd s? to S s s j T[ s i,] s j else T[ s i,] s? Let s think out why this works 25
NFA DFA with Suset Construction The lgorithm: s 0 -closure(q 0 ) dd s 0 to S while ( S is still chnging ) for ech s i S for ech Σ s? -closure(move(s i,)) if ( s? S ) then dd s? to S s s j T[ s i,] s j else T[ s i,] s? The lgorithm hlts: 1. S contins no duplictes (test efore dding) 2. 2 Q is finite 3. while loop dds to S, ut does not remove from S (monotone) the loop hlts S contins ll the rechle NFA sttes It tries ech symol in ech s i. It uilds every possile NFA configurtion. S nd T form the DFA Let s think out why this works 26
NFA DFA with Suset Construction Exmple of fixed-point computtion Monotone construction of some finite set Hlts when it stops dding to the set Proofs of hlting & correctness re similr These computtions rise in mny contexts Other fixed-point computtions Cnonicl construction of sets of LR(1) items Quite similr to the suset construction Clssic dt-flow nlysis Solving sets of simultneous set equtions DFA minimiztion lgorithm (coming up!) We will see mny more fixed-point computtions 27
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1 28
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, q 4, q 6, q 9 none none 29
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, q 4, q 6, q 9 s 1 q 1, q 2, q 3, q 4, q 6, q 9 none none 30
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, q 4, q 6, q 9 s 1 q 1, q 2, q 3, none q 4, q 6, q 9 none none 31
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, q 4, q 6, q 9 none none s 1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5 q 7 32
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, q 4, q 6, q 9 none s 1 q 1, q 2, q 3, none q 5, q 8, q 9, q 4, q 6, q 9 q 3, q 4, q 6 none q 7, q 8, q 9, q 3, q 4, q 6 33
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, q 4, q 6, q 9 none s 1 q 1, q 2, q 3, none q 5, q 8, q 9, q 4, q 6, q 9 q 3, q 4, q 6 s 2 q 5, q 8, q 9, q 3, q 4, q 6 none q 7, q 8, q 9, q 3, q 4, q 6 34
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, none none q 4, q 6, q 9 s 1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s 2 q 5, q 8, q 9, q 3, q 4, q 6 none s 2 s 3 35
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, none none q 4, q 6, q 9 s 1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s 2 q 5, q 8, q 9, none s 2 s 3 q 3, q 4, q 6 s 3 q 7, q 8, q 9, q 3, q 4, q 6 36
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, none none q 4, q 6, q 9 s 1 q 1, q 2, q 3, none q 5, q 8, q 9, q 7, q 8, q 9, q 4, q 6, q 9 q 3, q 4, q 6 q 3, q 4, q 6 s 2 q 5, q 8, q 9, none s 2 s 3 q 3, q 4, q 6 s 3 q 7, q 8, q 9, none s 2 s 3 q 3, q 4, q 6 37
NFA DFA with Suset Construction ( c )* : q 0 q 1 q 4 q 5 q q 3 q 2 8 q 9 q 6 c q 7 Applying the suset construction: -closure(move(s,*)) NFA sttes c s 0 q 0 q 1, q 2, q 3, none none q 4, q 6, q 9 s 1 q 1, q 2, q 3, none q 5, q 8, q 9, q 7, q 8, q 9, q 4, q 6, q 9 q 3, q 4, q 6 q 3, q 4, q 6 s 2 q 5, q 8, q 9, none s 2 s 3 q 3, q 4, q 6 s 3 q 7, q 8, q 9, none s 2 s 3 q 3, q 4, q 6 Finl sttes 38
NFA DFA with Suset Construction The DFA for ( c ) * s 0 s 1 c s 2 s 3 c c δ c s 0 s 1 - - s 1 - s 2 s 3 s 2 - s 2 s 3 s 3 - s 2 s 3 Ends up smller thn the NFA All trnsitions re deterministic 39
Automting Scnner Construction RE NFA (Thompson s construction) Build n NFA for ech term The Cycle of Constructions Comine them with -moves NFA DFA (suset construction) Build the simultion DFA Miniml DFA Hopcroft s lgorithm RE NFA DFA miniml DFA DFA RE (not relly prt of scnner construction) All pirs, ll pths prolem Union together pths from s 0 to finl stte 40
DFA Minimiztion The Big Picture Discover sets of equivlent sttes Represent ech such set with just one stte 41
DFA Minimiztion The Big Picture Discover sets of equivlent sttes Represent ech such set with just one stte Two sttes re equivlent if nd only if: Σ, trnsitions on led to equivlent sttes (DFA) -trnsitions to distinct sets sttes must e in distinct sets 42
DFA Minimiztion The Big Picture Discover sets of equivlent sttes Represent ech such set with just one stte Two sttes re equivlent if nd only if: Σ, trnsitions on led to equivlent sttes (DFA) -trnsitions to distinct sets sttes must e in distinct sets A prtition P of S Ech stte s S is in exctly one set p i P The lgorithm itertively prtitions the DFA s sttes 43
DFA Minimiztion Detils of the lgorithm Group sttes into mximl size sets, optimisticlly Itertively sudivide those sets, s needed Sttes tht remin grouped together re equivlent Initil prtition, P 0, hs two sets: {F} & {Q-F} (D =(Q,Σ,δ,q 0,F)) Splitting set ( prtitioning set y ) Assume q, & q s, nd δ(q,) = q x, & δ(q,) = q y If q x & q y re not in the sme set, then s must e split q hs trnsition on, q does not splits s 44
DFA Minimiztion The lgorithm P { F, {Q-F}} while ( P is still chnging) T { } for ech set S P T T split(s) P T split(s): for ech Σ if splits S into S 1, S 2, then return {S 1, S 2, } else return S Why does this work? Prtition P 2 Q Strt off with 2 susets of Q {F} nd {Q-F} While loop tkes P i P i+1 y splitting 1 or more sets P i+1 is t lest one step closer to the prtition with Q sets Mximum of Q splits Note tht Prtitions re never comined 45
DFA Minimiztion The lgorithm P { F, {Q-F}} while ( P is still chnging) T { } for ech set S P T T split(s) P T split(s): for ech Σ if splits S into S 1, S 2, then return {S 1, S 2, } else return S Why does this work? Prtition P 2 Q Strt off with 2 susets of Q {F} nd {Q-F} While loop tkes P i P i+1 y splitting 1 or more sets P i+1 is t lest one step closer to the prtition with Q sets Mximum of Q splits Note tht Prtitions re never comined This is fixed-point lgorithm! 46
Bck to our DFA Minimiztion exmple Then, pply the minimiztion lgorithm Split on Current Prtition c P 0 { s 1, s 2, s 3 } {s 0 } none none none s 0 s 1 c s 2 c finl sttes s 3 c To produce the miniml DFA s 0 s 1 c We oserved tht humn would design simpler utomton thn Thompson s construction & the suset construction did. Minimizing tht DFA produces the one tht humn would design! 47
Arevited Register Specifiction Strt with regulr expression r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 The Cycle of Constructions RE NFA DFA miniml DFA 48
Arevited Register Specifiction Thompson s construction produces r 0 r 1 s 0 r 2 r 8 s f r 9 The Cycle of Constructions RE NFA DFA miniml DFA 49
Arevited Register Specifiction The suset construction uilds s 0 r 0 s f0 1 9 8 2 s f 1 s f8 s f2 s f9 This is DFA, ut it hs lot of sttes The Cycle of Constructions RE NFA DFA miniml DFA 50
Arevited Register Specifiction The DFA minimiztion lgorithm uilds s 0 r 0,1,2,3,4, 5,6,7,8,9 s f This looks like wht skilled compiler writer would do! The Cycle of Constructions RE NFA DFA miniml DFA 51
Homework nd Next clss Syntx Anlysis Red EC: 3.1 3.3 52