CSE 401 Compilers Leture 3: Regulr Expressions & Snning, on?nued Mihel Ringenurg Tody s Agend Lst?me we reviewed lnguges nd grmmrs, nd riefly strted disussing regulr expressions. Tody I ll restrt the regulr expression disussion, sine it felt it rushed. I ll then desrie how to uild finite utomt tht reognize regulr expressions. On Mondy, I ll disuss how snners re implemented. 2 1
Announements Homework 1 will e out lter tody. I ll post on ourse wesite nd send emil. Due next Fridy (Jnury 18). First prt of the projet (the snner) will e ssigned erly next week. Offie hours seleted, str?ng next week: Lure: Mondys (exept 1/21 & 2/18), 4-5, CSE 218 Mike: Wednesdys, 2:30-3:30, CSE 212 Or y ppointment on Tuesdys Zh: Fridys, 1:30-2:30, CSE 218 3 Regulr Expressions Defined over some lphet Σ For progrmming lnguges, lphet is usully ASCII or Uniode If re is regulr expression, L(re ) is the lnguge (set of strings) generted y re 4 2
Fundmentl REs re L(re ) Notes { } Singleton set, for eh symol in the lphetσ { } Empty string { } Empty lnguge These re the si uilding loks tht other regulr expressions re uilt from. 5 Oper?ons on REs re L(re ) Notes rs L(r)L(s) r s L(r) Contention: string from r followed y string from s L(s) Comintion (union): string from either r or s r* L(r)* Kleene losure: sequene of 0 or more strings from r Preedene: * (highest), onten?on, (lowest) Prentheses n e used to group REs s needed 6 3
Exmples re Mening + single + hrter! single! hrter!= 2 hrter sequene!= xyzzy 5 hrter sequene xyzzy (1 0)* Zero or more inry digits (1 0)(1 0)* Binry onstnt (possile leding 0s) 0 1(1 0)* Binry onstnt without extr leding 0s, i.e, 0 or strts with 1 ( hs lowest preedene) 7 Arevi?ons The si oper?ons generte ll possile regulr expressions, ut there re ommon revi?ons used for onveniene. Some exmples: Ar. Mening Notes r+ (rr*) 1 or more ourrenes r? (r ) 0 or 1 ourrene [-z] ( z) 1 hrter in given rnge [xyz] ( x y z) 1 of the given hrters 8 4
Exerise: Wht do these represent? re Mening []+ []* [0-9]+ [1-9][0-9]* [-za-z][-za-z0-9_]* 9 Exerise: Wht do these represent? re []+ []* Mening Sequene of one or more s, s nd s Zero or more s, s, nd s [0-9]+ Non-negtive integer (possily with leding 0s) [1-9][0-9]* Positive integer (no leding 0s) [-za-z][-za-z0-9_]* One or more letters or digits, must strt with letter. 14 5
Arevi?ons Mny systems llow revi?ons to mke wri?ng nd reding defini?ons or speifi?ons esier nme ::= re Restri?on: revi?ons my not e irulr (reursive) either diretly or indiretly (else would e not e regulr lnguge) digit ::= [0-9] is oky numer ::= digit numer is not 15 Exmple Possile syntx for numeri onstnts digit ::= [0-9] digits ::= digit+ numer ::= digits (. digits )? ( [ee] (+ - )? digits )? No?e tht this llows (unneessry) leding 0s, e.g., 00045.6. (0, or 0.14 would e neessry 0s.) How would you prevent tht? 16 6
Exmple Possile syntx for numeri onstnts digit ::= [0-9] nonzero_digit ::= [1-9] digits ::= digit+ numer ::= (0 nonzero_digit digits?) (. digits )? ( [ee] (+ - )? digits )? 17 Reognizing REs Finite utomt n e used to reognize lnguges generted y regulr expressions Cn uild y hnd or utom?lly Resonly strighoorwrd, nd n e done system?lly Tools like Lex, Flex (for ompilers wripen in C++), nd JFlex (for ompilers wripen in Jv) do this utom?lly, given set of REs. 18 7
Finite Stte Automton Review from your CS theory lss A finite set of sttes One mrked s ini?l stte One or more mrked s finl sttes Sttes some?mes leled or numered A set of trnsi?ons from stte to stte Eh leled with symol from Σ (the lphet), or The symols orrespond to hrters in the input strem. 2 1 3 4 19 Finite Stte Automton Operte y reding input symols (usully hrters) Trnsi?on n e tken if leled with urrent symol - trnsi?on n e tken t ny?me Aept when finl stte rehed nd no more input Slightly different in snner, where the FSA is used s surou?ne to find the longest input string tht mthes token RE. Rejet if no trnsi?on possile, or no more input nd not in finl stte (DFA) 2 1 3 4 20 8
Exmple: FSA for pig p i g 21 Exmple: FSA for pig Input 1: pig p i g Sttus: Exeu?ng 22 9
Exmple: FSA for pig Input 1: pig p i g Sttus: Exeu?ng 23 Exmple: FSA for pig Input 1: pig p i g Sttus: Exeu?ng 24 10
Exmple: FSA for pig Input 1: pig p i g Sttus: Aept! (In finl stte, nd no more input.) 25 Exmple: FSA for pig Input 2: pit p i g Sttus: Exeu?ng 26 11
Exmple: FSA for pig Input 1: pit p i g Sttus: Exeu?ng 27 Exmple: FSA for pig Input 1: pit p i g Sttus: Exeu?ng 28 12
Exmple: FSA for pig Input 1: pit p i g Sttus: Rejet! (No legl trnsi?ons on t.) 29 DFA vs NFA Determinis? Finite Automt (DFA) No hoie of whih trnsi?on to tke Non- determinis? Finite Automt (NFA) Choie of trnsi?on in t lest one se trnsi?ons (rs): If the urrent stte hs ny outgoing rs, we n follow ny of them without onsuming ny input Aept if some wy to reh finl stte on given input Rejet if no possile wy to finl stte Modeling hoie op?on 1: guess pth, ktrk if rejets Op?on 2: lone t hoie point, ept if ny lone epts 30 13
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 31 Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 32 14
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 33 Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 34 15
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 35 Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 36 16
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 37 Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 38 17
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 39 Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 40 18
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 41 Exmple NFA Input 1: GOSEAHAWKS H A W K G O S Sttus: Exeu?ng S E A T T L E 42 19
Exmple NFA Input 1: GOSEAHAWKS H A W K G O S S E A Sttus: Aept! T T L E 43 FAs in Snners Wnt DFA for speed (no ktrking or loning) But onversion from regulr expressions to NFA is esier Lukily, there is well- defined proedure for onver?ng n NFA to n equivlent DFA 44 20
From RE to NFA: se ses These orrespond to the Fundmentl REs shown erlier. NFA for symol NFA for empty string () NFA for empty set ( ) 45 Conten?on: r s This represent n NFA tht epts the regulr expression r An - trnsi?on from every finl stte of the r mhine to strt stte of the s mhine. An NFA for RE s r s The ide: When we find string tht mthes the regulr expression r, we strt trying to mth the regulr expression s. Sine this is n NFA, it s oky if we guess wrong we will mke n trnsi?on from every prefix of the input tht mthes r, nd thus hek ll possile mthes. 46 21
Union/Comin?on: r s r s The ide: Non- determinis?lly hek if the input mthes either r or s. If either su- mhine rehes finl stte, jump to the union mhine s finl stte. If the en?re input hs een onsumed t this point (i.e., the en?re string mthes r or s), the union mhine will ept. 47 Kleene str: r * r N1 The ide: At the strt node (N1), we pempt to mth either the empty string (to ount for the possiility of zero ourrene of r) or single mth of r. Every?me the r mhine find poten?l mth, it non- determinis?lly jumps k to N1 nd repets the proess. Sine this is n NFA, it s oky if we guess the wrong mth of r we ll try ll of them. 48 22
Exmple Drw the NFA for ( ): 49 Exmple Drw the NFA for ( ): 50 23
Exmple Drw the NFA for ( ): 51 Exmple Drw the NFA for ( ): (If stte hs single outgoing - trnsi?on, nd no other outgoing trnsi?ons, you n merge it into the trget.) 52 24
Exmple Drw the NFA for ( ): 53 Exerise Drw the NFA for: (t g) ug 54 25
Exerise Drw the NFA for: (t g) ug (t g) ug 55 Exerise Drw the NFA for: (t g) ug (t g) u g 56 26
Exerise Drw the NFA for: (t g) ug t g u g 57 Exerise Drw the NFA for: (t g) ug t g u g 58 27
From NFA to DFA Suset onstru?on: onstrut DFA from n NFA. Eh DFA stte represents set of NFA sttes. Key ide: Stte of DFA {er reding some input is the set of ll sttes tht NFA ould hve rehed {er reding the sme input Algorithm (exmple of fixed- point omput?on): Find - losure (ll sttes rehle vi 0 or more - trnsi?ons) of strt stte. Crete DFA stte orresponding to this set. Add to unvisited list. While there exist unvisited DFA sttes, selet one (ll it d): For eh symol s in the lphet, determine the NFA sttes rehle y ny NFA stte in the set orresponding to d. Determine the losure of these sttes. Crete trnsi?on from d on symol s to DFA stte orresponding to this losure set. If this stte is new, dd to the unvisited list. 59 Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 60 28
Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {1,2,5} Epsilon losure of strt stte 61 Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {3} {1,2,5} Visit {1,2,5}: Trnsi?ons on. No trnsi?ons from 3. 62 29
Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {1,2,5} {3} {6} Visit {1,2,5}: Trnsi?ons on. 63 Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {1,2,5} {3} {6,7} Epsilon losure of {6} 64 30
Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {1,2,5} {3} {6,7} Done with {1,2,5} 65 Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {3} {4,7} {1,2,5} {6,7} Visit {3}: Just one trnsi?on. Do losure of new stte. Mrk {3} s visited. 66 31
Convert NFA to DFA: Exmple 2 5 6 3 4 1 7 {3} {4,7} {1,2,5} {6,7} Lst two sttes hve no trnsi?ons, ut ontin finl stte, so mrk s finl. 67 Next Time Implemen?ng snner By hnd Vi utomted tools Enjoy your weekend Go Hwks! 68 32