Prevously on GLT Generc Lnguge Technology: Bsc technologes Pro.dr. Mrk vn den Brnd ASF+SDF syntx descrptons semntc descrptons: type checkng nlyss trnsormtons t so to s Concepts o progrmmng lnguges syntctc ssues semntc ssues Met modelng nd model trnsormtons / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 1 Bsc technologes Syntctcl nlyss Prser genertors Rewrte engnes Tsks nd orgnzton o lexcl nlyzer Speccton o lexcl tokens v regulr expressons Implementton o regulr expressons (non-)determnstc nte utomt trnslton o regulr expresson to utomton / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 2 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 3
progrm text (chrcters) Syntctcl nlyss prse tree Tsks o the lexcl nlyzer: redng the nput nd producton o tokens elmnton o lyout nd comments keepng trck o poston normton Lexcl nlyzer get next token token Prser Symol tle / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 4 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 5 A regulr expresson (r.e.) r over n lphet Σ corresponds to the lnguge L(r) 1. s r.e. nd corresponds to {} 2. Σ s r.e. nd corresponds to {} 3. Suppose r nd s re r.e. s correspondng to the lnguges L(r) nd L(s). lterntve (r) (s) s r.e. L(r) L(s). conctenton (r) (s) s r.e. L(r) L(s) c. Kleene closure (r)* s r.e. (L(r))* d. rckets (r) s r.e. L(r) Opertors re let-ssoctve nd prortes re * > conctenton > A regulr denton over lphet Σ hs the orm: re 1 -> d 1 re 2 -> d 2 re n -> d n where d re derent nmes nd ech re s r.e. over lphet Σ {d 1, d 2,, d -1 } Thus, n re occur only nmes whch re lredy dened / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 6 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 7
A regulr expresson cn e compled nto nte utomton (FA = nte utomton) whch s recognzer or the correspondng regulr lnguge A nte utomton s non-determnstc severl derent trnstons re possle or one nput symol n stte (NFA) There re two possle wys o trnsormng r.e. nto determnstc nte utomton: 1. r.e. NFA DFA 2. re r.e. DFA The generted DFA hs to e optmzed Otherwse the nte utomton s determnstc (DFA) / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 8 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 9 Non-determnstc nte utomton conssts o: 1. Set o sttes S 2. Input lphet Σ 3. Trnston uncton whch tkes stte/symol pr nd yelds set o new sttes 4. The strt stte s 0 S 5. A set F o cceptng sttes Exmple: S = {0,1,2,3} Σ = {, } S 0 = 0 F = {3} Ths utomton ccepts: ( )* Trnston uncton: Stte 0 {0,1} {0} 1 {2} 2 {3} / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 10 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 11
Trnstons my lso e lelled wth 1 2 0 3 4 NFA ccepts strng x There exsts pth n the trnston dgrm rom strt to nl stte such tht the conctenton o the lels on the pth equl x For exmple s ccepted y the prevous NFA Pth: 0 1 2 2 2 Lels: = The regulr expresson s: * * / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 12 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 13 Regulr expresson NFA Input: regulr expresson r over lphet Σ Output: NFA N whch ccepts L(r) 1. r = 2. r = where Σ 1. Suppose N(s) nd N(t) re NFAs or the r.e. s nd t. r = s t. r=st N(s) N(s) N(t) N(t) Ths operton s only possle the nl stte o N(s) hs no outgong trnstons nd the strt stte o N(t) hs no ncomng trnstons / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 14 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 15
c. r = s* N(s) d. r =(s) thenn(r) =N(s) Converson NFA DFA Regulr expresson cn e trnsormed nto NFAs DFAs cn e smulted/mplemented ecently Trnsormton o NFA nto DFA: construct DFA where ech stte represents suset o the sttes o the sttes o the NFA ter redng the nput 1 2 n the NFA s n set o sttes T, whch corresponds to one stte o the DFA / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 16 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 17 Auxlry unctons: -closure(s) yelds set o NFA sttes rechle rom stte s n NFA v -trnstons only -closure(t) yelds set o NFA sttes rechle rom stte s n T v -trnstons only move(t, ) yelds set o NFA sttes rechle rom stte s n T v nput Intlly, -closure(s 0 ) s the only stte n Dsttes nd unmrked whle there s n unmrked stte T n Dsttes do mrk T; or ech nput symol do U := -closure(move(t, )) U s not n Dsttes then dd U s n unmrked stte to Dsttes; end Dtrns[T,] := U; end end / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 18 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 19
Suset constructon (Rn & Scott) NFA = (Q, V, γ, q 0, F) NFA N or ( )* Equvlent DFA = (P(Q), V, δ, {q 0 }, F ) ) δ P(Q) V P(Q) δ(qq, ) ) = ( q:q qq : γ(q, ( )) F = {qq P(Q) qq F } δ({q 0 }, w) = set o ll sttes n whch the orgnl NFA cn e ter processng strng w 2 3 0 1 6 7 8 9 10 4 5 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 20 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 21 A = {0, 1, 4, 5, 7} (= -closure(0)) B = {1, 2, 3, 4, 6, 7, 8} (= move({0, 1, 4, 5, 7}, )) C = {1, 2, 4, 5, 6, 7} (= move({0, 1, 4, 5, 7}, )) D = {1,2,4,5,6,7,9}(= move({1, 2, 3, 4, 6, 7, 8}, )) E = {1, 2, 4, 5, 6, 7, 10} (= move({1, 2, 3, 4, 6, 7, 9}, )) Resultng DFA A B D E Trnston tle stte A B C B B D C B C D B E E B C C / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 22 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 23
LEX s scnner genertor whch trnsorms regulr expressons nto nte utomton: r.e. NFA re 0 {cton 0 } re 1 {cton 1 } 0 0 re k {cton k } strt F = { 0,, k } NFA DFA cceptng sttes hve the orm {,,,,, c, } wth correspondng cton: cton mn(,,c) k k Resoluton o mgutes Longest mtch s preerred I two lterntves recognze the sme sequence o chrcters, the lterntve occurrng rst n the speccton s chosen BEGIN [sym := egnsym] IF [sym := sym] letter.(letter dgt)* [sym := dsym] dgt.(dgt)* [sym := ntrepsym] := [sym := ecomessym] / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 24 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 25 Syntctcl nlyss Syntctcl nlyss Context-ree grmmrs Dervtons Prse Trees Let-recursve grmmrs Top-down prsng non-recursve predctve prsers constructon o prse tles Bottom-up prsng sht/reduce prsers LR prsers GLR prsers SGLR prsers A context-ree grmmr s 4-tuple G = (N, Σ, P, S) 1. N s set o non termnls 2. Σ s set o termnls (dsjont rom N) 3. P s suset o (N Σ)* N An element (α, A) P s clled producton A ::= α or α A 4. S N s the strt symol The sets N, Σ, P re nte / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 26 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 27
Syntctcl nlyss Syntctcl nlyss A context-ree grmmr cn e consder s smple rewrte system: A A P (,, (N Σ)*, A N) Exmple N = {E}, Σ = {+,*,(,),-, }, S = E, P = { E + E E E * E E ( E ) E - E E E} } Dervton: E -E -(E) -(E+E) -(+E) -(+) The lnguge L(G) generted y the context-ree grmmr G = (N, Σ, P, S) s: L(G) = {w Σ* S + w} A sentence w L(G) contns only termnls A sententl orm s strng o termnls nd nontermnls whch cn e derved rom S: S * wth (N Σ) * A sentence n L(G) s sententl orm n whch no non-termnls occur / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 28 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 29 Syntctcl nlyss Syntctcl nlyss Let/rght dervtons There re choces to e mde or ech dervton step: whch non-termnl must e replced? whch lterntve o the selected non-termnl must e ppled? Alwys selectng the letmost non-termnl n the sententl orm gves letmost dervton: lm There exsts lso rghtmost dervton: rm Consder the context-ree grmmr or expressons: Letmost dervton or -(+) E -E -(E) -(E+E) -(+E) -(+) Rghtmost t dervton or -(+) E -E -(E) -(E+E) -(E+) -(+) A prse tree or context-ree grmmr s G = (N, Σ, P, S) tree: 1. The root s leled wth S (the strt non-termnl) 2. Ech le s leled wth termnl ( Σ)or 3. All other nodes re leled wth non-termnl I A s the lel o node nd X 1,,X n re the lels o the chldren (rom let to rght) then X 1,,X n A must e producton rule n G (wth X s ether termnl or non-termnl) Specl cse: A wth lel A whch hs exctly one chld wth lel / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 30 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 31
Syntctcl nlyss Syntctcl nlyss Exmple: E -E -(E) -(E+E) -(+E) -(+) E E E E E E E - E - E - E - E - E - E ( E ) ( E ) ( E ) ( E ) ( E ) E + E E + E E + E The prse tree strcts rom the dervton order Acceptor nd prser For ech grmmr G there exsts decson procedure (cceptor) AG or L(G): AG: STRING {true, lse} such tht AG(w) =true w L(G) A prser s n cceptor whch constructs prse tree s well. A top-down prser constructs the tree strtng rom the root A ottom-up prser constructs the tree strtng rom the les / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 32 / Fcultet Wskunde en Inormtc 9-12-2009 PAGE 33