Syntax analysis Context-free syntax is specified with a context-free grammar.

Size: px
Start display at page:

Download "Syntax analysis Context-free syntax is specified with a context-free grammar."

Transcription

1 he role of the parser tokens scanner parser source code Parsing hapter errors opyright c by ntony Hosking Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page o copy otherwise to republish to post on servers or to redistribute to lists requires prior specific permission andor fee equest permission to publish from hosking@cspurdueedu performs contextfree syntax analysis guides contextsensitive analysis constructs an inediate representation produces meaningful error messages attempts error correction or the next few weeks we will look at parser construction otation and inology t n c yntax analysis ontextfree syntax is specified with a contextfree grammar ormally a G G is a tuple b t a U α u n W P where γ t is the set of inal symbols in the grammar or our purposes t is the set of tokens returned by the scanner n is a set of syntactic variables the noninals that denote sets of substrings occurring in the language hese are used to impose a structure on the grammar is a distinguished noninal n denoting the entire set of strings in G his is sometimes called a goal β symbol P is a finite set of productions specifying how inals and noninals can be combined to form strings in the language ach production must have a single noninal on its H t he set t v w γ n is called the vocabulary of G αγβ is a step derivation using γ then αβ f steps and denote derivations of and imilarly β then β is said to be a sentential form of G f is called a sentence of G G w w t w G t β β G ote

2 ; ; > > canning vs parsing Where do we draw the line? op op Derivations We can view the productions of a G as rewriting rules Using our example G goal op op op id op op id op id num op id num id num id We have derived the sentence We denote this derivation goal uch a sequence of rewrites is a derivation or a parse he process of discovering a derivation is called parsing = ackusaur orm for grammars xample simple essions over numbers and identifiers op goal op n a for a grammar we represent noninals with angle brackets or capital letters inals with font or underline productions as in the example canning vs parsing egular essions are used to classify identifiers numbers keywords s are more concise and easier to understand for tokens than a grammar more efficient scanners Ds can be built from s than from arbitrary grammars ontextfree grammars are used to count brackets imparting structure essions yntactic analysis is complicated enough grammar for has around productions actoring out lexical analysis as a separate phase makes compiler more manageable

3 ightmost derivation id or the string goal id op op id num num num id id id id op op goal gain Precedence hese two derivations point out a problem with the grammar it has no notion of precedence or implied order of evaluation o add precedence takes additional machinery goal erms must derive from essions forcing correct tree Derivations t each step we chose a noninal to replace his choice can lead to different derivations wo are of particular interest leftmost derivation the leftmost noninal is replaced at each step rightmost derivation the rightmost noninal is replaced at each step he previous example was a leftmost derivation Precedence goal op idy> op idx> num> reewalk evaluation computes the wrong answer hould be

4 Precedence goal idy> idx> num> reewalk evaluation computes mbiguity ay be able to eliminate ambiguities by rearranging the grammar = stmt matched matched matched unmatched = matched = unmatched unmatched stmt matched his generates the same language as the ambiguous grammar but applies the common sense rule with the closest unmatched match each his is most likely the language designer s intent Precedence num num num num id id id id id id ow for the string goal id but this time we build the gain goal desired tree mbiguity f a grammar has more than one derivation for a single sentential form then it is ambiguous stmt stmt = xample stmt stmt onsider deriving the sentential form t has two derivations his ambiguity is purely grammatical t is a contextfree ambiguity

5 Parsing the big picture mbiguity mbiguity is often due to confusion in the contextfree specification ontextsensitive confusions can arise from overloading tokens parser grammar generator parser code Our goal is a flexible parser generator system opdown parsing topdown parser starts with the root of the parse tree labelled with the start or goal symbol of the grammar o build a parse it repeats the following steps until the fringe of the parse tree matches the input string t a node labelled select a production α and construct the appropriate child for each symbol of α When a inal is added to the fringe that doesn t match the input string backtrack ind next node to be expanded must have a label in n he key is selecting the right production in step should be guided by input string could be a function or n many lgollike languages subscripted variable Disambiguating this statement requires context need values of declarations not contextfree really an issue of type ather than complicate parsing we will handle this separately opdown versus bottomup opdown parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrackfree predictive ottomup parsers start at the leaves and fill in start in a state valid for legal first tokens as input is consumed change state to encode possibilities recognize valid prefixes use a stack to store both state and sentential forms

6 xample xample nother possible parse for Prod n entential form nput goal f parser makes wrong choices expansion doesn t inate his isn t a good property for a parser to have Parsers should inate imple ession grammar ecall our grammar for simple essions goal onsider the input string Prod n entential form nput goal

7 liminating leftrecursion o remove leftrecursion we can transform the grammar onsider the grammar fragment α foo foo β foo where α and β do not start with We can rewrite this as bar bar β α ε foo bar is a new noninal bar where his fragment contains no leftrecursion xample his cleaner grammar defines the same language goal t is rightrecursive free of εproductions Unfortunately it generates different associativity ame syntax different meaning eftrecursion opdown parsers cannot handle leftrecursion in a grammar ormally a grammar is leftrecursive if α for some string α n such that Our simple ession grammar is leftrecursive xample pplying the transformation yields Our ession grammar contains two cases of leftrecursion ε ε With this new grammar a topdown parser will inate backtrack on some inputs

8 How much lookahead is needed? We saw that topdown parsers may need to backtrack when they select the wrong production Do we need arbitrary lookahead to parse Gs? in general yes use the arley or ockeyounger asami algorithms ortunately large subclasses of Gs can be parsed with limited lookahead most programming language constructs can be essed in a grammar that falls in these subclasses mong the interesting subclasses are left to right scan leftmost derivation token lookahead; and left to right scan rightmost derivation token lookahead eft ing What if a grammar does not have this property? ometimes we can transform a grammar to have this property or each noninal find the longest prefix α common to two or more of its alternatives ε then replace all of the productions αβ αβ αβn if α with α β β βn where is a new noninal epeat until no two alternatives for a single noninal have a common prefix xample Our longsuffering ession grammar ε goal ε ecall we ed out leftrecursion Predictive parsing β we would like a α asic idea or any two productions distinct way of choosing the correct production to expand or some H α G define α as the set of tokens that appear first in some string derived from α hat is for some w t w α iff α wγ β both appear in α and ey property Whenever two productions φ β the grammar we would like α his would allow the parser to make a correct choice with a lookahead of only one symbol he example grammar has this property

9 xample pplying the transformation here are two noninals to left ε ε xample entential form nput goal ε he next symbol deined each choice correctly xample onsider a rightrecursive and rightassociative version of the ession grammar goal o choose between productions the parser must see past the or and look at the or φ his grammar fails the test xample ubstituting back into the grammar yields goal ε ε ow selection requires only a single token lookahead ote his grammar is still rightassociative

10 @ ; ; ; ; Generality Question y left ing and eliminating leftrecursion can wetransform an arbitrary contextfree grammar to a form where it can be predictively parsed with a single token lookahead? nswer Given a contextfree grammar that doesn t meet our conditions it is undecidable whether an equivalent grammar exists that does meet our conditions any contextfree languages do not have such a grammar n a n b n n a n b n ust look past an arbitrary number of a s to discover the or the and so deine the derivation ecursive descent parsing D? > ; = D H G G D H G? > ; Q? D Q? D H G D H G ack to leftrecursion elimination Given a lefted G to eliminate leftrecursion if α then replace all of the productions α β γ with β γ α ε where and are new productions epeat until there are no leftrecursive productions ecursive descent parsing ow we can produce a simple recursive descent parser from the rightassociative grammar

11 onrecursive predictive parsing Observation Our recursive descent parser encodes state information in its runtime stack or call stack uilding the tree Using recursive procedure calls to implement a stack abstraction may not be particularly efficient his suggests other implementation methods explicit stack handcoded parser One of the key jobs of the parser is to build an inediate representation of the source code stackbased tabledriven parser abledriven parsers parser generator system often looks like o build an abstract syntax tree we can simply insert code at the appropriate points stack code source scanner tokens table driven parser grammar parser generator parsing tables his is true for both topdown and bottomup parsers can stack nodes can stack nodes can pop build and push subtree can stack nodes can pop build and push subtree can pop and return tree onrecursive predictive parsing ow a predictive parser looks like stack table driven parser tokens scanner code source parsing tables ather than writing code we build tables uilding tables can be automated

12 @? > = ;? ; H G Q P? > ; H onrecursive predictive parsing What we need now is a parsing table ts parse table goal Our ession grammar goal ε ε we use to represent OOW as or a noninal define OOW the set of inals that can appear immediately to the right of in some sentential form hus a noninal s OOW set specifies the tokens that can legally appear after it inal symbol has no OOW set o build OOW Put in OOW goal f αβ a Put β ε in OOW b f β ε ie α orε β ie β then put OOW in OOW epeat until no more additions can be made DO ε onrecursive predictive parsing nput astringwand a parsing table for G tart ymbol X is a noninal X Y k YY Y Yk Y k or a string of grammar symbols α define α as the set of inals that begin strings derived from α a t α aβ f α ε then ε α α contains the tokens valid in the initial position in α o build X f X t then X is X f X ε then add ε to X f X YY Y k a Put Y ε in X b i i kifε Y Yi ie Y Yi ε then put Yi ε in X c f ε Y Y k then put ε in epeat until no more additions can be made D H D X H

13 grammars Provable facts about grammars o leftrecursive grammar is o ambiguous grammar is ome languages have no grammar ε free grammar where each alternative expansion for begins with a distinct inal is a simple grammar xample a a is not because a a a a a ε accepts the same language and is xample Our longsuffering ession grammar ε ε OOW ε ε grammars Previous definition grammar G is iff for all noninals each distinct pair of productions β and γ φ γ β satisfy the condition What if ε? evised definition grammar G is iff for each set of productions α αn α α αn are all pairwise disjoint f αi ε then OOW φ j ni j α α j f G is εfree condition is sufficient parse table construction nput Grammar G Output Parsing table ethod α productions a α to add α a a α b f ε b α to add OOW b i α to then add OOW ii f et each undefined entry of to with multiple entries then grammar is not a f ε b t soa b ote recall a

14 ; P P When an error occurs looking for scan until an W W D W W P P P P Y X Y H X G grammar that is not stmt stmt stmt stmt efted stmt stmt stmt stmt stmt ε ow stmt ε lso OOW stmt ut stmt OOW stmt φ On seeing there is a conflict between choosing stmt stmt and stmt ε grammar is not ; > to associate stmt stmt with closest previous he fix Put priority on parsing k γ for k OQP γ β iff for β and k but trong k k Definition G is strong φ γoow k βoow k k k trong Definition G is x γ β k iff lm wα lm wβα wx lm wα lm wγα wy k x k y β γ ie knowing w and next k input tokens k k y accurately predicts production to use k k trong k trong k uilding the tree gain we insert code at the right points root node tart ymbol pop and fill in node X is a noninal Xtoken X Yk YY pop node for X build node for each child and make it a child of node for X Y n Yk nk nk Yk rror recovery ey notion or each noninal construct a set of inals on which the parser can synchronize is found element of YH a YH place keywords that start statements in YH add symbols in to YH uilding YH a OOW f we can t match a inal on top of stack pop the inal print a message saying the inal was inserted continue the parse a t a ie YH

15 trong k example onsider the grammar G aa bba b ε t is O O trong ome definitions ecall or a grammar G with start symbol anystringαsuch that α is called a sentential form f α t then α is called a sentence in G Otherwise it is just a sentential form not a sentence in G leftsentential form is a sentential form that occurs in the leftmost derivation of some sentence rightsentential form is a sentential form that occurs in the rightmost derivation of some sentence xample onsider the grammar ottomup parsing Goal Given an input string w and agrammargconstruct a parse tree by starting at the leaves and working to the root he parser repeatedly matches a rightsentential form from the language against the tree s upper frontier t each match it applies a reduction to build on the frontier each reduction matches an upper frontier of the partially built tree to the H of some production each reduction adds a node on top of the frontier he final result is a rightmost derivation in reverse and the input string Prod n entential orm ae he trick appears to be scanning the input and finding valid sentential forms

16 Handles What are we trying to find? substring α of the tree s upper frontier that matches some production α where reducing α to is one step in the reverse of a rightmost derivation We call such a string a handle ormally a handle of a rightsentential form γ is a production β and a position in γ where β may be found and replaced by to produce the previous rightsentential form in a rightmost derivation of γ Handles ie if rm αw α rm αβw then β in the position following α is a handle of αβw ecause γ is a rightsentential form the substring to the right of a handle contains only inal symbols β w β in the parse tree for αβw he handle xample he leftrecursive ession grammar original form Prod n entential orm goal Handles heorem f G is unambiguous then every rightsentential form has a unique handle Proof by definition G is unambiguous a unique production a unique position k at which a unique handle rightmost derivation is unique β β applied to take γi β is applied to γi goal

17 tack implementation One scheme to implement a handlepruning bottomup parser is called a shiftreduce parser hiftreduce parsers use a stack and an input buffer initialize stack with epeat until the top of the stack is the goal symbol and the input token is a find the handle if we don t have a handle on top of the stack shift an input symbol onto the stack b prune the handle if we have a handle β on the stack reduce i pop β symbols off the stack ii push onto the stack Handlepruning hiftreduce parsing he process to construct a bottomup parse is called handlepruning o construct a rightmost derivation γn w γ n hiftreduce parsers are simple to understand shiftreduce parser has just four canonical actions shift next input symbol is shifted onto the top of the stack reduce right end of handle is on top of stack; locate left end of handle within the stack; pop handle off stack and push appropriate noninal H accept inate parsing and signal success error call an error recovery routine ey insight recognize handles with a D D transitions shift states instead of symbols accepting states trigger reductions γ γ γ we set i to n and apply the following simple algorithm n γi βi i γi i βi his takes n steps where n is the length of the derivation xample back to tack nput ction goal

18 xample tables he Grammar state O GOO s acc s r r s r r r r s s r r r ote his is a simple little rightrecursive grammar t is not the same grammar as in previous lectures grammars k if given a k nformally we say that a grammar G is rightmost derivation w γn γ γ γ we can for each rightsentential form in the derivation isolate the handle of each rightsentential form and deine the production by which to reduce by scanning γi from left to right going at most k symbols beyond the right end of the handle of γi parsing he skeleton parser s si si β β k shifts l reduces and accept where k is length of input string and l is length of reverse rightmost derivation xample using the tables tack nput ction s r s s r r r s s r r r r acc

19 Why study grammars? grammars are often used to construct parsers We call these parsers parsers grammars k iff virtually all contextfree programming language constructs can be essed in an form grammars are the most general grammars parsable by a deinistic bottomup parser k efficient parsers can be implemented for grammars parsers detect an error as soon as possible in a lefttoright scan of the input grammars describe a proper superset of the languages recognized by predictive ie parsers k recognize use of a production β seeing first k symbols derived from β k recognize the handle β after seeing everything derived from β plus k lookahead symbols vs n parser for either lgol or Pascal has several thousand states while an or parser for the same language may have several hundred states ormally a grammar G is rm αβw and rm αβy and y k rm αw rm γx k w γx αy ie ssume sentential forms αβw and αβy with common prefix αβ and common ksymbol lookahead such that αβw reduces to αw and w y k αβy reduces to γx k ut the common prefix means αβy also reduces to αy for the same result γx hus αy parsing hree common algorithms to build tables for an parser smallest class of grammars smallest tables number of states simple fast construction full set of grammars largest tables number of states slow large construction inediate sized set of grammars same number of states as canonical construction is slow and large better construction techniques exist

20 xample he indicates how much of an item we have seen at a given state in the parse XYZ indicates that the parser is looking for a string that can be derived from XYZ XY Z indicates that the parser has seen a string derived from XY and is looking for one derivable from Z items no lookahead items XYZ generates XYZ X YZ XY Z XYZ closure Given an item α β its closure contains the item and any other items that can generate legal substrings to follow α hus if the parser has viable prefix α on its stack the input should reduce to β or γ for some other item γ in the closure β γ α items k he table construction algorithms use sets of k items or configurations to represent the possible states in a parse n k item is a pair αβ where at some position in the α is a production from G with a H marking how much of the H of a production has already been seen β is a lookahead string containing k symbols inals or and k wo cases of interest are k table construction table and items playakeyroleinthe algorithm items play a key role in the construction algorithms he characteristic finite state machine he for a grammar is a D which recognizes viable prefixes of rightsentential forms viable prefix is any prefix that does not extend beyond the handle t accepts when a handle has been discovered and needs to be reduced o construct the we need two functions to build its states to deine its transitions X

21 G W O P H ] Q \ G ^ Z O _ ] Q \ O O _ ] Q \ W ] Q \ W uilding the item sets where We start the construction with the item is the start symbol of the augmented grammar G is the start symbol of G represents o compute the collection of sets of items G s s ; s X? > ; s D = s s D = D D D D? onstructing the parsing table construct the collection of sets of items for G state i of the is constructed from i a α aβ i and ixa j O ixa shift j b α ix O ixa reduce α a c i O ixa accept a ix j GOO ix j set undefined entries in O and GOO to error initial state of parser s is Y[Z U Q Q P QP Y Z U Q Y P O e dc a b ` goto items and X be a grammar symbol et be a set of is the closure of the set of all items hen GOO X Xβ α such that β αx f is the set of valid items for some viable prefix γ then GOO X is the set of valid items for the viable prefix γx represents state after recognizing X in state X GOO X β αx Xβ α example H H H he corresponding H id id id

22 onflicts in the O table f the parsing table contains any multiplydefined O entries then G is not wo conflicts arise shiftreduce both shift and reduce possible in same item set reducereduce more than one distinct reduce action possible in same item set onflicts can be resolved through lookahead in O onsider ε aα shiftreduce conflict requires lookahead to avoid shiftreduce conflict after shifting need to see to give precedence over rom previous example OOW state O GOO OOW s s sacc s s r r r r r r s s s s r r r r r r id id id example state O GOO s s s s acc acc acc acc acc s s r r r r r r r r r r s s s s r r r r r r r r r r id id id simple lookahead dd lookaheads after building item sets onstructing the parsing table construct the collection of sets of items for G state i of the is constructed from i a α aβ i and ia j O ia shift j a b α i O ia reduce α a OOW c i O i accept i j GOO i j set undefined entries in O and GOO to error initial state of parser s is

23 xample ut it is state O GOO s s s acc s s r r r r r r r r s s r s r r s s r r r r r r r r r s r r s s items ecall n k xample grammar that is not item is a pair α β where α is a production from G with a at some position in the H marking how much of the H of a production has been seen β is a lookahead string containing k symbols inals or What about items? ll the lookahead strings are constrained to have length ook something like X YZ a OOW xample grammar that is not ts item sets onsider OOW ow consider

24 @ Q ZY X PW [ Z U ^ P] PW ` ^ Q Q ^ W \ Y X \ Y X ^ \ Y X PW OQ O? closure Given an item α βa its closure contains the item and any other items that can generate legal substrings to follow α hus if the parser has viable prefix α on its stack the input should reduce to β or γ for some other item γb in the closure a b β γ α βa b uilding the item sets for grammar G where D We start the construction with the item is the start symbol of the augmented grammar G is the start symbol of G represents GH o compute the collection of sets of items G s D P O O U s Q P O P _ ^ P W O s X c O Q b W _ a ^ P W O d ` DX s O O g a dfe φ DX s O O a O DX s O O g a g ^ O g g P g ^ c P ^ P Q P W PW items What s the point of the lookahead symbols? carry along to choose correct reduction when there is a a choice lookaheads are bookkeeping unless item has at right end in X YZ a has no direct use in XYZa a is useful allows use of grammars that are not uniquely invertible he point or a b α and α we can decide between reducing to or by looking at limited right context o two productions have the same H goto items and X be a grammar symbol et be a set of is the closure of the set of all items X hen GOO a Xβ α such that a β αx f is the set of valid items for some viable prefix γ then GOO X is the set of valid items for the viable prefix γx represents state after recognizing X in state X goto X a β αx = ; a Xβ α ; > ;??

25 = > ; ack to previous example no longer has shiftreduce conflict reduce on shift on nother example onsider d cd c c d d c cd c item sets c cd dcd c d c cd c cd dcd c d state O GOO c d s s acc s s s s r r r s s r r r r onstructing the parsing table uild lookahead into the D to begin with construct the collection of sets of items for G state i of the machine is constructed from i a α aβb i and ia j O ia shift j b α i O ia reduce α c i O i accept i j GOO i j set undefined entries in O and GOO to error initial state of parser s is a xample back to ession grammar n general has many more states than ; ; ; item sets shifting shifting ; ; ; ; ; ; ; ; ; ; ; ; ; ;

26 table construction o construct parsing tables we can insert a single step into the algorithm or each core present among the set of items find all sets having that core and replace these sets by their union he goto function must be updated to reflect the replacement sets he resulting algorithm has large space requirements xample c c d d c cd c c cd c cd dcd d cd c cd dcd c d econsider c d state O GOO c d s s acc s s s s r r r r r r r erged states cd c cd c dcd d cd c cd parsing Define the core of a set of items to be the set of items derived by ignoring the lookahead symbols hus the two sets and α β α β α β α β have the same core ey idea f two sets of items i and j havethesame core we can merge the states that represent them in the O and GOO tables table construction he revised and renumbered algorithm construct the collection of sets of items for G for each core present among the set of items find all sets having that core and replace these sets by their union update the function incrementally state i of the machine is constructed from i a α aβb i and ia j O ia shift j b α i O ia reduce α c i O i accept i j GOO i j set undefined entries in O and GOO to error initial state of parser s is a

27 he role of precedence Precedence and associativity can be used to resolve shiftreduce conflicts in ambiguous grammars shift reduce lookahead with higher precedence same precedence left associative dvantages more concise albeit ambiguous grammars shallower parse trees fewer reductions lassic application ession grammars rror recovery in shiftreduce parsers he problem encounter an invalid token bad pieces of tree hanging from stack incorrect entries in symbol table We want to parse the rest of the file estarting the parser line number find a restartable state on the stack move to a consistent place in the input print an informative message to ore efficient construction Observe that we can represent i by its basis or kernel items that are either or do not have at the left of the H compute shift reduce and goto actions for state derived from i directly from its kernel his leads to a method that avoids building the complete canonical collection of sets of items he role of precedence With precedence and associativity we can use his eliminates useless reductions single productions

28 xample Using can be augmented with his should throw out the erroneous statement synchronize at or invoke Other natural places for errors all the lists missing parentheses or brackets extra operator or missing operator Parsing review ecursive descent hand coded recursive descent parser directly encodes a grammar typically an grammar into a series of mutually recursive procedures t has most of the linguistic limitations of k parser must be able to recognize the use of a k n production after seeing only the first k symbols of its right hand side k parser must be able to recognize the k n occurrence of the right hand side of a production after having seen all that is derived from that right hand side with k symbols of lookahead rror recovery in yaccbisonava UP he error mechanism designated token valid in any production shows syncronization points When an error is discovered pops the stack until is legal skips input tokens until it successfully shifts productions can have actions his mechanism is fairly general rror ecovery of the online UP manual ee eft versus right recursion ight ecursion needed for ination in predictive parsers requires more stack space right associative operators eft ecursion works fine in bottomup parsers limits required stack space left associative operators ule of thumb right recursion for topdown parsers left recursion for bottomup parsers

29 omplexity of parsing grammar hierarchy type α >β ambiguous type context sensitive ααβ >αδβ inear bounded automaton PP complete type context free Α >α arley s algorithm On³ type regular >wx D On On² k nuth s algorithm On unambiguous k ote this is a hierarchy of grammars not languages anguage vs grammar or example every regular language has a grammar that is but not all regular grammars are onsider ab ac Without lefting this grammar is not

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example Administrivia Test I during class on 10 March. Bottom-Up Parsing Lecture 11-12 From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS14 Lecture 11 1 2/20/08 Prof. Hilfinger CS14 Lecture 11 2 Bottom-Up

More information

Syntax Analysis (Part 2)

Syntax Analysis (Part 2) Syntax Analysis (Part 2) Martin Sulzmann Martin Sulzmann Syntax Analysis (Part 2) 1 / 42 Bottom-Up Parsing Idea Build right-most derivation. Scan input and seek for matching right hand sides. Terminology

More information

CS 406: Bottom-Up Parsing

CS 406: Bottom-Up Parsing CS 406: Bottom-Up Parsing Stefan D. Bruda Winter 2016 BOTTOM-UP PUSH-DOWN AUTOMATA A different way to construct a push-down automaton equivalent to a given grammar = shift-reduce parser: Given G = (N,

More information

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van ngelen, Flora State University, 2007 Position of a Parser in the Compiler Model 2 Source Program Lexical Analyzer Lexical

More information

Computer Science 160 Translation of Programming Languages

Computer Science 160 Translation of Programming Languages Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Building a Handle Recognizing Machine: [now, with a look-ahead token, which is LR(1) ] LR(k) items An LR(k) item

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

Top-Down Parsing and Intro to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing Predictive Parsers op-down Parsing and Intro to Bottom-Up Parsing Lecture 7 Like recursive-descent but parser can predict which production to use By looking at the next few tokens No backtracking Predictive

More information

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules. Bottom-up Parsing Bottom-up Parsing Until now we started with the starting nonterminal S and tried to derive the input from it. In a way, this isn t the natural thing to do. It s much more logical to start

More information

Parsing -3. A View During TD Parsing

Parsing -3. A View During TD Parsing Parsing -3 Deterministic table-driven parsing techniques Pictorial view of TD and BU parsing BU (shift-reduce) Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1) Problems with

More information

Compiler Construction Lent Term 2015 Lectures (of 16)

Compiler Construction Lent Term 2015 Lectures (of 16) Compiler Construction Lent Term 2015 Lectures 13 --- 16 (of 16) 1. Return to lexical analysis : application of Theory of Regular Languages and Finite Automata 2. Generating Recursive descent parsers 3.

More information

Compiler Construction Lent Term 2015 Lectures (of 16)

Compiler Construction Lent Term 2015 Lectures (of 16) Compiler Construction Lent Term 2015 Lectures 13 --- 16 (of 16) 1. Return to lexical analysis : application of Theory of Regular Languages and Finite Automata 2. Generating Recursive descent parsers 3.

More information

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class LR2: LR(0) Parsing LR Parsing CMPT 379: Compilers Instructor: Anoop Sarkar anoopsarkar.github.io/compilers-class Parsing - Roadmap Parser: decision procedure: builds a parse tree Top-down vs. bottom-up

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Outline Introduction to Bottom-Up Parsing Review LL parsing Shift-reduce parsing he LR parsing algorithm Constructing LR parsing tables Compiler Design 1 (2011) 2 op-down Parsing: Review op-down parsing

More information

Compiling Techniques

Compiling Techniques Lecture 6: 9 October 2015 Announcement New tutorial session: Friday 2pm check ct course webpage to find your allocated group Table of contents 1 2 Ambiguity s Bottom-Up Parser A bottom-up parser builds

More information

Compiler Construction Lectures 13 16

Compiler Construction Lectures 13 16 Compiler Construction Lectures 13 16 Lent Term, 2013 Lecturer: Timothy G. Griffin 1 Generating Lexical Analyzers Source Program Lexical Analyzer tokens Parser Lexical specification Scanner Generator LEX

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Outline Introduction to Bottom-Up Parsing Review LL parsing Shift-reduce parsing he LR parsing algorithm Constructing LR parsing tables 2 op-down Parsing: Review op-down parsing expands a parse tree from

More information

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario Parsing Algorithms CS 4447/CS 9545 -- Stephen Watt University of Western Ontario The Big Picture Develop parsers based on grammars Figure out properties of the grammars Make tables that drive parsing engines

More information

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis Context-free grammars Derivations Parse Trees Left-recursive grammars Top-down parsing non-recursive predictive parsers construction of parse tables Bottom-up parsing shift/reduce parsers LR parsers GLR

More information

Syntax Analysis, VI Examples from LR Parsing. Comp 412

Syntax Analysis, VI Examples from LR Parsing. Comp 412 COMP 412 FALL 2017 Syntax Analysis, VI Examples from LR Parsing Comp 412 source code IR IR target code Front End Optimizer Back End Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id Bottom-Up Parsing Attempts to traverse a parse tree bottom up (post-order traversal) Reduces a sequence of tokens to the start symbol At each reduction step, the RHS of a production is replaced with LHS

More information

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10 Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 2/13/2008 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Team assignments this evening for all

More information

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X }

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X } Let s try building an SLR parsing table for another simple grammar: S XaY Y X by c Y X S XaY Y X by c Y X Canonical collection: I 0 : {S S, S XaY, S Y, X by, X c, Y X} I 1 : {S S } I 2 : {S X ay, Y X }

More information

Parsing VI LR(1) Parsers

Parsing VI LR(1) Parsers Parsing VI LR(1) Parsers N.B.: This lecture uses a left-recursive version of the SheepNoise grammar. The book uses a rightrecursive version. The derivations (& the tables) are different. Copyright 2005,

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction Bottom Up Parsing David Sinclair Bottom Up Parsing LL(1) parsers have enjoyed a bit of a revival thanks to JavaCC. LL(k) parsers must predict which production rule to use

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Introduction to Bottom-Up Parsing Outline Review LL parsing Shift-reduce parsing The LR parsing algorithm Constructing LR parsing tables Compiler Design 1 (2011) 2 Top-Down Parsing: Review Top-down parsing

More information

THEORY OF COMPILATION

THEORY OF COMPILATION Lecture 04 Syntax analysis: top-down and bottom-up parsing THEORY OF COMPILATION EranYahav 1 You are here Compiler txt Source Lexical Analysis Syntax Analysis Parsing Semantic Analysis Inter. Rep. (IR)

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Introduction to Bottom-Up Parsing Outline Review LL parsing Shift-reduce parsing The LR parsing algorithm Constructing LR parsing tables 2 Top-Down Parsing: Review Top-down parsing expands a parse tree

More information

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1 Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS 164 -- Berkley University 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing); Bottom up parsing General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing); 1 Top-down vs Bottom-up Bottom-up more powerful than top-down; Can process

More information

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking) Announcements n Quiz 1 n Hold on to paper, bring over at the end n HW1 due today n HW2 will be posted tonight n Due Tue, Sep 18 at 2pm in Submitty! n Team assignment. Form teams in Submitty! n Top-down

More information

Compiler Design Spring 2017

Compiler Design Spring 2017 Compiler Design Spring 2017 3.4 Bottom-up parsing Dr. Zoltán Majó Compiler Group Java HotSpot Virtual Machine Oracle Corporation 1 Bottom up parsing Goal: Obtain rightmost derivation in reverse w S Reduce

More information

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7 Compiler Design 1 LR Parsing Compiler Design 2 LR(0) Parsing An LR(0) parser can take shift-reduce decisions entirely on the basis of the states of LR(0) automaton a of the grammar. Consider the following

More information

Syntax Analysis - Part 1. Syntax Analysis

Syntax Analysis - Part 1. Syntax Analysis Syntax Analysis Outline Context-Free Grammars (CFGs) Parsing Top-Down Recursive Descent Table-Driven Bottom-Up LR Parsing Algorithm How to Build LR Tables Parser Generators Grammar Issues for Programming

More information

CS415 Compilers Syntax Analysis Top-down Parsing

CS415 Compilers Syntax Analysis Top-down Parsing CS415 Compilers Syntax Analysis Top-down Parsing These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Announcements Midterm on Thursday, March 13

More information

Computer Science 160 Translation of Programming Languages

Computer Science 160 Translation of Programming Languages Computer Science 160 Translation of Programming Languages Instructor: Christopher Kruegel Top-Down Parsing Top-down Parsing Algorithm Construct the root node of the parse tree, label it with the start

More information

Syntax Analysis Part III

Syntax Analysis Part III Syntax Analysis Part III Chapter 4: Top-Down Parsing Slides adapted from : Robert van Engelen, Florida State University Eliminating Ambiguity stmt if expr then stmt if expr then stmt else stmt other The

More information

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Lecture 3 Lexical analysis Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Big picture Source code Front End IR Back End Machine code Errors Front end responsibilities Check

More information

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described

More information

This lecture covers Chapter 5 of HMU: Context-free Grammars

This lecture covers Chapter 5 of HMU: Context-free Grammars This lecture covers Chapter 5 of HMU: Context-free rammars (Context-free) rammars (Leftmost and Rightmost) Derivations Parse Trees An quivalence between Derivations and Parse Trees Ambiguity in rammars

More information

Chapter 4: Bottom-up Analysis 106 / 338

Chapter 4: Bottom-up Analysis 106 / 338 Syntactic Analysis Chapter 4: Bottom-up Analysis 106 / 338 Bottom-up Analysis Attention: Many grammars are not LL(k)! A reason for that is: Definition Grammar G is called left-recursive, if A + A β for

More information

Compiling Techniques

Compiling Techniques Lecture 5: Top-Down Parsing 26 September 2017 The Parser Context-Free Grammar (CFG) Lexer Source code Scanner char Tokeniser token Parser AST Semantic Analyser AST IR Generator IR Errors Checks the stream

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 28, 2003 These supplementary notes review the notion of an inductive definition and give

More information

Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing

Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing Plan for Today Recall Predictive Parsing when it works and when it doesn t necessary to remove left-recursion might have to left-factor Error recovery for predictive parsers Predictive parsing as a specific

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages CS 314 Principles of Programming Languages Lecture 7: LL(1) Parsing Zheng (Eddy) Zhang Rutgers University February 7, 2018 Class Information Homework 3 will be posted this coming Monday. 2 Review: Top-Down

More information

Lecture Notes on Inductive Definitions

Lecture Notes on Inductive Definitions Lecture Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 September 2, 2004 These supplementary notes review the notion of an inductive definition and

More information

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k. Bottom-up Analysis Theorem: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k. Proof: Let A Aβ α P and A be reachable from S Assumption: G is LL(k) A n A S First k (αβ n γ) First

More information

Context free languages

Context free languages Context free languages Syntatic parsers and parse trees E! E! *! E! (! E! )! E + E! id! id! id! 2 Context Free Grammars The CF grammar production rules have the following structure X α being X N and α

More information

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar CSC 4181Compiler Construction Context-ree Grammars Using grammars in parsers CG 1 Parsing Process Call the scanner to get tokens Build a parse tree from the stream of tokens A parse tree shows the syntactic

More information

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write: Languages A language is a set (usually infinite) of strings, also known as sentences Each string consists of a sequence of symbols taken from some alphabet An alphabet, V, is a finite set of symbols, e.g.

More information

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412 COMP 412 FALL 2018 Syntax Analysis, VII The Canonical LR(1) Table Construction Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

Context-free Grammars and Languages

Context-free Grammars and Languages Context-free Grammars and Languages COMP 455 002, Spring 2019 Jim Anderson (modified by Nathan Otterness) 1 Context-free Grammars Context-free grammars provide another way to specify languages. Example:

More information

Compiling Techniques

Compiling Techniques Lecture 5: Top-Down Parsing 6 October 2015 The Parser Context-Free Grammar (CFG) Lexer Source code Scanner char Tokeniser token Parser AST Semantic Analyser AST IR Generator IR Errors Checks the stream

More information

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully EXAM Please read all instructions, including these, carefully There are 7 questions on the exam, with multiple parts. You have 3 hours to work on the exam. The exam is open book, open notes. Please write

More information

5 Context-Free Languages

5 Context-Free Languages CA320: COMPUTABILITY AND COMPLEXITY 1 5 Context-Free Languages 5.1 Context-Free Grammars Context-Free Grammars Context-free languages are specified with a context-free grammar (CFG). Formally, a CFG G

More information

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. LR(1) Parsers Part III Last Parsing Lecture Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. LR(1) Parsers A table-driven LR(1) parser looks like source code Scanner Table-driven Parser

More information

Why augment the grammar?

Why augment the grammar? Why augment the grammar? Consider -> F + F FOLLOW F -> i ---------------- 0:->.F+ ->.F F->.i i F 2:F->i. Action Goto + i $ F ----------------------- 0 s2 1 1 s3? 2 r3 r3 3 s2 4 1 4? 1:->F.+ ->F. i 4:->F+.

More information

INF5110 Compiler Construction

INF5110 Compiler Construction INF5110 Compiler Construction Parsing Spring 2016 1 / 131 Outline 1. Parsing Bottom-up parsing Bibs 2 / 131 Outline 1. Parsing Bottom-up parsing Bibs 3 / 131 Bottom-up parsing: intro LR(0) SLR(1) LALR(1)

More information

1. For the following sub-problems, consider the following context-free grammar: S AA$ (1) A xa (2) A B (3) B yb (4)

1. For the following sub-problems, consider the following context-free grammar: S AA$ (1) A xa (2) A B (3) B yb (4) ECE 468 & 573 Problem Set 2: Contet-free Grammars, Parsers 1. For the following sub-problems, consider the following contet-free grammar: S $ (1) (2) (3) (4) λ (5) (a) What are the terminals and non-terminals

More information

Lecture 11 Sections 4.5, 4.7. Wed, Feb 18, 2009

Lecture 11 Sections 4.5, 4.7. Wed, Feb 18, 2009 The s s The s Lecture 11 Sections 4.5, 4.7 Hampden-Sydney College Wed, Feb 18, 2009 Outline The s s 1 s 2 3 4 5 6 The LR(0) Parsing s The s s There are two tables that we will construct. The action table

More information

CSE302: Compiler Design

CSE302: Compiler Design CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 27, 2007 Outline Recap

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 7a Syntax Analysis Elias Athanasopoulos

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 7a Syntax Analysis Elias Athanasopoulos ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών Lecture 7a Syntax Analysis Elias Athanasopoulos eliasathan@cs.ucy.ac.cy Operator-precedence Parsing A class of shiq-reduce parsers that can be writen by hand

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. LR(1) Parsers Part II Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. LR(1) Parsers A table-driven LR(1) parser looks like source code Scanner Table-driven Parser IR grammar Parser

More information

UNIT-VI PUSHDOWN AUTOMATA

UNIT-VI PUSHDOWN AUTOMATA Syllabus R09 Regulation UNIT-VI PUSHDOWN AUTOMATA The context free languages have a type of automaton that defined them. This automaton, called a pushdown automaton, is an extension of the nondeterministic

More information

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Language Processing Systems Prof. Mohamed Hamada Software Engineering La. The University of izu Japan Syntax nalysis (Parsing) 1. Uses Regular Expressions to define tokens 2. Uses Finite utomata to recognize

More information

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET Regular Languages and FA A language is a set of strings over a finite alphabet Σ. All languages are finite or countably infinite. The set of all languages

More information

Bottom-Up Syntax Analysis

Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Mooly Sagiv http://www.cs.tau.ac.il/~msagiv/courses/wcc13.html Textbook:Modern Compiler Design Chapter 2.2.5 (modified) 1 Pushdown automata Deterministic fficient Parsers Report

More information

Context-Free Grammars (and Languages) Lecture 7

Context-Free Grammars (and Languages) Lecture 7 Context-Free Grammars (and Languages) Lecture 7 1 Today Beyond regular expressions: Context-Free Grammars (CFGs) What is a CFG? What is the language associated with a CFG? Creating CFGs. Reasoning about

More information

Top-Down Parsing, Part II

Top-Down Parsing, Part II op-down Parsing, Part II Announcements Programming Project 1 due Friday, 11:59PM Office hours every day until then. ubmission instructions will be emailed out tonight. Where We Are ource Code Lexical Analysis

More information

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions Lexical Analysis Part II: Constructing a Scanner from Regular Expressions CS434 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper, Ken Kennedy

More information

Introduction to Turing Machines. Reading: Chapters 8 & 9

Introduction to Turing Machines. Reading: Chapters 8 & 9 Introduction to Turing Machines Reading: Chapters 8 & 9 1 Turing Machines (TM) Generalize the class of CFLs: Recursively Enumerable Languages Recursive Languages Context-Free Languages Regular Languages

More information

Chapter 4: Context-Free Grammars

Chapter 4: Context-Free Grammars Chapter 4: Context-Free Grammars 4.1 Basics of Context-Free Grammars Definition A context-free grammars, or CFG, G is specified by a quadruple (N, Σ, P, S), where N is the nonterminal or variable alphabet;

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

CISC4090: Theory of Computation

CISC4090: Theory of Computation CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter

More information

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26 Parsing Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Context-Free Grammars 2 Simplifying CFGs Removing useless symbols Eliminating

More information

Lecture 11 Context-Free Languages

Lecture 11 Context-Free Languages Lecture 11 Context-Free Languages COT 4420 Theory of Computation Chapter 5 Context-Free Languages n { a b : n n { ww } 0} R Regular Languages a *b* ( a + b) * Example 1 G = ({S}, {a, b}, S, P) Derivations:

More information

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University.

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University. Lexical Analysis Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University http://compilers.cs.uni-saarland.de Compiler Construction Core Course 2017 Saarland University Today

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix.

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix. (NB. Pages 22-40 are intended for those who need repeated study in formal languages) Length of a string Number of symbols in the string. Formal languages Basic concepts for symbols, strings and languages:

More information

Type 3 languages. Type 2 languages. Regular grammars Finite automata. Regular expressions. Type 2 grammars. Deterministic Nondeterministic.

Type 3 languages. Type 2 languages. Regular grammars Finite automata. Regular expressions. Type 2 grammars. Deterministic Nondeterministic. Course 7 1 Type 3 languages Regular grammars Finite automata Deterministic Nondeterministic Regular expressions a, a, ε, E 1.E 2, E 1 E 2, E 1*, (E 1 ) Type 2 languages Type 2 grammars 2 Brief history

More information

Supplementary Notes on Inductive Definitions

Supplementary Notes on Inductive Definitions Supplementary Notes on Inductive Definitions 15-312: Foundations of Programming Languages Frank Pfenning Lecture 2 August 29, 2002 These supplementary notes review the notion of an inductive definition

More information

Notes for Comp 497 (454) Week 10

Notes for Comp 497 (454) Week 10 Notes for Comp 497 (454) Week 10 Today we look at the last two chapters in Part II. Cohen presents some results concerning the two categories of language we have seen so far: Regular languages (RL). Context-free

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY REVIEW for MIDTERM 1 THURSDAY Feb 6 Midterm 1 will cover everything we have seen so far The PROBLEMS will be from Sipser, Chapters 1, 2, 3 It will be

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: LR(0) Grammars LR(0) Grammars The case k = 0 is

More information

Announcements. H6 posted 2 days ago (due on Tue) Midterm went well: Very nice curve. Average 65% High score 74/75

Announcements. H6 posted 2 days ago (due on Tue) Midterm went well: Very nice curve. Average 65% High score 74/75 Announcements H6 posted 2 days ago (due on Tue) Mterm went well: Average 65% High score 74/75 Very nice curve 80 70 60 50 40 30 20 10 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106

More information

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up

More information

Fundamentele Informatica II

Fundamentele Informatica II Fundamentele Informatica II Answer to selected exercises 5 John C Martin: Introduction to Languages and the Theory of Computation M.M. Bonsangue (and J. Kleijn) Fall 2011 5.1.a (q 0, ab, Z 0 ) (q 1, b,

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

Compiler Principles, PS4

Compiler Principles, PS4 Top-Down Parsing Compiler Principles, PS4 Parsing problem definition: The general parsing problem is - given set of rules and input stream (in our case scheme token input stream), how to find the parse

More information

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen Pushdown Automata Notes on Automata and Theory of Computation Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan ROC Pushdown Automata p. 1

More information

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Notes for Comp 497 (Comp 454) Week 10 4/5/05 Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability

More information

UNIT-VIII COMPUTABILITY THEORY

UNIT-VIII COMPUTABILITY THEORY CONTEXT SENSITIVE LANGUAGE UNIT-VIII COMPUTABILITY THEORY A Context Sensitive Grammar is a 4-tuple, G = (N, Σ P, S) where: N Set of non terminal symbols Σ Set of terminal symbols S Start symbol of the

More information

Context-Free Grammars and Languages

Context-Free Grammars and Languages Context-Free Grammars and Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully EXAM Please read all instructions, including these, carefully There are 7 questions on the exam, with multiple parts. You have 3 hours to work on the exam. The exam is open book, open notes. Please write

More information

Accept or reject. Stack

Accept or reject. Stack Pushdown Automata CS351 Just as a DFA was equivalent to a regular expression, we have a similar analogy for the context-free grammar. A pushdown automata (PDA) is equivalent in power to contextfree grammars.

More information

Automata and Computability. Solutions to Exercises

Automata and Computability. Solutions to Exercises Automata and Computability Solutions to Exercises Spring 27 Alexis Maciel Department of Computer Science Clarkson University Copyright c 27 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

More information