PARSING AND TRANSLATION November 2010 prof. Ing. Bořivoj Melichar, DrSc. doc. Ing. Jan Janoušek, Ph.D. Ing. Ladislav Vagner, Ph.D.

Size: px
Start display at page:

Download "PARSING AND TRANSLATION November 2010 prof. Ing. Bořivoj Melichar, DrSc. doc. Ing. Jan Janoušek, Ph.D. Ing. Ladislav Vagner, Ph.D."

Transcription

1 PARSING AND TRANSLATION November 2010 prof. Ing. Bořivoj Melichar, DrSc. doc. Ing. Jan Janoušek, Ph.D. Ing. Ladislav Vagner, Ph.D.

2

3 Preface More than 40 years of previous development in the area of compiler construction provides us with the main following results: The basic principles of translation process decomposition are now well understood. Moreover, the decomposition tightly corresponds with the organization of the translator modules. There exist formal methods of language description, as well as formal methods that describe translations. In addition to the exactness, these methods allow translation description without direct relation to their algorithmic imlpementations. There are methods that can be used to create parser or translator. These methods construct the parser or compiler program based on the formal description of the language or translation, respectively. There are several utility programs that allow automated construction of compilers or their parts. The theory of formal languages, grammars and automata has an important role in the development of the compiler construction methods. One of the most important result of this theory is the parsing theory. It provides us with several parsing algorithms. The most important of them are the parsing algorithms for LL and LR grammars. These algorithms are in common use in nowadays compilers. The milestone in the development of compiler construction methods is the concept of syntax directed translation. The basis of this concept is the fact that the parser can take over the entire translation process. This idea leaded to the theory of formal translation, translation grammars, attribute grammars, and translation automata. The most important practical result of all the above mentioned theories are algorithms that allow the construction of an algorithmical implementation of a parsing or translation method based on its non-procedural description. Such an algorithms are the basis for program tools for automated construction of compiler parts. Even that the development in the above mentioned theories is far from being finished, our current knowledge in the area of compiler construction represents large theoretical basis that is not comparable to any other area in the computer software development. The sequels of this evident fact are approaches to employ the compiler construction principles in other areas such as in text editors, syntax-directed editors, information and database systems, pattern matching systems, text formatting systems, image drawing systems and many other program products. Prague, November 2010 Authors

4 Contents 1 Notions used in this textbook 1 2 LR grammars and languages Strong LR grammars Weak LR grammars LR(0) grammars Simple LR(k) grammars LALR grammars LR(k) grammars Properties of LR grammars an languages Formal translation and bottom-up parsing Pushdown translation automata and postfix grammars Formal translation directed by LR parsing Postfix translation grammars with LR(k) input grammars LR(k) translation grammars Translation grammars with LR(k) input grammars Attributed translation directed by LR parser S attributed grammars LR attributed translation grammars Parallel parsing Fundamental parallel algorithms Parallel algorithm performance evaluation Parallel reduction Parallel prefix sum Parallel parentheses matching Parallel finite automaton Parallel LL parsing Parallel parser structure Nondeterministic parallel LL parsing Deterministic parallel LL parsing LLP(1,k) grammars LLP(q,k) grammars Performance analysis An optimal EREW PRAM algorithm LLP grammars and languages i

5 5.4 Parallel LR parsing Sequential LR parsing Ideal parallel LR parsing Deterministic parallel LR parsing Gluing processes Parsing with Reduced Pushdown Store Activity Reductions between Shifts of Two Adjacent Symbols Faster GLR Parsing Some Empirical Results Reconstructing Derivations Bibliography 117 ii

6 List of Figures 2.1 LR automaton goto function as a graph LR automaton from Example LR automaton from Example LR automaton for grammar from Example LR automaton for LALR(1) grammar from Example LR automaton for LR(1) grammar from Example Transitions of translation automaton LR parsing directed translation of string a+a a from Example Formal translation of an input string by LR(1) grammar from Example Translation trees for pair (aabb, xxyyyy) from Examples 3.21 and GOTO graph for grammar from Example Attribute translation tree for string id;k[25], Derivation tree Parallel reduction algorithm Parallel prefix sum algorithm Parallel parentheses matching algorithm Parallel finite automaton algorithm Parsing table for the grammar from Example Parallel LL parser structure Processor network for parallel parsing of string a+a a Possible leaf processes for the grammar from Example Parallel parsing of input string a+a a Possible processes for given lookahead and lookback strings Deterministic parallel LL parsing for input string a+a a Deterministic parallel LL parsing with time optimal gluing Relation between LL, LLP and regular languages Parallel parsing and gluing for input string a+a a Partial parsing and some gluing for input string a] Parsing table of standard LR(1) parser for expression grammar, generated by SLR technique Our pushdown automaton for the expression grammar. The subroutine recognizing G E is enclosed in the dashed box, and the grayed circle is the start state Trace of standard LR parser (left), and our automaton (right) iii

7 6.4 The optimized pushdown automaton. Each edge is labelled by a triple a, x, y, where a is the symbol to be read, x is the change of the pushdown store, and y is the output Trace of optimized pushdown automaton Timing results for some ambiguous grammars Timing results for the expression grammar from Example A sample reduction log and derivations. Squares are reduction nodes and circles are fan-in nodes iv

8 Chapter 1 Notions used in this textbook In this textbook, many terms from the theory of formal translation and attributed grammars, and theory of translation automata will be introduced. We will use terms from logic, set theory, relation theory, graph theory, grammar theory, and automata theory to define the above mentioned terms. These terms wil be recalled in this section. Logic A statement is a sentence, that can be unambiguously decided to be either true or false. To simplify the notion of complex assertions, we use the following operators: logical conjunction (conjunction), logical disjunction (disjunction), logical implication, equivalence. Let P and Q be assertions. Then: P Q is true if both P and Q are true simultaneously, P Q is true if at least one of P and Q is true, P Q (P implies Q) is not true if and only if P is true and Q is false, P Q is true if and only if both P and Q are simultaneously true of both are simultaneously false. Let us present some example statement: (A = B) (A B B A). The statement states that two sets A and B are equal if and only if one set is a subset of the other set and vice versa. We will use the following equivalence for some proofs: let P and Q be arbitrary assertions. Then it holds: (P Q) [(P Q) (Q P)]. In other words, two statements P and Q are equivalent if and only if P implies Q and Q implies P. The symbol denotes universal quantifier, symbol existential quantifier. The notion xp means P holds for all x, notion xp means there exists x such that P holds. 1

9 Sets We will use the term set in the usual intuitive meaning. The notion x X denotes that x is an element of the set X, x X denotes the fact x is not an element of the set X. Theinclusion between two sets will bedenoted by notion X Y which means that every element of the set X is also present in the set Y, i.e. X is a subset of Y. The notion X = {x : P(x)} means that the set X contains elements x that they match the condition P(x). Some examples from the set theory follows: union: A B = {x : x A x B}, intersection: A B = {x : x A x B}, difference: A B = {x : x A x B}, cartesian product: A B = {(x,y) : x A y B}, power set: 2 A = {B : B A}. A set that contains only a finite number of elements will be denoted finite set. Finite sets may be specified by exhaustive enumeration of their elements. The notion X = {a,b,c} means that the set X contains just the three elements a, b, and c (it does not contain any other elements). Empty set will be denoted by the symbol. If for some sets holds: A B =, then the sets A and B are disjoint sets. Relations Abinaryrelationbetweenelemets ofsetaandelementsofsetbiseverysetrwherer A B. The fact (x,y) R is often denoted xry. A relation over set A is every relation R defined R A A. Such a relation can be: reflexive: if x A is xrx, symmetric: if x,y A is xry yrx, antisymmetric: if xry and yrx implies that x = y, transitive: if xry and yrz implies that xrz. The product of a relation R A B and a relation S B C is a relation R S = {(x,y) : z B : (x,z) R (z,y) S}. The k-th power of a relation R over set A is defined for k 0: R 0 = {(x,x) : x A}, R k = R k 1 R pro k > 0. Transitive closure of a relation R over set A is a relation R + = Transitive and reflexive closure of a relation R over set A is a relation R = Mapping Mapping is a special case of a relation. Mapping from set A to set B is any relation F A B such that x A, there exists at most one y B such that xfy. For mappings, we use the notion F(x) = y instead of xfy. The element y denotes the value of F for x. A mapping F from set A to set B is denoted F : A B. If there are some x A for which the value of F(x) is not defined then F is partial mapping. The opposite case is complete mapping which is defined for all x A. Graphs Oriented graph is a pair (V,H), where V is a finite set of nodes and H V V is a set of edges. k=1 R k. k=0 R k. 2

10 Edges are denoted (x,y) H, where x stands from the node where the edge starts and y denotes the node where the edge ends (leads to). A finite sequence of edges (x 0,x 1 ), (x 1,x 2 ),..., (x n 1,x n ) (where no node occurs more than once) is denoted path of length n from node x 0 to node x n. A graph (V,H) is a tree if it contains just one node where no edge leads to (this node is the root node), and for all nodes y different from the root node, there exist a path from the root node to the y. The nodes of a tree where no edge starts are leaves. Formal languages An alphabet is arbitrary finite nonempty set of elements symbols. A string over alphabet is any finite sequence of symbols from the alphabet. The empty sequence is also a string, an empty string, which will be denoted ε. The set of all strings over alphabet T will be denoted T. The set of all nonempty strings over T is T +. It holds that: T = T + {ε}. If strings x and y are from T then z = xy is a concatenation of strings x and y. The length of a string is the number of symbols the string consists of. The length of a string x is denoted x. Formal language L over an alphabet T is arbitrary subset of T. The complement of a language L 1 over alphabet T is language L 2 = T L 1. The product of languages L 1 and L 2 is a language L = L 1.L 2 = {xy : x L 1 y L 2 }. The k-th power of a language L over T is defiend for k 0 as follows: L 0 = {ε}, L k = L k 1.L for k > 0. An iteration of a language L is language L = L n. n=0 A positive iteration of a language L is language L + = L n. Grammars A grammar is a fourtuple G = (N,T,P,S), where N is a finite set of nonterminal symbols (nonterminals for short), T is a finite set of terminal symbols, P (N T).N.(N T) (N T) is a finite set of rules (a rule (α,β) from P is often denoted α β), and S N is starting symbol. A context-free grammar is a grammar where rules are of the form A α, A N, α (N T). A regular grammar is a grammar, where rules are of the form A a, A ab, A,B N, a T. The relation α β (N T) (N T) is a derivation in a grammar G if α = γxδ, β = γωδ, γ,δ,ω (N T), X N, X ω P. The k-th power, transitive closure, and reflexive and transitive closure of the derivation relation will be denoted k, +, and, respectively. A language L generated by a grammar G is a set L(G) = {x : x T S x}. Finite automata (Nondeterministic) finite automaton is a quintuple A = (Q,T,δ,q 0,F), where Q is a finite set of states, T is an input alphabet, δ is a mapping from Q T to 2 Q, q 0 Q is initial state, and F Q is a set of final states. A pair (q,w) Q T is denoted configuration of a finite automaton, (q 0,w) is the initial configuration, and (q,ε) where q F is final (accepting) configuration. A relation (q,aw) (p,w) (Q T ) (Q T ) is a transition of an automaton A, if p δ(q,a). The k-th power, transitive, and transitive and reflexive closure of the relation is denoted k, +,, respectively. A finite automaton A is deterministic if it holds: q Q,a T : δ(q,a) 1. n=1 3

11 A language L accepted by a finite automaton A is a set L(A) = {x : x T δ(q 0,x) (q,ε) q F}. Pushdown automata (Nondeterministic) pushdown automaton is a seven-tuple M = (Q,T,G,δ,q 0,Z 0,F), where Q is a finite set of states, T is an input alphabet, G is a pushdown store alphabet, δ is a mapping from Q (T {ε}) G into a set of finite subsets of Q G, q 0 Q is an initial state, Z 0 G is the initial contents of the pushdown store, and F Q is the set of final (accepting) states. Triplet (q,w,x) Q T G denotes theconfiguration ofapushdownautomaton. Theinitial configuration of a pushdown automaton is a triplet (q 0,w,Z 0 ) for the input word w T. Therelation (q,aw,αβ) (p,w,γβ) (Q T G ) (Q T G ) is a transition of a pushdown automaton M if (p,γ) δ(q,a,α). The k-th power, transitive closure, and transitive and reflexive closure of the relation is denoted k, +,, respectively. A pushdown automaton M is deterministic, if it holds: δ(q,a,γ) 1 for all q Q, a T {ε}, γ G. If δ(q,a,α), δ(q,a,β) and α β then α is not a suffix of β and β is not a suffix of α. If δ(q,a,α), δ(q,ε,β), then α is not a suffix of β and β is not a suffix of α. A language L accepted by a pushdown automaton M is defined in two distinct ways: Accepting by final state: L(M) = {x : δ(q 0,x,Z 0 ) (q,ε,γ) x T γ G q F}, Accepting by empty pushdown store: Lε(M) = {x : (q 0,x,Z 0 ) (q,ε,ε) x T q Q}. 4

12 Chapter 2 LR grammars and languages This chapter introduces a group of parsing algorithms which create parsing tree of the input string from bottom to top. These algorithms are named LR parsers since they read input string from left to right and they produce right parse of the input. The algorithm may use the information of the nearest k symbols of the unread part of the input string. Grammars which allow such a parser to be constructed are named LR(k) grammars. The basic principle of LR parser can be stated as follows: Let G = (N,T,P,S) be an unambiguous context-free grammar and let w = a 1 a 2...a n be an input string from language L(G). Then there exists rightmost derivation S = γ 1 γ 2... γ m = w. Since the mentioned derivation is rightmost one, every sentential form γ i (i = 1,2,...,m 1) is of the form γ i = αaa j a j+1...a n, where A N, α (N T) and string a j a j+1...a n T is a suffix of the input string w. Suppose γ i 1 = αbz and a rule B β be used in a derivation step γ i 1 γ i (that is αbz αβz). The main problem of deterministic bottom-up parsing is to find out the correct string βz in the sentential form γ i. If the string is found, sentential form γ i can be reduced to the sentential form γ i 1. The model of a bottom-up parser is pushdown automaton. Such an automaton is, in general, nondeterministic one, thus cannot be directly used as a parser. Let consider how a deterministic pushdown automaton can be constructed for a given grammar and what circumstances are to be satisfied. Given a context-free grammar, a pushdown automaton can be constructed. The transition mapping δ is defined as follows (remember, the top of the pushdown store is on the right-hand side): 1. δ(q,a,ε) = {(q,a)} a T, 2. δ(q,ε,α) = {(q,a) : A α P}, 3. δ(q,ε,#s) = {(r,ε)}. The operations are denoted shift (1), reduce (2), and accept (3). The above shown construction leads to a pushdown automaton that is nondeterministic in all cases. The reason resides in the fact that the shift is defined by a transition δ(q,a,ε) = {(q,a)} and reduce by a transition δ(q,ε,α) = {(q,a)}. For both transitions, the string ε is a prefix (an also a suffix) of the string α. To obtain a really deterministic pushdown automaton, the construction is to be changed. The main problem of the construction is the fact that shift operations are done regardless the contents of the pushdown store. Therefore, we will attempt to modify the automaton in such a way that it can decide which operation (shift or reduce) to perform based on the symbol that is on the top of the pushdown store. We will demonstrate the technique in the following example. 5

13 Example 2.1: Let a context-free grammar be G = ({S,A,B},{a,b,c,d},P,S), where P contains rules: (1) S Aa (2) A bb (3) A Ac (4) B d A pushdown automaton for that grammar can be constructed as follows: R = ({q,r},{a,b,c,d},{s,a,b,a,b,c,d,#},δ,q,#,{r}), where transition δ is defined: 1. δ(q,a,ε) = {(q,a)} δ(q,b,ε)) = {(q,b)} δ(q,c,ε)) = {(q,c)} δ(q,d,ε)) = {(q,d)} 2. δ(q,ε,aa) = {(q,s)} δ(q,ε,bb) = {(q,a)} δ(q,ε,ac) = {(q,a)} δ(q,ε,d) = {(q,b)} 3. δ(q,ε,#s) = {(r,ε))} This pushdown automaton is nondeterministic because it performs shifts according to its definitions. Based on the contents of the pushdown store, the shifts may be performed as follows: δ(q,a,a) = {(q,aa)} - symbols a and c appear in the sentential form after δ(q,c,a) = {(q,ac)} the symbol A, δ(q,b,#) = {(q,#b)} - symbol b can appear at the beginning of the sentential form only, δ(q,d,b) = {(q,bd)} - symbol d can appear just after symbol b only. This modification leads to a deterministic pushdown automaton for the given grammar. However, this technique is not universal, it can be used for a limited class of grammars (for strong LR(0) grammars) only. For other grammars, the modification will not work and the resulting pushdown automaton will not be deterministic. The bottom-up parser is similar to top-down parser both the parsers can use the following additional information to choose next operation while parsing: 1. the information about the not-yet read part of the input string, 2. the information about parsing in the past. There are grammars that can be deterministically parsed by the bottom-up parser with the additional information about up to k closest symbols in the unread part of the input string. These grammars are strong LR(k) grammars. In the next sections we will study two classes of LR grammars. First, we will introduce strong LR(k) grammars. Then, weak LR(k) grammars will be studied. Deterministic parsing of weak LR(k) grammars must use information about the parsing history. Both classes of LR grammars use the same (except for slight modifications) parsing algorithm which is based on the pushdown automaton. For both classes of grammars, a parsing table is used to decide whether a reduction is to be performed or not and which reduction is to be choosen. The parsing table contains all necessary information. The table is constructed based on the grammar. The construction algorithm is different for both classes of LR grammars. We will describe the algorithms for individual cases in the following sections. 6

14 2.1 Strong LR grammars Strong LR grammars are context-free grammars, for which exist a deterministic bottom-up parser that: 1. uses the information about up to k closest symbols in the not-yet read part of the input string, 2. does not use the information about the parsing history. Prior to defining strong LR(k) grammars, we will introduce functions BEFORE and EFF k. Definition 2.2: Let G = (N,T,P,S) be a context-free grammar, X N, and α (N T). Functions BEFORE(X) and EFF k (α) are defined as follows: BEFORE(X) = {Y : S αyxβ,y (N T)} {# : S Xβ}, EFF k (α) = {w : w FIRST k (α),and there exists rightmost derivation α β wx such, that for no β holds that β = Awx }. The set EFF k (α) contains all strings from the set FIRST k (α) that were not derived by a derivation α β wx such that the first nonterminal in β was substitued by an empty string. The name EFF stands for ε-free first. Now, we can define strong LR(k) grammar. Definition 2.3: Acontex-freegrammarG = (N,T,P,S)isastrongLR(k)grammmar, iftheaugmentedgrammar G = (N {S },T,P {S S},S ) mets the following criteria: 1. If P contains a pair of rules of the form: (a) A αx,b βx, (b) A αx,b ε and X BEFORE(B), or (c) A ε,b ε and X BEFORE(B), X BEFORE(A) then FOLLOW k (A) FOLLOW k (B) =. 2. If P contains a pair of rules of the form: (a) A αx, B αxγ, (b) A ε,b αxγ and X BEFORE(A), or (c) A ε,b γ and X BEFORE(A), X BEFORE(B) then FOLLOW k (A) EFF k (γfollow k (B)) =. The first condition ensures that in the case of a reduction, it is possible to choose the correct rule for the reduction based on up to k lookahead symbols. The second condition guarantees that it is possible to decide whether reduction or shift operation is to be performed. Similarly to the top-down parser, the bottom-up parser uses parsing table when choosing the next operation that is to be performed. The table entries contain the appropriate operation, which is based on the topmost pushdown store symbol and on the lookahead string. The parsing table for a strong LR(k) grammar can be constructed using the following algorithm. Algorithm 2.4: Construction of a parsing table for a strong LR(k) grammar. Input: Strong LR(k) grammar G = (N,T,P,S). Output: Parsing table p for G. Method: Parsing table p is defined over (N T {#}) T k. 7

15 1. The input grammar G is augmented: 2. Parsing table p is constructed: G = (N {S },T,P {S S},S ). (a) p(x,u) = reduce(i), if A αx is i-th rule in P and u FOLLOW k (A). (b) p(x,u) = reduce(i), if A ε is i-th rule in P, X BEFORE(A), u FOLLOW k (A). (c) p(s, ε) = accept. (d) p(x,u) = shift, if B βxγ P and u EFF k (γfollow k (B)). (e) p(x,u) = error in all other cases. Example 2.5: Given grammar G = ({E,E,T,T,F},{a,+,,(,)},P,E), where P contains rules below. Evaluate parsing table for G. (1) E E T (5) T T (2) E E+ (6) T ε (3) E ε (7) F (E) (4) T T F (8) F a We augment the grammar by the rule (0) S E. Grammar G is a strong LR(1) grammar, thus parsing table may be constructed. The table is shown below. The operations in the table are denoted as follows: Sh...shift, R(i)...reduce(i), A...accept, error entries are left blank. When constructing the table, we used that BEFORE(E ) = {#,(} and BEFORE(T ) = {E }. p a + ( ) ε E Sh Sh A E R(6) R(6) T R(1) Sh R(1) R(1) T Sh Sh F R(4) R(4) R(4) R(4) a R(8) R(8) R(8) R(8) + R(2) R(2) R(5) R(5) ( R(3) R(3) ) R(7) R(7) R(7) R(7) # R(3) R(3) Strong LR(k) parsing can be done using the following algorithm. Algorithm 2.6: Strong LR(k) parsing algorithm. Input: Parsing table p for grammar G = (N,T,P,S), grammar G rules, and input string w T. Output: Right parse in case string w L(G), error signaling otherwise. Method: The algorithm reads symbols from the input string w, makes use of the pushdown store and creates a string of numbers of rules which were used during the reductions. The initial pushdown store contents is#. The algorithm repeats steps(1a) and(1b) until the input string is either accepted or rejected (error signaling). In the description below, let the symbol X denote the symbol on the top of the pushdown store. 1. Evaluate the lookahead string (of length k), let it be u. (a) If p(x,u) = shift, one symbol is read from the input string and is stored in the pushdown store. 8

16 (b) If p(x,u) = reduce(i), the algorithm finds rule i, let it be A α. Then remove (pop) string α from the pushdown store, push the symbol A on the pushdown store and append rule number i to the right parse. If the contents of the pushdown store was not equal to α (and thus it was not possible to remove it), an error is detected and parsing ends with an error signaling. (c) If p(x,ε) = accept and the contents of the pushdown store is #X, the parsing was successful and the output string is correct right parse of the input string. If the contents of the pushdown store is different, an error signaling takes place instead. (d) If p(x,u) = error, the parsing ends with an error signaling. A configuration of the parsing algrithm is triplet (α,x,π), where α is the contents of the pushdown store (topmost symbol is on the right-hand side), x is the not-yet read part of the input string, π is the so far created part of the output, (#, w, ε) is the initial configuration, (#S, ε, π) is the final configuration. Example 2.7: Let us demonstrate the parsing of the input string a+a a. We will use parsing table evaluated in Example 2.5. (#,a+a a,ε) (#E, a+a a, 3) (#E T, a+a a, 36) (#E T a, +a a, 36) (#E T F, +a a, 368) (#E T, +a a, 3684) (#E, +a a, 36841) (#E+, a a, 36841) (#E, a a, ) (#E T, a a, ) (#E T a, a, ) (#E T F, a, ) (#E T, a, ) (#E T, a, ) (#E T, a, ) (#E T a, ε, ) (#E T F, ε, ) (#E T, ε, ) (#E, ε, ) A special case of strong LR(k) grammars are strong LR(0) grammars. For these grammars, the information about the topmost symbol on the pushdown store is sufficient when choosing the next operation during parsing. Any strong LR(0) grammar has these properties: 1. the right-hand sides of all rules in the augmented grammar G end with mutualy different symbols, 2. a symbol occuring at the end of a right-hand side of any rule does not appear in any other rule on the right-hand side. 9

17 The above properties imply that grammar G does not contain any ε-rules and that starting symbol S does not appear on right-hand side of any rule in G. Moreover, the symbols occuring at the end of right-hand sides of rules positively identify when a reduction is to be performed and what rule to use for the reduction. If a symbol X, which occurs at the end of rule A αx, appears on the top of the pushdown store, then reduction by rule A αx is to be performed. Example 2.8: Let grammar G = ({S,A,B},{a,b,c,d},P,S) have P containing rules: (1) S Aa (3) A Ac (2) A bb (4) B d Grammar G is a strong LR(0) grammar. We augment the grammar with rule (0) S S and construct parsing table. p ε S A A Sh B R(2) a R(1) b Sh c R(3) d R(4) # Sh The parsing of the input string bdca is depicted below. (#,bdca,ε) (#b, dca, ε) (#bd, ca, ε) (#bb, ca, 4) (#A, ca, 42) (#Ac, a, 42) (#A, a, 423) (#Aa, ε, 423) (#S, ε, 4231) 2.2 Weak LR grammars ThestrongLRparsingwasusinginformationaboutk symbolsfromthenot-yetreadpartoftheinput string and one symbol on the top of the pushdown store only. In the case of weak LR grammars, such an information is not enough. When choosing the next rule in the parsing, weak LR parser uses the information about parsing history in addition to the strong LR parser. Obviously, weak LR grammars include strong LR grammars as a proper subset. For that reason, we will omit the weak adjective in the next text. During the parsing of an input string x generated by a grammar G, the bottom-up parser uses the pushdown store to keep a string that corresponds to a prefix of some rightmost sentential form that occures in the rightmost derivation of x in G. If a grammar G = (N,T,P,S) allows a derivation S αaw αβw xw, then rightmost sentential form w may be reduced using rule A β to rightmost sentential form αaw. The substring β is a handle of sentential form αβw,α,β (N T),w T. Definition 2.9: Let G = (N,T,P,S) be a context-free grammar and let G = (N {S },T,P redukční část 10

18 {S S},S ) be augmented grammar for G. We state that G is LR(k) grammar k 0, if (all derivations are rightmost): 1. S rm αaw αβw, 2. S rm γbx αβy, 3. FIRST k (w) = FIRST k (y) implies that αay = γbx, (i.e. α = γ, A = B, and x = y). In other words, it is possible to positively decide that a reduction by a rule A β is to be performed based on string α and the lookahead string of length up to k symbols. Definition 2.10: Assume that S αaw αβw is a rightmost derivation in a context-free grammar G = (N,T,P,S). A string γ is a viable prefix in G, if it is a prefix of αβ. It means that the string γ is a string which is a prefix of some rightmost sentential form and it does not include complete handle in that sentential form. If γ = αβ, then γ is a complete viable prefix. In a bottom-up parsing, a viable prefix appears in the pushdown store during the parsing. If the pushdown store contains a complete viable prefix, a reduction can be performed. In addition to the LR(k) grammars, we will study subclasses of LR(k) grammars: LR(0) grammars, simple LR(k) grammars, (SLR(k) grammars), LALR(k) grammars. The reasons for defining such a subclasses of LR(k) grammars are listed below: the construction of a parser for the subclasses is simplier than the construction for general LR(k) parser, the tables for the parser are smaller LR(0) grammars LR(0) grammars are grammars that can be parsed by a deterministic bottom-up parser which uses only parsing history to decide the next parsing operation. It means that the parser does not use the lookahead information. Example 2.11: Given grammar G = ({S,A,B},{a,b},P,S), where P contains rules: (1) S aab (4) A b (2) S aabba (5) B ε (3) A aa We demonstrate that deterministic parsing of strings generated by this grammar can be done using information about parsing history only. This is not obvious, because symbol b, for instance, appears in G in three different places: it is the right-hand side of the rule A b, and it is a part of right hand sides of rules S aab and S aabba. The parsing of strings abab and aaba is demonstrated in the table below. These two examples show that reductions are choosen based on a certain contents of the pushdown store. The next table perspektivní předpona úplná perspektivní předpona 11

19 depicts the strings that will be contained in the pushdown store when a reduction is to be performed using a certain rule. Such strings are always complete viable prefixes. We can see that the example is simple, because the contents of the pushdown store is just one string (one viable prefix) for each rule. Input Pushdown store contents Operation abab ε bab a ab ab ab aa b aaa b aa ε aab ε S aaba ε aba a ba aa ba aab a aabb ε aabba ε S shift a shift b reduction A b shift a reduction A Aa shift b reduction S aab accept shift a shift a reduction B ε shift b shift a reduction S aabba accept There may exist several complete viable prefixes for every rule. Even more, it is possible that there might exist infinite number of complete viable prefixes for a rule. However, it is proven that the sets of complete viable prefixes are regular. It means that it is possible to construct a finite automaton to analyze the viable prefixes. This automaton is called characteristic automaton for LR grammar, LR automaton for short. The usage of a LR automaton for analysis of viable prefixes makes LR parsing simplier. When using LR automaton, it is not needed to traverse the pushdown store to decide which operation to use. Instead, it is sufficient to look at state of the LR automaton. The LR automaton for G augmented by the rule S S is depicted in Figure 2.1. Rule S aab S aabba A Aa A b B ε Pushdown store contents aab aabba aaa ab aa The automaton in Figure 2.1 was constructed such that for every complete viable prefix, there exists a sequence of transitions from the starting state to a final state. The final state corresponds to a certain reduction. Therefore the final states are different and every final state is labeled by a rule that will be used for the reduction. We outline the LR(0) parsing algorithm (the exact algorithm will be stated later): Read input string and traverse the LR automaton accordingly. Store the reached states in the pushdown store. When a final state is reached, a reduction is performed. It means that the states that correspond to the right-hand side of the reduction rule are replaced by the nonterminal standing on the left-hand side of the reduction rule. In the terms of the LR automaton, this can be thought as a return to a state that corresponds to the situation before treating the first symbol on the right-hand side of the reduction rule. In 12

20 START # S S 1 S' S a b 1 b a 1 a B a 2 B B 1 b b 3 A b a 4 A a a A 1 b b 2 a3 A Aa S aab S aabba Figure 2.1: LR automaton that symbol, an edge labeled by the nonterminal standing on the left-hand side of the reduction rule must exist. Using such an edge, new automaton state is reached. To simply evaluate the end of the parsing, we augment the grammar with a new starting symbol S and a new rule S S. The complete viable prefix for that rule is always S. The reduction by this rule can be therefore considered the end of parsing and accepting of the input string. The LR automaton states can be mnemonically labeled as follows: Edges labeled by symbol X lead to a state X i, the subscript i is choosen so the label X i is unique. The starting state will be labeled #. There are three methods to construct the LR automaton: 1. By constructing the collection of sets of LR items. The function GOT O is the transition function of the LR automaton. 2. Construction of LR automaton directly from the grammar. 3. By establishing a system of regular equations. The solution of the equation system are regular expressions describing the sets of complete viable prefixes. The LR automaton can be obtained by constructing finite automaton from these regular equations. The first method will be studied in detail. The second and the third ones are described in [2,8]. Definition 2.12: A LR(0) item is a rule from grammar with a position mark on the right-hand side. We will use the symbol. (dot) to mark the position. For instance A α.β. A set of LR(0) items contains LR(0) items which have identical symbol before the dot mark. A set of LR(0) items describes parsing state at the moment when a certain symbol was pushed onto the pushdown store. The symbols after the dot mark are the symbols which might be pushed onto the pushdown store when changing to a new state. There are two important kinds of LR(0) items in a set of LR(0) items: 1. LR(0) items where the dot mark is followed by a terminal symbol. These items represent situation where shift will be performed. 2. LR(0) items where the dot mark is placed on the end of the right-hand side. These items represent reduction states. For every set M of LR(0) items, there exists a successor set of LR(0) items for every symbol located after the dot mark. We start from the initial set of LR(0) items when constructing the 13

21 collection of sets of LR(0) items. Afterwards, all successors of the initial set are constructed. This operation is performed repeatedly for every set of LR(0) items. A set of LR(0) items is constructed from the kernel and its closure. When constructing the closure, new LR(0) items are added to the set. These items have dot mark before the first symbol of the right hand side and the left hand side nonterminal appears just before the dot mark in some other LR(0) item belonging to the set. The construction of the set of LR(0) items is described in the Algorithm Algorithm 2.13: Construction of the collection of sets of LR(0) items. Input: A context-free grammar G = (N,T,P,S). Output: A collection C of sets of LR(0) items for grammar G. Method: 1. Prepare augmented grammar G : G = (N {S },T,P {S S},S ), where S N. 2. The initial set of LR(0) items # is constructed as follows: (a) # := {S.S}. (b) If A.Bα #,B N and B β P, then # := # {B.β} (c) Repeat step b) until no new item can be added to the set #. (d) C := {#}, # is the initial set. 3. Having constructed a set of LR(0) items M i, a new set of LR(0) items X j will be constructed for every symbol X (N T) which appears just after the dot mark in some LR(0) item in M i. The index j will be choosen so it is higher than the so-far highest index X n. (a) X j := {A αx.β : A α.xβ M i }. (b) If A α.bβ X j,b N, B γ P, then X j := X j {B.γ}. (c) Repeat step b) until no new item can be added to X j. (d) C := C {X j },goto(m i,x) = X j. 4. Repeat step 3. until no new set of LR(0) items can be added to the collection C. Note: The steps 2a) and 3a) create the kernel of a set of LR(0) items. The closure is evaluated by repeating steps 2b) and 3b). Example 2.14: Given grammar G = ({S,A,B},{a,b,c},P,S), where P contains rules below. Evaluate the set of LR(0) items. (1) S B (4) A ba (2) B abb (5) A c (3) B A The set of LR(0) items is: 14

22 G. # = {S.S B 1 = {S B.} S.B a = {B a.bb B.aBb B.aBb B.A B.A A.bA A.bA A.c} A.c} S = {S S.} c = {A c.} A 1 = {B A.} B 2 = {B ab.b} b 1 = {A b.a A 2 = {A ba.} A.bA b 2 = {B abb.} A.c} Now, wearereadyto introducethegoto functiondefinedover theset oflr(0) items forgrammar Definition 2.15: Function goto(m i,x) = X j, if items of the form A α.xβ are in a set M i and the kernel of X j was formed by items of the form A αx.β. The goto function can be represented as an oriented, node and edge labeled graph. The transition goto(m i,x) = X j corresponds to the graph depicted in Figure 2.2. The GOTO function is the transition mapping of the LR automaton. M i x X j Figure 2.2: goto function as a graph Example 2.16: Let us construct goto function for the set of LR(0) items from the Example The function is depicted in Figure 2.3. START S S A 1 A # B B 1 A a c b a a c c c b 2 b B b A b B 2 b 1 A 2 Figure 2.3: LR automaton from Example

23 Definition 2.17: A context free grammar G = (N,T,P,S) is an LR(0) grammar, if it holds: If a set M from the collection of sets of LR(0) items for the grammar G contains an item of the form A α., then the set M does not contain any other item of the form B β. or B β.γ, where γ starts by a terminal. Note: The above definition is equivalent to the definition 2.9 for the case of k = 0. We will now present how to construct a parsing table based on a collection of sets of LR(0) items. Algorithm 2.18: Construction of a parsing table for LR(0) grammars. Input: A collection C of sets of LR(0) items for augmented grammar G = (N,T,P,S ). Output: Parsing table p for grammar G. Method: The parsing table p will have rows labeled by symbols that correspond to sets from C. For all M i C do: 1. p(m i ) = accept, if S S. M i, 2. p(m i ) = reduce(j), if A β. M i and A β is the j-th rule from P, and A S, β S, 3. p(m i ) = shift in all other cases. Example 2.19: Given grammar G = ({S,S,A,B},{a,b,0,1},P,S ), where P contains rules: (0) S S (4) A 0 (1) S A (5) B abbb (2) S B (6) B 1 (3) A aab This grammar is not LL(k) grammar for any k. We will demonstrate that it is LR(0) grammar. First, we will construct the collection of sets of LR(0) items for G using Algorithm # = {S.A A 1 = {S A.} S.B B 1 = {S B.} A.aAb 0 = {A 0.} A.0 1 = {B 1.} B.aBbb A 2 = {A aa.b} B.1 B 2 = {B ab.bb} S.S} b 1 = {A aab.} a = {A a.ab b 2 = {B abb.b} B a.bbb b 3 = {B abbb.} A.aAb S = {S S.} A.0 B.aBbb B.1} From the structure of the sets of LR(0) items, it is clear that the grammar is LR(0). We will construct the parsing table and LR automaton. The automaton is depicted in Figure

24 p action # shift A 1 reduce(1) B 1 reduce(2) a shift 0 reduce(4) 1 reduce(6) A 2 shift B 2 shift b 1 reduce(3) b 2 shift b 3 reduce(5) S accept START S' S S S # A S A A 1 1 a 0 B 1 1 a 0 0 B 1 B 1 A 0 a B A B 2 A 2 b b B abbb b 3 b b 2 b 1 A aab Figure 2.4: LR automaton from Example 2.19 The parsing algorithm for LR(0) grammars is similar to the parsing algorithm for strong LR(0) grammars. The difference is that the states of the LR automaton will be stored in the pushdown store instead of the grammar symbols. Algorithm 2.20: Parsing algorithm for LR(0) grammars. Input: Parsing table p and LR automaton for grammar G, input string w T, and the initial symbolstored inthepushdownstore(thelabel of theinitial set oflr(0) items, #is theconventional name used in this textbook). Output: Right parse of w in case w L(G), an error signaling otherwise. Method: The algorithm reads symbols from the input string w, uses the pushdown store and creates a sequence of numbers of rules used for reductions. The initial pushdown store contents is #. The algorithm repeats steps 1., 2.,..., 6. until it either accepts the string or detects an error. In the steps below, let the X denote the topmost symbol in the pushdown store. 17

25 1. If p(x) = shift, read one symbol and continue with step (5). 2. If p(x) = reduce(i), find i-th rule from P, let it be A α. Pop α symbols from the pushdown store and append rule number (i) to the output. Continue with step (5). 3. If p(x) = accept and entire input string was read, the parsing terminates, input string w is accepted and the output string is the right parse of w. If the entire input string was not read, the parsing ends with an error signaling. 4. If p(x) = error, the parsing ends with an error signaling. 5. Let Y be the symbol that is to be pushed into the pushdown store (Y is either the input symbol read in step (1) or left-hand side of the rule used in step (2)) and let X be the symbol on the top of the pushdown store (note that step (2) could remove certain number of symbols from the pushdown store). 6. If goto(x,y) = Z, then store Z onto the pushdown store and continue with step (1). 7. If goto(x,y) is not defined, the parsing ends with an error signaling. Example 2.21: We will demonstrate LR(0) parsing for input string aa0bb using parsing table and LR automaton from Example (#, aa0bb, ε) (#aa, a0bb, ε) (#aa, 0bb, ε) (#aa0, bb, ε) (#aaa 2, bb, 4) (#aaa 2 b, b, 4) (#aa 2, b, 43) (#aa 2 b, ε, 43) (#A 2, ε, 433) (#S, ε, 4331) Simple LR(k) grammars In the previous section, the construction of LR(0) parser was studied. The condition for LR(0) grammar is fulfilled only by a small subset of grammars. Very often, there is a set of LR(0) items that contain an item of the form A α., which represents reduction, as well as other item of the form B β. representing some other reduction or an item of the form C γ.aδ representing a shift. Such a grammar is not an LR(0) one. We will demonstrate it in the next Example. Example 2.22: Given grammar G = ({E,E,T,F},{+,,(,),a},P,E ), where the set of rules P contains rules: (0) E E (4) T F (1) E E + T (5) F (E) (2) E T (6) F a (3) T T F The collection of sets of LR(0) items for G contains the following sets: 18

26 # = {E.E F 1 = {T F.} E.E +T a = {F a.} E.T + = {E E +.T T.T F T.T F T.F T.F F.(E) F.(E) F.a} F.a} E 1 = {E E. = {T T.F E E.+T} F.(E) T 1 = {E T. F.a} T T. F} E 2 = {F (E.) (= {F (.E) E E.+T} E.E +T T 2 = {E E +T. E.T T T. F} T.T F F 2 = {T T F.} T.F ) = {F (E).} F.(E) F.a} The grammar G is not an LR(0) one. For instance, in the set E 1, there are two items E E. and E E.+T. There does not exist a way how to distinguish whether to accept the input string (reduce by E E) or shift symbol +. This situation is named shift reduce conflict. If there are two different items of the form A α. and B β. in one set, we name it reduce reduce conflict. The conflicts in sets of LR(0) items can be sometimes removed. The decision about the operation to be performed in the conflicting situation can be done based on the symbols that appear at the beginning of the not-yet read part of the input string. If this idea leads to removing of the conflicts, we stand that the grammar is simple LR(k) grammar (also SLR(k) grammar). The k 1 denotes the length of the prefix of the not-yet read part of the input string that needs to be scaned to remove conflicts in the sets of LR(0) items. Definition 2.23: A context-free grammar G = (N,T,P,S) is a simple LR(k) (SLR(k)) grammar if it holds: Let C be a collection of sets of LR(0) items for G and let A α.β and B γ.δ be two different LR(0) items in arbitrary set of LR(0) items in C. Then any pair of items must match at least one of the following: 1. either β N(N T) or δ N(N T), 2. neither β nor δ are both empty strings, 3. β ε,δ = ε and FOLLOW k (B) FIRST k (βfollow k (A)) =, 4. β = ε,δ ε and FOLLOW k (A) FIRST k (δfollow k (B)) =, 5. β = δ = ε and FOLLOW k (A) FOLLOW k (B) =. The notion β N(N T) means that the string β starts with a nonterminal symbol. The parsing table p for SLR(k) grammars can be constructed using the following algorithm: Algorithm 2.24: The construction of parsing table p for SLR(k) grammar. Input: A SLR(k) grammar G = (N,T,P,S) and the collection of sets of LR(0) items from, grammmar G. Output: The parsing table p for grammar G. Method: The parsing table p will have rows labeled by names that correspond to the names of sets from C. The columns will be labeled by strings from T k. 19

27 1. p(m i,u) = shift, if A β 1.β 2 M i, β 2 T(N T), and u FIRST k (β 2 FOLLOW k (A)). 2. p(m i,u) = reduce(j), if j 1, A β. M i, A β is j-th rule in P, and u FOLLOW k (A). 3. p(m i,ε) = accept, if S S. M i. 4. p(m i,u) = error in all other cases. We will now show how to use the above algorithm to construct parsing table for grammar G from Example Example 2.25: First, we will construct the goto function. Function goto(a, X) is defined for X N T. For the collection of sets of LR(0) items from Example 2.22, the goto function is depicted as an LR automaton in Figure 2.5. START # E F T a E 1 F 1 ( + F F T 1 a a + ( T + E E 2 * a T 2 ) * ) ( T a * F F 2 ( ( Figure 2.5: LR automaton for grammar from Example 2.22 Using Algorithm 2.24, we will construct parsing table F for grammar from Example The table contains the following abbreviations: Sh shift, R i reduce(i), and A accept. Error entries are left blank. 20

28 p a + ( ) ε # Sh Sh E 1 Sh A T 1 R 2 Sh R 2 R 2 F 1 R 4 R 4 R 4 R 4 a R 6 R 6 R 6 R 6 ( Sh Sh + Sh Sh Sh Sh E 2 Sh Sh T 2 R 1 Sh R 1 R 1 F 2 R 3 R 3 R 3 R 3 ) R 5 R 5 R 5 R 5 Now, we have to present the modified parsing algorithm since the parsing algorithm for LR(0) grammars cannot be used directly. The reason is the fact that parsing table F is two dimensional for the case of SLR(k) grammars. Algorithm 2.26: Parsing algorithm for SLR(k) grammars. The parsing algorithm is suitable for LALR(k) and LR(k) grammars as well. (The latter two classes of grammars will be presented in the next sections.) Input: Parsing table p and LR automaton for grammar G = (N,T,P,S), input string w T, and the initial symbol stored in the pushdown store (the label of the initial set of LR(0) items, # is the conventional name in this textbook). Output: Right parse of w in case w L(G), an error signaling otherwise. Method: The algorithm reads symbols from the input string w, uses the pushdown store and creates a sequence of numbers of rules used for reductions. The initial pushdown store contents is #. The algorithm repeats steps 1., 2.,..., 7. until it either accepts the string or detects an error. In the steps below, let the X denote the topmost symbol in the pushdown store. 1. Evaluate the first k unread symbols from the not-yet read part of the input string. Let it be string u. 2. If p(x,u) = shift, read one input symbol and proceed with step (6). 3. If p(x,u) = reduce(i), find i-th rule from P, let it be A α. Pop α symbols from the pushdown store and append rule number (i) to the output. Continue with step (6). 4. If p(x,u) = accept (i.e. u = ε and therefore entire input string was read), the parsing terminates, input string w is accepted and the output string is the right parse of w. 5. If p(x,u) = error, the parsing ends with an error signaling. 6. Let Y be symbol that is to be pushed into the pushdown store (Y is either the input symbol read in step (2) or left hand side of the rule used in step (3)) and let X be the symbol on the top of the pushdown store (note that step (3) could remove certain number of symbols from the pushdown store). 7. If goto(x,y) = Z, then store Z onto the pushdown store and continue with step (1). 8. If goto(x,y) is not defined, the parsing ends with an error signaling. Note: When reducing in step (3), the contents of the pushdown store can be simply removed without further examining. This is different from the case of strong LR parsers (see Algorithm 2.6). Example 2.27: We will demonstrate parsing for grammar G from Example The the input string will be a+a a. We will use parsing table and LR automaton from Example

Parsing -3. A View During TD Parsing

Parsing -3. A View During TD Parsing Parsing -3 Deterministic table-driven parsing techniques Pictorial view of TD and BU parsing BU (shift-reduce) Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1) Problems with

More information

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class LR2: LR(0) Parsing LR Parsing CMPT 379: Compilers Instructor: Anoop Sarkar anoopsarkar.github.io/compilers-class Parsing - Roadmap Parser: decision procedure: builds a parse tree Top-down vs. bottom-up

More information

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k. Bottom-up Analysis Theorem: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k. Proof: Let A Aβ α P and A be reachable from S Assumption: G is LL(k) A n A S First k (αβ n γ) First

More information

Foundations of Informatics: a Bridging Course

Foundations of Informatics: a Bridging Course Foundations of Informatics: a Bridging Course Week 3: Formal Languages and Semantics Thomas Noll Lehrstuhl für Informatik 2 RWTH Aachen University noll@cs.rwth-aachen.de http://www.b-it-center.de/wob/en/view/class211_id948.html

More information

Chapter 1. Formal Definition and View. Lecture Formal Pushdown Automata on the 28th April 2009

Chapter 1. Formal Definition and View. Lecture Formal Pushdown Automata on the 28th April 2009 Chapter 1 Formal and View Lecture on the 28th April 2009 Formal of PA Faculty of Information Technology Brno University of Technology 1.1 Aim of the Lecture 1 Define pushdown automaton in a formal way

More information

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska LECTURE 5 CHAPTER 2 FINITE AUTOMATA 1. Deterministic Finite Automata DFA 2. Nondeterministic Finite Automata NDFA 3. Finite Automata

More information

Automata Theory and Formal Grammars: Lecture 1

Automata Theory and Formal Grammars: Lecture 1 Automata Theory and Formal Grammars: Lecture 1 Sets, Languages, Logic Automata Theory and Formal Grammars: Lecture 1 p.1/72 Sets, Languages, Logic Today Course Overview Administrivia Sets Theory (Review?)

More information

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where Recitation 11 Notes Context Free Grammars Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x A V, and x (V T)*. Examples Problem 1. Given the

More information

THEORY OF COMPILATION

THEORY OF COMPILATION Lecture 04 Syntax analysis: top-down and bottom-up parsing THEORY OF COMPILATION EranYahav 1 You are here Compiler txt Source Lexical Analysis Syntax Analysis Parsing Semantic Analysis Inter. Rep. (IR)

More information

CS 406: Bottom-Up Parsing

CS 406: Bottom-Up Parsing CS 406: Bottom-Up Parsing Stefan D. Bruda Winter 2016 BOTTOM-UP PUSH-DOWN AUTOMATA A different way to construct a push-down automaton equivalent to a given grammar = shift-reduce parser: Given G = (N,

More information

Section 1 (closed-book) Total points 30

Section 1 (closed-book) Total points 30 CS 454 Theory of Computation Fall 2011 Section 1 (closed-book) Total points 30 1. Which of the following are true? (a) a PDA can always be converted to an equivalent PDA that at each step pops or pushes

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

UNIT-VI PUSHDOWN AUTOMATA

UNIT-VI PUSHDOWN AUTOMATA Syllabus R09 Regulation UNIT-VI PUSHDOWN AUTOMATA The context free languages have a type of automaton that defined them. This automaton, called a pushdown automaton, is an extension of the nondeterministic

More information

Automata Theory for Presburger Arithmetic Logic

Automata Theory for Presburger Arithmetic Logic Automata Theory for Presburger Arithmetic Logic References from Introduction to Automata Theory, Languages & Computation and Constraints in Computational Logic Theory & Application Presented by Masood

More information

Chapter 4: Bottom-up Analysis 106 / 338

Chapter 4: Bottom-up Analysis 106 / 338 Syntactic Analysis Chapter 4: Bottom-up Analysis 106 / 338 Bottom-up Analysis Attention: Many grammars are not LL(k)! A reason for that is: Definition Grammar G is called left-recursive, if A + A β for

More information

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET Regular Languages and FA A language is a set of strings over a finite alphabet Σ. All languages are finite or countably infinite. The set of all languages

More information

CPSC 421: Tutorial #1

CPSC 421: Tutorial #1 CPSC 421: Tutorial #1 October 14, 2016 Set Theory. 1. Let A be an arbitrary set, and let B = {x A : x / x}. That is, B contains all sets in A that do not contain themselves: For all y, ( ) y B if and only

More information

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described

More information

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs Harry Lewis October 8, 2013 Reading: Sipser, pp. 119-128. Pushdown Automata (review) Pushdown Automata = Finite automaton

More information

CISC4090: Theory of Computation

CISC4090: Theory of Computation CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter

More information

Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes

Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes These notes form a brief summary of what has been covered during the lectures. All the definitions must be memorized and understood. Statements

More information

Einführung in die Computerlinguistik

Einführung in die Computerlinguistik Einführung in die Computerlinguistik Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 22 CFG (1) Example: Grammar G telescope : Productions: S NP VP NP

More information

Mathematical Preliminaries. Sipser pages 1-28

Mathematical Preliminaries. Sipser pages 1-28 Mathematical Preliminaries Sipser pages 1-28 Mathematical Preliminaries This course is about the fundamental capabilities and limitations of computers. It has 3 parts 1. Automata Models of computation

More information

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar TAFL 1 (ECS-403) Unit- III 3.1 Definition of CFG (Context Free Grammar) and problems 3.2 Derivation 3.3 Ambiguity in Grammar 3.3.1 Inherent Ambiguity 3.3.2 Ambiguous to Unambiguous CFG 3.4 Simplification

More information

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor 60-354, Theory of Computation Fall 2013 Asish Mukhopadhyay School of Computer Science University of Windsor Pushdown Automata (PDA) PDA = ε-nfa + stack Acceptance ε-nfa enters a final state or Stack is

More information

Pushdown Automata. Reading: Chapter 6

Pushdown Automata. Reading: Chapter 6 Pushdown Automata Reading: Chapter 6 1 Pushdown Automata (PDA) Informally: A PDA is an NFA-ε with a infinite stack. Transitions are modified to accommodate stack operations. Questions: What is a stack?

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction Bottom Up Parsing David Sinclair Bottom Up Parsing LL(1) parsers have enjoyed a bit of a revival thanks to JavaCC. LL(k) parsers must predict which production rule to use

More information

Fundamentele Informatica II

Fundamentele Informatica II Fundamentele Informatica II Answer to selected exercises 5 John C Martin: Introduction to Languages and the Theory of Computation M.M. Bonsangue (and J. Kleijn) Fall 2011 5.1.a (q 0, ab, Z 0 ) (q 1, b,

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

Theory of Computation

Theory of Computation Theory of Computation (Feodor F. Dragan) Department of Computer Science Kent State University Spring, 2018 Theory of Computation, Feodor F. Dragan, Kent State University 1 Before we go into details, what

More information

Accept or reject. Stack

Accept or reject. Stack Pushdown Automata CS351 Just as a DFA was equivalent to a regular expression, we have a similar analogy for the context-free grammar. A pushdown automata (PDA) is equivalent in power to contextfree grammars.

More information

Compiler Construction

Compiler Construction Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: LR(0) Grammars LR(0) Grammars The case k = 0 is

More information

Bottom-Up Syntax Analysis

Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Wilhelm/Maurer: Compiler Design, Chapter 8 Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-sb.de and Mooly Sagiv Tel Aviv University sagiv@math.tau.ac.il Subjects Functionality

More information

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully EXAM Please read all instructions, including these, carefully There are 7 questions on the exam, with multiple parts. You have 3 hours to work on the exam. The exam is open book, open notes. Please write

More information

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X }

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X } Let s try building an SLR parsing table for another simple grammar: S XaY Y X by c Y X S XaY Y X by c Y X Canonical collection: I 0 : {S S, S XaY, S Y, X by, X c, Y X} I 1 : {S S } I 2 : {S X ay, Y X }

More information

Solution Scoring: SD Reg exp.: a(a

Solution Scoring: SD Reg exp.: a(a MA/CSSE 474 Exam 3 Winter 2013-14 Name Solution_with explanations Section: 02(3 rd ) 03(4 th ) 1. (28 points) For each of the following statements, circle T or F to indicate whether it is True or False.

More information

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write: Languages A language is a set (usually infinite) of strings, also known as sentences Each string consists of a sequence of symbols taken from some alphabet An alphabet, V, is a finite set of symbols, e.g.

More information

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam ECS 120: Theory of Computation Handout MT UC Davis Phillip Rogaway February 16, 2012 Midterm Exam Instructions: The exam has six pages, including this cover page, printed out two-sided (no more wasted

More information

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id Bottom-Up Parsing Attempts to traverse a parse tree bottom up (post-order traversal) Reduces a sequence of tokens to the start symbol At each reduction step, the RHS of a production is replaced with LHS

More information

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing); Bottom up parsing General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing); 1 Top-down vs Bottom-up Bottom-up more powerful than top-down; Can process

More information

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis Context-free grammars Derivations Parse Trees Left-recursive grammars Top-down parsing non-recursive predictive parsers construction of parse tables Bottom-up parsing shift/reduce parsers LR parsers GLR

More information

Pushdown Automata: Introduction (2)

Pushdown Automata: Introduction (2) Pushdown Automata: Introduction Pushdown automaton (PDA) M = (K, Σ, Γ,, s, A) where K is a set of states Σ is an input alphabet Γ is a set of stack symbols s K is the start state A K is a set of accepting

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information

Theory of Computation Turing Machine and Pushdown Automata

Theory of Computation Turing Machine and Pushdown Automata Theory of Computation Turing Machine and Pushdown Automata 1. What is a Turing Machine? A Turing Machine is an accepting device which accepts the languages (recursively enumerable set) generated by type

More information

Automata Theory (2A) Young Won Lim 5/31/18

Automata Theory (2A) Young Won Lim 5/31/18 Automata Theory (2A) Copyright (c) 2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later

More information

On LR(k)-parsers of polynomial size

On LR(k)-parsers of polynomial size On LR(k)-parsers of polynomial size Norbert Blum October 15, 2013 Abstract Usually, a parser for an LR(k)-grammar G is a deterministic pushdown transducer which produces backwards the unique rightmost

More information

Pushdown Automata (2015/11/23)

Pushdown Automata (2015/11/23) Chapter 6 Pushdown Automata (2015/11/23) Sagrada Familia, Barcelona, Spain Outline 6.0 Introduction 6.1 Definition of PDA 6.2 The Language of a PDA 6.3 Euivalence of PDA s and CFG s 6.4 Deterministic PDA

More information

Theory of Computation

Theory of Computation Fall 2002 (YEN) Theory of Computation Midterm Exam. Name:... I.D.#:... 1. (30 pts) True or false (mark O for true ; X for false ). (Score=Max{0, Right- 1 2 Wrong}.) (1) X... If L 1 is regular and L 2 L

More information

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen Pushdown Automata Notes on Automata and Theory of Computation Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan ROC Pushdown Automata p. 1

More information

Context-Free Grammars and Languages. Reading: Chapter 5

Context-Free Grammars and Languages. Reading: Chapter 5 Context-Free Grammars and Languages Reading: Chapter 5 1 Context-Free Languages The class of context-free languages generalizes the class of regular languages, i.e., every regular language is a context-free

More information

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata. CMSC 330: Organization of Programming Languages Last Lecture Languages Sets of strings Operations on languages Finite Automata Regular expressions Constants Operators Precedence CMSC 330 2 Clarifications

More information

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen Pushdown automata Twan van Laarhoven Institute for Computing and Information Sciences Intelligent Systems Version: fall 2014 T. van Laarhoven Version: fall 2014 Formal Languages, Grammars and Automata

More information

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata : Organization of Programming Languages Theory of Regular Expressions Finite Automata Previous Course Review {s s defined} means the set of string s such that s is chosen or defined as given s A means

More information

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata. Pushdown Automata We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata. Next we consider a more powerful computation model, called a

More information

Uses of finite automata

Uses of finite automata Chapter 2 :Finite Automata 2.1 Finite Automata Automata are computational devices to solve language recognition problems. Language recognition problem is to determine whether a word belongs to a language.

More information

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7 Compiler Design 1 LR Parsing Compiler Design 2 LR(0) Parsing An LR(0) parser can take shift-reduce decisions entirely on the basis of the states of LR(0) automaton a of the grammar. Consider the following

More information

Finite Automata and Regular Languages

Finite Automata and Regular Languages Finite Automata and Regular Languages Topics to be covered in Chapters 1-4 include: deterministic vs. nondeterministic FA, regular expressions, one-way vs. two-way FA, minimization, pumping lemma for regular

More information

String Suffix Automata and Subtree Pushdown Automata

String Suffix Automata and Subtree Pushdown Automata String Suffix Automata and Subtree Pushdown Automata Jan Janoušek Department of Computer Science Faculty of Information Technologies Czech Technical University in Prague Zikova 1905/4, 166 36 Prague 6,

More information

5 Context-Free Languages

5 Context-Free Languages CA320: COMPUTABILITY AND COMPLEXITY 1 5 Context-Free Languages 5.1 Context-Free Grammars Context-Free Grammars Context-free languages are specified with a context-free grammar (CFG). Formally, a CFG G

More information

Bottom-Up Syntax Analysis

Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Wilhelm/Seidl/Hack: Compiler Design Syntactic and Semantic Analysis, Chapter 3 Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-saarland.de and Mooly Sagiv Tel Aviv

More information

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I Internal Examination 2017-18 B.Tech III Year VI Semester Sub: Theory of Computation (6CS3A) Time: 1 Hour 30 min. Max Marks: 40 Note: Attempt all three

More information

Theory of Computation (Classroom Practice Booklet Solutions)

Theory of Computation (Classroom Practice Booklet Solutions) Theory of Computation (Classroom Practice Booklet Solutions) 1. Finite Automata & Regular Sets 01. Ans: (a) & (c) Sol: (a) The reversal of a regular set is regular as the reversal of a regular expression

More information

Bottom-Up Syntax Analysis

Bottom-Up Syntax Analysis Bottom-Up Syntax Analysis Wilhelm/Seidl/Hack: Compiler Design Syntactic and Semantic Analysis Reinhard Wilhelm Universität des Saarlandes wilhelm@cs.uni-saarland.de and Mooly Sagiv Tel Aviv University

More information

Theory of Computation - Module 3

Theory of Computation - Module 3 Theory of Computation - Module 3 Syllabus Context Free Grammar Simplification of CFG- Normal forms-chomsky Normal form and Greibach Normal formpumping lemma for Context free languages- Applications of

More information

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer. Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer. Language Generator: Context free grammars are language generators,

More information

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking) Announcements n Quiz 1 n Hold on to paper, bring over at the end n HW1 due today n HW2 will be posted tonight n Due Tue, Sep 18 at 2pm in Submitty! n Team assignment. Form teams in Submitty! n Top-down

More information

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY No part of this publication may be reproduced or distributed in any form or any means, electronic, mechanical, photocopying, or otherwise without the prior permission of the author. GATE SOLVED PAPER Computer

More information

Chapter Five: Nondeterministic Finite Automata

Chapter Five: Nondeterministic Finite Automata Chapter Five: Nondeterministic Finite Automata From DFA to NFA A DFA has exactly one transition from every state on every symbol in the alphabet. By relaxing this requirement we get a related but more

More information

Computational Models - Lecture 5 1

Computational Models - Lecture 5 1 Computational Models - Lecture 5 1 Handout Mode Iftach Haitner and Yishay Mansour. Tel Aviv University. April 10/22, 2013 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice

More information

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario Parsing Algorithms CS 4447/CS 9545 -- Stephen Watt University of Western Ontario The Big Picture Develop parsers based on grammars Figure out properties of the grammars Make tables that drive parsing engines

More information

Languages, regular languages, finite automata

Languages, regular languages, finite automata Notes on Computer Theory Last updated: January, 2018 Languages, regular languages, finite automata Content largely taken from Richards [1] and Sipser [2] 1 Languages An alphabet is a finite set of characters,

More information

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix.

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix. (NB. Pages 22-40 are intended for those who need repeated study in formal languages) Length of a string Number of symbols in the string. Formal languages Basic concepts for symbols, strings and languages:

More information

Theory of Computation

Theory of Computation Thomas Zeugmann Hokkaido University Laboratory for Algorithmics http://www-alg.ist.hokudai.ac.jp/ thomas/toc/ Lecture 3: Finite State Automata Motivation In the previous lecture we learned how to formalize

More information

CS20a: summary (Oct 24, 2002)

CS20a: summary (Oct 24, 2002) CS20a: summary (Oct 24, 2002) Context-free languages Grammars G = (V, T, P, S) Pushdown automata N-PDA = CFG D-PDA < CFG Today What languages are context-free? Pumping lemma (similar to pumping lemma for

More information

Theory of computation: initial remarks (Chapter 11)

Theory of computation: initial remarks (Chapter 11) Theory of computation: initial remarks (Chapter 11) For many purposes, computation is elegantly modeled with simple mathematical objects: Turing machines, finite automata, pushdown automata, and such.

More information

Syntax Analysis (Part 2)

Syntax Analysis (Part 2) Syntax Analysis (Part 2) Martin Sulzmann Martin Sulzmann Syntax Analysis (Part 2) 1 / 42 Bottom-Up Parsing Idea Build right-most derivation. Scan input and seek for matching right hand sides. Terminology

More information

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules. Bottom-up Parsing Bottom-up Parsing Until now we started with the starting nonterminal S and tried to derive the input from it. In a way, this isn t the natural thing to do. It s much more logical to start

More information

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example Administrivia Test I during class on 10 March. Bottom-Up Parsing Lecture 11-12 From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS14 Lecture 11 1 2/20/08 Prof. Hilfinger CS14 Lecture 11 2 Bottom-Up

More information

Context-free Grammars and Languages

Context-free Grammars and Languages Context-free Grammars and Languages COMP 455 002, Spring 2019 Jim Anderson (modified by Nathan Otterness) 1 Context-free Grammars Context-free grammars provide another way to specify languages. Example:

More information

Theory of Computation (IV) Yijia Chen Fudan University

Theory of Computation (IV) Yijia Chen Fudan University Theory of Computation (IV) Yijia Chen Fudan University Review language regular context-free machine DFA/ NFA PDA syntax regular expression context-free grammar Pushdown automata Definition A pushdown automaton

More information

3515ICT: Theory of Computation. Regular languages

3515ICT: Theory of Computation. Regular languages 3515ICT: Theory of Computation Regular languages Notation and concepts concerning alphabets, strings and languages, and identification of languages with problems (H, 1.5). Regular expressions (H, 3.1,

More information

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission. CSE 5 Homework Due: Monday October 9, 7 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in this

More information

CS Pushdown Automata

CS Pushdown Automata Chap. 6 Pushdown Automata 6.1 Definition of Pushdown Automata Example 6.2 L ww R = {ww R w (0+1) * } Palindromes over {0, 1}. A cfg P 0 1 0P0 1P1. Consider a FA with a stack(= a Pushdown automaton; PDA).

More information

Context free languages

Context free languages Context free languages Syntatic parsers and parse trees E! E! *! E! (! E! )! E + E! id! id! id! 2 Context Free Grammars The CF grammar production rules have the following structure X α being X N and α

More information

CS Rewriting System - grammars, fa, and PDA

CS Rewriting System - grammars, fa, and PDA Restricted version of PDA If (p, γ) δ(q, a, X), a Σ {ε}, p. q Q, X Γ. restrict γ Γ in three ways: i) if γ = YX, (q, ay, Xβ) (p, y, YXβ) push Y Γ, ii) if γ = X, (q, ay, Xβ) (p, y, Xβ) no change on stack,

More information

UNIT-I. Strings, Alphabets, Language and Operations

UNIT-I. Strings, Alphabets, Language and Operations UNIT-I Strings, Alphabets, Language and Operations Strings of characters are fundamental building blocks in computer science. Alphabet is defined as a non empty finite set or nonempty set of symbols. The

More information

Nondeterministic Finite Automata

Nondeterministic Finite Automata Nondeterministic Finite Automata Not A DFA Does not have exactly one transition from every state on every symbol: Two transitions from q 0 on a No transition from q 1 (on either a or b) Though not a DFA,

More information

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u, 1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u, v, x, y, z as per the pumping theorem. 3. Prove that

More information

Chapter 3. Regular grammars

Chapter 3. Regular grammars Chapter 3 Regular grammars 59 3.1 Introduction Other view of the concept of language: not the formalization of the notion of effective procedure, but set of words satisfying a given set of rules Origin

More information

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin Chapter 0 Introduction Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin October 2014 Automata Theory 2 of 22 Automata theory deals

More information

Closure under the Regular Operations

Closure under the Regular Operations Closure under the Regular Operations Application of NFA Now we use the NFA to show that collection of regular languages is closed under regular operations union, concatenation, and star Earlier we have

More information

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska LECTURE 14 SMALL REVIEW FOR FINAL SOME Y/N QUESTIONS Q1 Given Σ =, there is L over Σ Yes: = {e} and L = {e} Σ Q2 There are uncountably

More information

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100 EXAM Please read all instructions, including these, carefully There are 7 questions on the exam, with multiple parts. You have 3 hours to work on the exam. The exam is open book, open notes. Please write

More information

CPS 220 Theory of Computation Pushdown Automata (PDA)

CPS 220 Theory of Computation Pushdown Automata (PDA) CPS 220 Theory of Computation Pushdown Automata (PDA) Nondeterministic Finite Automaton with some extra memory Memory is called the stack, accessed in a very restricted way: in a First-In First-Out fashion

More information

Theory of Computer Science

Theory of Computer Science Theory of Computer Science C1. Formal Languages and Grammars Malte Helmert University of Basel March 14, 2016 Introduction Example: Propositional Formulas from the logic part: Definition (Syntax of Propositional

More information

컴파일러입문 제 3 장 정규언어

컴파일러입문 제 3 장 정규언어 컴파일러입문 제 3 장 정규언어 목차 3.1 정규문법과정규언어 3.2 정규표현 3.3 유한오토마타 3.4 정규언어의속성 Regular Language Page 2 정규문법과정규언어 A study of the theory of regular languages is often justified by the fact that they model the lexical

More information

Theory of computation: initial remarks (Chapter 11)

Theory of computation: initial remarks (Chapter 11) Theory of computation: initial remarks (Chapter 11) For many purposes, computation is elegantly modeled with simple mathematical objects: Turing machines, finite automata, pushdown automata, and such.

More information

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2 BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2 Prepared by, Dr. Subhendu Kumar Rath, BPUT, Odisha. UNIT 2 Structure NON-DETERMINISTIC FINITE AUTOMATA

More information

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska LECTURE 11 CHAPTER 3 CONTEXT-FREE LANGUAGES 1. Context Free Grammars 2. Pushdown Automata 3. Pushdown automata and context -free

More information

Compiler Design Spring 2017

Compiler Design Spring 2017 Compiler Design Spring 2017 3.4 Bottom-up parsing Dr. Zoltán Majó Compiler Group Java HotSpot Virtual Machine Oracle Corporation 1 Bottom up parsing Goal: Obtain rightmost derivation in reverse w S Reduce

More information