LCFRS Exercises and solutions

Similar documents
Mildly Context-Sensitive Grammar Formalisms: Thread Automata

Einführung in die Computerlinguistik

Theory of Computation - Module 3

Tree Adjoining Grammars

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

CFG Simplification. (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

CS375: Logic and Theory of Computing

VTU QUESTION BANK. Unit 1. Introduction to Finite Automata. 1. Obtain DFAs to accept strings of a s and b s having exactly one a.

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Computability Theory

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Properties of Context-free Languages. Reading: Chapter 7

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Fundamentele Informatica II

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Properties of Context Free Languages

The word problem in Z 2 and formal language theory

Notes for Comp 497 (454) Week 10

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Automata Theory CS F-08 Context-Free Grammars

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Solution. S ABc Ab c Bc Ac b A ABa Ba Aa a B Bbc bc.

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

This lecture covers Chapter 7 of HMU: Properties of CFLs

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Einführung in die Computerlinguistik

CS375 Midterm Exam Solution Set (Fall 2017)

Miscellaneous. Closure Properties Decision Properties

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Multiple Context-Free Grammars

2.1 Solution. E T F a. E E + T T + T F + T a + T a + F a + a

Chapter 16: Non-Context-Free Languages

REGular and Context-Free Grammars

Properties of context-free Languages

5 Context-Free Languages

Properties of Context-Free Languages

Section 1 (closed-book) Total points 30

Non-context-Free Languages. CS215, Lecture 5 c

Properties of Context-Free Languages. Closure Properties Decision Properties

Concordia University Department of Computer Science & Software Engineering

Multiple Context-free Grammars

The Pumping Lemma for Context Free Grammars

Introduction to Theory of Computing

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

HW6 Solutions. Micha l Dereziński. March 20, 2015

HW 3 Solutions. Tommy November 27, 2012

Foundations of Informatics: a Bridging Course

The View Over The Horizon

ECS120 Fall Discussion Notes. October 25, The midterm is on Thursday, November 2nd during class. (That is next week!)

Problem 2.6(d) [4 pts] Problem 2.12 [3pts] Original CFG:

Exam: Synchronous Grammars

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

CS500 Homework #2 Solutions

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

Context-Free Grammars (and Languages) Lecture 7

Chapter 6. Properties of Regular Languages

Intro to Theory of Computation

CSE 468, Fall 2006 Homework solutions 1

Solutions to Problem Set 3

CPS 220 Theory of Computation

Context-Free Languages

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Theory of Computation

Computational Models - Lecture 4 1

Grammars and Context Free Languages

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

Computational Models - Lecture 5 1

Context-free Grammars and Languages

CS 341 Homework 16 Languages that Are and Are Not Context-Free

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication

FLAC Context-Free Grammars

Chapter 5: Context-Free Languages

Context-Free Languages (Pre Lecture)

Chap. 7 Properties of Context-free Languages

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Fundamentele Informatica 3 Antwoorden op geselecteerde opgaven uit Hoofdstuk 7 en Hoofdstuk 8

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

MTH401A Theory of Computation. Lecture 17

Grammars (part II) Prof. Dan A. Simovici UMB

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Lesson 7: Algebraic Expressions The Commutative and Associative Properties

Computational Models - Lecture 5 1

CISC 4090 Theory of Computation

Solution Scoring: SD Reg exp.: a(a

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

Grammars and Context Free Languages

CS311 Computational Structures More about PDAs & Context-Free Languages. Lecture 9. Andrew P. Black Andrew Tolmach

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

Formal Languages, Grammars and Automata Lecture 5

CSE 105 Homework 5 Due: Monday November 13, Instructions. should be on each page of the submission.

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Transcription:

LCFRS Exercises and solutions Laura Kallmeyer SS 2010 Question 1 1. Give a CFG for the following language: {a n b m c m d n n > 0, m 0} 2. Show that the following language is not context-free: {a 2n n 0} Hint: Show that this language does not satisfy the CFL pumping lemma. 1. Nonterminals N = {S, T }, terminals T = {a, b, c, d}, start symbol S and productions {S a S d, S a T d, T b T c, T ε}. 2. To show: L = {a 2n n 0} is not context-free. We assume that L is context-free. Then it satisfies the pumping lemma with a certain constant c 1. The word a 2c is in the language. The next longer word is a 2c+1 with a 2c+1 a 2c = 2 c+1 2 c = 2 c > c. Contradiction to the pumping lemma according to which there must be a word with a length a 2c + c. Question 2 Consider the language L 2 = {a n b n n 0}. 1. Give a CFG for L 2 with nested dependencies, i.e., such that for each word a 1...a n b 1... b n (the subscripts mark the occurrences of the as and bs respectively) a i and b n+1 i are added by the same production for all 1 i n. 2. Show that for L 2 there is no CFG displaying cross-serial dependencies, i.e., no CFG such that for each word a 1...a n b 1... b n, a i and b i are added by the same production for all 1 i n and, furthermore, different a s are added by different productions. Hint: You can argue that if such a CFG exists, then there exists also a CFG for the copy language which is a contradiction to the fact that the copy language is not context-free. 1. G = N, T, P, S with N = {S}, T = {a, b}, start symbol S and productions S asb, S ε. 2. Assume that such a CFG exists. Its productions are then all of the form X αaβbγ with X N, α, β, γ N such that if such a production is applied when generating a string a 1... a n b 1... b n, then the a and b of the production necessarily end up at positions i and n + i for some i, 1 i n. Then replacing each of these productions X αaβbγ with X αaβaγ and X αbβbγ leads to a CFG generating the copy language. Contradiction.

Question 3 Similar to Shieber s (1985) argument for Swiss German, one can apply first a homomorphism f, then intersect with some regular language, and then apply another homomorphism g in order to reduce the language of Swiss German to the copy language {ww w {a, b} }. Find the corresponding homomorphisms and the regular language. A first homomorphism can be as the f from Shieber (1985). Then intersect with the regular language w{a, b} x{c, d} y which leads to {wv 1 xv 2 y v 1 {a, b}, v 2 {c, d} such that v 1 = v 2 and for all i, 1 i v 1 : if the ith symbol in v 1 is an a (a b), the ith symbol in v 2 is a c (a d)}. Finally we apply a second homomorphism g with g(w) := g(x) := g(y) := ε, g(a) := g(c) := a, g(b) := g(d) := b. This leads to the copy language. Question 4 Consider the MCFG given by the following clauses (in simple RCG notation): S(XY Z) A(Y )B(X, Z) A(aX) A(X) B(bX, by b) B(X, Y ) A(a) ε B(ε, ε) ε 1. Give the sets yield(a) and yield(b). 2. What is the string language generated by the grammar? 1. yield(a) = { a n n 1} yield(b) = { b n, (bb) n n 0}. 2. {b m a n (bb) m n 1, m 0}. Question 5 Give the language generated by the following simple RCG and give the derivation tree for a string of length 9. S-REL(XY Z) VP-REL(X, Z)N-SUBJ(Y ) VP-REL(X, Y Z) (X, Z)V(Y ) (X,a copy of Y ) (X, Y ) (X,a picture of Y ) (X, Y ) N-SUBJ(Peter) ε V(painted) ε (whom, ε) ε The string language is the regular language whom Peter painted ((a copy of) + (a picture of)) For the string whom Peter painted a copy of a picture of (of length 9), we obtain the following derivation tree: 2

S-REL N-SUBJ VP-REL V whom Peter painted a copy of a picture of ε Question 6 Consider the simple RCG with the following clauses: S(XY ZU) A(X, Z)B(U, Y ) S(XY Z) A(X, Z)C(Y ) A(aX, az) A(X, Z) A(ε, c) ε B(Xb, Y b) B(X, Y ) B(ε, c) ε C(aXY ) D(X)C(Y ) D(d) ε 1. Perform the following transformations on this simple RCG while obtaining always weakly equivalent simple RCGs: (a) Transform the grammar into an ordered simple RCG. (b) Remove useless rules. (c) Remove ε-rules. 2. What is the string language generated by this grammar? 1. Simplifying the grammar: (a) Transform the grammar into an ordered simple RCG. (If the superscript is the identity, we omit it.) The only problematic rule is S(XY ZU) A(X, Z)B(U, Y ). It transforms into S(XY ZU) A(X, Z)B 2,1 (Y, U). Add B 2,1 (Y b, Xb) B(X, Y ) and B 2,1 (c, ε) ε. Then, B 2,1 (Y b, Xb) B(X, Y ) transforms into B 2,1 (Y b, Xb) B 2,1 (Y, X). In the following, for reasons of readability, we replace B 2,1 with a new symbol E. Result: S(XY ZU) A(X, Z)E(Y, U) S(XY Z) A(X, Z)C(Y ) A(aX, az) A(X, Z) A(ε, c) ε B(Xb, Y b) B(X, Y ) B(ε, c) ε E(Y b, Xb) E(Y, X) E(c, ε) ε C(aXY ) D(X)C(Y ) D(d) ε (b) Remove useless rules. N T = {A, B, E, D, S}. Consequently, remove S(XY Z) A(X, Z)C(Y ) and C(aXY ) D(X)C(Y ). In the result, N S = {S, A, E}. Consequently, remove also D(d) ε, B(Xb, Y b) B(X, Y ) and B(ε, c) ε. Result: S(XY ZU) A(X, Z)E(Y, U) A(aX, az) A(X, Z) E(Y b, Xb) E(Y, X) A(ε, c) ε E(c, ε) ε 3

(c) Remove ε-rules. N ε = {A 01, A 11, E 10, E 11, S 1 }. Resulting productions: S 1 (XY ZU) A 11 (X, Z)E 11 (Y, U) S 1 (Y ZU) A 01 (Z)E 11 (Y, U) S 1 (XY Z) A 11 (X, Z)E 10 (Y ) S 1 (Y Z) A 01 (Z)E 10 (Y ) A 11 (ax, az) A 11 (X, Z) A 11 (a, az) A 01 (Z) A 01 (c) ε E 11 (Y b, Xb) E 11 (Y, X) E 11 (Y b, b) E 10 (Y ) E 10 (c) ε 2. The string language generated by this grammar is {a n cb m a n cb m n, m 0}. Question 7 Show that the language {w 5 w {a, b} } is not a 2-MCFL. Hint: Intersect first with the regular language a + b + a + b + a + b + a + b + a + b + and then show that the result does not satisfy the pumping lemma. We assume that L = {w 5 w {a, b} } is a 2-MCFL. Then the language L = {a n b m a n b m a n b m a n b m a n b m n, m > 0} which we obtain from intersecting L with the regular language denoted by a + b + a + b + a + b + a + b + a + b + must also be a 2-MCFL. Consequently, with the pumping lemma, there must be at least one word in the language of the form w 1 v 1 w 2 v 2 w 3 v 3 w 4 v 4 w 5 where v 1 v 2 v 3 v 4 v 5 ε such that the v i (1 i 4) can be iterated. Each of the v 1,..., v 4 must necessarily contain either only as or only bs, otherwise the next iteration step would lead to a word outside the language. However, this means that by these iterations only some and not all of the exponents n and m get increased (since maximally four substrings are iterated but we have five exponents n and five exponents m). I.e., after the next iteration we necessarily obtain a word with either two a-sequences of different length or two b-sequences of different length. This means that the word we obtain by iteration is not in L. Therefore, L does not satisfy the pumping lemma for 2-MCFL which contradicts our assumption that L (and L ) are 2-MCFLs. Question 8 1. Show that the copy language {ww w T } for some alphabet T is semilinear using the Parikh Theorem. 2. Show that {a 2n n 0} is not semilinear. Hint: if the language was semilinear it would satisfy the constant growth property. Show that this is not the case. 1. The copy language L := {ww w T } is letter equivalent to L := {ww R w T and w R is w in reverse order}, which is a CFL: It is generated by the CFG with productions S ε and S xsx for all x T. Consequently (with Parikh s theorem) L and also L are semilinear. 2. Assume that {a 2n n 0} satisfies the constant growth property with c 0 and C. Then take a w = a 2m with w = 2 m > max({c 0 } C). Then, according to the definition of constant growth, for w = a 2m+1 there must be a w = a 2k with w = w + c for some c C. I.e., 2 m+1 = 2 k + c. Consequently (since k m) c 2 m. Contradiction. Question 9 Consider the following TA: M = N, T, S, ret, κ, K, δ, U, Θ with N = {S, S, S A, S B, A, B,ret}, T = {a, b}, K = N and κ the identity, δ(s) = δ(a) = δ(s A ) = δ(b) = δ(s B ) = 1, δ(ret) = and the following transitions: 4

S [S]S, S a A2, S a SA, S A [S A ]S, S b B2, S b SB, S B [S B ]S, a A 2 ret, [SA ]ret A 2, B 2 b ret, [SB ]ret B 2 1. What is the string language accepted by this TA? 2. Choose a word of length 4 in this language and give the thread sets (only successful items) that are generated for this word. 1. The language is {ww R w {a, b} + }. 2. Successful configurations for w = abba: thread set remaining input operation ε : S abba ε : S, 1 : S abba S [S]S ε : S, 1 : S A bba S a S A ε : S, 1 : S A, 11 : S bba S A [S A ]S ε : S, 1 : S A, 11 : B 2 ba S b B 2 ε : S, 1 : S A, 11 : ret a B 2 b ret ε : S, 1 : A 2 a [S A ]ret A 2 ε : S, 1 : ret ε a A 2 ret Question 10 Consider the following set-local MCTAG: α A ε S B ε β A A NA a A d b A NA c β B B NA e B h f B NA g 1. What is the string language generated by this set-local MCTAG? 2. Give an equivalent 4-MCFG. 1. {a n b n c n d n e n f n g n h n n 0}. 2. Start symbol S, N = {α, β, S}. Rules: S(X) α(x) α(xy ZU) β(x, Y, Z, U) β(axb, cy d, ezf, guh) β(x, Y, Z, U) α(ε) ε β(ab, cd, ef, gh) ε 5