Tree Adjoining Grammars

Similar documents
Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Einführung in die Computerlinguistik

Parsing. Weighted Deductive Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Mildly Context-Sensitive Grammar Formalisms: Thread Automata

Parsing Linear Context-Free Rewriting Systems

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Non-context-Free Languages. CS215, Lecture 5 c

Foundations of Informatics: a Bridging Course

Parsing. A Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 27

Tabular Algorithms for TAG Parsing

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Properties of Context-Free Languages

Einführung in die Computerlinguistik

NPDA, CFG equivalence

Context-Free and Noncontext-Free Languages

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Sri vidya college of engineering and technology

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Context-Free Languages (Pre Lecture)

Properties of Context-Free Languages. Closure Properties Decision Properties

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Ogden s Lemma for CFLs

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

Tree Adjoining Grammars

Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar

Complexity of Natural Languages Mildly-context sensitivity T1 languages

Theory of Computation

This lecture covers Chapter 7 of HMU: Properties of CFLs

A Polynomial-Time Parsing Algorithm for TT-MCTAG

CSE 105 Homework 5 Due: Monday November 13, Instructions. should be on each page of the submission.

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

CS20a: summary (Oct 24, 2002)

Sharpening the empirical claims of generative syntax through formalization

Context Free Language Properties

Even More on Dynamic Programming

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Computational Models - Lecture 4 1

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Computational Models - Lecture 5 1

Computational Models - Lecture 4

1 Alphabets and Languages

LCFRS Exercises and solutions

The View Over The Horizon

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Notes for Comp 497 (Comp 454) Week 10 4/5/05

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS375: Logic and Theory of Computing

Properties of context-free Languages

PARSING TREE ADJOINING GRAMMARS AND TREE INSERTION GRAMMARS WITH SIMULTANEOUS ADJUNCTIONS

Section 1 (closed-book) Total points 30

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Theory of Computation (II) Yijia Chen Fudan University

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

The Pumping Lemma for Context Free Grammars

UNIT II REGULAR LANGUAGES

Pushdown Automata: Introduction (2)

Properties of Context-free Languages. Reading: Chapter 7

Epsilon-Free Grammars and Lexicalized Grammars that Generate the Class of the Mildly Context-Sensitive Languages

Putting Some Weakly Context-Free Formalisms in Order

Finite Automata and Regular languages

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

MA/CSSE 474 Theory of Computation

Parsing Linear Context-Free Rewriting Systems with Fast Matrix Multiplication

CS481F01 Prelim 2 Solutions

MA/CSSE 474 Theory of Computation

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Lecture 17: Language Recognition

Introduction to Theory of Computing

Context-Free Languages

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Decidability (What, stuff is unsolvable?)

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

NEW TABULAR ALGORITHMS FOR LIG PARSING

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Multiple tree automata: an alternative tree-oriented formalism for LCFRS

3515ICT: Theory of Computation. Regular languages

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

CISC4090: Theory of Computation

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Strong connectivity hypothesis and generative power in TAG

Notes for Comp 497 (454) Week 10

THE AUSTRALIAN NATIONAL UNIVERSITY Second Semester COMP2600/COMP6260 (Formal Methods for Software Engineering)

CPS 220 Theory of Computation Pushdown Automata (PDA)

Chapter 6. Properties of Regular Languages

Mathematics for linguists

Introduction to Automata

Theory of Computation (IV) Yijia Chen Fudan University

CSE 105 THEORY OF COMPUTATION

Context- Free Parsing with CKY. October 16, 2014

CSE 355 Test 2, Fall 2016

FLAC Context-Free Grammars

Functions on languages:

Transcription:

Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36

Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs 4 Pumping Lemma for TAG 2 / 36

Parsing as deduction: Parsing Schemata (1) Pereira and Warren (1983); Shieber et al. (1995); Sikkel (1997) Parsing Schemata understand parsing as a deductive process. Deduction of new items from existing ones can be described using inference rules. General form: antecedent side conditions consequent antecedent, consequent: lists of items Application: if antecedent can be deduced and side condition holds, then the consequent can be deduced as well. 3 / 36

Parsing as deduction: Parsing Schemata (2) A parsing schema consists of deduction rules; an axiom (or axioms), can be written as a deduction rule with empty antecedent; and a goal item. The parsing algorithm succeeds if, for a given input, it is possible to deduce the goal item. 4 / 36

Parsing as deduction: Parsing schemata (3) Example: CYK-Parsing for CFG in Chomsky Normal Form. Goal item: [S, 1, n] Deduction rules: Scan: [A, i, 1] A w i P Complete: [B, i, l 1 ], [C, i + l 1, l 2 ] [A, i, l 1 + l 2 ] A B C P 5 / 36

Parsing as deduction: Chart parsing (1) Chart parsing: We have two structures, the chart C and an agenda A. Both are initialized as empty. We start by computing all items that are axioms, i.e., that can be obtained by applying rules with empty antecedents. Starting from these items, we extend the set C as far as possible by subsequent applications of the deduction rules. The agenda contains items that are waiting to be used in further deduction rules. It avoids multiple applications of the same instance of a deduction rule. 6 / 36

Parsing as deduction: Chart parsing (2) Chart parsing C = A = for all items I resulting form a rule application with empty antecedent set: add I to C and to A while A : remove an item I from A for all items I deduced from I and items from C as antecedents: if I / C: add I to C and to A if there is a goal item in C: return true else return false 7 / 36

CYK for TAG: Items (1) CYK-Parsing for TAG: First presented in Vijay-Shanker and Joshi (1985), formulation with deduction rules in Kallmeyer and Satta (2009); Kallmeyer (2010). Assumption: elementary trees are such that each node has at most two daughters. (Any TAG can be transformed into an equivalent TAG satisfying this condition.) The algorithm simulates a bottom-up traversal of the derived tree. 8 / 36

CYK for TAG: Items (2) At each moment, we are in a specific node in an elementary tree and we know about the yield of the part below. Either there is a foot node below, then the yield is separated into two parts. Or there is no foot node below and the yield is a single substring of the input. We need to keep track of whether we have already adjoined at the node or not since at most one adjunction per node can occur. For this, we distinguish between a bottom and a top position for the dot on a node. Bottom signifies that we have not performed an adjunction. 9 / 36

CYK for TAG: Items (3) Item form: [γ, p t, i, f 1, f 2, j] where γ I A, p is the Gorn address of a node in γ (ɛ for the root, pi for the ith daughter of the node at address p), subscript t {, } specifies whether substitution or adjunction has already taken place ( ) or not ( ) at p, and 0 i f 1 f 2 j n are indices with i, j indicating the left and right boundaries of the yield of the subtree at position p and f 1, f 2 indicating the yield of a gap in case a foot node is dominated by p. We write f 1 = f 2 = if no gap is involved. 10 / 36

CYK for TAG: Inference rules (1) Goal items: [α, ɛ, 0,,, n] where α I We need two rules to process leaf nodes while scanning their labels, depending on whether they have terminal labels or labels ɛ: Lex-scan: [γ, p, i,,, i + 1] l(γ, p) = w i+1 Eps-scan: [γ, p, i,,, i] l(γ, p) = ɛ w i+1 i i + 1 Lex-scan ε i i Eps-scan (Notation: l(γ, p) is the label of the node at address p in γ.) 11 / 36

CYK for TAG: Inference rules (2) The rule foot-predict processes the foot node of auxiliary trees β A by guessing the yield below the foot node: Foot-predict: [β, p, i, i, j, j] β A, p foot node address in β, i j A A* i j 12 / 36

CYK for TAG: Inference rules (3) When moving up inside a single elementary tree, we either move from only one daughter to its mother, if this is the only daughter, or we move from the set of both daughters to the mother node: Move-unary: [γ, (p 1), i, f 1, f 2, j] [γ, p, i, f 1, f 2, j] node address p 2 does not exist in γ Move-binary: [γ, (p 1), i, f 1, f 2, k], [γ, (p 2), k, f 1, f 2, j] [γ, p, i, f 1 f 1, f 2 f 2, j] (f f = f where f = f if f =, f = f if f =, and f is undefined otherwise) 13 / 36

CYK for TAG: Inference rules (4) Move-unary: A B A B i j i j 14 / 36

CYK for TAG: Inference rules (5) Move-binary: γ A γ A B C B C i k k j γ A B C i j 15 / 36

CYK for TAG: Inference rules (6) For nodes that do not require adjunction, we can move from the bottom position of the node to its top position. Null-adjoin: [γ, p, i, f 1, f 2, j] [γ, p, i, f 1, f 2, j] f OA (γ, p) = 0 A A i j i j 16 / 36

CYK for TAG: Inference rules (9) The rule substitute performes a substitution: Substitute: [α, ɛ, i,,, j] [γ, p, i,,, j] l(α, ɛ) = l(γ, p) α A γ A i j i j 17 / 36

CYK for TAG: Inference rules (8) The rule adjoin adjoins an auxiliary tree β at p in γ, under the precondition that the adjunction of β at p in γ is allowed: Adjoin: [β, ɛ, i, f 1, f 2, j], [γ, p, f 1, f 1, f 2, f 2 ] [γ, p, i, f 1, f 2, j] β f SA (γ, p) 18 / 36

CYK for TAG: Inference rules (9) Adjoin: A γ β A A* i f 1 f 2 j f 1 f 2 γ A i j 19 / 36

CYK for TAG: Complexity Complexity of the algorithm: What is the upper bound for the number of applications of the adjoin operation? We have A possibilities for β, A I for γ, m for p where m is the maximal number of internal nodes in an elementary tree. The six indices i, f 1, f 1, f 2, f 2, j range from 0 to n. Consequently, adjoin can be applied at most A A I m(n + 1) 6 times and therefore, the time complexity of this algorithm is O(n 6 ). 20 / 36

Closure properties (1) One of the reasons why the TAG formalism is appealing from a formal point of view is the fact that it has nice closure properties Vijay-Shanker and Joshi (1985); Vijay-Shanker (1987). Proposition TALs are closed under union. This can be easily shown as follows: Assume the two sets of non-terminals to be disjoint. Then build a large TAG putting the initial and auxiliary trees from the two grammars together. 21 / 36

Closure properties (2) Proposition TALs are closed under concatenation. In order to show this, assume again the sets of non-terminals to be disjoint. Then build the unions of the initial and auxiliary trees, introduce a new start symbol S and add one initial tree with root label S and two daughters labeled with the start symbols of the original grammars. 22 / 36

Closure properties (3) Proposition TALs are closed under Kleene closure. The idea of the proof is as follows: We add an initial tree with the empty word and an auxiliary tree that can be adjoined to the roots of initial trees with the start symbol and that has a new leaf with the start symbol. 23 / 36

Closure properties (4) Proposition TALs are closed under substitution. In order to obtain the TAG that yields the language after substitution, we replace all terminals by start symbols of the corresponding TAGs. As a corollary one obtains: Proposition TALs are closed under arbitrary homomorphisms. 24 / 36

Closure properties (5) Proposition TALs are closed under intersection with regular languages. The proof in Vijay-Shanker (1987) uses extended push-down automata (EPDA), the automata that recognize TALs. We will introduce EPDAs later. Vijay-Shanker combines such an automaton with the finite state automaton for a regular language in order to construct a new EPDA that recognizes the intersection. 25 / 36

Pumping lemma (1) In CFLs, from a certain string length on, two parts of the string can be iterated ( pumped ). The proof idea is the following: Context-free derivation trees from a certain maximal path length on have the property that a non-terminal occurs twice on this path. Then the part between the two occurrences can be iterated. This means that the strings to the left and right of this part are pumped. The same kind of iteration is possible in TAG derivation trees since TAG derivation trees are context-free. This leads to a pumping lemma for TALs Vijay-Shanker (1987). 26 / 36

Pumping lemma (2) Iteration on TAG derivation trees: β β The blue part can be iterated. 27 / 36

Pumping lemma (3) In other words, A derived auxiliary tree β can be repeatedly adjoined into itself. Into the lowest β (low in the sense of the derivation tree) another auxiliary tree β derived from β is adjoined. 28 / 36

Pumping lemma (4) What does that mean for the derived tree? Let n be the node in β to which β can be adjoined and to which the final β is adjoined as well. There are three cases for the corresponding derived trees before adjoining the final β : 1 n is on the spine (i.e., on the path from the root to the foot node), 2 n is on the left of the spine, or 3 n is on the right of the spine. 29 / 36

Pumping lemma (5) Case 1: Case 2: x z x z n n w 1 w 2 w 3 w 4 y xw n 1 v 1w n 2 ywn 3 v 2w n 4 z w 1 w 2 w 3 w 4 xwn+1 1 v 1 w 2 v 2 w 3 (w 2 w 4 w 3 ) n yw 4 z y (Case 3 is exactly like case 2, except that everything is mirrored.) 30 / 36

Pumping lemma (6) Proposition (Pumping Lemma for TAL) If L is a TAL, then there is a constant c such that if w L and w c, then there are x, y, z, v 1, v 2, w 1, w 2, w 3, w 4 T such that v 1 v 2 w 1 w 2 w 3 w 4 c, w 1 w 2 w 3 w 4 1, and one of the following three cases holds: 1 w = xw 1 v 1 w 2 yw 3 v 2 w 4 z and xw1 n v 1 w2 n yw3 n v 2 w4 n z is in the string language for all n 0, or 2 w = xw 1 v 1 w 2 v 2 w 3 yw 4 z and xw1 n+1 v 1 w 2 v 2 w 3 (w 2 w 4 w 3 ) n yw 4 z is in the string language for all n 0, or 3 w = xw 1 yw 2 v 1 w 3 v 2 w 4 z and xw 1 y(w 2 w 1 w 3 ) n w 2 v 1 w 3 v 2 w4 n+1 z is in the string language for all n 0. 31 / 36

Pumping lemma (7) As a corollary, the following weaker pumping lemma holds: Proposition (Weak Pumping Lemma for TAL) If L is a TAL, then there is a constant c such that if w L and w c, then there are x, y, z, v 1, v 2, w 1, w 2, w 3, w 4 T such that v 1 v 2 w 1 w 2 w 3 w 4 c, w 1 w 2 w 3 w 4 1, w = xv 1 yv 2 z, and xw1 nv 1w2 nywn 3 v 2w4 n z L(G) for all n 0. In this weaker version, the w 1, w 2, w 3, w 4 need not be substrings of the original word w. 32 / 36

Pumping lemma (8) A pumping lemma can be used to show that certain languages are not in the class of the string languages satisfying the pumping proposition. Proposition The double copy language L := {www w {a, b} } is not a TAL. 33 / 36

Pumping lemma (9) Proof: Assume that L is a TAL. Then L := L a b a b a b = {a n b m a n b m a n b m n, m 0} is a TAL as well. Assume that L satisfies the weak pumping lemma with a constant c. Consider the word w = a c+1 b c+1 a c+1 b c+1 a c+1 b c+1. None of the w i, 1 i 4 from the pumping lemma can contain both as and bs. Furthermore, at least three of them must contain the same letters and be inserted into the three different a c+1 respectively or into the three different b c+1. This is a contradiction since then either v 1 c + 1 or v 2 c + 1. 34 / 36

Pumping lemma (10) Another example of a language that can be shown not to be a TAL, using the pumping lemma, is L 5 := {a n b n c n d n e n n 0}. 35 / 36

Kallmeyer, L. (2010). Parsing Beyond Context-Free Grammars. Cognitive Technologies. Springer, Heidelberg. Kallmeyer, L. and Satta, G. (2009). A polynomial-time parsing algorithm for tt-mctag. In Proceedings of ACL, Singapore. Pereira, F. C. N. and Warren, D. (1983). Parsing as deduction. In 21st Annual Meeting of the Association for Computational Linguistics, pages 137 âăş144, MIT, Cambridge, Massachusetts. Shieber, S. M., Schabes, Y., and Pereira, F. C. N. (1995). Principles and implementation of deductive parsing. Journal of Logic Programming, 24(1 and 2):3 36. Sikkel, K. (1997). Parsing Schemata. Texts in Theoretical Computer Science. Springer-Verlag, Berlin, Heidelberg, New York. Vijay-Shanker, K. (1987). A Study of Tree Adjoining Grammars. PhD thesis, University of Pennsylvania. Vijay-Shanker, K. and Joshi, A. K. (1985). Some computational properties of Tree Adjoining Grammars. In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics, pages 82 93.