Linking Theorems for Tree Transducers

Similar documents
Applications of Tree Automata Theory Lecture VI: Back to Machine Translation

The Power of Weighted Regularity-Preserving Multi Bottom-up Tree Transducers

Applications of Tree Automata Theory Lecture V: Theory of Tree Transducers

Composition Closure of ε-free Linear Extended Top-down Tree Transducers

How to train your multi bottom-up tree transducer

The Power of Tree Series Transducers

Multi Bottom-up Tree Transducers

Preservation of Recognizability for Weighted Linear Extended Top-Down Tree Transducers

Hierarchies of Tree Series TransducersRevisited 1

Lecture 7: Introduction to syntax-based MT

Compositions of Extended Top-down Tree Transducers

Statistical Machine Translation of Natural Languages

Tree Transducers in Machine Translation

Extended Multi Bottom-Up Tree Transducers

Key words. tree transducer, natural language processing, hierarchy, composition closure

Compositions of Tree Series Transformations

Composition and Decomposition of Extended Multi Bottom-Up Tree Transducers?

Compositions of Top-down Tree Transducers with ε-rules

Statistical Machine Translation of Natural Languages

Compositions of Bottom-Up Tree Series Transformations

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Pure and O-Substitution

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Why Synchronous Tree Substitution Grammars?

Decision Problems of Tree Transducers with Origin

GENERATING SETS AND DECOMPOSITIONS FOR IDEMPOTENT TREE LANGUAGES

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Lecture 12: Mapping Reductions

Tree transformations and dependencies

Probabilistic Context Free Grammars. Many slides from Michael Collins

Syntax-Directed Translations and Quasi-alphabetic Tree Bimorphisms Revisited

Compositions of Bottom-Up Tree Series Transformations


Random Generation of Nondeterministic Tree Automata

Minimization of Weighted Automata

De Nugis Groebnerialium 5: Noether, Macaulay, Jordan

DYNAMICAL CUBES AND A CRITERIA FOR SYSTEMS HAVING PRODUCT EXTENSIONS

Applied Natural Language Processing

Rule Extraction for Machine Translation

Learning Rational Functions

Counting k-marked Durfee Symbols

A Uniformization Theorem for Nested Word to Word Transductions

Decision Problems of Tree Transducers with Origin

Fuzzy Tree Transducers

Array Insertion and Deletion P Systems

Regular transformations of data words through origin information

ELEMENTARY LINEAR ALGEBRA

Syntax-Based Decoding

Probabilistic Context-free Grammars

Tractable Lineages on Treelike Instances: Limits and Extensions

Chap. 3. Controlled Systems, Controllability

A b r i l l i a n t young chemist, T h u r e Wagelius of N e w Y o r k, ac. himself with eth

Alessandro Mazzei MASTER DI SCIENZE COGNITIVE GENOVA 2005

Expressive power of pebble automata

0/16. Tree Transducers. Niko Paltzer. Seminar Formal Grammars WS 06/07 Advisor: Marco Kuhlmann. Programming Systems Lab.

CS 545 Lecture XVI: Parsing

Math 1 Variable Manipulation Part 5 Absolute Value & Inequalities

COM364 Automata Theory Lecture Note 2 - Nondeterminism

A matrix over a field F is a rectangular array of elements from F. The symbol

On Affine Geometry of Space Curves

This lecture covers Chapter 5 of HMU: Context-free Grammars

Weighted Tree-Walking Automata

Jumping Finite Automata

The SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania

Strictly Positive Definite Functions on a Real Inner Product Space

An algebraic characterization of unary two-way transducers

Topics in Timed Automata

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Combinatorial Method in the Coset Enumeration. of Symmetrically Generated Groups II: Monomial Modular Representations

FFTs in Graphics and Vision. Groups and Representations

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 8 Nancy Lynch

Automata for arithmetic Meyer sets

Backward and Forward Bisimulation Minimisation of Tree Automata

2. Prime and Maximal Ideals

Unique Normal Forms and Confluence of Rewrite Systems: Persistence

(pp ) PDAs and CFGs (Sec. 2.2)

CS 4424 Matrix multiplication

On the approximation problem of common fixed points for a finite family of non-self asymptotically quasi-nonexpansive-type mappings in Banach spaces

Transducers for bidirectional decoding of prefix codes

Finite Elements. Colin Cotter. January 18, Colin Cotter FEM

Control of Electromechanical Systems

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA

Phylogenetic Networks with Recombination

HMM and Part of Speech Tagging. Adam Meyers New York University

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

What You Must Remember When Processing Data Words

THEODORE VORONOV DIFFERENTIAL GEOMETRY. Spring 2009

MATH3283W LECTURE NOTES: WEEK 6 = 5 13, = 2 5, 1 13

Confluence of Shallow Right-Linear Rewrite Systems

Radial Basis Functions I

Stein s Method and Characteristic Functions

Projective Quantum Spaces

The Non-Deterministic Mostowski Hierarchy and Distance-Parity Automata

FINITE STATE AUTOMATA

(pp ) PDAs and CFGs (Sec. 2.2)

Node Query Preservation for Deterministic Linear Top-Down Tree Transducers

Miniversal deformations of pairs of skew-symmetric matrices under congruence. Andrii Dmytryshyn

Hasse Diagrams for Classes of Deterministic Bottom-Up Tree-to-Tree-Series Transformations

Counter Automata and Classical Logics for Data Words

Myhill-Nerode Theorem for Recognizable Tree Series Revisited?

Transcription:

Linking Theorems for Tree Transducers Andreas Maletti maletti@ims.uni-stuttgart.de peyer October 1, 2015 Andreas Maletti Linking Theorems for MBOT Theorietag 2015 1 / 32

tatistical Machine Translation w kana ynzran -BJ PP-CLR PP-MNR Aly b h $kl mdhk him a funny way at looking PP-CLR in PP they were And -BJ

tatistical Machine Translation w kana ynzran -BJ PP-CLR PP-MNR Aly b h $kl mdhk him a funny way at looking PP-CLR in PP they were And -BJ

tatistical Machine Translation ynzran w kana -BJ PP-CLR Aly h PP-MNR b $kl mdhk h him b b $kl in $kl a. way mdhk mdhk funny w w And Aly Aly at kana kana -BJ they. were ynzran ynzran looking him at looking PP-CLR a funny way in PP they were And -BJ

tatistical Machine Translation ynzran w kana -BJ PP-CLR PP-MNR Aly b $kl mdhk h him b b $kl in $kl a. way mdhk mdhk funny w w And Aly Aly at kana kana -BJ they. were ynzran ynzran looking $kl mdhk $kl Aly b ynzran PP-CLR PP kana w kana

tatistical Machine Translation ynzran w kana -BJ PP-CLR PP-MNR Aly b $kl mdhk h him b b $kl in $kl a. way mdhk mdhk funny w w And Aly Aly at kana kana -BJ they. were ynzran ynzran looking PP-CLR PP-CLR PP-CLR Aly Aly Aly $kl mdhk $kl b $kl mdhk $kl mdhk $kl ynzran PP-CLR PP kana w kana

tatistical Machine Translation ynzran w kana -BJ PP-CLR PP-MNR b h him b b $kl in $kl a. way mdhk mdhk funny w w And Aly Aly at kana kana -BJ they. were ynzran ynzran looking ynzran PP-CLR PP kana b PP-CLR PP-CLR PP-CLR Aly Aly $kl mdhk $kl mdhk $kl w kana

tatistical Machine Translation ynzran w kana -BJ PP-CLR PP-MNR b h him b b $kl in $kl a. way mdhk mdhk funny w w And Aly Aly at kana kana -BJ they. were ynzran ynzran looking ynzran PP-CLR PP kana b PP-CLR PP-CLR PP-CLR Aly Aly $kl mdhk $kl mdhk $kl w kana PP-MNR PP-MNR b b PP

tatistical Machine Translation Extracted rules w w kana kana. kana $kl mdhk $kl mdhk $kl PP-MNR PP-MNR b ynzran b -BJ PP h w w And PP-CLR PP-MNR him b b $kl mdhk in $kl a. way mdhk funny kana kana they. were ynzran ynzran looking Aly Aly at PP-CLR PP-CLR ynzran PP-CLR PP-MNR Aly PP-CLR Aly

Linear Multi Tree Transducer MBOT linear multi tree transducer (Q, Σ, I, R) finite set Q alphabet Σ states input and output symbols I Q initial states finite set R T Σ (Q) Q T Σ (Q) rules each Q occurs at most once in l (l,, r) R each Q that occurs in r also occurs in l (l,, r) R

Linear Multi Tree Transducer yntactic properties MBOT (Q, Σ, I, R) is linear tree transducer with regular look-ahead (XTOP R ) if r 1 linear tree transducer (XTOP) if r = 1 (l,, r) R (l,, r) R

Linear Multi Tree Transducer yntactic properties MBOT (Q, Σ, I, R) is linear tree transducer with regular look-ahead (XTOP R ) if r 1 linear tree transducer (XTOP) if r = 1 ε-free if l / Q (l,, r) R (l,, r) R (l,, r) R

Linear Multi Tree Transducer Extracted rules w w kana kana. kana $kl mdhk $kl mdhk $kl PP-MNR PP-MNR b ynzran b -BJ PP h w w And PP-CLR PP-MNR him b b $kl mdhk in $kl a. way mdhk funny kana kana they. were ynzran ynzran looking Aly Aly at PP-CLR PP-CLR ynzran PP-CLR PP-MNR Aly PP-CLR Aly Properties XTOP R : XTOP: ε-free:

Another Example Textual example MBOT M = (Q, Σ, {}, R) Q = {,, id, id } Σ = {,, γ, α} the following rules in R: (, ) (, ) (, ) (id, id ), (id, id ) γ(id) id,id γ(id) α id,id α

Another Example Graphical representation, id id id id γ id id,id γ id α id,id α Properties XTOP R : XTOP: ε-free:

emantics Rules, id id id id γ id id,id γ id α id,id α

emantics Rules, id id id id γ id id,id γ id α id,id α

emantics Rules, id id id id γ id id,id γ id α id,id α

emantics Rules, id id id id γ id id,id γ id α id,id α id id id id

emantics Rules, id id id id γ id id,id γ id α id,id α id id id id

emantics Rules, id id id id γ id id,id γ id α id,id α α α α α α α α α

emantics Rules, id id id id γ id id,id γ id α id,id α α α γ α α α α α α α

emantics id id id id entential forms t, A, D, u t T Σ (Q) A N N D N N u T Σ (Q) input tree active links (red) disabled links (gray) output tree

emantics M w w M w kana w kana kana M w And kana kana kana M w And they kana were

emantics Dependencies and relation state-computed dependencies: M = { t, D, u t, u T Σ,, {(ε, ε)},, M t,, D, u } computed dependencies: dep(m) = I M

emantics Dependencies and relation state-computed dependencies: M = { t, D, u t, u T Σ,, {(ε, ε)},, M t,, D, u } computed dependencies: dep(m) = I M computed transformation: τ M = {(t, u) t, D, u dep(m)}

Further Properties Regularity-preserving transformation τ T Σ T Σ preserves regularity if τ(l) = {u (t, u) τ, t L} is regular for all regular L T Σ rp-mbot = regularity preserving transformations computable by MBOT Compositions τ 1 ; τ 2 = {(s, u) t : (s, t) τ 1, (t, u) τ 2 } support modular development allow integration of external knowledge sources occur naturally in uery rewriting

Contents 1 Basics 2 Linking techniue

Dependencies Recent research Bojańczyk, ICALP 2014 Maneth et al., ICALP 2015 on models with dependencies

Dependencies Hierarchical properties A dependency t, D, u is input hierarchical if 1 w 2 w 1 2 (v 1, w 1 ) D with w 1 w 2 for all (v 1, w 1 ), (v 2, w 2 ) D with v 1 < v 2

Dependencies Hierarchical properties A dependency t, D, u is input hierarchical if 1 w 2 w 1 2 (v 1, w 1 ) D with w 1 w 2 for all (v 1, w 1 ), (v 2, w 2 ) D with v 1 < v 2

Dependencies Hierarchical properties A dependency t, D, u is input hierarchical if 1 w 2 w 1 2 (v 1, w 1 ) D with w 1 w 2 for all (v 1, w 1 ), (v 2, w 2 ) D with v 1 < v 2 strictly input hierarchical if 1 v 1 < v 2 implies w 1 w 2 2 v 1 = v 2 implies w 1 w 2 or w 2 w 1 for all (v 1, w 1 ), (v 2, w 2 ) D

Dependencies Distance properties A dependency t, D, u is input link-distance bounded by b N if for all (v 1, w 1 ), (v 1 v, w 2 ) D with v > b (v 1 v, w 3 ) D such that v < v and 1 v b

Dependencies Distance properties A dependency t, D, u is input link-distance bounded by b N if for all (v 1, w 1 ), (v 1 v, w 2 ) D with v > b (v 1 v, w 3 ) D such that v < v and 1 v b

Dependencies Distance properties A dependency t, D, u is input link-distance bounded by b N if for all (v 1, w 1 ), (v 1 v, w 2 ) D with v > b (v 1 v, w 3 ) D such that v < v and 1 v b strict input link-distance bounded by b if for all v 1, v 1 v pos(t) with v > b (v 1 v, w 3 ) D such that v < v and 1 v b

Dependencies α α γ α α α α α α α

Dependencies

Dependencies strictly input hierarchical

Dependencies strictly input hierarchical and strictly output hierarchical

Dependencies strictly input hierarchical and strictly output hierarchical with strict input link-distance 2

Dependencies strictly input hierarchical and strictly output hierarchical with strict input link-distance 2 and strict output link-distance 1

Dependencies hierarchical link-distance bounded Model \ Property input output input output XTOP R strictly strictly strictly MBOT strictly strictly

Linking Theorem Theorem Let M 1,..., M k be ε-free XTOP R over Σ such that {(c[t 1,..., t n ], c [t 1,..., t n ]) t 1,..., t n T } τ M1 ; ; τ Mk for some contexts c, c C Σ (X n ) and special T T Σ. 1 i k, 1 j n t j T, u i 1, D i, u i dep(m i ), (v ji, w ji ) D i such that u 0 = c[t 1,..., t n ] and u k = c [t 1,..., t n ] pos xj (c ) w jk v ji w j(i 1) if i 2 pos xj (c) v j1

Linking Theorem Corollary [Arnold, Dauchet, TC 1982] Illustrated relation τ cannot be computed by any ε-free XTOP R t 2 t 1 t 1 t 4 t 3 t 3 t 2 t n 3 t n 2 t n 3 t n 4 t n t n 1 t n t n 1 t n 2

Linking Theorem Corollary [M. et al., ICOMP 2009] Illustrated relation τ cannot be computed by any ε-free XTOP R γ γ s t u s t u

Topicalization Example It rained yesterday night. Topicalized: Yesterday night, it rained.

Topicalization Example It rained yesterday night. Topicalized: Yesterday night, it rained. We toiled all day yesterday at the restaurant that charges extra for clean plates. Topicalized: At the restaurant that charges extra for clean plates, we toiled all day yesterday.

Topicalization On the tree level PRP VBD AD it rained NN RB night yesterday, NN yesterday NN night PRP it VBD rained

Topicalization On the tree level PRP we VBD toiled DT all NN day NN yesterday PP IN at DT the NN restaurant BAR WH WDT that VBZ charges JJ extra PP IN for JJ clean NN plates PP IN at DT the NN restaurant BAR WH WDT that VBZ charges JJ extra PP IN for JJ clean NN plates, PRP we VBD toiled DT all NN day NN yesterday

Topicalization t 2 t 1 t 3 t 2 t n 1 t 3 t n t 1 t n 1 t n Theorem Topicalization is in rp-mbot

Topicalization Theorem Topicalization cannot be computed by any composition of ε-free XTOP R u 1 u 2 u 0 u 3 t 2 t 3? (1) (2). (3). t 1 t 2 t n 1 t n t 1 v 12 v 13 t 3 t n 1 t n v (n 1)2 v n2 v n3 v (n 1)3 3 ε-free XTOP R sufficient to simulate any composition of ε-free XTOP R

Topicalization Corollary (XTOP R ) rp-mbot

Linking Theorem Theorem Let M = (Q, Σ, I, R) be an ε-free MBOT such that {(c[t 1,..., t n ], c [t 1,..., t n ]) t 1,..., t n T } τ M for some contexts c, c C Σ (X n ) and special T T Σ. 1 j n, t j T, u, D, u dep(m), (v j, w j ) D with u = c[t 1,..., t n ] and u = c [t 1,..., t n ] pos xj (c) v j pos xj (c ) w j

Linking Theorem Corollary Inverse of topicalization cannot be computed by any ε-free MBOT t 1 t 2 t 2 t 3 t 3 t n 1 t n 1 t n t n t 1

ummary & References ummary 1 (XTOP R ) rp-mbot 2 rp-mbot not closed under inverses 3 What happens to invertable MBOT?

ummary & References ummary 1 (XTOP R ) rp-mbot 2 rp-mbot not closed under inverses 3 What happens to invertable MBOT? References J. Engelfriet, E. Lilin, : Extended multi bottom-up tree transducers Composition and decomposition. Acta Inf., 2009 Z. Fülöp, : Composition closure of ε-free linear extended top-down tree transducers. Proc. 17th DLT, LNC 7907, 2013 P. Koehn: tatistical machine translation. Cambridge Univ. Press, 2009, J. Graehl, M. Hopkins, K. Knight: The power of extended top-down tree transducers. IAM J. Comput., 2009