Compositions of Bottom-Up Tree Series Transformations Andreas Maletti a Technische Universität Dresden Fakultät Informatik D 01062 Dresden, Germany maletti@tcs.inf.tu-dresden.de May 27, 2005 1. Motivation 2. Semirings, Tree Series, and Tree Series Substitution 3. Bottom-Up Tree Series Transducers 4. Composition Results a Financially supported by the German Research Foundation (DFG, GK 334) 1 May 27, 2005
Motivation Babel Fish Translation German Herzlich willkommen meine sehr geehrten Damen und Herren. Ich möchte mich vorab bei den Organisatoren für die vortrefflich geleistete Arbeit bedanken. English Cordially welcomely my very much honoured ladies and gentlemen. I would like to thank you first the supervisors for the splendid carried out work. Herzlich willkommen meine sehr geehrten Damen und Herren. Ich möchte mich vorab bei den Organisatoren, die diese Veranstaltung erst ermöglicht haben, für die vortrefflich geleistete Arbeit bedanken. Cordially welcomely my very much honoured ladies and gentlemen. I would like me first the supervisors, who made this meeting possible only, for whom splendid carried out work thank you. Motivation 2 May 27, 2005
Motivation Humans Conversation level?? Context, History Sentence level SENT NP ADJ NP VP (Tree) SENT NP ADJ NP VP (Tree) Grammar Word level Herzlich (String) Babel Fish Cordially (String) Motivation 3 May 27, 2005
Motivation Automatic translation is widely used (even Microsoft uses it to translate English documentation into German) Der Wert des Punkts, der zurückgegeben wird, wird in Bildschirmkoordinaten angegeben, während das Aufrufen von CWnd::GetCurrentMessage statt eine Nachricht manuell erstellen, sinnvoll schien. Die Quick-Info führt einen Treffertest aus, um wann festzulegen, den Test den Punkt von innerhalb der Grenze des Client-Rechtecks jeder zugeordneter Tools den weitergeleiteten Nachrichtenherbsten fehlzuschlagen und die Quick-Info nicht angezeigt werden. Dictionaries are very powerful word-to-word translators; leave few words untranslated Outcome is nevertheless usually unhappy and ungrammatical Post-processing necessary Motivation 4 May 27, 2005
Motivation Major problem: Ambiguity of natural language Common approach: Soft output (results equipped with a probability) Human choses the correct translation among the more likely ones Motivation 5 May 27, 2005
Motivation Humans Conversation level?? Context, History Sentence level 0.2 0.8 SENT NP VP ADJ NP SENT NP VP ADV VP 0.4 0.6 SENT NP VP ADJ NP SENT NP VP ADV VP Grammar Word level Herzlich... Babel Fish 0.9 Cordially... 0.1 Heartily... Motivation 6 May 27, 2005
Motivation Tree series transducers are a straightforward generalization of (i) tree transducers, which are applied in syntax-directed semantics, functional programming, and XML querying, (ii) weighted automata, which are applied in (tree) pattern matching, image compression and speech-to-text processing. Motivation 7 May 27, 2005
Generalization Hierarchy tree series transducer τ : T Σ A T weighted tree automaton L A T Σ weighted transducer τ : Σ A tree transducer τ : T Σ B T weighted automaton L A Σ tree automaton L B T Σ generalized sequential machine τ : Σ B string automaton L B Σ Motivation 8 May 27, 2005
Trees Σ ranked alphabet, Σ k Σ symbols of rank k, X = { x i i N + } T Σ (X) set of Σ-trees indexed by X, T Σ = T Σ ( ), t T Σ (X) is linear (resp., nondeleting) in Y X, if every y Y occurs at most (resp., at least) once in t, t[t 1,..., t k ] denotes the tree substitution of t i for x i in t Examples: Σ = {σ (2), γ (1), (0), β (0) } and Y = {x 1, x 2 } σ σ γ γ σ β x 1 σ ( σ(, β), γ(x 1 ) ) x 1 x 2 γ ( σ(x 1, x 2 ) ) Semirings, Tree Series, and Tree Series Substitution 9 May 27, 2005
Semirings A semiring is an algebraic structure A = (A,, ) (A, ) is a commutative monoid with neutral element 0, (A, ) is a monoid with neutral element 1, 0 is absorbing wrt., and distributes over (from left and right). Examples: semiring of non-negative integers N = (N { }, +, ) Boolean semiring B = ({0, 1},, ) tropical semiring T = (N { }, min, +) any ring, field, etc. Semirings, Tree Series, and Tree Series Substitution 10 May 27, 2005
Properties of Semirings We say that A is commutative, if is commutative, idempotent, if a a = a, complete, if there is an operation I : AI A such that 1. i {m,n} a i = a m a n, 2. i I a i = ) j J ( i I j a i, if I = j J I j is a (generalized) partition of I, and 3. ( i I a ) ( i j J b ) j = i I,j J (a i b j ). Semiring Commutative Idempotent Complete N YES no YES B YES YES YES T YES YES YES Semirings, Tree Series, and Tree Series Substitution 11 May 27, 2005
Tree Series A = (A,, ) semiring, Σ ranked alphabet Mappings ϕ : T Σ (X) A are also called tree series the set of all tree series is A T Σ (X), the coefficient of t T Σ (X) in ϕ, i.e., ϕ(t), is denoted by (ϕ, t), the sum is defined pointwise (ϕ 1 ϕ 2, t) = (ϕ 1, t) (ϕ 2, t), the support of ϕ is supp(ϕ) = { t T Σ (X) (ϕ, t) 0 }, ϕ is linear (resp., nondeleting in Y X), if supp(ϕ) is a set of trees, which are linear (resp., nondeleting in Y), the series ϕ with supp(ϕ) = is denoted by 0. Example: ϕ = 1 + 1 β + 3 σ(, ) +... + 3 σ(β, β) + 5 σ(, σ(, )) +... Semirings, Tree Series, and Tree Series Substitution 12 May 27, 2005
Tree Series Substitution A = (A,, ) complete semiring, ϕ, ψ 1,..., ψ k A T Σ (X) Pure substitution of (ψ 1,..., ψ k ) into ϕ: ϕ (ψ 1,..., ψ k ) = (ϕ, t) (ψ 1, t 1 ) (ψ k, t k ) t[t 1,..., t k ] t supp(ϕ), ( i [k]): t i supp(ψ i ) Example: 5 σ(x 1, x 1 ) (2 3 β) = 10 σ(, ) 15 σ(β, β) σ σ σ 5 (2 3 β) = 10 x 1 x 1 15 β β Semirings, Tree Series, and Tree Series Substitution 13 May 27, 2005
Tree Series Transducers Definition: A (bottom-up) tree series transducer (tst) is a system M = (Q, Σ,, A, F, µ) Q is a non-empty set of states, Σ and are input and output ranked alphabets, A = (A,, ) is a complete semiring, F A T (X 1 ) Q is a vector of linear and nondeleting tree series, also called final output, tree representation µ = ( µ k ) k N with µ k : Σ k A T (X k ) Q Qk. If Q is finite and µ k (σ) q, q is polynomial, then M is called finite. Tree Series Transducers 14 May 27, 2005
Semantics of Tree Series Transducers Mapping r : pos(t) Q is a run of M on the input tree t T Σ Run(t) set of all runs on t Evaluation mapping: eval r : pos(t) A T defined for every k N, lab t (p) Σ k by eval r (p) = µ k (lab t (p)) r(p),r(p 1)...r(p k) ( eval r (p 1),..., eval r (p k) ) Tree-series transformation induced by M is M : A T Σ A T defined M (ϕ) = ( ) eval r (ε) t T Σ r Run(t) Tree Series Transducers 15 May 27, 2005
Semantics Example M = (Q, Σ,, N, F, µ) with Q = {, }, Σ = {σ (2), (0) } and = {γ (1), (0) }, F = 0 and F = 1 x 1, and tree representation µ 0 () = 1 µ 0 () = 1 µ 2 (σ), = 1 µ 2 (σ), = 1 x 1 µ 2 (σ), = 1 x 2 Tree Series Transducers 16 May 27, 2005
Semantics Example (cont.) Input tree t Run r on t eval r on t Output tree σ γ 3 () γ σ σ γ 2 () γ σ σ γ() γ M (1 t) = 2 γ() 4 γ 3 () Tree Series Transducers 17 May 27, 2005
Known Results nl-bot ts-ts (A) nl-bot ts-ts (A) = nl-bot ts-ts (A) nl-bot ts-ts (A) h-bot ts-ts (A) BOT ts-ts (A) p-bot ts-ts (A) nlp-bot ts-ts (A) h-bot ts-ts (A) Corollary: For every commutative and ℵ 0 -complete semiring nlp-bot ts-ts (A) p-bot ts-ts (A) = p-bot ts-ts (A). Composition results 18 May 27, 2005
Extension (Q, Σ,, A, F, µ) tree series transducer, q Q k, q Q, ϕ A T Σ (X k ) Definition: We define hµ q : T Σ (X k ) A T (X k ) Q hµ(x q 1 x i, if q = q i i ) q = 0, otherwise hµ(σ(t q 1,..., t k )) q = µ k (σ) q,p1...p k (hµ(t q 1 ) p1,..., hµ(t q k ) pk ) p 1,...,p k Q We define hµ q : A T Σ (X k ) A T (X k ) Q by hµ(ϕ) q q = (ϕ, t) hµ(t) q q t T Σ (X k ) Composition results 19 May 27, 2005
Composition Construction M 1 = (Q 1, Σ,, A, F 1, µ 1 ) and M 2 = (Q 2,, Γ, A, F 2, µ 2 ) tree series transducer Definition: The product of M 1 and M 2, denoted by M 1 M 2, is the tree series transducer M = (Q 1 Q 2, Σ, Γ, A, F, µ) F pq = i Q 2 (F 2 ) i h q µ 2 ( (F1 ) p )i µ k (σ) pq,(p1 q 1,...,p k q k ) = h q 1...q k µ 2 ( (µ1 ) k (σ) p,p1...p k )q. Composition results 20 May 27, 2005
Composition On subtree: t a = M1 u b = M2 v Deletion: a a = M1 b = M2 t t u u v Composition results 21 May 27, 2005
Composition (cont.) On subtree: t a b = M1 M 2 v Deletion: t t a a b b = M1 M 2 v v Composition results 22 May 27, 2005
Another Product Definition: / Q 2, Q 2 = Q 2 { }, and 0 construct M 2 = (Q 2, Γ,, A, F 2, µ 2 ) with (F 2 ) q = (F 2 ) q ; q Q 2 and (F 2 ) = 0 and tree representation µ 2 (µ 2) k (γ) q,q1...q k = (µ 2 ) k (γ) q,q1...q k (µ 2) k (γ),... = 1. construct (M 1 M 2 ) = (Q 1 Q 2, Σ,, A, F, µ) with F (p,q) = (F 2 ) q ( h q ) µ 2((F 1 ) p ) q q Q 2 µ k (σ) (p,q),(p1,q 1 )...(p k,q k ) = h q 1...q k µ 2 µ k (σ) (p, ),(p1, )...(p k, ) = h... µ 2 ( t T Γ (k), ( i [k]): i/ var(t) q i = ( (µ1 ) k (σ) p,p1...p k ) ) ((µ 1 ) k (σ) p,p1...p k, t) t q Composition results 23 May 27, 2005
Main Theorem A commutative and complete semiring Main Theorem l-bot ts-ts (A) BOT ts-ts (A) = BOT ts-ts (A). BOT ts-ts (A) db-bot ts-ts (A) = BOT ts-ts (A), BOT ts-ts (A) d-bot ts-ts (A) = BOT ts-ts (A), provided that A is multiplicatively idempotent. Composition results 24 May 27, 2005
References [Borchardt 04] B. Borchardt: Code Selection by Tree Series Transducers. CIAA 04, Kingston, Canada, 2004. to appear [Engelfriet et al 02] J. Engelfriet, Z. Fülöp, and H. Vogler: Bottom-up and Topdown Tree Series Transformations. Journal of Automata, Languages, and Combinatorics 7:11 70, 2002 [Fülöp et al 03] Z. Fülöp and H. Vogler: Tree Series Transformations that Respect Copying. Theory of Computing Systems 36:247 293, 2003 [Kuich 99] W. Kuich: Tree Transducers and Formal Tree Series. Acta Cybernetica 14:135 149, 1999 Composition results 25 May 27, 2005