A canonical semi-deterministic transducer

Similar documents
Chapter 4. Regular Expressions. 4.1 Some Definitions

The Binomial Theorem.

FABER Formal Languages, Automata. Lecture 2. Mälardalen University

CSCI 340: Computational Models. Regular Expressions. Department of Computer Science

CS6902 Theory of Computation and Algorithms

Deciding Representability of Sets of Words of Equal Length

Stream Codes. 6.1 The guessing game

The assignment is not difficult, but quite labour consuming. Do not wait until the very last day.

6.4 Binomial Coefficients

CS Automata, Computability and Formal Languages

A Canonical Semi-Deterministic Transducer

Author: Vivek Kulkarni ( )

In English, there are at least three different types of entities: letters, words, sentences.

Linear Classifiers (Kernels)

Learning Context Free Grammars with the Syntactic Concept Lattice

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Computational Learning Theory Learning Patterns (Monomials)

Automata Theory CS F-08 Context-Free Grammars

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

CS A Term 2009: Foundations of Computer Science. Homework 2. By Li Feng, Shweta Srivastava, and Carolina Ruiz.

Learning Regular Sets

Formal solution Chen-Fliess series

download instant at Assume that (w R ) R = w for all strings w Σ of length n or less.

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Automata Theory Final Exam Solution 08:10-10:00 am Friday, June 13, 2008

Using Multiplicity Automata to Identify Transducer Relations from Membership and Equivalence Queries

Foreword. Grammatical inference. Examples of sequences. Sources. Example of problems expressed by sequences Switching the light

Theory of Computer Science

On Parsing Expression Grammars A recognition-based system for deterministic languages

Recap from Last Time

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

{a, b, c} {a, b} {a, c} {b, c} {a}

PREDICTING NEW STRUCTURES: THE SIMPLE CUBIC AND PEROVSKITE CASE

Computational Learning Theory Regular Expression vs. Monomilas

CS 133 : Automata Theory and Computability

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Lecture 11 Context-Free Languages

Homework 4. Chapter 7. CS A Term 2009: Foundations of Computer Science. By Li Feng, Shweta Srivastava, and Carolina Ruiz

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Nondeterministic Finite Automata and Regular Expressions

18 Learning Transducers

CSEP 590 Data Compression Autumn Arithmetic Coding

Harvard CS121 and CSCI E-121 Lecture 2: Mathematical Preliminaries

Avoiding Large Squares in Partial Words

Using bases to simplify asymptotic expansions

16 Learning Probabilistic Finite Automata

Automata, Computability and Complexity with Applications Exercises in the Book Solutions Elaine Rich

Theoretical Computer Science

Pushdown Automata. Reading: Chapter 6

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Theoretical Computer Science

Automata Theory CS F-04 Non-Determinisitic Finite Automata

Context-Free Grammar

FLAC Context-Free Grammars

C1.1 Introduction. Theory of Computer Science. Theory of Computer Science. C1.1 Introduction. C1.2 Alphabets and Formal Languages. C1.

Automata: a short introduction

Complexité palindromique des codages de rotations et conjectures

Sturmian Words, Sturmian Trees and Sturmian Graphs

arxiv: v1 [math.gr] 21 Feb 2014

One-relation languages and ω-code generators

Theory of Computation (Classroom Practice Booklet Solutions)

A Universal Turing Machine

Theoretical Computer Science

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7

FALL 07 CSE 213: HW4

Context-free Languages and Pushdown Automata

chao-dyn/ Nov 1995

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Nonregular Languages

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

CSE 105 THEORY OF COMPUTATION

Entropy Games and Matrix Multiplication Games. EQINOCS seminar

Right Derivations on Semirings

Concordia University Department of Computer Science & Software Engineering

The Probability of Winning a Series. Gregory Quenell

Congruence based approaches

Emergent Chance. Christian List and Marcus Pivato * Abstract: We offer a new argument for the claim that there can be non-degenerate

Supplementary Examples for

1 Alphabets and Languages

CS375 Midterm Exam Solution Set (Fall 2017)

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching

Lecture 1 09/08/2017

2018 Canadian Senior Mathematics Contest

ÖVNINGSUPPGIFTER I SAMMANHANGSFRIA SPRÅK. 17 april Classrum Edition

Basics of WQO theory, with some applications in computer science

Improved TBL algorithm for learning context-free grammar

Advanced Spin-spin Coupling. NMR Spectroscopy. Magnetic Equivalence (Revisited)

Fundamentele Informatica II

Algebraic Expressions

De Bruijn sequences on primitive words and squares

Theory of Computation

Computational Theory

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

W3203 Discrete Mathema1cs. Coun1ng. Spring 2015 Instructor: Ilia Vovsha.

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

The Pumping Lemma (cont.) 2IT70 Finite Automata and Process Theory

Formal Languages and Automata

On the suffix automaton with mismatches

Transcription:

A canonical semi-deterministic transducer Achilles A. Beros Joint work with Colin de la Higuera Laboratoire d Informatique de Nantes Atlantique, Université de Nantes September 18, 2014

The Results There are learning algorithms for deterministic transducers. How far toward non-determinism can we extend learning? We consider a natural extension of deterministic transducers called semi-deterministic transducers (SDTs), and we find three results: There is a canonical form for SDTs. Even with domain knowledge, they are not learnable regardless of efficiency bounds. They are learnable with very weak additional queries (with polynomial bounds). It is possible that these weak queries can be simulated through statistical methods.

The Model A picture to illustrate the definition. a:aa,bb q a a:aa,bb q λ b:λ q aa a:a b:ab,ba q b b:ab,ba

The Model A picture to illustrate the definition. aa:aa,bb q a a:aa,bb q λ b:λ q aa a:a b:ab,ba q b b:ab,ba For a transition, e, the input (i(e)) is a member of the input alphabet (Σ).

The Model A picture to illustrate the definition. a:aa,bb q a a:aa,bb q λ b:λ q aa a:a b:a,ab q b b:ab,ba For a transition, e, the input (i(e)) is a member of the input alphabet (Σ). The output (o(e)) is a finite set of incomparable strings over the output alphabet (Ω).

The Model A picture to illustrate the definition. a:aa,bb q a a:aa,bb q λ b:λ q aa a:a b:ab,ba q b b:ab,ba b:a For a transition, e, the input (i(e)) is a member of the input alphabet (Σ). The output (o(e)) is a finite set of incomparable strings over the output alphabet (Ω). From each state, there is at most one transition with input x Σ.

The Model A picture to illustrate the definition. a:aa,bb q a a:aa,bb q λ b:λ q aa a:a b:ab,ba q b b:ab,ba The translations of aa are AAAA, AABB, BBAA and BBBB. The number of translations of a string can grow exponentially in the length of the string, for a fixed transducer.

The Model One final assumption: there is a unique initial state. This is for convenience. A few useful observations: λ is never the input of transition. If λ is in a set of outputs, it is the only member. Every string has a unique path through an SDT. Every string has a finite number of translations. These observations allow us to form a tree of outputs analogous to one for a sequential or subsequential transducer. In our case it is a tree of output trees.

The Tree of Output Trees Consider the tree of inputs that will not be rejected. λ a b aa bb aaa aab bba bbb aaaa aabb aaab aaba bbaa bbbb bbab bbba

The Tree of Output Trees The tree is the domain of a function to P(Ω ). λ {λ} a {AA, BB} b {AB, BA} aa {AAAA, AABB, BBAA, BBBB} bb {ABAB, ABBA, BAAB, BABA} (further outputs omitted) aaa aab bba bbb aaaa aabb aaab aaba bbaa bbbb bbab bbba

The Tree of Output Trees By looking one level higher in the ranked universe, a relation that was not well-defined has become a well-defined function. This function is the semantic representation of the transducer. In OSTIA and other state-merging algorithms, there are two phases: onwarding and merging. Onwarding is performed for deterministic transducers by finding common prefixes for translations comparing outputs in a much simpler tree of outputs. We demonstrate that this is possible in the much more general case of SDTs.

Onwarding The existing technique: Fix a dataset. λ : λ a : AB ba : AAB aaa : ABBA abb : ABAA bba : AABB bbb : AABA

Onwarding The existing technique: Fix a dataset. Consider inputs with a common prefix. λ : λ a : AB b :? ba : AAB aaa : ABBA abb : ABAA bba : AABB bbb : AABA

Onwarding The existing technique: Fix a dataset. Consider inputs with a common prefix. List the outputs of those inputs. λ : λ a : AB b :? ba : AAB aaa : ABBA abb : ABAA bba : AABB bbb : AABA

Onwarding The existing technique: Fix a dataset. Consider inputs with a common prefix. List the outputs of those inputs. Find the prefixes common to all the outputs. λ : λ a : AB b : A, AA or AAB ba : AAB aaa : ABBA abb : ABAA bba : AABB bbb : AABA

Onwarding The existing technique: Fix a dataset. Consider inputs with a common prefix. List the outputs of those inputs. Find the prefixes common to all the outputs. Choose a best common prefix to onward. λ : λ a : AB b : AAB ba : AAB aaa : ABBA abb : ABAA bba : AABB bbb : AABA

Onwarding To understand our new technique, we examine sets of prefixes. Consider the translations of aabb using the earlier example: {AAAAAB, AAAABA, AABBAB, AABBBA, BBAAAB, BBAABA, BBBBAB, BBBBBA} AAAAAB AABBAB AAAABA AABBBA BBAAAB BBBBAB BBAABA BBBBBA

Onwarding To understand our new technique, we examine sets of prefixes. Consider the translations of aabb using the earlier example: {AAAAAB, AAAABA, AABBAB, AABBBA, BBAAAB, BBAABA, BBBBAB, BBBBBA} AA BB AAAAAB AABBAB AAAABA AABBBA BBAAAB BBBBAB BBAABA BBBBBA

Onwarding {AA, BB} is a maximal antichain of the tree. We call it a valid antichain because the tree below AA is identical to the tree below BB. This is a consequence of AA and BB being outputs of the same transition. For a tree T and x T, T x is the tree of suffixes of x in T. Definition Given a tree, T, a subset S of T is a valid antichain if S is a maximal antichain and ( x, y S)(T x = T y ). Definition For P and Q, sets of strings over some common alphabet, we say that P < ac Q (P is antichain less than Q) if either 1. P < Q, or 2. P = Q and, for all x P and y Q, if x y, then x y.

Onwarding Theorem For a finite set of strings, S, the valid antichains of S are a finite linear order under < ac. Consequence: A canonical way of decomposing trees into initial segments. The new technique: Consider all inputs that have a common prefix. List the trees of outputs for those inputs. Find the valid antichains of the trees. Pick a valid antichain common to all of the trees.

Conclusion Given access to translation queries, the canonical form can be found and SDTs can be learned from a polynomial size characteristic sample. In other words, there is a single algorithm that learns every SDT under these conditions. Without translation queries, this is not possible. One possible avenue to pursue is to determine if translation queries can be replaced with statistical analysis: both provide negative information. This will require a careful development of probabilistic model.

Thank You