Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Similar documents
Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research

Regular expressions, Finite Automata, transition graphs are all the same!!

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages

Fundamentals of Computer Science

Finite-State Automata: Recap

Minimal DFA. minimal DFA for L starting from any other

Deterministic Finite Automata

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

CHAPTER 1 Regular Languages. Contents

Chapter 2 Finite Automata

Lecture 08: Feb. 08, 2019

Chapter 1, Part 1. Regular Languages. CSC527, Chapter 1, Part 1 c 2012 Mitsunori Ogihara 1

Java II Finite Automata I

CS375: Logic and Theory of Computing

Convert the NFA into DFA

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata-cont d

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

In-depth introduction to main models, concepts of theory of computation:

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Formal Languages and Automata

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

NFAs continued, Closure Properties of Regular Languages

Coalgebra, Lecture 15: Equations for Deterministic Automata

CISC 4090 Theory of Computation

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Non-deterministic Finite Automata

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

CMSC 330: Organization of Programming Languages

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

Closure Properties of Regular Languages

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

11.1 Finite Automata. CS125 Lecture 11 Fall Motivation: TMs without a tape: maybe we can at least fully understand such a simple model?

Regular languages refresher

Designing finite automata II

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Lexical Analysis Finite Automate

Lecture 9: LTL and Büchi Automata

Non-deterministic Finite Automata

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

FABER Formal Languages, Automata and Models of Computation

Thoery of Automata CS402

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

Agenda. Agenda. Regular Expressions. Examples of Regular Expressions. Regular Expressions (crash course) Computational Linguistics 1

Worked out examples Finite Automata

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

NFAs continued, Closure Properties of Regular Languages

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

PART 2. REGULAR LANGUAGES, GRAMMARS AND AUTOMATA

Formal Language and Automata Theory (CS21004)

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

Let's start with an example:

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

Harvard University Computer Science 121 Midterm October 23, 2012

BACHELOR THESIS Star height

Context-Free Grammars and Languages

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Homework 4. 0 ε 0. (00) ε 0 ε 0 (00) (11) CS 341: Foundations of Computer Science II Prof. Marvin Nakayama

More on automata. Michael George. March 24 April 7, 2014

CS375: Logic and Theory of Computing

1 From NFA to regular expression

1.3 Regular Expressions

Lecture 09: Myhill-Nerode Theorem

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

Nondeterminism and Nodeterministic Automata

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA

3 Regular expressions

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

1.4 Nonregular Languages

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science

Non-Deterministic Finite Automata

Normal Forms for Context-free Grammars

Homework 3 Solutions

1 Nondeterministic Finite Automata

Lexical Analysis Part III

ɛ-closure, Kleene s Theorem,

State Minimization for DFAs

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Deterministic Finite-State Automata

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.

Model Reduction of Finite State Machines by Contraction

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

The size of subsequence automaton

A tutorial on sequential functions

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Formal languages, automata, and theory of computation

NON-DETERMINISTIC FSA

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

Transcription:

Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri

Preliminries Finite lphet, empty string. Set of ll strings over :. Length of string x : x. Mirror imge or reverse of string x = x 1 x n : x R = x n x 1. A lnguge L: suset of. pge 2 Cournt Institute, NYU

Rtionl Opertions Rtionl opertions over lnguges: union: lso denoted L 1 + L 2, conctention: closure: L 1 L 2 = {x : x L 1 x L 2 }. L 1 L 2 = {x = uv : u L 1 v L 2 }. L = L n, where L n = L L n=0 n. pge 3 Cournt Institute, NYU

Regulr or Rtionl Lnguges Definition: the clss of regulr/rtionl lnguges over is the smllest set L contining the empty set nd closed under the rtionl opertions. i.e., L x, {x} L L 1,L 2 L,L 1 L 2 L,L 1 L 2 L,L 1 L. Exmples of regulr lnguges over ={,, c} :, ( + ) c, n c, ( +( + c) ) c. pge 4 Cournt Institute, NYU

Finite Automt Definition: finite utomton A over the lphet is 4-tuple (Q, I, F, E) where Q is finite set of sttes, I Q set of initil sttes, F Q set of finl sttes, nd E multiset of trnsitions which re elements of Q ( { }) Q. pth in n utomton element of E. A =(Q, I, F, E) pth from stte in I to stte in is n F is clled n ccepting pth. Lnguge L(A) ccepted y A: set of strings leling ccepting pths. pge 5 Cournt Institute, NYU

Finite Automt - Exmple 0 1 2 pge 6 Cournt Institute, NYU

Finite Automt - Some Properties Trim: ny stte lies on some ccepting pth. Unmiguous: no two ccepting pths hve the sme lel. Deterministic: unique initil stte, trnsitions leving the sme stte hve different lels. Complete: t lest one outgoing trnsition leled with ny lphet element t ny stte. Acyclic: no pth with cycle. pge 7 Cournt Institute, NYU

Normlized Automt Definition: finite utomton is normlized if it hs unique initil stte with no incoming trnsition. it hs unique finl stte with no outgoing trnsition. i A f pge 8 Cournt Institute, NYU

Elementry Normlized Automton Definition: normlized utomton ccepting n element { } constructed s follows. 0 1 pge 9 Cournt Institute, NYU

Normlized Automt: Union Construction: the union of two normlized utomt is normlized utomton constructed s follows. i 1 A 1 1 f i f i 2 A 2 2 f pge 10 Cournt Institute, NYU

Normlized Automt: Conctention Construction: the conctention of two normlized utomt is normlized utomton constructed s follows. i1 A f 1 1 i f 2 A 2 2 pge 11 Cournt Institute, NYU

Normlized Automt: Closure Construction: the closure of normlized utomton is normlized utomton constructed s follows. i 0 i A f f 0 pge 12 Cournt Institute, NYU

Normlized Automt - Properties Construction properties: ech rtionl opertion require creting t most two sttes. ech stte hs t most two outgoing trnsitions. the complexity of ech opertion is liner. pge 13 Cournt Institute, NYU

Thompson s Construction let re regulr expression over the lphet. Then, there exists normlized utomton A with t most 2 r sttes representing r. Proof: first, prse regulr expression. (Thompson, 1968) construction of normlized utomton strting from elementry expressions nd following opertions of the tree. pge 14 Cournt Institute, NYU

Thompson s Construction - Exmple ε 4 ε 5 ε ε 1 2 ε 3 ε 6 ε 0 ε 7 c 8 ε 9 Normlized utomton for regulr expression + c. pge 15 Cournt Institute, NYU

Regulr Lnguges nd Finite Automt Theorem: A lnguge is regulr iff it cn e ccepted y finite utomton. Proof: Let for A =(Q, I, F, E) e finite utomton. (i, j, k) [1, Q ] [1, Q ] [0, Q ] L(A) = is thus regulr. i I,f F X Q if pge 16 define Xij 0 is regulr for ll (i, j) since E is finite. y induction Xij k for ll (i, j, k) since (Kleene, 1956) X k ij = {i q 1 q 2... q n j : n 0,q i k}. X k+1 ij = X k ij + Xk i,k+1 (Xk k+1,k+1 ) Xk k+1,j. Cournt Institute, NYU

Regulr Lnguges nd Finite Automt Proof: the converse holds y Thompson s construction. Notes: more generl theorem (Schützenerger, 1961) holds for weighted utomt. not ll lnguges re regulr, e.g., L = { n n : n N} is not regulr. Let A e n utomton. If L L(A), then for lrge enough n, n n corresponds to pth with cycle: n n = p u q, p u q L(A), which implies L(A) = L. pge 17 Cournt Institute, NYU

ε-removl Any finite utomton hs n equivlent utomton with no ε-trnsitions. For ny stte q Q, let [q] denote the set of sttes reched from q y pths leled with. Define A =(Q,I,F,E ) Q = { [q]: q Q}, I = [q], F = { [q]: [q] F = }. q I E 0 = {( [p],, [q]) : 9(p 0,,q) 2 E,p 0 2 [p]}. s pge 18 Cournt Institute, NYU

ε-removl - Illustrtion 0 1 2 3 {0, 1} {0, 2} {0, 1, 3} {0} pge 19 Cournt Institute, NYU

Determiniztion Any utomton A =(Q, I, F, E) without epsilon trnsitions hs n equivlent deterministic utomton. Suset construction: A =(Q,I,F,E ) with Q =2 Q. I = {s Q : s I = }. F = {s Q : s F = }. E = {(s,, s ): (q,, q ) E,q s, q s }. pge 20 Cournt Institute, NYU

Determiniztion - Illustrtion 0 1 2 {0} {1} {1, 2} {2} {0, 1} pge 21 Cournt Institute, NYU

Completion Any deterministic utomton hs n equivlent complete deterministic utomton. Algorithm illustrtion: 0 1 3 0 1 3 2 2 4 pge 22 Cournt Institute, NYU

Complementtion Let A =(Q, I, F, E) e deterministic utomton, then there exists deterministic utomton ccepting L(A). By previous property, we cn ssume A complete. The utomton B =(,Q,I,Q F, E) otined from A y mking non-finl sttes finl nd finl sttes non-finl exctly ccepts L(A). pge 23 Cournt Institute, NYU

Complementtion - Ilustrtion 0 1 3 2 4 0 1 3 2 4 pge 24 Cournt Institute, NYU

Finite-Stte Trnsducers Definition: finite-stte trnsducer T over the lphets nd is 4-tuple where Q is finite set of sttes, I Q set of initil sttes, F Q set of finl sttes, nd E multiset of trnsitions which re elements of Q ( { }) ( { }) Q. T defines reltion vi the pir of input nd output lels of its ccepting pths, R(T )={(x, y) : I x:y F }. pge 25 Cournt Institute, NYU

References Kleene, S. C.1956. Representtion of events in nerve nets nd finite utomt. Automt Studies. Lewis, Hrry R. nd Ppdimitriou, Christos H. Elements of the Theory of Computtion, Chpter 2. Prentice Hll, 1981. Nivt, Murice. 968. Trnsductions des lngges de Chomsky. Annles 18, Institut Fourier. Schützenerger, Mrcel~Pul. 1961. On the definition of fmily of utomt. Informtion nd Control, 4 Thompson, K. 1968. Regulr expression serch lgorithm. Comm. ACM, 11. pge 26 Cournt Institute, NYU