Introduction to Finite Automaton

Similar documents
Weighted Finite-State Transducer Algorithms An Overview

Axioms of Kleene Algebra

Finite-State Transducers

OpenFst: An Open-Source, Weighted Finite-State Transducer Library and its Applications to Speech and Language. Part I. Theory and Algorithms

Kleene Algebras and Algebraic Path Problems

Introduction to Finite-State Automata

Finite Automata. Finite Automata

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Sri vidya college of engineering and technology

Generic ǫ-removal and Input ǫ-normalization Algorithms for Weighted Transducers

COMP4141 Theory of Computation

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Finite Automata and Formal Languages

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

A Disambiguation Algorithm for Weighted Automata

Speech Recognition Lecture 4: Weighted Transducer Software Library. Mehryar Mohri Courant Institute of Mathematical Sciences

Lecture 3: Nondeterministic Finite Automata

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

COM364 Automata Theory Lecture Note 2 - Nondeterminism

EFFICIENT ALGORITHMS FOR TESTING THE TWINS PROPERTY

Deterministic Finite Automaton (DFA)

Finite-state Machines: Theory and Applications

FS Properties and FSTs

Nondeterministic Finite Automata

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

P(t w) = arg maxp(t, w) (5.1) P(t,w) = P(t)P(w t). (5.2) The first term, P(t), can be described using a language model, for example, a bigram model:

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata


Uses of finite automata

Theory of Computation Lecture 1. Dr. Nahla Belal

Formal Models in NLP

Inf2A: Converting from NFAs to DFAs and Closure Properties

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Author: Vivek Kulkarni ( )

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Kleene Algebra and Arden s Theorem. Anshul Kumar Inzemamul Haque

Regular Expressions and Language Properties

Regular Expressions. Definitions Equivalence to Finite Automata

Fooling Sets and. Lecture 5

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

CS 455/555: Finite automata

Business Processes Modelling MPB (6 cfu, 295AA)

Introduction to Kleene Algebras

Theory of computation: initial remarks (Chapter 11)

Classes and conversions

An algebraic characterization of unary two-way transducers

Foundations of

Computational Models #1

FINITE STATE AUTOMATA

How do regular expressions work? CMSC 330: Organization of Programming Languages

Finite Automata and Regular languages

Closure under the Regular Operations

Automata: a short introduction

Computational Models - Lecture 4

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Regular expressions and Kleene s theorem

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

1 More finite deterministic automata

Probabilistic Aspects of Computer Science: Probabilistic Automata

Regular expressions and automata

Decision, Computation and Language

Chapter 6. Properties of Regular Languages

Theory of Computation

Language Technology. Unit 1: Sequence Models. CUNY Graduate Center. Lecture 4a: Probabilities and Estimations

Warshall s algorithm

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Deterministic Finite Automata (DFAs)

CS 154, Lecture 3: DFA NFA, Regular Expressions

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

Languages, regular languages, finite automata

Computer Sciences Department

Lecture 17: Language Recognition

Notes on generating functions in automata theory

Finite Automata - Deterministic Finite Automata. Deterministic Finite Automaton (DFA) (or Finite State Machine)

Some useful tasks involving language. Finite-State Machines and Regular Languages. More useful tasks involving language. Regular expressions

Computational Models Lecture 2 1

On-Line Learning with Path Experts and Non-Additive Losses

CS243, Logic and Computation Nondeterministic finite automata

Finite Automata and Regular Languages

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

3515ICT: Theory of Computation. Regular languages

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

A Weak Bisimulation for Weighted Automata

Nondeterminism and Epsilon Transitions

Turing Machines (TM) Deterministic Turing Machine (DTM) Nondeterministic Turing Machine (NDTM)

Formal Languages. We ll use the English language as a running example.

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

COMP/MATH 300 Topics for Spring 2017 June 5, Review and Regular Languages

Unit 6. Non Regular Languages The Pumping Lemma. Reading: Sipser, chapter 1

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Computational Models Lecture 2 1

What we have done so far

Introduction to Theory of Computing

Theory of Computation (II) Yijia Chen Fudan University

Transcription:

Lecture 1 Introduction to Finite Automaton IIP-TL@NTU Lim Zhi Hao 2015

Lecture 1 Introduction to Finite Automata (FA) Intuition of FA Informally, it is a collection of a finite set of states and state transitions, where each transition has at least one label. Figure: Some examples of Finite Automata

Lecture 1 Introduction to Finite Automata (FA) Applications of FA Applications of FA Finding all occurrences of pattern 10 in a binary string 0 1 1 start x 0 1 x 1 0 x 2 0 Consider 3 binary strings: 001001 X 000111 7 101010 X

Lecture 1 Introduction to Finite Automata (FA) Applications of FA Applications of FA A Bot is a character in a game that is controlled by the computer (Not NPC!). Coding Bot behaviours in games Powerful enemy start Weak enemy flee roam fight Escaped Enemy escapes/dies And many other applications as well.

Lecture 1 Introduction to Finite Automata (FA) Definition of FA Finite Automata (Definition) 5Tuplewrittenas(Q, A, E, I, F ) Where Q: Finite set of states A: Alphabet, or a formal language E: Finite set of state transitions I : Set of Initial states F : Set of Final states

Lecture 1 Introduction to Finite Automata (FA) Definition of FA Worked Example Let C =(Q, A, E, I, F ) Where Q = {1, 2}, I = {1}, F = {2}, A = {a, b} and E = {(1, a, 1), (1, a, 2), (2, b, 1), (2, b, 2)} Solution:

Lecture 1 Introduction to Finite Automata (FA) Paths in FA Intuition of Path Note that this is a series of directed edges, = e 1 e n An Example 1 =(1, a, 1), (1, a, 2), (2, b, 2), (2, b, 1), (1, a, 2), (2, b, 2) is a path

Lecture 1 Introduction to Finite Automata (FA) Paths in FA Intuition of Path Another Example: (Bot Behaviour) Powerful enemy start Weak enemy flee roam fight Escaped Enemy escapes/dies 1 = (roam, Powerful Enemy, flee), (flee, Escaped, roam) is a path

Lecture 1 Introduction to Finite Automata (FA) Paths in FA Definition of Paths in FA Given a Path: = e 1 e n Origin or Previous state: p[ ], e.g p[ ] =e 1 Destination or Next state: n[ ], e.g n[ ] =e n Input Label: i( ) Output Label: o( ) Set of Paths (Definition) P(R 1, R 2 ): Set of paths from R 1 Q to R 2 Q P(R 1, x, R 2 ): With Input label x 2 P(R 1, x, y, R 2 ): With Output label y 2

Lecture 1 Introduction to Finite Automata (FA) Paths in FA Summary/Recap What is a Finite Automaton? Definition of Finite Automaton. Definition of Paths and Sets of Paths

Lecture 1 Appendix References References I Jean-Eric Pin Mathematical Foundations of Automata Theory. Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers.

Lecture 2 Introduction to Finite State Transducers IIP-TL@NTU Lim Zhi Hao 2015

Lecture 2 Introduction to FSTs and WFSTs Introduction to FST Intuition of FST FST is an extension of FA, where it has both an input and output Eg: Input abbcd, thefstwillreturnzyyxw. Why?

Lecture 2 Introduction to FSTs and WFSTs Introduction to FST FST: Formal Definition Finite State Transducer (Definition): 6 Tuple written as (,, Q, E, I, F ) : Set of input labels : Set of output labels Q: Finite set of states E: Finite set of state transitions E Q { [ "} { [ "} Q I : Set of Initial states F : Set of Final states

Lecture 2 Introduction to FSTs and WFSTs Introduction to FST What is Q { [ "} { [ "} Q? Consider a walk, that goes from 0! 1! 2! 5 Formally, the path is written as: 1 =(0, a, z, 1), (1, c, x, 2), (2, d, w, 5) Notice each 4 tuple has a format of Q { [ "} { [ "} Q?

Lecture 2 Introduction to FSTs and WFSTs Introduction to WFST Intuition of WFST WFST is a weighted FST, each transition, initial state and final state has an additional weight associated to them. Eg: Input abbcd, thewfstwillreturnzyyxw and the weight of the path, 0.5 1.2 0.7 0.7 3 2 0.1.

Lecture 2 Introduction to FSTs and WFSTs Introduction to WFST Weighted Finite State Transducer (Definition): 8 Tuple written as (,, Q, E, I, F,, ) : Set of input labels : Set of output labels Q: Finite set of states E: Finite set of state transitions E Q { [ "} { [ "} K Q I : Set of Initial states F : Set of Final states : Initial weight function ( : I! K) : Final weight function ( : F! K)

Lecture 2 Introduction to FSTs and WFSTs Introduction to WFST What is Q { [ "} { [ "} K Q? Consider the same walk as before: 0! 1! 2! 5 Formally, the path is written as: 2 =(0, a, z, 1.2, 1), (1, c, x, 3, 2), (2, d, w, 2, 5) Each 5 tuple has a format of Q { [ "} { [ "} K Q

Lecture 2 Introduction to FSTs and WFSTs Introduction to WFST Di erences in FST and WFST

Lecture 2 Introduction to FSTs and WFSTs Introduction to WFST Summary on Types of FA

Lecture 2 Introduction to FSTs and WFSTs Basic Properties of Transducers Basic Properties of Transducers Given T :(,, Q, E, I, F,, ) andx 2, y 2 Deterministic: T has no two transitions leaving the same state sharing the same input label Sequential: T has a unique initial state Functional: T has a unique y for any x Epsilon-Transitions: 9 a state transition that can be made without any input (") Stochastic WFST: For every state, the sum of all transitions leaving that state is EXACTLY 1

Lecture 2 Introduction to FSTs and WFSTs Basic Properties of Transducers Summary/Recap What is a Finite State Transducer? Definition of Weighted Finite State Transducers. Di erent types of Finite Automaton. Basic Properties of Transducers

Lecture 2 Appendix References References I Jean-Eric Pin Mathematical Foundations of Automata Theory. Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers.

Lecture 3 Introduction to Semirings I IIP-TL@NTU Lim Zhi Hao 2015

Lecture 3 Introduction to Semirings Why study Semirings? Motivation and Importance I Semirings are a type of Algebraic Structure In order to manipulate WFSTs, we need to learn the concept of semirings. In WFST-based speech recognition, the tropical semiring is mainly used In some optimization steps for WFSTs, the log semiring is also used The operations on WFSTs will be introduced in detail in the following sections.

Lecture 3 Introduction to Semirings Why study Semirings? Motivation and Importance II Reasons Computer Scientists are interested in semirings: Reduce problem complexity Enable Generic Problem Solving Reasons Mathematicians are interested in semirings: Semirings are fundamentally di erent from fields New areas of research thanks to applications in CS

Lecture 3 Introduction to Semirings What are Semirings? What are Semirings: Algebra revisited I Base on your own knowledge, these should look familiar to you: 1 + (2 + 3) = (1 + 2) + 3 3+4=4+3 3+0=3 1+( 1) = 0 2 (3 + 4) = (2 3) + (2 4) (3 + 4) 2=(3 2) + (4 2) 2 (3 4) = (2 3) 4 2 5=5 2 2 1=2 2 0.5 =1 In pure mathematics, there is a rigorous study and extension of such algebraic structures, known as Abstract Algebra

Lecture 3 Introduction to Semirings What are Semirings? What are Semirings: Algebra revisited II Rewriting into classical algebra, given x, y, z 2 R: x +(y + z) =(x + y)+z x + y = y + x x +0=x x (y z) =(x y) z x y = y x x 1=x x +( x) =0 x x 1 =1, x 6= 0 z (x + y) =(z x)+(z y) (x + y) z =(x z)+(y z) All of the properties each have their unique names.

Lecture 3 Introduction to Semirings What are Semirings? What are Semirings: Algebra revisited III Addition: Associativity: x (y z) =(x y) z Commutativity: x y = y x Identity, 0 : x 0 =x Inverse, x : x ( x) = 0 Distributivity of Multiplication over Addition: z (x y) =(z x) (z y) (x y) z =(x z) (y z) Multiplication: Associativity: x (y z) =(x y) z Commutativity: x y = y x Identity, 1 : x 1 =x Inverse, x 1 : x x 1 = 1 The algebraic structure in the above case is known as a FIELD. Recall POLYNOMIALS

Lecture 3 Introduction to Semirings What are Semirings? What are Semirings: Algebra revisited IV Addition: Associativity: x (y z) =(x y) z Commutativity: x y = y x Identity, 0 : x 0 =x Inverse, x : x ( x) = 0 Multiplication: Associativity: x (y z) =(x y) z Commutativity: x y = y x Identity, 1 : x 1 =x Inverse Distributivity of Multiplication over Addition: z (x y) =(z x)+(z y) (x y) z =(x z) (y z) The algebraic structure in the above case is known as a RING. Recall SQUARE MATRICES

Lecture 3 Introduction to Semirings What are Semirings? What are Semirings: Algebra revisited V Addition: Associativity: x (y z) =(x y) z Commutativity: x y = y x Identity, 0 : x 0 =x Inverse Inverse Distributivity of Multiplication over Addition: z (x y) =(z x)+(z y) (x y) z =(x z) (y z) Multiplication: Associativity: x (y z) =(x y) z Commutativity: x y = y x Identity, 1 : x 1 =x The algebraic structure above is known as a SEMIRING. Recall PROBABILITY

Lecture 3 Introduction to Semirings What are Semirings? Summary/Recap Informally: Fields Rings Semirings

Lecture 3 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers.

Lecture 4 Introduction to Semirings II IIP-TL@NTU Lim Zhi Hao 2015

Lecture 4 Formal Definition of Semirings Semiring (Definition): 5Tuplewrittenas(K,,, 0, 1) Where K: A set containing some mathematical objects : Addition operation : Multiplication operation 0: Additive Identity. Note: 0 2 K 1: Multiplicative Identity. Note: 1 2 K

Lecture 4 Why study Semirings? -Addition: Compute the weight of a sequence Associativity: x (y z) =(x y) z Commutativity: x y = y x Additive Identity: x 0 =x -Multiplication: Compute the weight of a path Associativity: x (y z) =(x y) z Distributivity: z (x y) =(z x) (z y) (x y) z =(x z) (y z) Multiplicative Identity: x 1 = x

Lecture 4 Why study Semirings? Why study Semirings? In most areas that uses mathematics, understanding how a field works is more than enough. However, when it comes to manipulating WFSTs, a knowledge of semirings is required. Also useful for Generic problem solving

Lecture 4 Why study Semirings? Shortest Path from S to T Step 1, Compute: 9 + 4 = 13 1 + 6 + 5 = 12 1 + 2 + 3 + 5 = 11 Step 2, Then: min{13, 12, 11} = 11

Lecture 4 Why study Semirings? Maximum Reliability from S to T Step 1, Compute: 0.4 0.8 =0.32 0.9 0.2 0.7 =0.126 0.9 0.9 1.0 0.7 =0.567 Step 2, Then: max{0.32, 0.126, 0.567} = 0.567

Lecture 4 Why study Semirings? Language leading from S to T in the Automaton Step 1, Compute: {a} {t} = {at} {t} {h} {e} = {the} {t} {i} {m} {e} = {time} Step 2, Then: S {{at}, {the}, {time}} = {at, the, time}

Lecture 4 Why study Semirings? Generic Problem Solving These problems were vastly di erent, but at the core, they are actually very much the same problem. Consider this: min{(9 + 4), (1 + 6 + 5), (1 + 2 + 3 + 5)} max{(0.4 0.8), (0.9 0.2 0.7), (0.9 0.9 1.0 0.7)} S {{a} {t}, {t} {h} {e}, {t} {i} {m} {e}} Is there a more general way of looking at the expressions above?

Lecture 4 Why study Semirings? Generic Problem Solving If we were to replace the inner operations with and the outer operation with L {{w1 w 2, {w {z } 3 w 4 w 5, {w {z } 3 w 6 w 7 w 5 } {z } w( 1 ) w( 2 ) w( 3 ) Even more compactly: L 2P(S,T ) w( )

Lecture 4 Why study Semirings? Summary/Recap Formal Definition Of Semiring? Why Study Semirings? Generic Problem Solving.

Lecture 4 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers.

Lecture 5 Introduction to Semirings III IIP-TL@NTU Lim Zhi Hao 2015

Lecture 5 Examples Of Semirings Common Examples of Semirings Probability: ([0, 1], +,, 0, 1) Strings: ( [ s 1, ^,, s 1,"), where : A finite set of strings s 1 :InfiniteString ^: Lowest common prefix [{abcde} ^{bcf } = {bc}] : Concatenation of 2 strings Tropical: (R, min, +, 1, 0), where min: The minimum function Log: (R, LOG, +, 1, 0), where x LOG y = log(e x + e y )

Lecture 5 Tropical Semirings Proof of Tropical Semiring Tropical Semiring (Definition): 5Tuplewrittenas(R, min, +, 1, 0) (K,,, 0, 1) Addition: Associativity: x (y z) =min(x, min(y, z)) = min(min(x, y), z) =(x y) z Commutativity: Additive Identity: x y = min(x, y) =min(y, x) =y x x 0 =min(x, 1) =x

Lecture 5 Tropical Semirings Proof of Tropical Semiring Tropical Semiring (Definition): 5Tuplewrittenas(R, min, +, 1, 0) (K,,, 0, 1) Multiplication: Associativity: x (y z) =x +(y + z) =(x + y)+z =(x y) z Multiplicative Identity: x 1 =x +0=x

Lecture 5 Tropical Semirings Proof of Tropical Semiring Tropical Semiring (Definition): 5Tuplewrittenas(R, min, +, 1, 0) (K,,, 0, 1) Left Distributivity of Multiplication, WLOG suppose x apple y: z (x y) =z + min(x, y) =z + x (z x) (z y) =min(z + x, z + y) =z + x = z (x y) Right Distributivity of Multiplication (x y) z = min(x, y)+z = x + z (x z) (y z) =min(x + z, y + z) =x + z =(x y) z

Lecture 5 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers.

Lecture 6 Introduction to Semirings IV IIP-TL@NTU Lim Zhi Hao 2015

Lecture 6 Log Semirings Proof of Log Semiring Log Semiring (Definition): 5Tuplewrittenas(R, LOG, +, 1, 0) (K,,, 0, 1) Associativity of Addition ( x LOG y = log(e x + e y )) x LOG (y LOG z) = log(e x + e (y LOG z) ) = log(e x + e ( log(e y +e z )) ) = log(e x + e log(e y +e z ) ) = log(e x + e y + e z ) = log(e log(e x +e y ) + e z ) = log(e (x LOG y) + e z ) =(x LOG y) LOG z

Lecture 6 Log Semirings Proof of Log Semiring Log Semiring (Definition): 5Tuplewrittenas(R, LOG, +, 1, 0) (K,,, 0, 1) Addition Commutativity: x LOG y = log(e x + e y )= log(e y + e x )=y LOG x Additive Identity: x LOG 0 = log(e x + e 1 )= log(e x )=x Multiplication: Associativity: x (y z) =x +(y + z) =(x + y)+z =(x y) z Additive Identity: x 1 =x +0=x

Lecture 6 Log Semirings Proof of Log Semiring Log Semiring (Definition): 5Tuplewrittenas(R, LOG, +, 1, 0) (K,,, 0, 1) Left Distributivity of Multiplication: z (x LOG y) =z log(e x + e y ) = log(e z ) log(e x + e y ) = log(e z e x + e z e y ) = log(e (z+x) + e (z+y) ) =(z + x) LOG (z + y) =(z x) LOG (z y) Proof of Right Distributive property is similar

Lecture 6 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers.

Lecture 7 Introduction to Semirings V IIP-TL@NTU Lim Zhi Hao 2015

Lecture 7 Special Cases of Semirings Special Cases of Semirings Recall that in some optimization steps for WFSTs, the addition and multiplication used are defined by the semirings Hence, it is good to know some additional types of semirings, so as to simplified some of the computations (Short-cuts) Commutative Semirings Idempotent Semirings k-closed Semirings Weakly Left Divisible Semirings ZeroSum Free Semirings

Lecture 7 Special Cases of Semirings Special Cases of Semirings Commutative Semiring A Semiring is a Commutative Semiring if is commutative as well Tropical and Log are Commutative Semirings The commutative property holds for classical addition Idempotent Semiring A Semiring is an Idempotent Semiring if x x = x, 8x 2 K Tropical and String are Idempotent Semirings Tropical: x x = min(x, x) =x String: x x = x ^ x = x

Lecture 7 Special Cases of Semirings Special Cases of Semirings k-closed Semiring 8a 2 K: Leta n = a a {z } and a 0 = 1 ntimes ASemiringisak-closed Semiring if L k n=0 an = L k+1 n=0 an Or less compactly: 1 a a k = 1 a a k a k+1 Positive Tropical Semirings are 0-closed: 1 = 1 1 =0=min{0, a} = 1 a, 8a 2 R + String Semirings are also 0-closed 1 =" = " ^ a = 1 a, 8a 2 a

Lecture 7 Special Cases of Semirings Special Cases of Semirings Weakly Left Divisible Semiring ASemiringisaWeaklyLeftDivisibleSemiringif 8(x y) 6= 0, 9z 2 K such that x =(x y) z Tropical and Log Semirings are also Weakly Left Divisible Semirings Tropical Semiring (R, min, +, 1, 0): z = x min(x, y) Log Semirings (R, LOG, +, 1, 0): z = log(1 + e x y )

Lecture 7 Special Cases of Semirings ZeroSum Free Semiring A Semiring is a ZeroSum Free Semiring if x x = y = 0 y = 0 implies Probability Semiring is a ZeroSum Free Semiring Proof: Suppose not. 9x, y 2 [0, 1] such that x + y =0but x 6= 0andy 6= 0. Then x = y with x 2 [0, 1]. This implies that y /2 (0, 1]. Contradiction These special cases will come into play in the following section, where we discuss the operations on WFSTs.

Lecture 7 Special Cases of Semirings Summary Of Special Cases When manupilating WFST s, some of these properties are used

Lecture 7 Special Cases of Semirings Sneak Peek on Operations Most WFST s are defined over a Semiring, that is, the operations are defined by the Semiring. and The next section will utilize some of the above mentioned special properties.

Lecture 7 Special Cases of Semirings Summary/Recap Tropical Semirings? Log Semirings? Special Cases of Semirings.

Lecture 7 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers.

Lecture 8 Introduction to Basic Operations for Finite Automaton IIP-TL@NTU Lim Zhi Hao 2015

Lecture 8 Operations for FA Basic Operations of FA Basic Operations WFST over semiring S =(K,,, 1, 0) (Definition): L T (x, y) = (p[ ]) w[ ] (n[ ]) 2P(I,x,y,F ) Recall that: L 2P(S,T ) w( ) The expression above is analogous to the definition of WFST over a Semiring. Figure: WFSA from Lecture 4 The extra (p[ ]) and (n[ ]) are the weights of the initial and final states respectively.

Lecture 8 Operations for FA Basic Operations of FA Given a WFSTs over a Tropical Semiring: (R, min, +, 1, 0) Consider 0 =(0, a, A, 1, 1), (1, c, B, 1, 3), x = {ac}, y = {AB} M T (x, y) = (p[ ]) w[ ] (n[ ]) 2P(I,x,y,F ) = (p[ 0 ]) w[ 0 ] (n[ 0 ]) = (0) (1 1) (3) =0+1+1+0.5 =2.5

Lecture 8 Operations for FA Basic Operations of FA Basic Operations WFST over semiring S =(K,, ) (Definition): L T (x, y) = (p[ ]) w[ ] (n[ ]) 2P(I,x,y,F ) Kleene closure: Union: Modifies the automaton so that the set of symbol sequences or transductions is sequentially repeated zero or more times Combines two automata in parallel Concatenation: Combines two automata in series

Lecture 8 Operations for FA Basic Operations of FA Basic Operations WFST over semiring S =(K,, ) (Definition): L T (x, y) = (p[ ]) w[ ] (n[ ]) Inverse: 2P(I,x,y,F ) Exchanging input and output symbols Reversal: Assigns to each pair of strings (x, y) what T assigns to its mirror images ( x, ỹ) Projection: Converting a transducer into an acceptor by omitting input or output labels

Lecture 8 Operations for FA Basic Operations of FA An example of a Kleene Closure Definition: T L (x, y) = 1 T n (x, y) n=0

Lecture 8 Operations for FA Basic Operations of FA An example of a Union Definition: (T A T B )(x, y) =T A (x, y) T B (x, y)

Lecture 8 Operations for FA Basic Operations of FA An example of a Concatenation Definition: L (T A T B )(x, y) = T A (x A, y A ) T B (x B, y B ) x=x A, x B ;y=y A, y B

Lecture 8 Operations for FA Basic Operations of FA An example of an Inverse Definition: T 1 (x, y) =T (y, x)

Lecture 8 Operations for FA Basic Operations of FA An example of a Reversal Definition: T (x, y) =T (ỹ, x)

Lecture 8 Operations for FA Basic Operations of FA An example of a Projection Definition: # T (x) = L y T (y, x)

Lecture 8 Operations for FA Basic Operations of FA Summary/Recap What is meant by WFST over a Semiring. Kleene Closure Union Concatenation Inverse Reversal Projection

Lecture 8 Operations for FA 2mainOperationsofWFSTs 2mainOperationsofWFSTs WFST s Operations Composition Optimization Epsilon-Removal Determinization Weight-Pushing Minimization

Lecture 8 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers. Mehryar Mohri Weighted Automata Algorithms.

Lecture 9 Introduction to 5 Basic Operations for WFST: Composition IIP-TL@NTU Lim Zhi Hao 2015

Lecture 9 Composition WFST s Operations Composition Optimization Epsilon-Removal Determinization Weight-Pushing Minimization

Lecture 9 Composition Rationale for Composition, an example T A converts a sequence of letters so that they are all uppercase letters T B converts the sequence of uppercase letters to a sequence of specific words that matches the letters T = T A T B converts a sequence of letters into a sequence of specific words The resulting transducer may be harder to construct from scratch So given a complicated transducer, it can be broken down into many simple transducers Composition simplifies construction of complicated transducers

Lecture 9 Composition Composition Combining two related transducers into one single transducer that represents a set of transductions cascaded with the original two transducers (Takaaki Hori,2013). Given 2 WFST over the Probability Semiring: We obtain:

Lecture 9 Composition Given 2 transducers: T A :(,, Q 1, E 1, I 1, F 1, 1, 1 ), T B :(,, Q 2, E 2, I 2, F 2, 2, 2 ) (T A T B )(x, y) = L T A (x, z) T B (z, y) z2

Lecture 9 Composition Composition I First iteration, consider (0 A, 0 B )

Lecture 9 Composition Composition II Second iteration, consider (1 A, 1 B )

Lecture 9 Composition Composition III Third iteration, consider (0 A, 1 B )

Lecture 9 Composition Composition IV Fourth iteration, consider (2 A, 1 B )

Lecture 9 Composition Composition V Fifth iteration, consider (3 A, 1 B )

Lecture 9 Composition Composition VI Sixth iteration, consider (3 A, 2 B )

Lecture 9 Composition Composition VII Seventh iteration, consider (3 A, 3 B )

Lecture 9 Composition Composition VIII We obtain the new transducer, T = T A node (3, 2) T B with an extra Dead

Lecture 9 Composition Composition IX Trimming away the node, we arrive at our previous result: (T A T B )(x, y) :(,, Q, E, I, F,, ) WhereQ Q 1 Q 2

Lecture 9 Composition Composition X May produce incorrect results for non-idempotent semirings as path-weights may be counted more than once Hence the Semiring defined necessarily needs to be idempotent. "-transitions create redundant paths during simple composition Solution is to utilize an intermediate filter transducer to eliminate the redundant paths

Lecture 9 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers. Mehryar Mohri Weighted Automata Algorithms. Josef R. Novak Weighted Finite-State Transducers Important Algorithms University of Tokyo, Minematsu Lab

Lecture 10 Introduction to 5 Basic Operations for WFST: Epsilon Removal IIP-TL@NTU Lim Zhi Hao 2015

Lecture 10 Epsilon-Removal WFST s Operations Composition Optimization Epsilon-Removal Determinization Weight-Pushing Minimization

Lecture 10 Epsilon-Removal Introduction to Epsilon-Removal Removes "-transitions from the input transducer and produces a new, epsilon-free transducer equivalent to the original Figure: Before "-Removal Figure: After "-Removal

Lecture 10 Epsilon-Removal Rationale for Epsilon-Removal Various automata or transducer operations such as rational operations generate "-transitions These transitions cause some delay in the use of the resulting transducers Since the search requires reading some sequences of " s first To make these weighted transducers more e cient to use, it may be preferable to remove all "-transitions Figure: Before "-Removal Figure: After "-Removal

Lecture 10 Epsilon-Removal Intuition of Algorithm Figure: Part of Original FA Consider the following paths: 1 0! 1! 2! 4 Requires input {a} only 2 0! 1! 2! 5 Requires input {b} only 3 0! 1! 3 Requires input {c} only

Lecture 10 Epsilon-Removal Intuition of Algorithm Figure: "-Closure of FA Remove redundant "-transitions: 1 0! 1! 2! 4 0! 4 2 0! 1! 2! 5 0! 5 3 0! 1! 3 0! 3 This step is to calculate the "-closure of each state.

Lecture 10 Epsilon-Removal Description of Epsilon-Removal Algorithm Compute the weighted "-closure for each state in the FST. Weighted "-closure (Definition) C(p) ={(q, w) :P(p,",q) 6= ;, w = " [p, q]} L Where "[p, q] = w( ) 2P(p,",q) 8p 2 Q Remove all "-transitions and 8p 2 Q and each (q, w) 2 C(p), the transition set of p is augmented to: Augmenting all transitions associated at state p 1 {(p, a, b, "[p, q] w, r) :(q, a, b, w, r) 2 E, (a, b) 6= (", ")} 2 If 9(q, w) s.tq 2 F : w(p) w(p) ( " [p, q] (q))

Lecture 10 Epsilon-Removal Epsilon-Removal Worked Example I Weighted "-closure (Definition) C(p) ={(q, w) :P(p,",q) 6= ;, w = " [p, q]} L Where "[p, q] = w( ) 2P(p,",q) 8p 2 Q Figure: WFSA over a Tropical SR C(0) contains (1, w 1 )and(2, w 2 ) w 1 = " [0, 1] = 1 w 2 = " [0, 2] = 1 1=2 Thus C(0) = {(1, 1), (2, 2)}

Lecture 10 Epsilon-Removal Epsilon-Removal Worked Example II Augmenting all transitions associated at state 0 1 {(p, a, b, "[p, q] w, r) :(q, a, b, w, r) 2 E, (a, b) 6= (", ")} 2 If 9(q, w) s.tq 2 F : w(p) w(p) ( " [p, q] (q)) Figure: "-Closure of FA Consider all transitions that begin with 1 and 2, that DO NOT contain any "-transitions (2,a,2,4)! (0,a, " [0, 2] 2,4) (2,b,3,5)! (0,b, " [0, 2] 3,5) (1,c,4,3)! (0,c, " [0, 1] 4,3)

Lecture 10 Epsilon-Removal Epsilon-Removal Worked Example III Augmenting all transitions associated at state 0 1 {(p, a, b, "[p, q] w, r) :(q, a, b, w, r) 2 E, (a, b) 6= (", ")} 2 "[0, 2] =2and " [0, 1] =1 Replacing the 3 transitions with: (2,a,2,4)! (0,a,4,4) (2,b,3,5)! (0,b,5,5) (1,c,4,3)! (0,c,5,3) Figure: "-Closure of FA

Lecture 10 Epsilon-Removal Epsilon-Removal Worked Example IV Removing the states 1 and 2 yields: Figure: Before "-Removal Figure: After "-Removal The new transitions: (0,a,4,4), (0,b,5,5) and(0,a,4,3)

Lecture 10 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Speech Recognition With Weighted Finite-State Transducers. Mehryar Mohri Weighted Automata Algorithms. Josef R. Novak Weighted Finite-State Transducers Important Algorithms University of Tokyo, Minematsu Lab

Lecture 11 Introduction to 5 Basic Operations for WFST: Determinization IIP-TL@NTU Lim Zhi Hao 2015

Lecture 11 Determinization WFST s Operations Composition Optimization Epsilon-Removal Determinization Weight-Pushing Minimization

Lecture 11 Determinization Introduction to Determinization NFA (Informal): At certain states, cannot determine where it will go to, hence the destination states is a powerset, instead of a single state For example: Given the input ab there are 2 possible states 4 or 6.

Lecture 11 Determinization Rationale of Determinization Availability of determinization is a big advantage as regards solving sequence recognition or transduction problems using FAs A deterministic FA (DFA) that is equivalent to the original FA Accelerates run-time speed as it yields the most e terms of speed cient FA in

Lecture 11 Determinization Description of Algorithm States are grouped based on the transitions input symbols Thus, the resulting transducer have UNIQUE inputs and UNIQUE paths Multiple final states are compressed into a single, weighted final state final emission function in each final state is replaced with epsilon transitions

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring First label initial state with weight 1. Then start by focusing on all the transitions with the same input. Compute total weight of given input a : w(e 1 )= 1 1=0+1=1 w(e 2 )= 1 2=0+2=2 w(e 1 ) w(e 2 )=min(1, 2) = 1 This is the new arc weight w(e 0 1 )=1

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Start by computing residual weights: w r (e i )=w(e 0 j ) 1 w(p[e i ]) w(e i ) w r (e 1 )= 1+0+1=0 w r (e 2 )= 1+0+2=1 n[e 1 ] = 1 and n[e 2 ] = 2 Group using a new notation: ((1, 0), (2, 1)), the new state Then the new transition is: {(0, 0), a, 1, ((1, 0), (2, 1))}

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Figure: After first update

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Now consider ((1, 0), (2, 1)). State 1 and 2 has residual weights 0 and 1 respectively. Compute total weight of given input b : w(e 3 )=0 3=0+3=3 w(e 4 )=1 3=1+3=4 w(e 3 ) w(e 4 )=min(3, 4) = 3 This is the new arc weight w(e 0 2 )=3

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Start by computing residual weights: w r (e i )=w(e 0 j ) 1 w(p[e i ]) w(e i ) w r (e 3 )= 3+0+3=0 w r (e 4 )= 3+1+3=1 n[e 3 ] = 1 and n[e 4 ] = 2 Group using a new notation: ((1, 0), (2, 1)), the new state Then the next transition is: {((1, 0), (2, 1)), b, 3, ((1, 0), (2, 1))}

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Figure: After second update

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Now consider ((1, 0), (2, 1)). State 1 and 2 has residual weights 0 and 1 respectively. Compute total weight of given input c : w(e 5 )=0 5=0+5=5 w(e 5 ) 0 =min(5, 1) =5 This is the new arc weight w(e 0 3 )=5

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Start by computing residual weights: w r (e i )=w(e 0 j ) 1 w(p[e i ]) w(e i ) w r (e 5 )= 5+0+5=0 n[e 5 ] = 3 Group using a new notation: (3, 0), the new state Then the next transition is: {((1, 0), (2, 1)), c, 5, (3, 0)}

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Figure: After third update

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Now consider ((1, 0), (2, 1)). State 1 and 2 has residual weights 0 and 1 respectively. Compute total weight of given input d : w(e 6 )=1 6=1+6=7 w(e 6 ) 0 =min(7, 1) =7 This is the new arc weight w(e 0 3 )=7

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Start by computing residual weights: w r (e i )=w(e 0 j ) 1 w(p[e i ]) w(e i ) w r (e 6 )= 7+1+6=0 n[e 6 ] = 3 Group using a new notation: (3, 0), the new state Then the next transition is: {((1, 0), (2, 1)), d, 7, (3, 0)}

Lecture 11 Determinization Determinization: Worked Example Figure: Original WFSA over Tropical Semiring Figure: After fourth update

Lecture 11 Determinization Determinization: Worked Example Figure: Determinized WFSA Some issues with weighted determinization: May not terminate for some transducers May not finish for some inputs, even if they are valid However, all FSAs are determinizable

Lecture 11 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Weighted Automata Algorithms.

Lecture 12 Introduction to 5 Basic Operations for WFST: Weight-Pushing IIP-TL@NTU Lim Zhi Hao 2015

Lecture 12 Weight-Pushing WFST s Operations Composition Optimization Epsilon-Removal Determinization Weight-Pushing Minimization

Lecture 12 Weight-Pushing Introduction to Weight-Pushing Moves the weights distributed over all the paths to the initial states, without changing the initial WFST/WFSA Example: Given a WFST over a Tropical Semiring Figure: Original WFST, T Figure: Weight-Pushed WFST in Tropical Semiring, push(t )

Lecture 12 Weight-Pushing Rationale of Weight-Pushing Many sequence recognition or transduction problems reduces to finding the minimal cost path. As the unpromising paths can be eliminated in the early stage of the search, and therefore reducing the total search time. Figure: Weight-Pushed WFST in Tropical Semiring, push(t )

Lecture 12 Weight-Pushing Description Of Weight-Pushing Algorithm Given a WFST T :(,, Q, E, I, F,, ) over semiring S =(K,,, 0, 1) 1 Compute potential of each state, given by V [q] = L w( ) (n[ ]) 2 (q,f ) Where (q, F ) is the set of paths originating from state q In the event of an infinite loop, employ the k-closed property 2 Re-weighting the transition weights, initial weights and final weights in the following way: 8e 2 E s.t V [p[e]] 6= 0, w[e] V [p[e]] 1 w[e] V [n[e]] 8q 2 I, (q) (q) V [q] 8q 2 F, (q) V [q] 1 (q)

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) First compute potential of each state: V [q] = L w( ) (n[ ]) 2 (q,f )

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall Potential of state: V [q] = L 2 (q,f ) w( ) (n[ ]) Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) Step 1.1: State 0 has 2 paths going to the final state 3 1 = e 1, e 3 and 2 = e 2, e 4 w( 1 ) (n[ 1 ]) = (1 1) (0.5) = 1 + 1 + 0.5 =2.5 w( 2 ) (n[ 2 ]) = (0 3) (0.5) = 0 + 3 + 0.5 =3.5 Using the potential formula V [0] = 2.5 3.5 =min(2.5, 3.5) = 2.5 Based on this Semiring, what are we doing?

Lecture 12 Weight-Pushing Recall: Shortest Path from S to T Step 1, Compute: 9 + 4 = 13 1 + 6 + 5 = 12 1 + 2 + 3 + 5 = 11 Step 2, Then: min{13, 12, 11} = 11

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall Potential of state: V [q] = L 2 (q,f ) w( ) (n[ ]) Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) Step 1.2: State 1 has 1 path going to the final state 3 3 = e 3 w( 3 ) (n[ 3 ]) = (1) (0.5) = 1 + 0.5 =1.5 Using the potential formula V [1] = 1.5 0 =min(1.5, 1) =1.5

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall Potential of state: V [q] = L 2 (q,f ) w( ) (n[ ]) Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) Step 1.3: State 2 has 1 path going to the final state 3 4 = e 4 w( 4 ) (n[ 4 ]) = (3) (0.5) = 3 + 0.5 =3.5 Using the potential formula V [2] = 3.5 0 =min(3.5, 1) =3.5

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall Potential of state: V [q] = L w( ) (n[ ]) 2 (q,f ) Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) Step 1.4: State 3 is the final state, initialize V [3] = (3) = 0.5 Hence we obtain all the potentials in the transducer: V [0] = 2.5 V [2] = 3.5 V [1] = 1.5 V [3] = 0.5

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall: Re-weighting 1 8e 2 E s.t V [p[e]] 6= 0, w[e] V [p[e]] 1 w[e] V [n[e]] 2 8q 2 I, (q) (q) V [q] 3 8q 2 F, (q) V [q] 1 (q) 4 Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) V [0] = 2.5 V [1] = 1.5 V [2] = 3.5 V [3] = 0.5 Step 2.1: Using (2) and (3), re-weight initial and final states 0 and 3 (0) (0) V [0] = 0 2.5 =0+2.5 =2.5 (3) V [3] 1 (3) = V [3] 1 V [3] = 1 =0

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Figure: Before Update Figure: After Update

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall: Re-weighting 1 8e 2 E s.t V [p[e]] 6= 0, w[e] V [p[e]] 1 w[e] V [n[e]] 2 8q 2 I, (q) (q) V [q] 3 8q 2 F, (q) V [q] 1 (q) 4 Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) V [0] = 2.5 V [1] = 1.5 V [2] = 3.5 V [3] = 0.5 Step 2.2: Using (1), re-weight the transition e 1 =(0, 1, 1) w[e 1 ] V [p[e 1 ]] 1 w[e 1 ] V [n[e 1 ]] V [0] 1 w[e 1 ] V [1] = 2.5+1+1.5 =0 w[e 1 ] 0

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Figure: Before Update Figure: After Update

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall: Re-weighting 1 8e 2 E s.t V [p[e]] 6= 0, w[e] V [p[e]] 1 w[e] V [n[e]] 2 8q 2 I, (q) (q) V [q] 3 8q 2 F, (q) V [q] 1 (q) 4 Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) V [0] = 2.5 V [1] = 1.5 V [2] = 3.5 V [3] = 0.5 Step 2.3: Using (1), re-weight the transition e 2 =(0, 0, 2) w[e 2 ] V [p[e 2 ]] 1 w[e 2 ] V [n[e 2 ]] V [0] 1 w[e 2 ] V [2] = 2.5+0+3.5 =1 w[e 2 ] 1

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Figure: Before Update Figure: After Update

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall: Re-weighting 1 8e 2 E s.t V [p[e]] 6= 0, w[e] V [p[e]] 1 w[e] V [n[e]] 2 8q 2 I, (q) (q) V [q] 3 8q 2 F, (q) V [q] 1 (q) 4 Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) V [0] = 2.5 V [1] = 1.5 V [2] = 3.5 V [3] = 0.5 Step 2.4: Using (1), re-weight the transition e 3 =(1, 1, 3) w[e 3 ] V [p[e 3 ]] 1 w[e 3 ] V [n[e 3 ]] V [1] 1 w[e 3 ] V [3] = 1.5+1+0.5 =0 w[e 3 ] 0

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Figure: Before Update Figure: After Update

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Recall: Re-weighting 1 8e 2 E s.t V [p[e]] 6= 0, w[e] V [p[e]] 1 w[e] V [n[e]] 2 8q 2 I, (q) (q) V [q] 3 8q 2 F, (q) V [q] 1 (q) 4 Tropical Semiring: (R, min, +, 1, 0) (K,,, 0, 1) V [0] = 2.5 V [1] = 1.5 V [2] = 3.5 V [3] = 0.5 Step 2.5: Using (1), re-weight the transition e 4 =(2, 3, 3) w[e 4 ] V [p[e 4 ]] 1 w[e 4 ] V [n[e 4 ]] V [2] 1 w[e 4 ] V [3] = 3.5+3+0.5 =0 w[e 4 ] 0

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Figure: Before Update Figure: After Update

Lecture 12 Weight-Pushing Worked Example (T over Tropical Semiring) Figure: Before Weight-Pushing, T Figure: After Weight-Pushing, push(t ) over a Tropical Semiring

Lecture 12 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers. Mehryar Mohri Weighted Automata Algorithms.

Lecture 13 Introduction to 5 Basic Operations for WFST: Minimization IIP-TL@NTU Lim Zhi Hao 2015

Lecture 13 Minimization WFST s Operations Composition Optimization Epsilon-Removal Determinization Weight-Pushing Minimization

Lecture 13 Minimization Introduction to Minimization Minimize the number of states for any DFA. Hence, given a transducer T, usually a determinization operation is performed, resulting in det(t) First, push weights and output labels to the initial states in the WFST Then, minimize the WFST using a classical minimization algorithm assuming that the triplet input:output/weight on each transition as one single label.

Lecture 13 Minimization Minimization Algorithm I For example, given a determinized WFST det(t ):

Lecture 13 Minimization Minimization Algorithm II Perform weight-pushing on det(t ): to obtain push(det(t ))

Lecture 13 Minimization Minimization Algorithm III Finally, apply any minimization algorithm on push(det(t )): And obtain min(push(det(t )))

Lecture 13 Minimization Rationale for Minimization Hopcrofts algorithm is an algorithm for e ciently minimizing DFAs. If WFST is acyclic, a more e cient algorithm, Revuzs algorithm, is used. By reducing the number of states, the search paths are greatly reduced, shortening search time.

Lecture 13 Minimization Recall Properties Required

Lecture 13 Applications Some Applications of WFSTs Speech Recognition Optical Character Recognition Text Classification Machine translation

Lecture 13 Appendix References References I Takaaki Hori Speech Recognition Algorithms Using Weighted Finite StateTransducers.