OpenFst: An Open-Source, Weighted Finite-State Transducer Library and its Applications to Speech and Language. Part I. Theory and Algorithms

Similar documents
Speech Recognition Lecture 4: Weighted Transducer Software Library. Mehryar Mohri Courant Institute of Mathematical Sciences

Generic ǫ-removal and Input ǫ-normalization Algorithms for Weighted Transducers

Weighted Finite-State Transducer Algorithms An Overview

Statistical Natural Language Processing

EFFICIENT ALGORITHMS FOR TESTING THE TWINS PROPERTY

N-Way Composition of Weighted Finite-State Transducers

A Disambiguation Algorithm for Weighted Automata

Introduction to Machine Learning Lecture 9. Mehryar Mohri Courant Institute and Google Research

Introduction to Finite Automaton

Pre-Initialized Composition For Large-Vocabulary Speech Recognition

Finite-State Transducers

On-Line Learning with Path Experts and Non-Additive Losses

Kernel Methods for Learning Languages

Learning Languages with Rational Kernels

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Introduction to Finite-State Automata

Introduction to Kleene Algebras

Formal Languages, Automata and Models of Computation

Weighted Finite State Transducers in Automatic Speech Recognition

An Abstraction for the Analysis of Secure Policy Interoperability

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin

Advanced Machine Learning

Path Queries under Distortions: Answering and Containment

Advanced Automata Theory 10 Transducers and Rational Relations

Introduction to Automata

Finite-state Machines: Theory and Applications

A Universal Turing Machine

Regular Expressions [1] Regular Expressions. Regular expressions can be seen as a system of notations for denoting ɛ-nfa

Author: Vivek Kulkarni ( )

Chapter 5. Finite Automata

Finite State Transducers

Hierarchical Overlap Graph

Finite Automata. Finite Automata

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

BASIC MATHEMATICAL TECHNIQUES

Regular Expressions and Finite-State Automata. L545 Spring 2008

Learning from Uncertain Data

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Probabilistic Aspects of Computer Science: Probabilistic Automata

Max-plus automata and Tropical identities

CMPS 6610 Fall 2018 Shortest Paths Carola Wenk

Language Technology. Unit 1: Sequence Models. CUNY Graduate Center. Lecture 4a: Probabilities and Estimations

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Models of Computation. by Costas Busch, LSU

Axioms of Kleene Algebra

A Weak Bisimulation for Weighted Automata

Speech Recognition Lecture 5: N-gram Language Models. Eugene Weinstein Google, NYU Courant Institute Slide Credit: Mehryar Mohri

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

Finite Automata - Deterministic Finite Automata. Deterministic Finite Automaton (DFA) (or Finite State Machine)

REVIEW QUESTIONS. Chapter 1: Foundations: Sets, Logic, and Algorithms

Large-Scale Training of SVMs with Automata Kernels

컴파일러입문 제 3 장 정규언어

Automata Theory for Presburger Arithmetic Logic

CS 133 : Automata Theory and Computability

Some useful tasks involving language. Finite-State Machines and Regular Languages. More useful tasks involving language. Regular expressions

Single Source Shortest Paths

Theoretical Computer Science

L p Distance and Equivalence of Probabilistic Automata

Automata Theory and Formal Grammars: Lecture 1

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Deterministic Finite Automaton (DFA)

Context-free Languages and Pushdown Automata

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Mildly Context-Sensitive Grammar Formalisms: Thread Automata

Part I: Definitions and Properties

UNIT-I. Strings, Alphabets, Language and Operations

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Mathematical Preliminaries. Sipser pages 1-28

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Page 1. Compiler Lecture Note, (Regular Language) 컴파일러입문 제 3 장 정규언어. Database Lab. KHU

MA/CSSE 474 Theory of Computation

Theory of Computation

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Theory of Computation

Natural Language Processing

Lecture 17: Language Recognition

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro Diniz

Warshall s algorithm

Language-Processing Problems. Roland Backhouse DIMACS, 8th July, 2003

CS 410/584, Algorithm Design & Analysis, Lecture Notes 4

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

5 Context-Free Languages

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

CS 21 Decidability and Tractability Winter Solution Set 3

Dynamic Programming: Shortest Paths and DFA to Reg Exps

Uses of finite automata

Finite Automata and Formal Languages

Building Finite-State Machines Intro to NLP - J. Eisner 1

CS 322 D: Formal languages and automata theory

Advanced Automata Theory 9 Automatic Structures in General

Nondeterministic Finite Automata

CS 455/555: Finite automata

Monoidal Categories, Bialgebras, and Automata

Automata and Languages

CP405 Theory of Computation

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages

Non-context-Free Languages. CS215, Lecture 5 c

Transcription:

OpenFst: An Open-Source, Weighted Finite-State Transducer Library and its Applications to Speech and Language Part I. Theory and Algorithms

Overview. Preliminaries Semirings Weighted Automata and Transducers. Algorithms Rational Operations Elementary Unary Operations Fundamental Binary Operations Optimization Algorithms Search Operations Fundamental String Algorithms OpenFst Part I. Algorithms Overview

Weight Sets: Semirings A semiring (K,,,, ) = a ring that may lack negation. Sum: to compute the weight of a sequence (sum of the weights of the paths labeled with that sequence). Product: to compute the weight of a path (product of the weights of constituent transitions). Semiring Set Boolean {, } Probability R + + Log R {, + } log + + Tropical R {, + } min + + String Σ { } ǫ log is defined by: x log y = log(e x +e y ) and is longest common prefix. The string semiring is a left semiring. OpenFst Part I. Algorithms Preliminaries

Weighted Automaton/Acceptor a/ b/4 / b/ c/3 a/ b/3 3/ b/ c/5 Probability semiring (R +, +,,, ) Tropical semiring (R + { }, min, +,, ) [A](ab) = 4 [A](ab) = 4 ( + 3 = 4) (min( + +,3 + + ) = 4) OpenFst Part I. Algorithms Preliminaries 3

Weighted Transducer a:ε/ a:r/3 / b:r/ b:ε/ c:s/ 3/ Probability semiring (R +, +,,, ) Tropical semiring (R + { }, min, +,, ) [T ](ab, r) = 6 [T ](ab, r) = 5 ( + 3 = 6) (min( + +,3 + + ) = 5) OpenFst Part I. Algorithms Preliminaries 4

Transducers as Weighted Automata A transducer T is functional iff for each x there exists at most one y such that [T ](x, y) An unweighted functional transducer can be seen as as: a weighted automata over the string semiring (Σ { },,,, ǫ) A weighted functional transducer over the semiring K can be seen as: a weighted automata over the cartesian product of the string semiring and K a:ε/ a:r/3 / b:r/ b:ε/ c:s/ 3/ a/(ε,) a/(r,3) /(ε,) b/(r,) b/(ε,) c/(s,) 3/(ε,) [T ](ab, r) = 5 [A](ab) = (r,5) [Tropical semiring (R + { },min,+,, )] OpenFst Part I. Algorithms Preliminaries 5

Definitions and Notation Paths Path π Origin or previous state: p[π]. Destination or next state: n[π]. Input label: i[π]. Output label: o[π]. p[π] i[π]:o[π] n[π] Sets of paths P(R,R ): set of all paths from R Q to R Q. P(R,x, R ): paths in P(R, R ) with input label x. P(R,x, y,r ): paths in P(R, x, R ) with output label y. OpenFst Part I. Algorithms Preliminaries 6

Definitions and Notation Automata and Transducers. General Definitions Alphabets: input Σ, output. States: Q, initial states I, final states F. Transitions: E Q (Σ {ǫ}) ( {ǫ}) K Q. Weight functions: initial weight function λ : I K. Machines final weight function ρ : F K. Automaton A = (Σ, Q, I,F, E, λ, ρ) with for all x Σ : M [A](x) = λ(p[π]) w[π] ρ(n[π]) π P(I,x,F) Transducer T = (Σ,,Q, I,F, E, λ, ρ) with for all x Σ, y : M [T ](x, y) = λ(p[π]) w[π] ρ(n[π]) π P(I,x,y,F) OpenFst Part I. Algorithms Preliminaries 7

Rational Operations Algorithms Definitions Operation Definition and Notation Sum [T T ](x, y) = [T ](x, y) [T ](x, y) M Product [T T ](x, y) = [T ](x,y ) [T ](x,y ) x=x x,y=y y M Closure [T ](x, y) = [T n ](x, y) n= Conditions on the closure operation: condition on T: e.g. weight of ǫ-cycles = (regulated transducers), or semiring condition: e.g. x = as with the tropical semiring (locally closed semirings). Complexity and implementation Complexity (linear): O(( E + Q ) + ( E + Q )) or O( Q + E ). Lazy implementation. OpenFst Part I. Algorithms Rational Operations 8

Sum (Union) Illustration Definition: [T T ](x, y) = [T ](x, y) [T ](x, y) Example: red/.5 green/.3 blue/ yellow/.6 /.8 green/.4 blue/. / /.3 red/.5 6 eps/ eps/ green/.3 blue/ /.8 yellow/.6 3 green/.4 4 / blue/. 5 /.3 OpenFst Part I. Algorithms Rational Operations 9

Product (Concatenation) Illustration M Definition: [T T ](x, y) = [T ](x, y ) [T ](x, y ) x=x x,y=y y Example: green/.4 / red/.5 blue/. green/.3 blue/ yellow/.6 /.8 /.3 red/.5 green/.4 4 / green/.3 blue/ yellow/.6 eps/.8 3 blue/. 5 /.3 OpenFst Part I. Algorithms Rational Operations

Definition: [T ](x, y) = Example: Closure Illustration M [T n ](x, y) n= green/.4 / green/.4 eps/ / eps/ blue/. 3 / blue/. /.3 eps/.3 /.3 OpenFst Part I. Algorithms Rational Operations

Some Elementary Unary Operations Algorithms Definitions Operation Definition and Notation Lazy Implementation Reversal [ T e ](x, y) = [T ](ex, ey) No Inversion [T ](x, y) = [T ](y, x) Yes Projection [Π (T)](x) = M y [T ](x, y) Yes [Π (T)](x) = M y [T ](y, x) Complexity and implementation Complexity (linear): O( Q + E ). Lazy implementation (see table). OpenFst Part I. Algorithms Elementary Unary Operations

Definition: [ e T ](x, y) = [T ](ex, ey) Example: Reversal Illustration red/.5 green/. 3 / green/.3 blue/ yellow/.6 blue/ 4 /.3 red/.5 eps/ eps/.3 4 5 green/. blue/ 3 blue/ yellow/.6 green/.3 / OpenFst Part I. Algorithms Elementary Unary Operations 3

Definition: [T ](x, y) = [T ](y, x) Inversion Illustration Example: red:bird/.5 green:pig/.3 blue:cat/ yellow:dog/.6 /.8 bird:red/.5 pig:green/.3 cat:blue/ dog:yellow/.6 /.8 OpenFst Part I. Algorithms Elementary Unary Operations 4

Projection Illustration Definition: [Π (T)](x) = M y [T ](x, y) Example: red:bird/.5 green:pig/.3 blue:cat/ yellow:dog/.6 /.8 red/.5 green/.3 blue/ yellow/.6 /.8 OpenFst Part I. Algorithms Elementary Unary Operations 5

Definitions Some Fundamental Binary Operations Algorithms Operation Definition and Notation Condition Composition [T T ](x, y) = M z [T ](x, z) [T ](z, y) K commutative Intersection [A A ](x) = [A ](x) [A ](x) K commutative Difference [A A ](x) = [A A ](x) A unweighted & deterministic Complexity and implementation Complexity (quadratic): O(( E + Q ) ( E + Q )). Path multiplicity in presence of ǫ-transitions: ǫ-filter. Lazy implementation. OpenFst Part I. Algorithms Fundamental Binary Operations 6

Composition Illustration Definition: [T T ](x, y) = M z [T ](x, z) [T ](z, y) Example: c:a/.3 a:b/.6 a:b/. b:a/. a:a/.4 b:b/.5 3/.6 b:c/.3 a:b/.4 /.7 c:b/.9 c:b/.7 (, ) a:b/ a:c/.4 (, ) (, ) a:b/.8 (3, )/.3 OpenFst Part I. Algorithms Fundamental Binary Operations 7

Composition Pseudocode Composition(T, T ) S Q I I while S do 3 (q, q ) Head(S) 4 Dequeue(S) 5 if (q, q ) I I then 6 I I I 7 λ(q, q ) λ (q ) λ (q ) 8 if (q, q ) F F then 9 F F {(q, q )} ρ(q, q ) ρ (q ) ρ (q ) for each(e, e ) such that o[e ] = i[e ] do if (n[e ], n[e ]) Q then 3 Q Q {(n[e ], n[e ])} 4 Enqueue(S,(n[e ], n[e ])) 5 E E {((q, q ), i[e ], o[e ], w[e ] w[e ],(n[e ], n[e ]))} 6 return T = (Σ,, Q, I, F, E, λ, ρ) OpenFst Part I. Algorithms Fundamental Binary Operations 8

Multiplicity & ǫ-transitions Problem T : a:a b:ε c:ε 3 d:d 4 ε:ε ˆT : a:a ε:ε b:ε ε:ε c:ε ε:ε 3 d:d ε:ε 4 T : a:d ε:e d:a 3 ε:ε ˆT : a:d ε:ε ε: e ε:ε d:a ε:ε 3 ˆT ˆT : a:d ε:e,,, (x:x) b:ε (ε:ε) c:ε (ε:ε) (ε:ε) ε:e,, (ε:ε) b:e (ε:ε) b:ε (ε:ε) c:ε (ε:ε) Redundant ǫ-paths 3, ε:e 3, (ε:ε) d:a (x:x) 4,3 OpenFst Part I. Algorithms Fundamental Binary Operations 9

Solution Filter F for Composition ε:ε x:x Filter F ε:ε x:x ε:ε ε:ε ε:ε a:d ε:e,,, (x:x) b:ε (ε:ε) c:ε (ε:ε) (ε:ε) ε:e,, (ε:ε) b:e (ε:ε) b:ε (ε:ε) c:ε (ε:ε) x:x 3, ε:e 3, (ε:ε) d:a (x:x) 4,3 Replace T T by ˆT F ˆT. OpenFst Part I. Algorithms Fundamental Binary Operations

Intersection Illustration Definition: [A A ](x) = [A ](x) [A ](x) Example: red/.5 green/.4 green/.3 blue/ yellow/.6 /.8 / red/. blue/.6 yellow/.3 /.5 blue/.6 3 /.8 red/.7 green/.7 yellow/.9 4 /.3 OpenFst Part I. Algorithms Fundamental Binary Operations

Difference Illustration Definition: [A A ](x) = [A A ](x) Example: red/.5 green green/.3 blue/ yellow/.6 /.8 red blue yellow red/.5 red/.5 red/.5 green/.3 green/.3 3 blue/ yellow/.6 4 /.8 OpenFst Part I. Algorithms Fundamental Binary Operations

Optimization Algorithms Overview Definitions Operation Connection ǫ-removal Determinization Pushing Minimization Description Removes non-accessible/non-coaccessible states Removes ǫ-transitions Creates equivalent deterministic machine Creates equivalent pushed/stochastic machine Creates equivalent minimal deterministic machine Conditions: There are specific semiring conditions for the use of these algorithms. Not all weighted automata or transducers can be determinized using that algorithm. OpenFst Part I. Algorithms Optimization Algorithms 3

Connection Algorithm Definition Input: weighted transducer T. Output: weighted transducer T T with all states connected. Description. Depth-first search of T from I.. Mark accessible and coaccessible states. 3. Keep marked states and corresponding transitions. Complexity and implementation Complexity (linear): O( Q + E ). No natural lazy implementation. OpenFst Part I. Algorithms Optimization Algorithms 4

Connection Illustration Definition: Removes non-accessible/non-coaccessible states Example: green/. 3 4 /. red/.5 blue/ /.8 green/.3 yellow/.6 red/ 5 red/.5 green/.3 blue/ yellow/.6 /.8 OpenFst Part I. Algorithms Optimization Algorithms 5

Definition ǫ-removal Algorithm Input: weighted transducer T with ǫ-transitions. Output: weighted transducer T T with no ǫ-transition. Description (two stages):. Computation of ǫ-closures: for any state p, states q that can be reached from p via ǫ-paths and the total weight of the ǫ-paths from p to q. C[p] = (q,w) : q ǫ[p], d[p, q] = w with: d[p,q] = M π P(p,ǫ,q) w[π]. Removal of ǫ s: actual removal of ǫ-transitions and addition of new transitions. = All-pair K-shortest-distance problem in T ǫ (T reduced to its ǫ- transitions). OpenFst Part I. Algorithms Optimization Algorithms 6

Complexity and implementation All-pair shortest-distance algorithm in T ǫ. k-closed semirings (for T ǫ ) or approximation: generic sparse shortestdistance algorithm [See references]. Closed semirings: Floyd-Warshall or Gauss-Jordan elimination algorithm with decomposition of T ǫ into strongly connected components [See references], space complexity (quadratic): O( Q + E ). time complexity (cubic): O( Q 3 (T + T + T )). Complexity: Acyclic T ǫ : O( Q + Q E (T + T )). General case (tropical semiring): O( Q E + Q log Q ). Lazy implementation: integration with on-the-fly weighted determinization. OpenFst Part I. Algorithms Optimization Algorithms 7

Definition: Removes ǫ-transitions ǫ-removal Illustration Example: a:b/. ε:ε/. b:a/.4 ε:ε/.3 ε:ε/.6 b:a/.9 b:b/.5 3 ε:ε/ 4/ b:a/.7 a:a/.8 4/..6.6.8 3.3 T a:a/.6 ǫ-distances b:a/ b:a/.7 a:b/. b:b/.7 b:a/.5 b:b/.5 b:a/.3 a:a/.4 b:a/.7 4/ b:a/.4 b:a/.9 a:a/.8 3 rmeps(t) b:a/.7 OpenFst Part I. Algorithms Optimization Algorithms 8

Definition Determinization Algorithm Input: determinizable weighted automaton or transducer M. Output: M M subsequential or deterministic: M has a unique initial state and no two transitions leaving the same state share the same input label. Description. Generalization of subset construction: weighted subsets {(q, w ),...,(q n, w n )}, w i remainder weight at state q i.. Weight of a transition in the result: -sum of the original transitions pre- -multiplied by remainders. Conditions Semiring: weakly left divisible semirings. M is determinizable the determinization algorithm applies to M. All unweighted automata are determinizable. All acyclic machines are determinizable. OpenFst Part I. Algorithms Optimization Algorithms 9

Not all weighted automata or transducers are determinizable. Characterization based on the twins property. Complexity and Implementation Complexity: exponential. Lazy implementation. OpenFst Part I. Algorithms Optimization Algorithms 3

Determinization of Weighted Automata Illustration Definition: Creates an equivalent deterministic weighted automaton Example: a/ b/4 b/ b/3 a/3 b/3 3/ b/ / b/5 a/ {(,),(,)}/ b/ {(,)} b/ b/3 {(3,)}/ {(,),(,3)}/ OpenFst Part I. Algorithms Optimization Algorithms 3

Determinization of Weighted Transducers Illustration Definition: Creates an equivalent deterministic weighted transducer Example: a:a/. a:b/. a:c/.5 b:b/.3 b:eps/.4 3 c:eps/.7 a:eps/.5 a:b/.6 4/.7 a:a/. {(, b, )} a:b/. {(, eps, )} a:eps/. {(, c,.4), (, a, )} b:a/.3 {(3, b, ), (4, eps,.)} /(eps,.8) a:b/.5 a:b/.6 c:c/. {(4, eps, )} /(eps,.7) OpenFst Part I. Algorithms Optimization Algorithms 3

Determinization of Weighted Automata Pseudocode Determinization(A) i {(i, λ(i)) : i I} λ (i ) 3 S i 4 while S do 5 p Head(S) 6 Dequeue(S) 7 for each x i[e[q[p ]]] do 8 w L v w : (p, v) p,(p, x, w, q) E 9 q {(q, L w (v w) : (p, v) p,(p, x, w, q) E ) : q = n[e], i[e] = x, e E[Q[p ]]} E E (p, x, w, q ) if q Q then Q Q q 3 if Q[q ] F then 4 F F q 5 ρ (q ) L v ρ(q) : (q, v) q, q F 6 Enqueue(S, q ) 7 return A OpenFst Part I. Algorithms Optimization Algorithms 33

Definition Pushing Algorithm Input: weighted automaton or transducer M. Output: M M such that: the longest common prefix of all outgoing paths is minimal, or the -sum of the weights of all outgoing transitions = modulo the string/weight at the initial state. Description (two stages):. Single-source shortest distance computation: for each state q, d[q] = M w[π] π P(q,F). Reweighting: for each transition e such that d[p[e]], w[e] (d[p[e]]) (w[e] d[n[e]]) OpenFst Part I. Algorithms Optimization Algorithms 34

Conditions (automata case) Weakly divisible semiring. Zero-sum free semiring or zero-sum free machine. Complexity Automata case Acyclic case (linear): O( Q + E (T + T )). General case (tropical semiring): O( Q log Q + E ). Transducer case: O(( P max + ) E ). OpenFst Part I. Algorithms Optimization Algorithms 35

Weight Pushing Illustration Definition: Creates an equivalent pushed/stochastic machine Example: Tropical semiring a/ b/ c/4 d/ e/ f/ e/ 3/ a/ b/ c/4 d/ e/ f/ e/ 3/ e/ f/ e/ f/ OpenFst Part I. Algorithms Optimization Algorithms 36

Log semiring a/ b/ c/4 d/ e/ f/ e/ 3/ a/.366 b/.366 c/4.366 d/.36 e/.33 f/.33 e/.33 3/-.639 e/ f/ e/.36 f/.33 OpenFst Part I. Algorithms Optimization Algorithms 37

Label Pushing Illustration Definition: Minimizes at each state the length of the common prefix of all outgoing paths at that state. Example: a:a c: ε b:d a: ε b:ε 3 d:ε c:d a:d 4 a: ε 5 f:d 6 a:a c:d b:ε a:d b:ε 3 d:d c: ε a:d 4 a: ε 5 f: ε 6 OpenFst Part I. Algorithms Optimization Algorithms 38

Definition Minimization Algorithm Input: deterministic weighted automaton or transducer M. Output: deterministic M M with minimal number of states and transitions. Description: two stages. Canonical representation: use pushing or other algorithm to standardize input automata.. Automata minimization: encode pairs (label, weight) as labels and use classical unweighted minimization algorithm. Complexity Automata case Acyclic case (linear): O( Q + E (T + T )). General case (tropical semiring): O( E log Q ). Transducer case Acyclic case: O(S + Q + E ( P max + )). General case: O(S + Q + E (log Q + P max )). OpenFst Part I. Algorithms Optimization Algorithms 39

Minimization Illustration Definition: Computes a minimal equivalent deterministic machine Example: b/ c/ d/ a/ b/ a/ d/3 3 e/ d/4 e/3 5 c/3 c/ 6 e/ 7/ 4 b/ c/ d/ a/ b/ a/3 d/6 3 e/ d/6 e/ 5 c/ c/ 6 e/ 7/6 d/ a/ b/ b/ a/3 4 c/ c/ e/ d/6 3 6 e/ 7/6 OpenFst Part I. Algorithms Optimization Algorithms 4

Definition Equivalence Algorithm Input: deterministic weighted automata A and A. Output: true if A A, false otherwise. Description: two stages. Canonical representation: use pushing or other algorithm to standardize input automata.. Test: encode pairs (label, weight) as labels and use classical algorithm for testing the equivalence of unweighted automata. Complexity First stage: O(( E + E ) + ( Q + Q )log( Q + Q )) if using pushing in the tropical semiring. Second stage (quasi-linear): O(m α(m, n)) where m = E + E and n = Q + Q, and α is the inverse of Ackermann s function. OpenFst Part I. Algorithms Optimization Algorithms 4

Equivalence Illustration Definition: A A iff [A ](x) = [A ](x) for all x Graphical Representation: blue/.7 /.3 red/.3 yellow/.9 3 /.4 red/? blue/ yellow/.3 /.3 OpenFst Part I. Algorithms Optimization Algorithms 4

Single-Source Shortest-Distance Algorithms Algorithm Generic single-source shortest-distance algorithm Definition: for each state q, d[q] = M π P(q,F) w[π] Works with any queue discipline and any semiring k-closed for the graph. Coincides with classical algorithms in the specific case of the tropical semiring and the specific queue disciplines: shortest-first (Dijkstra), FIFO (Bellman-Ford), or topological sort order (Lawler). N-shortest paths algorithm General N-shortest paths algorithm augmented with the computation of the potentials. On-the-fly weighted determinization for n-shortest strings. OpenFst Part I. Algorithms Search Operations 43

N-Shortest Paths Illustration Definition: Computes the N-shortest paths in the input machine Condition: Semiring needs to have the path property: a b {a, b} (e.g. tropical semiring) Example: red/.5 red/.5 red/.5 green/.3 green/.3 3 blue/ yellow/.6 4 /.8 green/.3 blue/ /.8 OpenFst Part I. Algorithms Search Operations 44

Pruning Illustration Definition: Removes any paths which weight is more than the shortestdistance -multiply by a specified threshold Condition: Semiring needs to be commutative and have the path property: a b {a, b} (e.g. tropical semiring) Example: red/.5 red/.5 green/.3 green/.3 3 blue/ yellow/.6 4 /.8 green/.3 blue/ yellow/.6 /.8 OpenFst Part I. Algorithms Search Operations 45

String Algorithms Overview How to implement some fundamental string algorithms using the operations previously described: Counting patterns (e.g. n-grams) in automata Pattern matching using automata Compiling regular expression into automata Benefits: generality, efficiency and flexibility OpenFst Part I. Algorithms String Algorithms 46

Expected count of x in A: Counting from weighted automata Let A be a weighted automaton over the probability semiring, c(x) = u Σ u x [A](u) where: u x : number of occurrences of x in u [A](u): weight associated to u by A Pr(u) if A is pushed Condition: The weight of any cycle in A should be less than. This is the case if A represents a probability distribution. OpenFst Part I. Algorithms String Algorithms 47

Counting by composition with a transducer Counting transducer T for set of sequences X with Σ = {a, b}: b:ε/ a:ε/ X:X/ b:ε/ a:ε/ / u x v 3 u:ε x:x v:ε 3 in A in A T To each successful path π in A and each occurence of x along π corresponds a successful path with output x in A T c(x) is the sum of the weight of all the successful path with output x in A T Theorem: Let Π denote projection onto output labels. For all x X, c(x) = [Π (A T)](x) OpenFst Part I. Algorithms String Algorithms 48

Counting with transducers Example a c b 3 a a 4 b 5 c:ε b:ε a:ε a:a b:b c:ε b:ε a:ε c:ε a:ε a:a 5 Automaton A 8 a:a b:ε a:a b:b 6 a:ε 3 7 b:b b:ε 4 Counting transducer T 8 a ε ε ε a 3 b a ε 5 b 6 ε 7 4 A T Π (A T) [Π (A T)](ab) = 3 = c(ab) OpenFst Part I. Algorithms String Algorithms 49

Local Grammar Algorithm [Mohri, 94] Definition Input: a deterministic finite automaton A Output: a compact representation of det(σ A) Description A generalization of [Aho-Corasick, 75] Failure transitions: labeled by φ, non-consuming, traversed when no transition with required label is present Default transitions: labeled by ρ, consuming, traversed when no transition with required label is present, only present at the initial state OpenFst Part I. Algorithms String Algorithms 5

Local Grammar Illustration Pattern matching: find all occurences of pattern A in text T T = abbaabaab, A = ab + a b a b a 3 ρ a φ φ b φ b a 3 A det(σ A), a, b, b 3, a 4,3 a 5, b 6, a 7,3 a 8, b 9, Complexity: search time linear in T T det(σ A) OpenFst Part I. Algorithms String Algorithms 5

Definition Regular Expression Compilation Algorithms Input: a (weighted) regular expression α Output: a (weighted) automata representing α Description: Thompson construction. Build a sparse tree for α. Walk the tree bottom-up and apply the relevant rational operation at each node Complexity and implementation Linear in the length of α Admits lazy implementation Other constructions (Glushkov, Antimirov, Follow) can be obtained from Thompson using epsilon-removal and minimization OpenFst Part I. Algorithms String Algorithms 5

Regular Expression Compilation Thompson Regular expression: α = (ab + 3b(4ab) ) Thompson automaton: ε/ ε/ ε/ ε/ ε/3 3 a/ ε/ b/ 3 ε/ b/ ε/ ε/ ε/4 4 a/ 4 ε/ 5 b/ ε/ 5 ε/ ε/ ε/ ε/ ε/ ε/ */ A T (α) OpenFst Part I. Algorithms String Algorithms 53

Regular Expression Compilation Glushkov Regular expression: α = (ab + 3b(4ab) ) Glushkov automaton: a/ / a/ b/ a/ / b/3 a/ b/3 b/3 a/4 4 3/ b/3 a/4 b/ 5/ A G (α) = rmeps(a T (α)) OpenFst Part I. Algorithms String Algorithms 54