Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri
Preliminries Finite lphet, empty string. Set of ll strings over :. Length of string x : x. Mirror imge or reverse of string x = x 1 x n : x R = x n x 1. A lnguge L: suset of. pge 2 Cournt Institute, NYU
Rtionl Opertions Rtionl opertions over lnguges: union: lso denoted L 1 + L 2, conctention: closure: L 1 L 2 = {x : x L 1 x L 2 }. L 1 L 2 = {x = uv : u L 1 v L 2 }. L = L n, where L n = L L n=0 n. pge 3 Cournt Institute, NYU
Regulr or Rtionl Lnguges Definition: the clss of regulr/rtionl lnguges over is the smllest set L contining the empty set nd closed under the rtionl opertions. i.e., L x, {x} L L 1,L 2 L,L 1 L 2 L,L 1 L 2 L,L 1 L. Exmples of regulr lnguges over ={,, c} :, ( + ) c, n c, ( +( + c) ) c. pge 4 Cournt Institute, NYU
Finite Automt Definition: finite utomton A over the lphet is 4-tuple (Q, I, F, E) where Q is finite set of sttes, I Q set of initil sttes, F Q set of finl sttes, nd E multiset of trnsitions which re elements of Q ( { }) Q. pth in n utomton element of E. A =(Q, I, F, E) pth from stte in I to stte in is n F is clled n ccepting pth. Lnguge L(A) ccepted y A: set of strings leling ccepting pths. pge 5 Cournt Institute, NYU
Finite Automt - Exmple 0 1 2 pge 6 Cournt Institute, NYU
Finite Automt - Some Properties Trim: ny stte lies on some ccepting pth. Unmiguous: no two ccepting pths hve the sme lel. Deterministic: unique initil stte, trnsitions leving the sme stte hve different lels. Complete: t lest one outgoing trnsition leled with ny lphet element t ny stte. Acyclic: no pth with cycle. pge 7 Cournt Institute, NYU
Normlized Automt Definition: finite utomton is normlized if it hs unique initil stte with no incoming trnsition. it hs unique finl stte with no outgoing trnsition. i A f pge 8 Cournt Institute, NYU
Elementry Normlized Automton Definition: normlized utomton ccepting n element { } constructed s follows. 0 1 pge 9 Cournt Institute, NYU
Normlized Automt: Union Construction: the union of two normlized utomt is normlized utomton constructed s follows. i 1 A 1 1 f i f i 2 A 2 2 f pge 10 Cournt Institute, NYU
Normlized Automt: Conctention Construction: the conctention of two normlized utomt is normlized utomton constructed s follows. i1 A f 1 1 i f 2 A 2 2 pge 11 Cournt Institute, NYU
Normlized Automt: Closure Construction: the closure of normlized utomton is normlized utomton constructed s follows. i 0 i A f f 0 pge 12 Cournt Institute, NYU
Normlized Automt - Properties Construction properties: ech rtionl opertion require creting t most two sttes. ech stte hs t most two outgoing trnsitions. the complexity of ech opertion is liner. pge 13 Cournt Institute, NYU
Thompson s Construction let re regulr expression over the lphet. Then, there exists normlized utomton A with t most 2 r sttes representing r. Proof: first, prse regulr expression. (Thompson, 1968) construction of normlized utomton strting from elementry expressions nd following opertions of the tree. pge 14 Cournt Institute, NYU
Thompson s Construction - Exmple ε 4 ε 5 ε ε 1 2 ε 3 ε 6 ε 0 ε 7 c 8 ε 9 Normlized utomton for regulr expression + c. pge 15 Cournt Institute, NYU
Regulr Lnguges nd Finite Automt Theorem: A lnguge is regulr iff it cn e ccepted y finite utomton. Proof: Let for A =(Q, I, F, E) e finite utomton. (i, j, k) [1, Q ] [1, Q ] [0, Q ] L(A) = is thus regulr. i I,f F X Q if pge 16 define Xij 0 is regulr for ll (i, j) since E is finite. y induction Xij k for ll (i, j, k) since (Kleene, 1956) X k ij = {i q 1 q 2... q n j : n 0,q i k}. X k+1 ij = X k ij + Xk i,k+1 (Xk k+1,k+1 ) Xk k+1,j. Cournt Institute, NYU
Regulr Lnguges nd Finite Automt Proof: the converse holds y Thompson s construction. Notes: more generl theorem (Schützenerger, 1961) holds for weighted utomt. not ll lnguges re regulr, e.g., L = { n n : n N} is not regulr. Let A e n utomton. If L L(A), then for lrge enough n, n n corresponds to pth with cycle: n n = p u q, p u q L(A), which implies L(A) = L. pge 17 Cournt Institute, NYU
ε-removl Any finite utomton hs n equivlent utomton with no ε-trnsitions. For ny stte q Q, let [q] denote the set of sttes reched from q y pths leled with. Define A =(Q,I,F,E ) Q = { [q]: q Q}, I = [q], F = { [q]: [q] F = }. q I E 0 = {( [p],, [q]) : 9(p 0,,q) 2 E,p 0 2 [p]}. s pge 18 Cournt Institute, NYU
ε-removl - Illustrtion 0 1 2 3 {0, 1} {0, 2} {0, 1, 3} {0} pge 19 Cournt Institute, NYU
Determiniztion Any utomton A =(Q, I, F, E) without epsilon trnsitions hs n equivlent deterministic utomton. Suset construction: A =(Q,I,F,E ) with Q =2 Q. I = {s Q : s I = }. F = {s Q : s F = }. E = {(s,, s ): (q,, q ) E,q s, q s }. pge 20 Cournt Institute, NYU
Determiniztion - Illustrtion 0 1 2 {0} {1} {1, 2} {2} {0, 1} pge 21 Cournt Institute, NYU
Completion Any deterministic utomton hs n equivlent complete deterministic utomton. Algorithm illustrtion: 0 1 3 0 1 3 2 2 4 pge 22 Cournt Institute, NYU
Complementtion Let A =(Q, I, F, E) e deterministic utomton, then there exists deterministic utomton ccepting L(A). By previous property, we cn ssume A complete. The utomton B =(,Q,I,Q F, E) otined from A y mking non-finl sttes finl nd finl sttes non-finl exctly ccepts L(A). pge 23 Cournt Institute, NYU
Complementtion - Ilustrtion 0 1 3 2 4 0 1 3 2 4 pge 24 Cournt Institute, NYU
Finite-Stte Trnsducers Definition: finite-stte trnsducer T over the lphets nd is 4-tuple where Q is finite set of sttes, I Q set of initil sttes, F Q set of finl sttes, nd E multiset of trnsitions which re elements of Q ( { }) ( { }) Q. T defines reltion vi the pir of input nd output lels of its ccepting pths, R(T )={(x, y) : I x:y F }. pge 25 Cournt Institute, NYU
References Kleene, S. C.1956. Representtion of events in nerve nets nd finite utomt. Automt Studies. Lewis, Hrry R. nd Ppdimitriou, Christos H. Elements of the Theory of Computtion, Chpter 2. Prentice Hll, 1981. Nivt, Murice. 968. Trnsductions des lngges de Chomsky. Annles 18, Institut Fourier. Schützenerger, Mrcel~Pul. 1961. On the definition of fmily of utomt. Informtion nd Control, 4 Thompson, K. 1968. Regulr expression serch lgorithm. Comm. ACM, 11. pge 26 Cournt Institute, NYU