General Algorithms for Testing the Ambiguity of Finite Automata

Generl Algorithms for Testing the Amiguity of Finite Automt Cyril Alluzen 1,, Mehryr Mohri 2,1, nd Ashish Rstogi 1, 1 Google Reserch, 76 Ninth Avenue, New York, NY 10011. 2 Cournt Institute of Mthemticl Sciences, 251 Mercer Street, New York, NY 10012. Astrct. This pper presents efficient lgorithms for testing the finite, polynomil, nd exponentil miguity of finite utomt with ǫ- trnsitions. It gives n lgorithm for testing the exponentil miguity of n utomton A in time O( A 2 E), nd finite or polynomil miguity in time O( A 3 E), where A E denotes the numer of trnsitions of A. These complexities significntly improve over the previous est complexities given for the sme prolem. Furthermore, the lgorithms presented re simple nd sed on generl lgorithm for the composition or intersection of utomt. We lso give n lgorithm to determine in time O( A 3 E) the degree of polynomil miguity of polynomilly miguous utomton A. Finlly, we present n ppliction of our lgorithms to n pproximte computtion of the entropy of proilistic utomton. 1 Introduction The question of the miguity of finite utomt rises in vriety of contexts. In some cses, the ppliction of n lgorithm requires n input utomton to e finitely miguous, in others, the convergence of ound or gurntee relies on finite miguity, or the symptotic rte of increse of miguity s function of the string length. Thus, in ll these cses, n lgorithm is needed to test the miguity, either to determine if it is finite, or to estimte its symptotic rte of increse. The prolem of testing miguity hs een extensively nlyzed in the pst [9, 7,13,3,6, 15,12,14,16]. The prolem of determining the degree of miguity of n utomton with finite miguity ws shown y Chn nd Irr to e PSPACE-complete [3]. However, testing finite miguity cn e chieved in polynomil time using chrcteriztion of exponentil nd polynomil miguity given y Irr nd Rvikumr [6] nd Weer nd Seidel [15]. The most efficient lgorithms for testing polynomil nd exponentil miguity, therey testing finite miguity, were given y Weer nd Seidel [14, 16]. The lgorithms they presented in [16] ssume the input utomton to e ǫ-free, ut they re Reserch done t the Cournt Institute, prtilly supported y the New York Stte Office of Science Technology nd Acdemic Reserch (NYSTAR).

extended y Weer to the cse where the utomton hs ǫ-trnsitions in [14]. In the presence of ǫ-trnsitions, the complexity of the lgorithms given y Weer [14] is O(( A E + A 2 Q )2 ) for testing the exponentil miguity of n utomton A nd O(( A E + A 2 Q )3 ) for testing polynomil miguity, where A E stnds for the numer of trnsitions nd A Q the numer of sttes of A. This pper presents significntly more efficient lgorithms for testing finite, polynomil, nd exponentil miguity for the generl cse of utomt with ǫ- trnsitions. It gives n lgorithm for testing the exponentil miguity of n utomton A in time O( A 2 E ), nd finite or polynomil miguity in time O( A 3 E ). The min ide ehind our lgorithms is to mke use of the composition or intersection of finite utomt with ǫ-trnsitions [11, 10]. The ǫ-filter used in these lgorithms crucilly helps in the nlysis nd test of the miguity. The lgorithms presented in this pper would not e vlid nd would led to incorrect results without the use of the ǫ-filter. We lso give n lgorithm to determine in time O( A 3 E ) the degree of polynomil miguity of polynomilly miguous utomton A. Finlly, we present n ppliction of our lgorithms to n pproximte computtion of the entropy of proilistic utomton. The reminder of the pper is orgnized s follows. Section 2 presents generl utomt nd miguity definitions. In Section 3, we give rief description of existing chrcteriztions for the miguity of utomt nd extend them to the cse of utomt with ǫ-trnsitions. In Section 4, we present our lgorithms for testing finite, polynomil, nd exponentil miguity, nd the proof of their correctness. Section 5 shows the relevnce of the computtion of the polynomil miguity to the pproximtion of the entropy of proilistic utomt. 2 Preliminries Definition 1. A finite utomton A is 5-tuple (Σ, Q, E, I, F) where Σ is finite lphet; Q is finite set of sttes; I Q the set of initil sttes; F Q the set of finl sttes; nd E Q (Σ {ǫ}) Q finite set of trnsitions, where ǫ denotes the empty string. We denote y A Q the numer of sttes, y A E the numer of trnsitions, nd y A = A E + A Q the size of n utomton A. Given stte q Q, E[q] denotes the set of trnsitions leving q. For two susets R Q nd R Q, we denote y P(R, x, R ) the set of ll pths from stte q R to stte q R leled with x Σ. We lso denote y p[π] the origin stte, y n[π] the destintion stte, nd y i[π] Σ the lel of pth π. A string x Σ is ccepted y A if it lels n ccepting pth, tht is pth from n initil stte to finl stte. A finite utomton A is sid to e trim if ll its sttes lie on some ccepting pth. It is sid to e unmiguous if no string x Σ lels two distinct ccepting pths; otherwise, it is sid to e miguous. The degree of miguity of string x in A is denoted y d(a, x) nd defined s the numer of ccepting pths in A leled y x. Note tht if A contins n ǫ-cycle, there exists x Σ such tht d(a, x) =. Using depth-first serch

v v p v p v v q () () (c) v 1 p 1 v 1 v 2 v 2 v d v 1 u q 2 v 1 p 2 u 2 q d v 2 p d d q d Fig.1. Illustrtion of the properties: () (EDA); () (IDA); nd (c) (IDA d ). v d of A restricted to ǫ-trnsitions, it cn e decided in liner time if A contins ǫ-cycles. Thus, in the following, we will ssume, without loss of generlity, tht A is ǫ-cycle free. The degree of miguity of A is defined s d(a) = sup x Σ d(a, x). A is sid to e finitely miguous if d(a) < nd infinitely miguous if d(a) =. It is sid to e polynomilly miguous if there exists polynomil h in N[X] such tht d(a, x) h( x ) for ll x Σ. The miniml degree of such polynomil is clled the degree of polynomil miguity of A nd is denoted y dp(a). By definition, dp(a) = 0 iff A is finitely miguous. When A is infinitely miguous ut not polynomilly miguous, it is sid to e exponentilly miguous nd dp(a) =. 3 Chrcteriztion of infinite miguity The chrcteriztion nd test of finite, polynomil, nd exponentil miguity of finite utomt without ǫ-trnsitions re sed on the following three fundmentl properties [6, 15, 14, 16]. Definition 2. The properties (EDA), (IDA), nd (EDA) for A re defined s follows. () (EDA): there exists stte q with t lest two distinct cycles leled y some v Σ (see Figure 1()) [6]. () (IDA): there exist two distinct sttes p nd q with pths leled with v from p to p, p to q, nd q to q, for some v Σ (see Figure 1()) [15, 14, 16]. (c) (IDA d ): there exist 2d sttes p 1,...p d, q 1,..., q d in A nd 2d 1 strings v 1,...,v d nd u 2,...u d in Σ such tht for ll 1 i d, p i q i nd P(p i, v i, p i ), P(p i, v i, q i ), nd P(q i, v i, q i ) re non-empty, nd, for ll 2 i d, P(q i 1, u i, p i ) is non-empty (see Figure 1(c)) [15, 14,16]. Oserve tht (EDA) implies (IDA). Assuming (EDA), let e nd e e the first trnsitions tht differ in the two cycles t stte p, then, since Definition 1 disllows multiple trnsitions etween the sme two sttes with the sme lel, we must hve n[e] n[e ]. Thus, (IDA) holds for the pir (n[e], n[e ]). In the ǫ-free cse, it ws shown tht trim utomton A stisfies (IDA) iff A is infinitely miguous [15,16], tht A stisfies (EDA) iff A is exponentilly miguous [6], nd tht A stisfies (IDA d ) iff dp(a) d [14,16]. In the following proposition, these chrcteriztions re strightforwrdly extended to the cse of utomt with ǫ-trnsitions.

0 1 ε 2 2,1 ε ε ε 2,2 ε 1,1 ε 1,2 0,0 0,0 1,1 ε 2,2 0 1 ε 2 0,1 # 0,2 1,1 () () (c) (d) (e) # # 1,2 2,2 Fig.2. ǫ-filter nd miguity: () Finite utomton A; () A A without using ǫ-filter, which incorrectly mkes A pper s exponentilly miguous; (c) A A using n ǫ- filter. Weer s processing of ǫ-trnsitions: (d) Finite utomton B; (e) ǫ-free utomton B such tht dp(b) = dp(b ). Proposition 1. Let A e trim ǫ-cycle free finite utomton. (i) A is infinitely miguous iff A stisfies (IDA). (ii) A is exponentilly miguous iff A stisfies (EDA). (iii) dp(a) d iff A stisfies (IDA d ). Proof. The proof is y induction on the numer of ǫ-trnsitions in A. If A does not hve ny ǫ-trnsition, then the proposition holds s shown in [15,16] for (i), [6] for (ii) nd [16] for (iii). Assume now tht A hs n+1 ǫ-trnsitions, n 0, nd tht the sttement of the proposition holds for ll utomt with n ǫ-trnsitions. Select n ǫ-trnsition e 0 in A, nd let A e the finite utomton otined fter ppliction of ǫ-removl to A limited to trnsition e 0. A is otined y deleting e 0 from A nd y dding trnsition (p[e 0 ], l[e], n[e]) for every trnsition e E[n[e 0 ]]. It is cler tht A nd A re equivlent nd tht there is lel-preserving ijection etween the pths in A nd A. Thus, () A stisfies (IDA) (resp. (EDA), (IDA d )) iff A stisfies (IDA) (resp. (EDA), (IDA d )) nd () for ll x Σ, d(a, x) = d(a, x). By induction, Proposition 1 holds for A nd thus, it follows from () nd () tht Proposition 1 lso holds for A. These chrcteriztions hve een used in [14, 16] to design lgorithms for testing infinite, polynomil, nd exponentil miguity, nd for computing the degree of polynomil miguity in the ǫ-free cse. Theorem 1 ([14,16]). Let A e trim ǫ-free finite utomton. 1. It is decidle in time O( A 3 E ) whether A is infinitely miguous. 2. It is decidle in time O( A 2 E ) whether A is exponentilly miguous. 3. The degree of polynomil miguity of A, dp(a), cn e computed in O( A 3 E ). The first result of Theorem 1 hs lso een generlized y [14] to the cse of utomt with ǫ-trnsitions ut with significntly worse complexity. Theorem 2 ([14]). Let A e trim ǫ-cycle free finite utomton. It is decidle in time O(( A E + A 2 Q )3 ) whether A is infinitely miguous.

0 1 2 3 0 1 2 3 0, 0 1, 1 () () (c) 0, 1 2, 1 3, 1 3, 2 3, 3 Fig.3. Exmple of finite utomton intersection. () Finite utomt A 1 nd () A 2. (c) Result of the intersection of A 1 nd A 2. The lgorithms designed for the ǫ-free cse cnnot e redily used for finite utomt with ǫ-trnsitions since they would led to incorrect results (see Figure 2()-(c)). Insted, [14] proposed reduction to the ǫ-free cse. First, [14] gve n lgorithm to test if there exist two sttes p nd q in A with two distinct ǫ-pths from p to q. If tht is the cse, then A is exponentilly miguous (complexity O( A 4 Q + A E)). Otherwise, [14] defined from A n ǫ-free utomton A over the lphet Σ {#} such tht A is infinitely miguous iff A is infinitely miguous, see Figure 2(d)-(e). 3 However, the numer of trnsitions of A is A E + A 2 Q. This explins why the complexity in the ǫ-trnsition cse is significntly worse thn in the ǫ-free cse. The sme pproch cn e used to test the exponentil miguity of A in time O(( A E + A 2 Q )2 ) nd to compute dp(a) when A is polynomilly miguous in O(( A E + A 2 Q )3 ). Note tht we give tighter estimtes of the complexity of the lgorithms of [14, 16] where the uthors gve complexities using the loose inequlity: A E Σ A 2 Q. 4 Algorithms Our lgorithms for testing miguity re sed on generl lgorithm for the composition or intersection of utomt, which we riefly descrie in the following section. 4.1 Intersection of finite utomt The intersection of finite utomt is specil cse of the generl composition lgorithm for weighted trnsducers [11,10]. Sttes in the intersection A 1 A 2 of two finite utomt A 1 nd A 2 re identified with pirs of stte of A 1 3 Oserve tht A is not the result of pplying the clssicl ǫ-removl lgorithm to A, since ǫ-removl does not preserve infinite miguity nd would led e n even lrger utomton. Insted [14] used more complex lgorithm where ǫ-trnsitions re replced y regulr trnsitions leled with specil symol while preserving infinite miguity, dp(a) = dp(a ), even though A is not equivlent to A. Sttes in A re pirs (q, i) with q stte in A nd i {1, 2}. There is trnsition from (p,1) to (q,2) leled y # if q elongs to the ǫ-closure of p nd from (p,2) to (q,1) leled y σ Σ if there ws such trnsition from p to q in A.

nd stte of A 2. The following rule specifies how to compute trnsition of A 1 A 2 in the sence of ǫ-trnsition from pproprite trnsitions of A 1 nd A 2 : (q 1,, q 1) nd (q 2,, q 2) = ((q 1, q 2 ),, (q 1, q 2)). Figure 3 illustrtes the lgorithm. A stte (q 1, q 2 ) is initil (resp. finl) when q 1 nd q 2 re initil (resp. finl). In the worst cse, ll trnsitions of A 1 leving stte q 1 mtch ll those of A 2 leving stte q 2, thus the spce nd time complexity of composition is qudrtic: O( A 1 A 2 ), or O( A 1 E A 2 E ) when A 1 nd A 2 re trim. 4.2 Epsilon-filtering A strightforwrd generliztion of the ǫ-free cse would generte redundnt ǫ-pths. This is crucil issue in the more generl cse of the intersection of weighted utomt over non-idempotent semiring, since it would led to n incorrect result. The weight of two mtching ǫ-pths of the originl utomt would then e counted s mny times s the numer of redundnt ǫ-pths generted in the result, insted of once. It is lso crucil prolem in the unweighted cse since redundnt ǫ-pths cn ffect the test of infinite miguity, s we shll see in the next section. A criticl component of the composition lgorithm of [11,10] consists however of precisely coping with this prolem using n epsilonfiltering mechnism. Figure 4(c) illustrtes the prolem just mentioned. To mtch ǫ-pths leving q 1 nd those leving q 2, generliztion of the ǫ-free intersection cn mke the following moves: (1) first move forwrd on n ǫ-trnsition of q 1, or even ǫ- pth, nd remin t the sme stte q 2 in A 2, with the hope of lter finding trnsition whose lel is some lel ǫ mtching trnsition of q 2 with the sme lel; (2) proceed similrly y following n ǫ-trnsition or ǫ-pth leving q 2 while remining t the sme stte q 1 in A 1 ; or, (3) mtch n ǫ-trnsition of q 1 with n ǫ-trnsition of q 2. Let us renme existing ǫ-lels of A 1 s ǫ 2, nd existing ǫ-lels of A 2 ǫ 1, nd let us ugment A 1 with self-loop leled with ǫ 1 t ll sttes nd similrly, ugment A 2 with self-loop leled with ǫ 2 t ll sttes, s illustrted y Figures 4() nd (). These self-loops correspond to remining t the sme stte in tht mchine while consuming n ǫ-lel of the other trnsition. The three moves just descried now correspond to the mtches (1) (ǫ 2 : ǫ 2 ), (2) (ǫ 1 : ǫ 1 ), nd (3) (ǫ 2 : ǫ 1 ). The grid of Figure 4(c) shows ll the possile ǫ-pths etween intersection sttes. We will denote y Ã1 nd Ã2 the utomt otined fter ppliction of these chnges. For the result of intersection not to e redundnt, etween ny two of these sttes, ll ut one pth must e disllowed. There re mny possile wys of selecting tht pth. One nturl wy is to select the shortest pth with the digonl trnsitions (ǫ-mtching trnsitions) tken first. Figure 4(c) illustrtes in oldfce the pth just descried from stte (0, 0) to stte (1, 2). Remrkly, this filtering mechnism itself cn e encoded s finite-stte trnsducer such s the trnsducer M of Figure 4(d). We denote y (p, q) (r, s) to indicte tht (r, s) cn e reched from (p, q) in the grid.

(0,0) ǫ 1:ǫ 1 (1,0) ǫ 1:ǫ 1 (2,0) ε 1 ε 2 ε 2 ε 1 ǫ 2:ǫ 2 (0,1) ǫ 2:ǫ 2 ǫ 2:ǫ 1 ǫ 1:ǫ 1 ǫ 2:ǫ 1 ǫ ǫ 2:ǫ 1 2:ǫ 2 (1,1) ǫ 1:ǫ 1 (2,1) ǫ 2:ǫ 2 ε2:ε1 x:x ǫ ǫ 2:ǫ 1 2:ǫ 2 ǫ 0 2:ǫ 2 ε1:ε1 x:x ε2:ε2 ε1:ε1 1 ε2:ε2 ǫ 1:ǫ 1 ǫ 1:ǫ 1 x:x (0,2) (1,2) (2,2) () () (c) (d) 2 Fig.4. Mrking of utomt, redundnt pths nd filter. () Ã1: self-loop leled with ǫ 1 dded t ll sttes of A 1, regulr ǫs renmed to ǫ 2. () Ã 2: self-loop leled with ǫ 2 dded t ll sttes of A 2, regulr ǫs renmed to ǫ 1. (c) Redundnt ǫ-pths: strightforwrd generliztion of the ǫ-free cse could generte ll the pths from (0, 0) to (2, 2) for exmple, even when composing just two simple trnsducers. (d) Filter trnsducer M llowing unique ǫ-pth. Proposition 2. Let M e the trnsducer of Figure 4(d). M llows unique pth etween ny two sttes (p, q) nd (r, s), with (p, q) (r, s). Proof. The full proof of this proposition is given in [2]. Thus, to intersect two finite utomt A 1 nd A 2 with ǫ-trnsitions, it suffices to compute Ã1 M Ã2, using the ǫ-free rules of composition. Sttes in the intersection re now identified with triplets mde of stte of A 1, stte of M, nd stte of A 2. A trnsition (q 1, 1, q 1 ) in Ã1, trnsition (f, 1, 2, f ) in M, nd trnsition (q 2, 2, q 2 ) in Ã2 re comined to form the following trnsition in the intersection: ((q 1, f, q 2 ),, (q 1, f, q 2)), with = ǫ if { 1, 2 } {ǫ 1, ǫ 2 } nd = 1 = 2 otherwise. In the rest of the pper, we will ssume tht the result of intersection is trimmed fter its computtion, which cn e done in liner time. Theorem 3. Let A 1 nd A 2 e two finite utomt with ǫ-trnsitions. To ech pir (π 1, π 2 ) of ccepting pths in A 1 nd A 2 shring the sme input lel x Σ corresponds unique ccepting pth π in A 1 A 2 leled with x. Proof. This follows strightforwrdly from Proposition 2. 4.3 Amiguity Tests We strt with test of the exponentil miguity of A. The key is tht the (EDA) property trnsltes into very simple property for A 2 = A A. Lemm 1. Let A e trim ǫ-cycle free finite utomton. A stisfies (EDA) iff there exists strongly connected component of A 2 = A A tht contins two sttes of the form (p, p) nd (q, q ), where p, q nd q re sttes of A with q q.

Proof. Assume tht A stisfies (EDA). There exist stte p nd string v such tht there re two distinct cycles c 1 nd c 2 leled y v t p. Let e 1 nd e 2 e the first edges tht differ in c 1 nd c 2. We cn then write c 1 = πe 1 π 1 nd c 2 = πe 2 π 2. If e 1 nd e 2 shre the sme lel, let π 1 = πe 1, π 2 = πe 2, π 1 = π 1 nd π 2 = π 2. If e 1 nd e 2 do not shre the sme lel, exctly one of them must e n ǫ-trnsition. By symmetry, we cn ssume without loss of generlity tht e 1 is the ǫ-trnsition. Let π 1 = πe 1, π 2 = π, π 1 = π 1 nd π 2 = ǫ 2π 2. In oth cses, let q = n[π 1] = p[π 1] nd q = n[π 2] = p[π 2]. Oserve tht q q. Since i[π 1 ] = i[π 2 ], π 1 nd π 2 re mtched y intersection resulting in pth in A2 from (p, p) to (q, q ). Similrly, since i[π 1 ] = i[π 2 ], π 1 nd π 2 re mtched y intersection resulting in pth from (q, q ) to (p, p). Thus, (p, p) nd (q, q ) re in the sme strongly connected component of A 2. Conversely, ssume tht there exist sttes p, q nd q in A such tht q q nd tht (p, p) nd (q, q ) re in the sme strongly connected component of A 2. Let c e cycle in (p, p) going through (q, q ), it hs een otined y mtching two cycles c 1 nd c 2. If c 1 were equl to c 2, intersection would mtch these two pths creting pth c long which ll the sttes would e of the form (r, r), nd since A is trim this would contrdict Theorem 3. Thus, c 1 nd c 2 re distinct nd (EDA) holds. Oserve tht the use of the ǫ-filter in composition is crucil for Lemm 1 to hold (see Figure 2). The lemm leds to strightforwrd lgorithm for testing exponentil miguity. Theorem 4. Let A e trim ǫ-cycle free finite utomton. It is decidle in time O( A 2 E ) whether A is exponentilly miguous. Proof. The lgorithm proceeds s follows. We compute A 2 nd, using depthfirst serch of A 2, trim it nd compute its strongly connected components. It follows from Lemm 1 tht A is exponentilly miguous iff there is strongly connected component tht contins two sttes of the form (p, p) nd (q, q ) with q q. Finding such strongly connected component cn e done in time liner in the size of A 2, i.e. in O( A 2 E ) since A nd A2 re trim. Thus, the complexity of the lgorithm is in O( A E 2 ). Testing the (IDA) property requires finding three pths shring the sme lel in A. As shown elow, this cn e done in nturl wy using the utomton A 3 = (A A) A, otined y pplying twice the intersection lgorithm. Lemm 2. Let A e trim ǫ-cycle free finite utomton. A stisfies (IDA) iff there exist two distinct sttes p nd q in A with non-ǫ pth in A 3 = A A A from stte (p, p, q) to stte (p, q, q). Proof. Assume tht A stisfies (IDA). Then, there exists string v Σ with three pths π 1 P(p, v, p), π 2 P(p, v, q) nd π 3 P(q, v, p). Since these three pths shre the sme lel v, they re mtched y intersection resulting in pth π in A 3 leled with v from (p[π 1 ], p[π 2 ], p[π 3 ]) = (p, p, q) to (n[π 1 ], n[π 2 ], n[π 3 ]) = (p, q, q).

Conversely, if there is non-ǫ pth π form (p, p, q) to (p, q, q) in A 3, it hs een otined y mtching three pths π 1, π 2 nd π 3 in A with the sme input v = i[π] ǫ. Thus, (IDA) holds. This lemm ppers lredy s Lemm 5.10 in [8]. Finlly, Theorem 4 nd Lemm 2 cn e comined to yield the following result. Theorem 5. Let A e trim ǫ-cycle free finite utomton. It is decidle in time O( A 3 E ) whether A is finitely, polynomilly, or exponentilly miguous. Proof. First, Theorem 4 cn e used to test whether A is exponentilly miguous y computing A 2. The complexity of this step is O( A 2 E ). If A is not exponentilly miguous, we proceed y computing nd trimming A 3 nd then testing whether A 3 verifies the property descried in Lemm 2. This is done y considering the utomton B on the lphet Σ = Σ {#} otined from A 3 y dding trnsition leled y # from stte (p, q, q) to stte (p, p, q) for every pir (p, q) of sttes in A such tht p q. It follows tht A 3 verifies the condition in Lemm 2 iff there is cycle in B contining oth trnsition leled y # nd trnsition leled y symol in Σ. This property cn e checked strightforwrdly using depth-first serch of B to compute its strongly connected components. If strongly connected component of B is found tht contins oth trnsition leled with # nd trnsition leled y symol in Σ, A verifies (IDA) ut not (EDA) nd thus A is polynomilly miguous. Otherwise, A is finitely miguous. The complexity of this step is liner in the size of B: O( B E ) = O( A E 3 + A Q 2 ) = O( A E 3 ) since A nd B re trim. The totl complexity of the lgorithm is O( A 2 E + A 3 E ) = O( A 3 E ). When A is polynomilly miguous, we cn derive from the lgorithm just descried one tht computes dp(a). Theorem 6. Let A e trim ǫ-cycle free finite utomton. If A is polynomilly miguous, dp(a) cn e computed in time O( A 3 E ). Proof. We first compute A 3 nd use the lgorithm of Theorem 5 to test whether A is polynomilly miguous nd to compute ll the pirs (p, q) tht verify the condition of Lemm 2. This step hs complexity O( A 3 E ). We then compute the component grph G of A, nd for ech pir (p, q) found in the previous step, we dd trnsition leled with # from the strongly connected component of p to the one of q. If there is pth in tht grph contining d edges leled y #, then A verifies (IDA d ). Thus, dp(a) is the mximum numer of edges mrked y # tht cn e found long pth in G. Since G is cyclic, this numer cn e computed in liner time in the size of G, i.e. in O( A 2 Q ). Thus, the overll complexity of the lgorithm is O( A 3 E ). 5 Appliction to Entropy Approximtion In this section, we descrie n ppliction in which determining the degree of miguity of proilistic utomton helps estimte the qulity of n pproximtion of its entropy. Weighted utomt re utomt in which ech trnsition

crries some weight in ddition to the usul lphet symol. The weights re elements of semiring, tht is ring tht my lck negtion. The following is more forml definition. Definition 3. A weighted utomton A over semiring (K,,, 0, 1) is 7- tuple (Σ, Q, I, F, E, λ, ρ) where Σ is finite lphet, Q finite set of sttes, I Q the set of initil sttes, F Q the set of finl sttes, E Q Σ {ǫ} K Q finite set of trnsitions, λ : I K the initil weight function mpping I to K, nd ρ : F K the finl weight function mpping F to K. Given trnsition e E, we denote y w[e] its weight. We extend the weight function w to pths y defining the weight of pth s the -product of the weights of its constituent trnsitions: w[π] = w[e 1 ] w[e k ]. The weight ssocited y weighted utomton A to n input string x Σ is defined y [A](x) = π P(I,x,F) λ[p[π]] w[π] ρ[n[π]]. The entropy H(A) of proilistic utomton A is defined s: H(A) = X [A](x) log([a](x)). (1) x Σ The system (K,,, (0, 0), (1, 0)) with K = (R {+, }) (R {+, }) nd nd defined s follows defines commuttive semiring clled the entropy semiring [4]: for ny two pirs (x 1, y 1 ) nd (x 2, y 2 ) in K, (x 1, y 1 ) (x 2, y 2 ) = (x 1 + x 2, y 1 + y 2 ) nd (x 1, y 1 ) (x 2, y 2 ) = (x 1 x 2, x 1 y 2 + x 2 y 1 ). In [4], the uthors showed tht generlized shortest-distnce lgorithm over this semiring correctly computes the entropy of n unmiguous proilistic utomton A. The lgorithm strts y mpping the weight of ech trnsition to pir where the first element is the proility nd the second the entropy: w[e] (w[e], w[e] log w[e]). The lgorithm then proceeds y computing the generlized shortest-distnce defined over the entropy semiring, which computes the -sum of the weights of ll ccepting pths in A. Here, we show tht the sme shortest-distnce lgorithm yields n pproximtion of the entropy of n miguous proilistic utomton A, where the pproximtion qulity is function of the degree of polynomil miguity, dp(a). Our proofs mke use of the stndrd log-sum inequlity [5], specil cse of Jensen s inequlity, which holds for ny positive rels 1,..., k, nd 1,..., k :! kx kx P k i log i i log P i k. (2) i i Lemm 3. Let A e proilistic utomton nd let x Σ + e string ccepted y A on k pths π 1,...,π k. Let w[π i ] e the proility of pth π i. Clerly, [A](x) = k w[π i]. Then, k w[π i] log w[π i ] [A](x)(log[A](x) log k). Proof. The result follows strightforwrdly from the log-sum inequlity, with i = w[π i ] nd i = 1:! P kx kx k w[π i]log w[π i] w[π i] log w[πi] = [A](x)(log[A](x) log k). (3) k

Let S(A) e the quntity computed y the generlized shortest-distnce lgorithm over the entropy semiring or proilistic utomton A. When A is unmiguous, it is shown y [4] tht S(A) = H(A). Theorem 7. Let A e proilistic utomton nd let L denote the expected length of the strings ccepted y A (i.e. L = x Σ x [A](x)). Then, 1. if A is finitely miguous with d(a) = k for some k N, then H(A) S(A) H(A) + log k; 2. if A is polynomilly miguous with dp(a) = k for some k N, then H(A) S(A) H(A) + k log L. Proof. The lower ound S(A) H(A) follows from the oservtion tht for string x tht is ccepted in A y k pths π 1,..., π k, ( kx kx ) ( kx ) w[π i]log(w(π i)) w[π i] log w[π i]. (4) Since the quntity k w[π i] log(w[π i ]) is string x s contriution to S(A) nd the quntity ( k w[π i])log( k w[π i]) its contriution to H(A), summing over ll ccepted strings x, we otin H(A) S(A). Assume tht A is finitely miguous with degree of miguity k. Let x Σ e string tht is ccepted on l x k pths π 1,..., π lx. By Lemm 3, we hve lx w[π i ] log w[π i ] [A](x)(log[A](x) log l x ) [A](x)(log[A](x) log k). Thus, S(A) = lx x Σ w[π i] log w[π i ] H(A) + x Σ (log k)[a](x) = H(A) + log k. This proves the first sttement of the theorem. Next, ssume tht A is polynomilly miguous with degree of polynomil miguity k. By Lemm 3, we hve l x w[π i ] log w[π i ] [A](x)(log[A](x) log l x ) [A](x)(log[A](x) log( x k )). Thus, S(A) H(A) + X x Σ k[a](x) log x = H(A) + ke A[log x ] (5) H(A) + k log E A[ x ] = H(A) + k log L, (y Jensen s inequlity) which proves the second sttement of the theorem. The theorem shows in prticulr tht the qulity of the pproximtion of the entropy of polynomilly miguous proilistic utomton cn e estimted y computing its degree of polynomil miguity, which cn e chieved efficiently s descried in the previous section. This lso requires the computtion of the expected length L of n ccepted string. L cn e computed efficiently for n ritrry proilistic utomton using the entropy semiring nd the generlized shortest-distnce lgorithms, using techniques similr to those descried in [4]. The only difference is in the initil step, where the weight of ech trnsition in A is mpped to pir of elements y w[e] (w[e], w[e]).

6 Conclusion We presented simple nd efficient lgorithms for testing the finite, polynomil, or exponentil miguity of finite utomt with ǫ-trnsitions. We conjecture tht the time complexity of our lgorithms is optiml. These lgorithms hve vriety of pplictions, in prticulr to test pre-condition for the pplicility of other utomt lgorithms. Our ppliction to the pproximtion of the entropy gives nother illustrtion of their usefulness. Our lgorithms lso demonstrte the prominent role plyed y the intersection or composition of utomt nd trnsducers with ǫ-trnsitions [11, 10] in the design of testing lgorithms. Composition cn e used to devise simple nd efficient testing lgorithms. We hve shown elsewhere how it cn e used to test the functionlity of finite-stte trnsducer, or the twins property for weighted utomt nd trnsducers [1]. References 1. C. Alluzen nd M. Mohri. Efficient Algorithms for Testing the Twins Property. Journl of Automt, Lnguges nd Comintorics, 8(2):117 144, 2003. 2. C. Alluzen nd M. Mohri. 3-wy composition of weighted finite-stte trnsducers. In CIAA 2008, volume 5148 of LNCS, pges 262 273. Springer, 2008. 3. T. Chn nd O. H. Irr. On the finite-vluedness prolem for sequentil mchines. Theoreticl Computer Science, 23:95 101, 1983. 4. C. Cortes, M. Mohri, A. Rstogi, nd M. Riley. Efficient computtion of the reltive entropy of proilistic utomt. In LATIN 2006, volume 3887 of LNCS, pges 323 336. Springer, 2006. 5. T. M. Cover nd J. A. Thoms. Elements of Informtion Theory. John Wiley & Sons, Inc., New York, 1991. 6. O. H. Irr nd B. Rvikumr. On sprseness, miguity nd other decision prolems for cceptors nd trnsducers. In STACS 1986, volume 210 of LNCS, pges 171 179. Springer, 1986. 7. G. Jco. Un lgorithme clculnt le crdinl, fini ou infini, des demi-groupes de mtrices. Theoreticl Computer Science, 5(2):183 202, 1977. 8. W. Kuich. Finite utomt nd miguity. Technicl Report 253, Institute für Informtionsverreitung - Technische Universität Grz und ÖCG, 1988. 9. A. Mndel nd I. Simon. On finite semigroups of mtrices. Theoreticl Computer Science, 5(2):101 111, 1977. 10. M. Mohri, F. C. N. Pereir, nd M. Riley. Weighted Automt in Text nd Speech Processing. In Proceedings of ECAI-96, Workshop on Extended finite stte models of lnguge, Budpest, Hungry. John Wiley nd Sons, 1996. 11. F. Pereir nd M. Riley. Finite Stte Lnguge Processing, chpter Speech Recognition y Composition of Weighted Finite Automt. The MIT Press, 1997. 12. B. Rvikumr nd O. H. Irr. Relting the type of miguity of finite utomt to the succintness of their representtion. SIAM Journl on Computing, 18(6):1263 1282, 1989. 13. C. Reutenuer. Propriétés rithmétiques et topologiques des séries rtionnelles en vrile non commuttive. Thèse de troisième cycle, Université Pris VI, 1977. 14. A. Weer. Üer die Mehrdeutigkeit und Wertigkeit von endlichen, Automten und Trnsducern. Disserttion, Goethe-Universität Frnkfurt m Min, 1987. 15. A. Weer nd H. Seidl. On the degree of miguity of finite utomt. In MFCS 1986, volume 233 of LNCS, pges 620 629. Springer, 1986. 16. A. Weer nd H. Seidl. On the degree of miguity of finite utomt. Theoreticl Computer Science, 88(2):325 349, 1991.