General Algorithms for Testing the Ambiguity of Finite Automata

TR2007-908 Generl Algorithms for Testing the Amiguity of Finite Automt Cyril Alluzen 1,, Mehryr Mohri 1,2, nd Ashish Rstogi 1 1 Cournt Institute of Mthemtil Sienes, 251 Merer Street, New York, NY 10012. 2 Google Reserh, 76 Ninth Avenue, New York, NY 10011. Astrt. This pper presents effiient lgorithms for testing the finite, polynomil, nd eponentil miguity of finite utomt with ǫ-trnsitions. It gives n lgorithm for testing the eponentil miguity of n utomton A in time O( A 2 E), nd finite or polynomil miguity in time O( A 3 E). These ompleities signifintly improve over the previous est ompleities given for the sme prolem. Furthermore, the lgorithms presented re simple nd re sed on generl lgorithm for the omposition or intersetion of utomt. We lso give n lgorithm to determine the degree of polynomil miguity of finite utomton A tht is polynomilly miguous in time O( A 3 E). Finlly, we present n pplition of our lgorithms to n pproimte omputtion of the entropy of proilisti utomton. 1 Introdution The question of the miguity of finite utomt rises in vriety of ontets. In some ses, the pplition of n lgorithm requires n input utomton to e finitely miguous, in others the onvergene of ound or gurntee relies on tht finite miguity or the symptoti rte of the inrese of miguity s funtion of the string length. Thus, in ll these ses, one needs n lgorithm to test the miguity, either to determine if it is finite, or to estimte its symptoti rte of inrese. The prolem of testing miguity hs een etensively nlyzed in the pst. The prolem of determining the degree of miguity of n utomton with finite miguity ws shown to e PSPACE-omplete. However, testing finite miguity n e done in polynomil time using hrteriztion of polynomil nd eponentil miguity given y [6, 5, 9, 4, 11]. The most effiient lgorithms for testing polynomil nd eponentil miguity, nd therey testing finite miguity were presented y [10, 12]. The lgorithms presented in [12] ssume the input utomton to e ǫ-free, ut they re etended to the se where the utomton hs ǫ-trnsitions in [10]. In the presene of ǫ-trnsitions, the ompleity of the lgorithms given y [10] is O(( A E + A 2 Q )2 ) for testing the eponentil miguity of n utomton A nd O(( A E + A 2 Q )3 ) for testing This uthor s new ddress is: Google Reserh, 76 Ninth Avenue, New York, NY 10011.

polynomil miguity, where A E stnds for the numer of trnsitions nd A Q the numer of sttes of A. This pper presents signifintly more effiient lgorithms for testing finite, polynomil, nd eponentil miguity for the generl se of utomt with ǫ-trnsitions. It gives n lgorithm for testing the eponentil miguity of n utomton A in time O( A 2 E ), nd finite or polynomil miguity in time O( A 3 E ). The min ide ehind our lgorithms is to mke use of the omposition or intersetion of finite utomt with ǫ-trnsitions [8, 7]. The ǫ-filter used in these lgorithms ruilly helps in the nlysis nd test of the miguity. We lso give n lgorithm to determine the degree of polynomil miguity of finite utomton A tht is polynomilly miguous in time O( A 3 E ). Finlly, we present n pplition of our lgorithms to n pproimte omputtion of the entropy of proilisti utomton. The reminder of the pper is orgnized s follows. Setion 2 presents generl utomt nd miguity definitions. In Setion 3 we give rief desription of eisting hrteriztions for the miguity of utomt nd etend them to the se of utomt with ǫ-trnsitions. In Setion 4 we present our lgorithms for testing the finite, polynomil, nd eponentil miguity, nd the proof of their orretness. Setion 5 detils the relevne of these lgorithms to the pproimtion of the entropy of proilisti utomt. 2 Preliminries Definition 1. A finite utomton A is 5-tuple (Σ, Q, E, I, F) where: Σ is finite lphet; Q is finite set of sttes; I Q the set of initil sttes; F Q the set of finl sttes; nd E Q (Σ {ǫ}) Q finite set of trnsitions, where ǫ denotes the empty string. We denote y A Q the numer of sttes, y A E the numer of trnsitions nd y A = A E + A Q the size of n utomton A. Given stte q Q, E[q] denotes the set of trnsitions leving q. For two susets R Q nd R Q, we denote y P(R,, R ) the set of ll pths from stte q R to stte q R leled with Σ. We lso denote y p[π] the origin stte, y n[π] the destintion stte, nd y i[π] Σ the lel of pth π. A string Σ is epted y A if it lels suessful pth, i.e. pth from n initil stte to finl stte. A finite utomton A is trim if every stte of A elongs to suessful pth. A is unmiguous if for ny string Σ there is t most one suessful pth leled y in A, otherwise, A is sid miguous. The degree of miguity of string in A, denoted y d(a, ), is the numer of suessful pths in A leled y. Note tht if A ontins n ǫ-yle, there eist Σ suh tht d(a, ) =. Using depth-first serh restrited to ǫ-trnsitions, it n e deided in liner time whether A hs ǫ-yles. Thus, in the following, we will ssume without loss of generlity tht A is ǫ-yle free. The degree of miguity of A is defined s d(a) = sup Σ d(a, ). A is sid finitely miguous if d(a) < nd infinitely miguous if d(a) =. A is sid polynomilly miguous if there eists polynomil h in N[X] suh tht d(a, )

v p v () v p v () v q v 1 p 1 v 1 v 2 v 2 v d v 1 q u 2 1 p v 2 2 q u d 2 p v d d q d v d () Fig. 1. Illustrtion of the () (EDA), () (IDA) nd () (IDA d ) properties. h( ) for ll Σ. The miniml degree of suh polynomil is lled the degree of polynomil miguity of A, denoted y dp(a). By definition, dp(a) = 0 iff A is finitely miguous. When A is infinitely miguous ut not polynomilly miguous, we sy tht A is eponentilly miguous nd tht dp(a) =. 3 Chrteriztion of infinite miguity The hrteriztion nd test of finite, polynomil, nd eponentil miguity of finite utomt without e-trnsitions re sed on the following fundmentl properties. [6, 5, 9, 4, 11, 10, 12]. Definition 2. The following re three key properties for the hrteriztion of the miguity of n utomt A. () (EDA): There eists stte q with t lest two distint yles leled y some v Σ (Figure 1()). () (IDA): There eist two distint sttes p nd q with pths leled with v from p to p, p to q, nd q to q, for some v Σ (Figure 1()). () (IDA d ): There eist 2d sttes p 1,... p d, q 1,..., q d in A nd 2d 1 strings v 1,..., v d nd u 2,...u d in Σ suh tht for ll 1 i d, p i q i nd P(p i, v i, p i ), P(p i, v i, q i ) nd P(q i, v i, q i ) re non-empty nd for ll 2 i d, P(q i 1, u i, p i ) is non-empty (Figure 1()). Oserve tht (EDA) implies (IDA). Assuming (EDA), let e nd e e the first trnsitions tht differ in the two yles t stte q, then we must hve n[e] n[e ] sine the definition 1 disllows multiple trnsitions etween the sme two sttes with the sme lel. Thus, (IDA) holds for the pir (n[e], n[e ]). In the ǫ-free se, it ws shown tht trim utomton A stisfies (IDA) iff A is infinitely miguous [11, 12], tht A stisfies (EDA) iff A is eponentilly miguous [4], nd tht A stisfies (IDA d ) iff dp(a) d [10, 12]. These hrteriztions n e strightforwrdly etended to the se of utomt with ǫ-trnsitions in the following proposition.

Proposition 1. Let A e trim ǫ-yle free finite utomton. (i) A is infinitely miguous iff A stisfies (IDA). (ii) A is eponentilly miguous iff A stisfies (EDA). (iii) dp(a) d iff A stisfies (IDA d ). Proof. The proof is y indution on the numer of ǫ-trnsitions in A. If A does not hve ny ǫ-trnsitions, then the proposition holds s shown in [11, 12] for (i), [4] for (ii) nd [12] for (iii). Assume now tht A hs n + 1 ǫ-trnsitions, n 0, nd tht the sttement of the proposition holds for ll utomt with n ǫ-trnsitions. Selet n ǫ-trnsition e 0 in A, nd let A e the finite utomton otined fter pplition of ǫ-removl to A limited to trnsition e 0. A is otined y deleting e 0 from A nd y dding trnsition (p[e 0 ], l[e], n[e]) for every trnsition e E[n[e 0 ]]. It is ler tht A nd A re equivlent nd tht there is lel-preserving ijetion etween the pths in A nd A. Thus, () A stisfies (IDA) (resp. (EDA), (IDA d )) iff A stisfies (IDA) (resp. (EDA), (IDA d )) nd () for ll Σ, d(a, ) = d(a, ). By indution, proposition 1 holds for A nd thus, it follows from () nd () tht proposition 1 lso holds for A. These hrteriztions hve een used in [10, 12] to design lgorithms for testing infinite, polynomil, nd eponentil miguity, nd for omputing the degree of polynomil miguity in the ǫ-free se. Theorem 1 ([10, 12]). Let A e trim ǫ-free finite utomton. 1. It is deidle in time O( A 3 E ) whether A is infinitely miguous. 2. It is deidle in time O( A 2 E ) whether A is eponentilly miguous. 3. The degree of polynomil miguity of A, dp(a), n e omputed in O( A 3 E ). The first result of theorem 1 hs lso een generlized y [10] to the se of utomt with ǫ-trnsitions ut with signifintly worse ompleity. Theorem 2 ([10]). Let A e trim ǫ-yle free finite utomton. It is deidle in time O(( A E + A 2 Q )3 ) whether A is infinitely miguous. The min ide used in [10] is to defined from A n ǫ-free utomton A suh tht A is infinitely miguous iff A is infinitely miguous. However, the numer of trnsitions of A is A E + A 2 Q. This eplins why the ompleity in the ǫ-trnsition se is signifintly worse thn in the ǫ-free se. A similr pproh n e used strightforwrdly to test the eponentil miguity of A with ompleity O(( A E + A 2 Q )2 ) nd to ompute dp(a) when A is polynomilly miguous with ompleity O(( A E + A 2 Q )3 ). Note tht we give here tighter estimtes of the ompleity of the lgorithms of [10, 12] where the uthors gve ompleities using the loose inequlity: A E Σ A 2 Q. 4 Algorithms Our lgorithms for testing miguity re sed on generl lgorithm for the omposition or intersetion of utomt, whih we desrie in the following setion oth to e self-ontined, nd to give proof of the orretness of the ǫ-filter whih we hve not presented in erlier pulitions.

0 1 2 3 0 1 2 3 0,0 1,1 0,1 2,1 3,2 3,1 3,3 () () () Fig. 2. Emple of finite utomton intersetion. () Finite utomt A 1 nd () A 2. () Result of the intersetion of A 1 nd A 2. 4.1 Intersetion of finite utomt The intersetion of finite utomt is speil se of the generl omposition lgorithm for weighted trnsduers [8, 7]. Sttes in the intersetion A 1 A 2 of two finite utomt A 1 nd A 2 re identified with pirs of stte of A 1 nd stte of A 2. Leving side ǫ-trnsitions, the following rule speifies how to ompute trnsition of A 1 A 2 from pproprite trnsitions of A 1 nd A 2 : (q 1,, q 1 ) nd (q 2,, q 2 ) = ((q 1, q 1 ),, (q 2, q 2 )). (1) Figure 2 illustrtes the lgorithm. A stte (q 1, q 2 ) is initil (resp. finl) when q 1 nd q 2 re initil (resp. finl). In the worst se, ll trnsitions of A 1 leving stte q 1 mth ll those of A 2 leving stte q 2, thus the spe nd time ompleity of omposition is qudrti: O( A 1 A 2 ), or O( A 1 E A 2 E ) when A 1 nd A 2 re trim. Epsilon filtering A strightforwrd generliztion of the ǫ-free se would generte redundnt ǫ-pths. This is ruil issue in the more generl se of the intersetion of weighted utomt over non-idempotent semiring, sine it would led to n inorret result. The weight of two mthing ǫ-pths of the originl utomt would then e ounted s mny times s the numer of redundnt ǫ-pths generted in the result, insted of one. It is lso ruil prolem in the unweighted se tht we re onsidering sine redundnt ǫ-pths n ffet the test of infinite miguity, s we shll see in the net setion. A ritil omponent of the omposition lgorithm of [8, 7] onsists however of preisely oping with this prolem using method lled epsilon filtering. Figure 3() illustrtes the prolem just mentioned. To mth ǫ-pths leving q 1 nd those leving q 2, generliztion of the ǫ-free intersetion n mke the following moves: (1) first move forwrd on n ǫ-trnsition of q 1, or even ǫ-pth, nd sty t the sme stte q 2 in A 2, with the hope of lter finding trnsition whose lel is some lel ǫ mthing trnsition of q 2 with the sme lel; (2) proeed similrly y following n ǫ-trnsition or ǫ-pth leving q 2 while stying t the sme stte q 1 in A 1 ; or, (3) mth n ǫ-trnsition of q 1 with n ǫ-trnsition of q 2. Let us renme eisting ǫ-lels of A 1 s ǫ 2, nd eisting ǫ-lels of A 2 ǫ 1, nd let us ugment A 1 with self-loop leled with ǫ 1 t ll sttes nd similrly, ugment A 2 with self-loop leled with ǫ 2 t ll sttes, s illustrted y Figures 3() nd (). These

(0,0) ǫ 1:ǫ 1 (1,0) ǫ 1:ǫ 1 (2,0) ǫ 2:ǫ 2 ǫ 2:ǫ 1 ǫ ǫ 2:ǫ 1 2:ǫ 2 ǫ 2:ǫ 2 ε1:ε1 ε 1 ε 2 ε 2 ε 1 (0,1) ǫ 2:ǫ 2 ǫ 1:ǫ 1 ǫ 2:ǫ 1 (1,1) ǫ 1:ǫ 1 (2,1) ε2:ε1 : ǫ ǫ 2:ǫ 1 2:ǫ 2 ǫ 0 2:ǫ 2 ε1:ε1 : ε2:ε2 1 ε2:ε2 ǫ 1:ǫ 1 ǫ 1:ǫ 1 : (0,2) (1,2) (2,2) () () () (d) 2 Fig. 3. Mrking of utomt, redundnt pths nd filter. () Ã1: self-loop leled with ǫ1 dded t ll sttes of A 1, regulr ǫs renmed to ǫ 2. () Ã2: self-loop leled with ǫ2 dded t ll sttes of A 2, regulr ǫs renmed to ǫ 1. () Redundnt ǫ-pths: strightforwrd generliztion of the ǫ-free se ould generte ll the pths from (0,0) to (2,2) for emple, even when omposing just two simple trnsduers. (d) Filter trnsduer M llowing unique ǫ-pth. self-loops orrespond to stying t the sme stte in tht mhine while onsuming n ǫ-lel of the other trnsition. The three moves just desried now orrespond to the mthes (1) (ǫ 2 :ǫ 2 ), (2) (ǫ 1 :ǫ 1 ), nd (3) (ǫ 2 :ǫ 1 ). The grid of Figure 3() shows ll the possile ǫ-pths etween intersetion sttes. We will denote y Ã1 nd Ã2 the utomt otined fter pplition of these hnges. For the result of intersetion not to e redundnt, etween ny two of these sttes, ll ut one pth must e disllowed. There re mny possile wys of seleting tht pth. One nturl wy is to selet the shortest pth with the digonl trnsitions (ǫ-mthing trnsitions) tken first. Figure 3() illustrtes in oldfe the pth just desried from stte (0, 0) to stte (1, 2). Remrkly, this filtering mehnism itself n e enoded s finite-stte trnsduer suh s the trnsduer M of Figure 3(d). We denote y (p, q) (r, s) to indite tht (r, s) n e rehed from (p, q) in the grid. Proposition 2. Let M e the trnsduer of Figure 3(d). M llows unique pth etween ny two sttes (p, q) nd (r, s), with (p, q) (r, s). Proof. Let denote (ǫ 1 :ǫ 1 ), denote (ǫ 2 :ǫ 2 ), denote (ǫ 2 :ǫ 1 ), nd let stnd for ny (:), with Σ. The following sequenes must e disllowed y shortest-pth filter with mthing trnsitions first:,,,. This is euse, from ny stte, insted of the moves or, the mthing or digonl trnsition n e tken. Similrly, insted of or, nd n e tken for n erlier mth. Conversely, it is ler from the grid or n immedite reursion tht filter disllowing these sequenes epts unique pth etween two onneted sttes of the grid. Let L e the set of sequenes over σ = {,,, } tht ontin one of the disllowed sequene just mentioned s sustring tht is L = σ ( + + + )σ. Then L represents etly the set of pths llowed y tht filter nd is thus regulr lnguge. Let A e n utomton representing L (Figure 4()). An utomton representing L n

0 1 3 2 {0} {0,1} {0,2} {0,3} 1 0 2 3 () () () Fig. 4. () Finite utomton A representing the set of disllowed sequenes. () Automton B, result of the determiniztion of A. Susets re indited t eh stte. () Automton C otined from B y omplementtion, stte 3 is not oessile. e onstruted from A y determiniztion nd omplementtion (Figures 4()-()). The resulting utomton C is equivlent to the trnsduer M fter removl of the stte 3, whih does not dmit pth to finl stte. Thus, to interset two finite utomt A 1 nd A 2 with ǫ-trnsitions, it suffies to ompute Ã1 M Ã2, using the the ǫ-free rules of intersetion or omposition. Theorem 3. Let A 1 nd A 2 e two finite utomt with ǫ-trnsitions. To eh pir (π 1, π 2 ) of suessful pths in A 1 nd A 2 shring the sme input lel Σ orresponds unique suessful pth π in A 1 A 2 leled y. Proof. This follows strightforwrdly from proposition 2. 4.2 Testing for infinite miguity We strt with test of the eponentil miguity of A. The key is tht the (EDA) property trnsltes into very simple property for A 2 = A A. Lemm 1. Let A e trim ǫ-yle free finite utomton. A stisfies (EDA) iff there eists strongly onneted omponent of A 2 = A A tht ontins two sttes of the form (p, p) nd (q, q ), where p, q nd q re sttes of A with q q. Proof. Assume tht A stisfies (EDA). There eist stte p nd string v suh tht there re two distint yles 1 nd 2 leled y v t p. Let e 1 nd e 2 e the first edges tht differ in 1 nd 2. We n then write 1 = πe 1 π 1 nd 2 = πe 2 π 2. If e 1 nd e 2 shre the sme lel, let π 1 = πe 1, π 2 = πe 2, π 1 = π 1 nd π 2 = π 2. If e 1 nd e 2 do not shre the sme lel, etly one of them must e n ǫ-trnsition. By symmetry, we n ssume without loss of generlity tht e 1 is the ǫ-trnsition. Let π 1 = πe 1, π 2 = π, π 1 = π 1 nd π 2 = ǫ 2 π 2. In oth ses, let q = n[π 1] = p[π 1] nd q = n[π 2 ] = p[π 2 ]. Oserve tht q q. Sine i[π 1 ] = i[π 2 ], π 1 nd π 2 re mthed y intersetion resulting in pth in A 2 from (p, p) to (q, q ). Similrly, sine i[π 1] = i[π 2], π 1 nd π 2 re mthed y intersetion resulting in pth from (q, q ) to (p, p). Thus, (p, p) nd (q, q ) re in the sme strongly onneted omponent of A 2.

Conversely, ssume tht there eist sttes p, q nd q in A suh tht q q nd tht (p, p) nd (q, q ) re in the sme strongly onneted omponent of A 2. Let e yle in (p, p) going through (q, q ), it hs een otined y mthing two yles 1 nd 2. If 1 were equl to 2, intersetion would mth these two pths reting pth long whih ll the sttes would e of the form (r, r), nd sine A is trim this would ontrdit Theorem 3. Thus, 1 nd 2 re distint nd (EDA) holds. Lemm 1 leds to strightforwrd lgorithm for testing eponentil miguity. Theorem 4. Let A e trim ǫ-yle free finite utomton. It is deidle in time O( A 2 E ) whether A is eponentilly miguous. Proof. The lgorithm proeeds s follows. We ompute A 2 nd, using depth-first serh of A 2, trim it nd ompute its strongly onneted omponents. It follows from Lemm 1 tht A is eponentilly miguous iff there is strongly onneted omponent tht ontins two sttes of the form (p, p) nd (q, q ) with q q. Finding suh strongly onneted omponent n e done in time liner in the size of A 2, i.e. in O( A 2 E ) sine A nd A2 re trim. Thus, the ompleity of the lgorithm is in O( A E 2 ). Testing the (IDA) property requires finding three pths shring the sme lel in A. This n e done in nturl wy using the utomton A 3 = A A A, s shown elow. Lemm 2. Let A e trim ǫ-yle free finite utomton. A stisfies (IDA) iff there eist two distint sttes p nd q in A with non-ǫ pth in A 3 = A A A from stte (p, p, q) to stte (p, q, q). Proof. Assume tht A stisfies (IDA). Then, there eists string v Σ with three pths π 1 P(p, v, p), π 2 P(p, v, q) nd π 3 P(q, v, p). Sine these three pths shre the sme lel v, they re mthed y intersetion resulting in pth π in A 3 leled with v from (p[π 1 ], p[π 2 ], p[π 3 ]) = (p, p, q) to (n[π 1 ], n[π 2 ], n[π 3 ]) = (p, q, q). Conversely, if there is non-ǫ pth π form (p, p, q) to (p, q, q) in A 3, it hs een otined y mthing three pths π 1, π 2 nd π 3 in A with the sme input v = i[π] ǫ. Thus, (IDA) holds. Finlly, Theorem 4 nd Lemm 2 n e omined to yield the following result. Theorem 5. Let A e trim ǫ-yle free finite utomton. It is deidle in time O( A 3 E ) whether A is finitely, polynomilly, or eponentilly miguous. Proof. First, Theorem 4 n e used to test whether A is eponentilly miguous y omputing A 2. The ompleity of this step is O( A 2 E ). If A is not eponentilly miguous, we proeed y omputing nd trimming A 3 nd then testing whether A 3 verifies the property desried in lemm 2. This is done y onsidering the utomton B on the lphet Σ = Σ {#} otined from A 3 y dding trnsition leled y # from stte (p, q, q) to stte (p, p, q) for every pir (p, q) of sttes in A suh tht p q. It follows tht A 3 verifies the ondition in lemm 2 iff there is yle in B ontining oth trnsition leled y # nd trnsition leled

y symol in Σ. This property n e heked strightforwrdly using depth-first serh of B to ompute its strongly onneted omponents. If strongly onneted omponent of B is found tht ontins oth trnsition leled with # nd trnsition leled y symol in Σ, A verifies (IDA) ut not (EDA) nd thus A is polynomilly miguous. Otherwise, A is finitely miguous. The ompleity of this step is liner in the size of B: O( B E ) = O( A E 3 + A Q 2 ) = O( A E 3 ) sine A nd B re trim. The totl ompleity of the lgorithm is O( A 2 E + A 3 E ) = O( A 3 E ). When A is polynomilly miguous, we n derive from the lgorithm just desried one tht omputes dp(a). Theorem 6. Let A e trim ǫ-yle free finite utomton. If A is polynomilly miguous, dp(a) n e omputed in time O( A 3 E ). Proof. We first ompute A 3 nd use the lgorithm of theorem 5 to test whether A is polynomilly miguous nd to ompute ll the pirs (p, q) tht verify the ondition of Lemm 2. This step hs ompleity O( A 3 E ). We then ompute the omponent grph G of A, nd for eh pir (p, q) found in the previous step, we dd trnsition leled with # from the strongly onneted omponent of p to the one of q. If there is pth in tht grph ontining d edges leled y #, then A verifies (IDA d ). Thus, dp(a) is the mimum numer of edges mrked y # tht n e found long pth in G. Sine G is yli, this numer n e omputed in liner time in the size of G, i.e. in O( A 2 Q ). Thus, the overll ompleity of the lgorithm is O( A 3 E ). 5 Applition to the Approimtion of Entropy In this setion, we desrie n pplition in whih determining the degree of miguity of proilisti utomton helps estimte the qulity of n pproimtion of its entropy. Weighted utomt re utomt in whih eh trnsition rries some weight in ddition to the usul lphet symol. The weights re elements of semiring, tht is ring tht my lk negtion. The following is more forml definition. Definition 3. A weighted utomton A over semiring (K,,, 0, 1) is 7-tuple (Σ, Q, I, F, E, λ, ρ) where: Σ is the finite lphet of the utomton, Q is finite set of sttes, I Q the set of initil sttes, F Q the set of finl sttes, E Q Σ {ǫ} K Q finite set of trnsitions, λ : I K the initil weight funtion mpping I to K, nd ρ : F K the finl weight funtion mpping F to K. Given trnsition e E, we denote y w[e] its weight. We etend the weight funtion w to pths y defining the weight of pth s the -produt of the weights of its onstituent trnsitions: w[π] = w[e 1 ] w[e k ]. The weight ssoited y weighted utomton A to n input string Σ is defined y: [A]() = λ[p[π]] w[π] ρ[n[π]]. (2) π P(I,,F)

The entropy H(A) of proilisti utomton A is defined s: H(A) = Σ [A]()log([A]()). (3) Let K denote (R {+, }) (R {+, }). The system (K,,, (0, 0), (1, 0)) where nd re defined s follows defines ommuttive semiring lled the entropy semiring [2]. For ny two pirs ( 1, y 1 ) nd ( 2, y 2 ) in K, ( 1, y 1 ) ( 2, y 2 ) = ( 1 + 2, y 1 + y 2 ) (4) ( 1, y 1 ) ( 2, y 2 ) = ( 1 2, 1 y 2 + 2 y 1 ). (5) In [2], the uthors show tht generlized shortest-distne lgorithm over this semiring orretly omputes the entropy of n unmiguous proilisti utomton A. The lgorithm strts y mpping the weight of eh trnsition to pir where the first element is the proility nd the seond the entropy: w[e] (w[e], w[e] log w[e]). The lgorithm then proeeds y omputing the generlized shortest-distne under the entropy semiring, whih omputes the -sum of the weights of ll epting pths in A. In this setion, we show tht the sme shortest-distne lgorithm yields n pproimtion of the entropy of n miguous proilisti utomton A, where the pproimtion qulity is funtion of the degree of polynomil miguity, dp(a). Our proofs mke use of the stndrd log-sum inequlity [3], speil se of Jensen s inequlity, whih holds for ny positive rels 1,..., k, nd 1,..., k : k i log i i ( k ) k i log i k. (6) i Lemm 3. Let A e proilisti utomton nd let Σ + e string epted y A on k pths π 1,...,π k. Let w(π i ) e the proility of pth π i. Clerly, [A]() = k w(π i). Then, k w(π i )log w(π i ) [A]()(log[A]() log k). (7) Proof. The result follows strightforwrdly from the log-sum inequlity, with i = w(π i ) nd i = 1: kx w(π i) log w(π i)! P kx k w(π i) log w(πi) = [A]()(log[A]() log k). (8) k For proilisti utomton A, let S(A) e the quntity omputed y the generlized shortest-distne lgorithm with the entropy semiring. For n unmiguous utomton A, S(A) = H(A) [2].

Theorem 7. Let A e proilisti utomton nd let L denote the epeted length of strings epted y A (i.e. L = Σ [A]()). Then, 1. If A is finitely miguous with degree of miguity k (i.e. d(a) = k for some k N), then H(A) S(A) H(A) + log k. 2. If A is polynomilly miguous with degree of polynomil miguity k (i.e. dp(a) = k for some k N), then H(A) S(A) H(A) + k log L. Proof. The lower ound, S(A) H(A) follows from the oservtion tht for string tht is epted in A y k pths π 1,..., π k, k k k w(π i )log(w(π i )) ( w(π i ))log( w(π i )). (9) Sine the quntity k w(π i)log(w(π i )) is string s ontriution to S(A) nd the quntity ( k w(π i))log( k w(π i)) its ontriution to H(A), summing over ll epted strings, we otin H(A) S(A). Assume tht A is finitely miguous with degree of miguity k. Let Σ e string tht is epted on l k pths π 1,...,π l. By Lemm 3, l X Thus, w(π i) log w(π i) [A]()(log[A]() log l ) [A]()(log[A]() log k). (10) S(A) = X l X Σ w(π i) log w(π i) H(A) + X Σ (log k)[a]() = H(A) + log k. (11) This proves the first sttement of the theorem. Net, ssume tht A is polynomilly miguous with degree of polynomil miguity k. By Lemm 3, l X Thus, w(π i) log w(π i) [A]()(log[A]() log l ) [A]()(log[A]() log( k )). (12) S(A) H(A) + k[a]()log = H(A) + ke A [log ] (13) Σ H(A) + k log E A [ ] = H(A) + k log L, (y Jensen s inequlity) whih proves the seond sttement of the theorem. The qulity of the pproimtion of the entropy of proilisti utomton A depends on the epeted length L of n epted string. L n e omputed effiiently for n ritrry proilisti utomton using the epettion semiring nd the generlized shortest-distne lgorithms, using tehniques similr to the ones desried in [2]. The definition of the epettion semiring is identil to the entropy semiring. The only differene is in the initil step, where the weight of eh trnsition in A is mpped to pir of elements. Under the epettion semiring, the mpping is w[e] (w[e], w[e]).

6 Conlusion We presented simple nd effiient lgorithms for testing the finite, polynomil, or eponentil miguity of finite utomt with ǫ-trnsitions. We onjeture tht the runningtime ompleity of our lgorithms is optiml. These lgorithms hve vriety of pplitions, in prtiulr to test pre-ondition for the ppliility of other utomt lgorithms. Our pplition to the pproimtion of the entropy gives nother illustrtion of the pplitions of these lgorithms. Our lgorithms lso illustrte the prominent role plyed y the generl lgorithm for the intersetion or omposition of utomt nd trnsduers with ǫ-trnsitions in the design of testing lgorithms. Composition n e used to devise simple nd effiient testing lgorithms. We hve shown elsewhere how it n e used to test the funtionlity of finite-stte trnsduer or to test the twins property for weighted utomt nd trnsduers [1]. Aknowledgments. The reserh of Cyril Alluzen nd Mehryr Mohri ws prtilly supported y the New York Stte Offie of Siene Tehnology nd Ademi Reserh (NYS- TAR). This projet ws lso sponsored in prt y the Deprtment of the Army Awrd Numer W81XWH-04-1-0307. The U.S. Army Medil Reserh Aquisition Ativity, 820 Chndler Street, Fort Detrik MD 21702-5014 is the wrding nd dministering quisition offie. The ontent of this mteril does not neessrily reflet the position or the poliy of the Government nd no offiil endorsement should e inferred. Referenes 1. Cyril Alluzen nd Mehryr Mohri. Effiient Algorithms for Testing the Twins Property. Journl of Automt, Lnguges nd Comintoris, 8(2):117 144, 2003. 2. Corinn Cortes, Mehryr Mohri, Ashish Rstogi, nd Mihel Riley. Effiient omputtion of the reltive entropy of proilisti utomt. In LATIN 2006, volume 3887 of Leture Notes in Computer Siene, pges 323 336. Springer, 2006. 3. Thoms M. Cover nd Joy A. Thoms. Elements of Informtion Theory. John Wiley & Sons, In., New York, 1991. 4. Osr H. Irr nd Bl Rvikumr. On sprseness, miguity nd other deision prolems for eptors nd trnsduers. In STACS 1986, volume 210 of Leture Notes in Computer Siene, pges 171 179. Springer, 1986. 5. Gérrd Jo. Un lgorithme lulnt le rdinl, fini ou infini, des demi-groupes de mtries. Theoretil Computer Siene, 5(2):183 202, 1977. 6. Arnldo Mndel nd Imre Simon. On finite semigroups of mtries. Theoretil Computer Siene, 5(2):101 111, 1977. 7. Mehryr Mohri, Fernndo C. N. Pereir, nd Mihel Riley. Weighted Automt in Tet nd Speeh Proessing. In Proeedings of the 12th iennil Europen Conferene on Artifiil Intelligene (ECAI-96). John Wiley nd Sons, 1996. 8. Fernndo Pereir nd Mihel Riley. Finite Stte Lnguge Proessing, hpter Speeh Reognition y Composition of Weighted Finite Automt. The MIT Press, 1997. 9. Christophe Reutenuer. Propriétés rithmétiques et topologiques des séries rtionnelles en vrile non ommuttive. Thèse de troisième yle, Université Pris VI, 1977.

10. Andres Weer. Üer die Mehrdeutigkeit und Wertigkeit von endlihen, Automten und Trnsduern. Disserttion, Goethe-Universität Frnkfurt m Min, 1987. 11. Andres Weer nd Helmut Seidl. On the degree of miguity of finite utomt. In MFCS 1986, volume 233 of Leture Notes in Computer Siene, pges 620 629. Springer, 1986. 12. Andres Weer nd Helmut Seidl. On the degree of miguity of finite utomt. Theoretil Computer Siene, 88(2):325 349, 1991.