A Parallel Algorithm for Minimization of Finite Automata

Similar documents
The Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule

Matching Partition a Linked List and Its Optimization

Parallelism and Locality in Priority Queues. A. Ranade S. Cheng E. Deprit J. Jones S. Shih. University of California. Berkeley, CA 94720

1-way quantum finite automata: strengths, weaknesses and generalizations

Computer arithmetic. Intensive Computation. Annalisa Massini 2017/2018

GIVEN an input sequence x 0,..., x n 1 and the

Analysis of execution time for parallel algorithm to dertmine if it is worth the effort to code and debug in parallel

Fuzzy Automata Induction using Construction Method

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

Feedback-error control

MATH 2710: NOTES FOR ANALYSIS

Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models

A generalization of Amdahl's law and relative conditions of parallelism

DFA Minimization in Map-Reduce

Convex Optimization methods for Computing Channel Capacity

1 1 c (a) 1 (b) 1 Figure 1: (a) First ath followed by salesman in the stris method. (b) Alternative ath. 4. D = distance travelled closing the loo. Th

Information collection on a graph

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

MATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK

Approximating min-max k-clustering

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

LIMITATIONS OF RECEPTRON. XOR Problem The failure of the perceptron to successfully simple problem such as XOR (Minsky and Papert).

Radial Basis Function Networks: Algorithms

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

q-ary Symmetric Channel for Large q

Shadow Computing: An Energy-Aware Fault Tolerant Computing Model

System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests

Theory of Parallel Hardware May 11, 2004 Massachusetts Institute of Technology Charles Leiserson, Michael Bender, Bradley Kuszmaul

A New Method of DDB Logical Structure Synthesis Using Distributed Tabu Search

Elliptic Curves and Cryptography

Information collection on a graph

PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY

PARALLEL MATRIX MULTIPLICATION: A SYSTEMATIC JOURNEY

A randomized sorting algorithm on the BSP model

State Estimation with ARMarkov Models

Positive decomposition of transfer functions with multiple poles

4. Score normalization technical details We now discuss the technical details of the score normalization method.

New Schedulability Test Conditions for Non-preemptive Scheduling on Multiprocessor Platforms

Cryptanalysis of Pseudorandom Generators

Linear diophantine equations for discrete tomography

The Knuth-Yao Quadrangle-Inequality Speedup is a Consequence of Total-Monotonicity

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

Finite-State Verification or Model Checking. Finite State Verification (FSV) or Model Checking

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

Outline. CS21 Decidability and Tractability. Regular expressions and FA. Regular expressions and FA. Regular expressions and FA

Universal Finite Memory Coding of Binary Sequences

Fault Tolerant Quantum Computing Robert Rogers, Thomas Sylwester, Abe Pauls

Lilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3

An Introduction To Range Searching

An Analysis of Reliable Classifiers through ROC Isometrics

Homework Set #3 Rates definitions, Channel Coding, Source-Channel coding

Chapter 1: PROBABILITY BASICS

Distributed Rule-Based Inference in the Presence of Redundant Information

Solution: (Course X071570: Stochastic Processes)

ECE 534 Information Theory - Midterm 2

DRAFT - do not circulate

Parallel Quantum-inspired Genetic Algorithm for Combinatorial Optimization Problem

On the capacity of the general trapdoor channel with feedback

Adaptive estimation with change detection for streaming data

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

On the Chvatál-Complexity of Knapsack Problems

2-D Analysis for Iterative Learning Controller for Discrete-Time Systems With Variable Initial Conditions Yong FANG 1, and Tommy W. S.

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

Deriving Indicator Direct and Cross Variograms from a Normal Scores Variogram Model (bigaus-full) David F. Machuca Mory and Clayton V.

Brownian Motion and Random Prime Factorization

ADAPTIVE CONTROL METHODS FOR EXCITED SYSTEMS

On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm

5. PRESSURE AND VELOCITY SPRING Each component of momentum satisfies its own scalar-transport equation. For one cell:

Recursive Estimation of the Preisach Density function for a Smart Actuator

The non-stochastic multi-armed bandit problem

AR PROCESSES AND SOURCES CAN BE RECONSTRUCTED FROM. Radu Balan, Alexander Jourjine, Justinian Rosca. Siemens Corporation Research

Multilayer Perceptron Neural Network (MLPs) For Analyzing the Properties of Jordan Oil Shale

Combinatorics of topmost discs of multi-peg Tower of Hanoi problem

Calculation of MTTF values with Markov Models for Safety Instrumented Systems

Model checking, verification of CTL. One must verify or expel... doubts, and convert them into the certainty of YES [Thomas Carlyle]

Observer/Kalman Filter Time Varying System Identification

Rotations in Curved Trajectories for Unconstrained Minimization

An Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem

Probability Estimates for Multi-class Classification by Pairwise Coupling

One step ahead prediction using Fuzzy Boolean Neural Networks 1

Research of PMU Optimal Placement in Power Systems

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

On Code Design for Simultaneous Energy and Information Transfer

A Simple Weight Decay Can Improve. Abstract. It has been observed in numerical simulations that a weight decay can improve

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar

Examples from Elements of Theory of Computation. Abstract. Introduction

16. Binary Search Trees

Verifying Two Conjectures on Generalized Elite Primes

Generation of Linear Models using Simulation Results

DMS: Distributed Sparse Tensor Factorization with Alternating Least Squares

Multi-Operation Multi-Machine Scheduling

DIFFERENTIAL evolution (DE) [3] has become a popular

16. Binary Search Trees

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

Eigenanalysis of Finite Element 3D Flow Models by Parallel Jacobi Davidson

UPPAAL tutorial What s inside UPPAAL The UPPAAL input languages

Bond Computing Systems: a Biologically Inspired and High-level Dynamics Model for Pervasive Computing

Formal Modeling in Cognitive Science Lecture 29: Noisy Channel Model and Applications;

Determining Momentum and Energy Corrections for g1c Using Kinematic Fitting

Transcription:

A Parallel Algorithm for Minimization of Finite Automata B. Ravikumar X. Xiong Deartment of Comuter Science University of Rhode Island Kingston, RI 02881 E-mail: fravi,xiongg@cs.uri.edu Abstract In this aer, we resent a arallel algorithm for the minimization of deterministic finite state automata (DFA's) and discuss its imlementation on a connection machine CM-5 using data arallel and message assing models. We show that its time comlexity on a rocessor EREW PRAM ( n) for inuts of size + log n log ) uniformly on almost all instances. The work done by our algorithm is thus within a factor of O(log n) of the best known sequential algorithm. The sace used by our algorithm is linear in the inut size. The actual resource requirements of our imlementations are consistent with these estimates. Although arallel algorithms have been roosed for this roblem in the ast, they are not ractical. We discuss the imlementation details and the exerimental results. n is O( n log2 n 1. Introduction Finite state machines are of fundamental imortance in theory as well as alications. Their alications range from classical ones such as comilers and sequential circuits to recent alications including neural networks, hidden Markov models, hyertext meta languages and rotocol verification. A central roblem in automata theory is to minimize a given Deterministic Finite Automaton (DFA). This roblem has a long history going back to the origins of automata theory. Huffman [7] and Moore [11] resented an algorithm for DFA minimization in 1950's. Their algorithm runs in time O(n 2 ) on DFA's with n state and was adequate for most of the classical alications. Many variations of this algorithm have aeared over the years; see [14] for a comrehensive summary. Hocroft [5] develoed a significantly faster algorithm of time comlexity O(n log n). Although Hocroft's algorithm is very efficient in ractice, it can' t handle DFA's with millions of states. But there are many alications involving such large DFA's. A arallel algorithm can handle such instances since the combined memory size of the rocessors is adequate to store and rocess such large DFA's. This work is art of a collection of arallel rograms we are develoing for many comutation-intensive testing roblems involving DFA's. we will briefly review the revious work on the design of arallel algorithms for DFA minimization roblem. The roblem has been studied extensively, [10, 12, 9, 2] etc. JáJá and Kosaraju [9] studied the secial case in which the inut alhabet size is 1 on a mesh-connected arallel comuter and obtained an efficient algorithm. Cho and Huynh [2] showed that the roblem is in NC 2 and that it is NLOGSPACEcomlete. The algorithm of [12] alies only to the case in which the inut alhabet is of size 1. In this case, the most efficient algorithm is due to JáJá and Ryu [10] which is a CRCW-PRAM algorithm of time comlexity O(log n) and total work O(n log log n). When the inut alhabet is of size greater than 1, no efficient arallel algorithm (i.e., one of time comlexity O(log k n)forsomefixedk) is known with a `reasonable' total work (say, O(n 2 ) or below). In fact, the only ublished NC algorithm for this roblem is due to [2] which has a time comlexity O(log 2 n) but requires (n 6 ) rocessors.

This aer differs from the work mentioned above in two ways: (i) Our goal is to imlement arallel algorithms for DFA minimization on actual arallel machines. (ii) Our figure of merit is the average case erformance, not the worst-case erformance. Here we reort on the imlementation of rograms for minimizing DFA's (with unrestricted inut alhabet size) on a CM-5 connection machine with 512 rocessors. We imlemented two classes of algorithms - one based on data arallel rogramming and the other based on message assing. Our rograms erform well in ractice They can minimize `tyical' DFA's with millions of states in a few minutes and their scalability is good. The rimary reason for their success is that they have very good erformance on almost all instances. We formally establish this using a robabilistic model of Traktenbrot and Barzdin. We also discuss the imlementation details and the exeriments we conducted with our rograms. We conclude with some new questions that arise from our study. In this abridged version, we omit the roofs of all the claims. They can be found in the full version. 2. Preliminaries We assume familiarity with basic concets about finite automata. To fix the notation, however, we will briefly introduce some of the central notions. For details, we refer the reader to [6]. A DFA M consists of a finite set Q of states, a finite inut alhabet, a start state q 0 2 Q and a set F Q of acceting states, and the transition function : Q! Q. An inut string x is a sequence of symbols over. On an inut string x = x 1 x 2 :::x m, the DFA will visit a sequence of states q 0 q 1 :::q m starting with the start state by successively alying the transition function. Thus q 1 = (q 0 ;x 1 ), q 2 = (q 1 ;x 2 ) etc. The language L(M ) acceted by a DFA is defined as the set of strings x that take the DFA to an acceting state. I.e., x will be in L(M ) if and only if q m 2 F. Such languages are called regular languages, a well-understood and imortant class. Two DFA's M 1 and M 2 are said to be equivalent if they accet the same set of strings. It is common to resent DFA minimization as an equivalent roblem called the coarsest set artition roblem [1] which is stated as follows: Given a set Q, an initial artition of Q into disjoint blocks fb 0 ; :::; B m 1 g, and a collection of functions f i : Q! Q, find the coarsest (having fewest blocks) artition of Q say fe 1 ;E 2 ; :::; E q g such that: (1) the final artition is consistent with the initial artition, i.e., each E i is a subset of some B j and (2) a and b in E i imlies for all j, f j (a) and f j (b) are in the same block E k. The connection to minimization of DFA is the following. Q is the set of states, f i is the ma induced by the symbol a i (this means, (q; a i )= f i (q)), the initial artition is the one that artitions Q into two sets, F the set of acceting states and QnF the nonacceting ones. It should be clear that the size of the minimal DFA is the number of equivalence classes in the coarsest artition. In the following, we will assume that the inut is a DFA with n states defined over a k-letter alhabet in which every state is reachable from the start state. This assumtion simlifies the resentation of our algorithm; otherwise a rerocessing ste should be added to delete the unreachable state. 3. Parallel Algorithm and Its Analysis A high level descrition of our arallel algorithm is given below: Algorithm ParallelMin1 inut: DF A with n states, k inuts, 2 blocks B 0;B 1. outut: minimized DF A. begin do for i =0to k 1 do for j =0to n 1 do in arallel Get the block number B of state q j; Get the block number B 0 of the state (q j;x i), label the state q j with the air (B; B 0 ); Re-arrange (e.g. by arallel sorting) the states into blocks so that the states with the same labels are contiguous; end arallel for; Assign to all states in the same block a new (unique) block number in arallel; end for; until no new block roduced; Deduce the minimal DFA from the final artition; end. The outer loo of the above algorithm aears inherently sequential and it seems very hard to arallelize it significantly. But this is not of great concern since exeriments show that the total number of iterations of the outer loo is very small (usually less than

10) for all the DFA's we tested of sizes ranging from 4 thousands to more than 1 million. This relative invariance of the number of iterations can be exlained theoretically; see the discussion below. Thus, in ractice, there is no real gain in attemting to arallelize the outer loo. In the remainder of this section, we will mainly discuss the high-level details of imlementing the above arallel algorithm on EREW-PRAM model and resent an analysis that holds for almost all instances in a recise sense. We start with a brief descrition of the model due to Traktenbrot and Barzdin [13]. Our notation will be similar to the one used by [3] who adated Traktenbrot-Barzdin model in their study of assive learning algorithms. Our model involves a slight generalization of theirs. Basically, the robability sace on which we base our average case is narrowed by fixing a DFA G arbitrarily excet for its set of acceting states. Thus an adversary can choose G (called the automaton grah) subject to the only condition that all states are reachable from the start state. The robability distribution is defined over all DFA's M G which can be derived from this automaton grah by randomly selecting the set F of acceting states. In our model, we will allow each state to be included in F indeendently with robability ( can be a function of n). Let P n;"; be any redicate on an n-state automaton which deends on n, a confidence arameter " (where 0 " 1) and (defined above). We say that uniformly almost all automata have roerty P n;"; if the following holds: for all " > 0, foralln > 0 and for any n-state underlying automaton grah G, if we randomly choose the set of acceting states where each state is included indeendently with robability, then with robability at least 1 ", the redicate P n;"; holds. Let M= < Q; ;q 0 ;;F > be a DFA. The outut associated with a string x = x 1 :::x m from state q defined as the sequence y 0 :::y m where y i = 1 if the state reached (starting at q) after alying the refix x 1 :::x i is an acceting state, 0 otherwise. (Note that y 0 is 1 or 0 according as q is acceting or not.) We denote this outut by q < x >. A string x 2 is a distinguishing string for q i and q j if q i <x>6=q j <x>. Finally define d(q i ;q j )as the length of the shortest distinguishing string for the states q i and q j.wearenow ready to resent our results about the exected erformance of ParallelMin1. In the following, lg will denote logarithm to base 2. Lemma 1. For uniformly almost all automata with n states, every air of states has a distinguishing string of length at most 2lg( n2 " ) 1 lg( ) : 1 2(1 ) Our next result resents the average time comlexity (in the above sense) of an EREW (exclusive-read, exclusive-write) PRAM imlementation of the arallel algorithm resented above. We will assume that the readers are familiar with the PRAM model. See [8] for a systematic treatment of the PRAM algorithms. Theorem 2. For uniformly almost all automata with n states, the algorithm ParallelMin1 can be imlemented on an EREW PRAM with q n rocessors with time n lg n lg(n comlexity O(k 2 =") +lgq lg n). The q lg(1=(1 2(1 ))) sace comlexity (= amount of shared memory used) is O(n). Thus the total work done by the algorithm is O(n log 2 n) if and k are constants. The next result exlains why Hocroft's algorithm does not arallelize well. Theorem 3. Let Q 0 be the set of states of a DFA after minimization, and i the iteration number of the Hocroft's algorithm, then i jj(jq 0 j 1): 4. Imlementation as Data Parallel Program Data-arallelism is the arallelism achievable through the simultaneous execution of the same oeration across a set of data [4]. Since each oeration is alied simultaneously to the data, a data-arallel rogram has only one rogram counter and is characterized by the absence of multi-threaded execution which makes the design, imlementation and testing of rograms very difficult. Although data arallel rogramming was originally develoed for SIMD (or vector) hardware, it has become a successful aradigm in all tyes of arallel rocessing systems such as

shared-memory multirocessors, distributed memory machines, and networks of workstations. Since our arallel algorithm ParallelMin1 has a simle control structure and uses an uniform data structure (array), it is easy to imlement it as a data arallel rogram. We resent our data arallel version of the algorithm in the seudo code similar to C*, a data arallel extension of C rogramming language. We assume in our imlementation that the comutation begins with the inut DFA already stored as arallel variables. The data structures used are as follows. delta is an array of jj arallel variables, each is a arallel one of size n. The i-th element of the delta[j], [i]delta[j], stores the transition function of state i under inut symbol j (i.e., the next state of state i under inut symbol j, (i; j)), 0 i n 1, 0 j jj 1. block is a arallel variable to store artition numbers for all states. [i]block is the block number of state i. In ParallelMin2 below, statements like [arallel index]arallel var1 = arallel var2 are send oerations which send values of arallel variable (arallel var2) elements to other arallel variable (arallel var1) elements according to index maing arallel index. statements like arallel var1 =[arallel index]arallel var2 are get oerations which get values of arallel variable (arallel var1) elements from other arallel variable (arallel var2) elements. Algorithm ParallelMin2 inut: delta, array of arallel variables to store transition function; block, initial block numbers of the DFA; outut: block, (udated) block numbers denote state equivalence classes of the minimized DFA; begin q =2; do check =0; for k =0to jj 1do Get block numbers: stateindex = coord(0); fstateindex is used to kee track of movement of state indicesg outut[0] = block; outut[1] = [delta[k]]block; Sort according to oututs: sorted = rank(outut[0]); [sorted]stateindex = stateindex; [sorted]outut[0] = outut[0]; [sorted]outut[1] = outut[1]; sorted=rank(outut[1]); [sorted]stateindex = stateindex; [sorted]outut[0] = outut[0]; [sorted]outut[1]=outut[1]; Set new block names: link =0; where(outut[0]6=[.-1]outut[0]or outut[1]6=[.-1]outut[1]) link =1; newblock = refixsum(link); Check if new blocks roduced: if ( q equal to [n-1]newblock ) check++; continue; fconsistent for inut kg else [stateindex]block=newblock; q =[n-1]newblock; end of for; until (check equal to jj);fconsistent for all inutsg end. In our imlementation, sorting is done by finding ranks (a system library function) of elements and three get oerations to re-distribute them. The arallel refix comutation is erformed by a library function as well. The erformance characteristics of ParallelMin2 are discussed in section 6. 5. Imlementation as Message Passing Program Now we resent the message assing version of the basic arallel algorithm. It is a little more comlicated than the data arallel version because we have to control communication between rocessors exlicitly. Our aroach for storing the inut DFA is to divide the transition function and the final state information

equally among different rocessors. Storage of the intermediate results will also be distributed among the rocessors. This is a standard aroach used in distributed memory arallel algorithms. For simlicity, we assume that divides n, the number of states in the inut DFA. Processors will be identified by integers from 0 to 1. A detailed descrition of the various stes of this algorithm can be found in the full aer. Algorithm ParallelMin3 for the rocessor myadd inut: delta[0:: n 1] array to store art of the transition function. block[0:: n 1], initial block numbers of states stored in this rocessor. outut: (udated) block[0:: n 1] begin done = false; Initialization of indexedbuffer: indexedbuffer[0; 0:: n 1]= myadd n.. (myadd +1) n 1.Sofirstrowof indexedbuffer contains the state numbers associated with this rocessor. indexedbuffer[1; 0:: n 1] = block[0:: n 1], ut current block numbers of the states in this rocessor; Construct Message Passing Tables according to delta. These tables contain for each state q in the rocessor the address of the rocessor that contains (q; x j) and of the rocessors which contain the state(s) 1 (q; x j) for each j. while (not done) done = true; for k =0 to jj 1do Exchange oututs: All-to-all communication takes lace as follows: Suose a state s t belongs to a rocessor t and suose (s t;x j)belongs to myadd. The rocessor will send a message to t containing the block number of (s t;x j). The received messages will be ut the third row, namely indexedbuffer[2; 0::n= +1]. Parallel-sort: Sort indexedbuffer using the second and third row together as key. Note that After sorting, indexedbuffer's first row (state indices) may not be ordered and may not belong to this node. That is, the sorting will re-distribute data items. Rename blocks: If two states have the same `key', they are ut in the same block in the refined artition. Put new block numbers in indexedbuffer's second row. This is a global scanning oeration on indexedbuffer of all rocessors. Parallel-sort: Sort indexedbuffer using the first row as the key. This will bring the new block number of each state back to the rocessor to which it belongs. if (a new block roduced)done = false; end of for; end of while; block[0:: n end. 1]=indexedBuffer[1, 0:: n 6. Exerimental Results on CM-5 Next, we resent our exerimental results on CM-5. We ran the rograms described above for many DFA's of different sizes on a CM-5 with 32 nodes. Each node has 4 vector units. The instances of DFA's used in 1];

70 10 3 60 x : Message Passing o : Data Parallel Time (in Seconds) 10 2 10 1 x : Message Passing o : Data Parallel Time (in Seconds) 50 40 30 20 10 0 10 4 10 5 10 6 Size of DFAs 10 0 50 100 150 200 250 300 Number of Processors Figure 2. Scalability Figure 1. Running Time vs. Problem Size this exeriment were constructed using four different methods. All of them have two inut symbols. In Figure 1, the number of rocessors is fixed at 32. When the inut size is less than about 100000, the time comlexity is almost linear so that it takes about twice as much time to solve an instance twice as large. But when the inut size exceeds 100000, nonlinearity shows u, esecially for the message assing rogram. To study the scalability of our rograms, we chose three DFA's (with 109078, 262144 and 524288 states) and minimized them by using both rograms on a CM- 5 configured with 32, 64, 128 and 256 nodes. Figure 2 shows the results of average time. Both rograms seem to scale well; but the message assing rogram is the better one. 7. Acknowledgments Our rograms were imlemented and tested on a connection machine CM-5 at the National Center for Suercomuter Alications at Urbana-Chamaign. We thank A. Tridgell of Australian National University for his hel with the sorting routine used in the message assing rogram. References [1] A. Aho, J. Hocroft, and J. Ullman. The Design and Analysis of Algorithms. Addison-Wesley Publishing Co., Reading, MA, 1973. [2] S. Cho and D. Huynh. The arallel comlexity of coarsest set artition roblems. Information Processing Letters, 42:89 94, 1992. [3] Y. Freund et al. Efficient learning of tyical finite automata from random walks. In 25th ACM Symosium on Theory of Comuting, ages 315 324, 1993. [4] P. Hatcher and M. Quinn. Data-arallel Programming on MIMD Comuters. MIT Press, Cambridge, Mass., 1991. [5] J. Hocroft. An n log n algorithm for minimizing states in a finite automaton. In Theory of Machines and Comutations, ages 189 196. Academic Press, New York, 1971. [6] J. Hocroft and J. Ullman. Introduction to Automata Theory, Languages and Comutation. Addison- Wesley Co., Reading, Mass., 1978. [7] D. Huffman. The synthesis of sequential switching circuits. Journal of Franklin Institute, 257(3):161 190, 1954. [8] J. JáJá. An Introduction to Parallel Algorithms. Addison-Wesley Publishing Co., Reading, Mass., 1992. [9] J. JáJá and S. Kosaraju. Parallel algorithms for lanar grah isomorhism and related roblems. IEEE Transactions on Circuits and Systems, 35(3), March 1988. [10] J. JáJá and K. Ryu. An efficient arallel algorithm for the single function coarsest artition roblem. ACM Symosium on Parallel Algorithms and Architectures, 1993. [11] E. Moore. Gedanken-exeriments on sequential circuits. In C. Shannon and J. McCarthy, editors, Automata Studies, ages 129 153. Princeton University Press, Princeton, NJ, 1956. [12] Y. Srikant. A arallel algorithm for the minimization of finite state automata. Intern. J. Comuter Math., 32:1 11, 1990.

[13] B. Trakhtenbrot and Y. Barzdin'. Finite Automata: Behavior and Synthesis. North-Holland Publishing Comany, 1973. [14] B. Watson. A taxonomy of finite automaton minimization algorithms. Technical reort, Faculty of Mathematics and Comuter Science, Eindhoven University of Technology, The Netherlands, 1994.