Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13
Processing Regulr Expressions We lredy lerned out Jv s regulr expression functionlity Now we get to know the mchinery ehind Pttern nd Mtcher clsses Compiling regulr expression into Pttern oject produces Finite Automton This utomton is then used to perform the mtching tsks We will see how to construct finite utomton tht recognizes n input string, i.e., tries to find full mtch Finite Automt I p.2/13
Definition: Finite Automton A finite utomton (FA) is tuple A =< Q,Σ,δ,q 0,F > Q finite non-empty set of sttes Σ finite lphet of input letters δ (totl) trnsition function Q Σ Q q 0 Q the initil stte F Q the set of finl (ccepting) sttes Trnsition grphs (digrms): initil stte sttes trnsition finl stte d o g q 0 q 1 q 2 q 3 Finite Automt I p.3/13
Finite Automt: Mtching A finite utomton ccepts given input string s if there is sequence of sttes p 1,p 2,...,p s Q such tht 1. p 1 = q 0, the strt stte 2. δ(p i,s i ) = p i+1, where s i is the i-th chrcter in s 3. p s F, i.e., finl stte A string is successfully mtched if we hve found the pproprite sequence of sttes Imgine the string on n input tpe with pointer tht is dvnced when using δ trnsition The set of strings ccepted y n utomton is the ccepted lnguge, nlogous to regulr expressions Finite Automt I p.4/13
(Non)deterministic Automt in the definition of utomt, δ ws totl function given n input string, the pth through the utomton is uniquely determined those utomt re therefore clled deterministic for nondeterministic FA, δ is trnsition reltion δ : Q Σ {} P(Q), where P(Q) is the powerset of Q llows trnsitions from one stte into severl sttes with the sme input symol need not e totl cn hve trnsitions leled (not in Σ), which represents the empty string Finite Automt I p.5/13
RegExps Automt Construct nondeterminstic utomt from regulr expressions (αβ) q 0α... q fα q 0β... q fβ (α β) q 0α... q fα q 0β... q fβ q 0 q f (α) q 0α... q fα q 0 q f Finite Automt I p.6/13
NFA vs. DFA Trversing DFA is esy given the input string: the pth is uniquely determined In contrst, trversing n NFA requires keeping trck of set of (current) sttes, strting with the set {q o } Processing the next input symol mens tking ll possile outgoing trnsitions from this set nd collecting the new set From every NFA, n equivlent DFA (one which does ccept the sme lnguge), cn e computed Bsic Ide: trck the susets tht cn e reched for every possile input Finite Automt I p.7/13
Trversing n NFA Finite Automt I p.8/13
Trversing n NFA Finite Automt I p.8/13
Trversing n NFA Finite Automt I p.8/13
Trversing n NFA Finite Automt I p.8/13
Trversing n NFA Finite Automt I p.8/13
Trversing n NFA Finite Automt I p.8/13
NFA DFA: Suset Construction Simulte in prllel ll possile moves the utomton cn mke The sttes of the resulting DFA will represent sets of sttes of the NFA, i.e., elements of P(Q) We use two opertions on sttes/stte-sets of the NFA -closure(t) move(t, ) Set of sttes rechle from ny stte s in T on on -trnsitions Set of sttes to which there is trnsition from one stte in T on input symol The finl sttes of the DFA re those where the corresponding NFA suset contins finl stte Finite Automt I p.9/13
Algorithm: Suset Construction proc SusetConstruction(s 0 ) DFASttes = -closure({s 0 }) while there is n unmrked stte T in DFASttes do mrk T for ech input symol do U := -closure(move(t, )) DFADelt[T, ] := U if U DFASttes then dd U s unmrked stte to DFASttes proc -closure(t) -closure := T; to check := T while to check not empty do get some stte t from to check for ech stte u with edge leled from t to u if u -closure then dd u to -closure nd to check Finite Automt I p.10/13
Exmple: Suset Construction Finite Automt I p.11/13
Exmple: Suset Construction 0,1, 2,4,7 Finite Automt I p.11/13
Exmple: Suset Construction 0,1, 2,4,7 1,2,3 4,6,7,8 Finite Automt I p.11/13
Exmple: Suset Construction 0,1, 2,4,7 1,2,3 4,6,7,8 1,2,4 5,6,7 Finite Automt I p.11/13
Exmple: Suset Construction 0,1, 2,4,7 1,2,3 4,6,7,8 1,2,4 5,6,7 Finite Automt I p.11/13
Exmple: Suset Construction 0,1, 2,4,7 1,2,3 4,6,7,8 1,2,4 5,6,7 Finite Automt I p.11/13
0,1, 2,4,7 1,2,3 4,6,7,8 Exmple: Suset Construction 1,2,4 5,6,7 Finite Automt I p.11/13
0,1, 2,4,7 1,2,3 4,6,7,8 Exmple: Suset Construction 1,2,4 5,6,7 1,2,4 5,6,7,9 Finite Automt I p.11/13
0,1, 2,4,7 1,2,3 4,6,7,8 Exmple: Suset Construction 1,2,4 5,6,7 1,2,4 5,6,7,9 Finite Automt I p.11/13
Time/Spce Considertions DFA trversl is liner to the length of input string x NFA needs O(n) spce (sttes+trnsitions), where n is the length of the regulr expression NFA trversl my need time n x, so why use NFAs? Finite Automt I p.12/13
Time/Spce Considertions DFA trversl is liner to the length of input string x NFA needs O(n) spce (sttes+trnsitions), where n is the length of the regulr expression NFA trversl my need time n x, so why use NFAs? There re DFA tht hve t lest 2 n sttes! Finite Automt I p.12/13
Time/Spce Considertions DFA trversl is liner to the length of input string x NFA needs O(n) spce (sttes+trnsitions), where n is the length of the regulr expression NFA trversl my need time n x, so why use NFAs? There re DFA tht hve t lest 2 n sttes! Solution 1: Lzy construction of the DFA: construct DFA sttes on the fly up to certin mount nd cche them Finite Automt I p.12/13
Time/Spce Considertions DFA trversl is liner to the length of input string x NFA needs O(n) spce (sttes+trnsitions), where n is the length of the regulr expression NFA trversl my need time n x, so why use NFAs? There re DFA tht hve t lest 2 n sttes! Solution 1: Lzy construction of the DFA: construct DFA sttes on the fly up to certin mount nd cche them Solution 2: Try to minimize the DFA: There is unique (modulo stte nmes) miniml utomton for regulr lnguge! Finite Automt I p.12/13
Minimiztion Algorithm y Hopcroft proc Minimize() B 1 = F; B 2 = Q F E = {B 1,B 2 } k = 3 for Σ do (i) = {s Q s B i t : δ(t,) = s} L = the smller of the (i) while L do tke some i L nd delete it for j < k s.th. t B j Finite Automt I p.13/13