Finite-state Machines: Theory and Applications

Size: px
Start display at page:

Download "Finite-state Machines: Theory and Applications"

Transcription

1 Finite-state Machines: Theory and Applications Unweighted Finite-state Automata Thomas Hanneforth Institut für Linguistik Universität Potsdam December 10, 2008 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

2 Overview 1 Applications of finite-state machines 2 Finite state acceptors and transducers: formal characterization 3 Closure properties and algebra of finite state acceptors 4 Closure properties and algebra of finite state transducers 5 Equivalence transformations on finite state acceptors 6 Equivalence transformations on finite state transducers 7 Decidability properties of unweighted finite-state acceptors and transducers Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

3 Outline 1 Applications of Finite-State Machines Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

4

5 Some Applications of Finite-State Machines in Computational Linguistics Morphological analysis: lemmatization, word segmentation, segmentation disambiguation Spelling correction Lexicon representation Part-of-Speech-Tagging Shallow parsing, Chunking Speech recognition, speech synthesis Optimality theory Language modeling & Statistical language processing Statistical machine translation Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

6 Outline 2 Finite state acceptors and transducers: formal characterization Finite-state acceptors Finite-state transducers Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

7 Finite state acceptors and transducers: formal characterization

8 Finite-state acceptors Non-deterministic finite-state acceptor Definition (Non-deterministic finite-state acceptor (NFA)) A non-deterministic finite-state acceptor A is a 5-tuple Q, Σ, q 0, F, δ where Q is a non-empty set of states Σ is a non-empty set and called the alphabet of A q 0 Q, the start state F Q, the set of final states δ : Q Σ {ε} 2 Q, the transition function. Notes δ may be a partial function (and usually is) Nondeterminism: the transition function δ maps a state q and an alphabet symbol a to a set of successor states. A transition may be labeled with ε, the neutral element of concatenation. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

9 Finite-state acceptors Example (Nondeterministic FSA A lex accepting some animal names) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

10 Finite-state acceptors Deterministic finite-state acceptor Definition (Deterministic finite-state acceptor (DFA)) A deterministic finite-state acceptor A is a 5-tuple Q, Σ, q 0, F, δ, where Q is a non-empty set of states Σ is a non-empty set and called the alphabet of A q 0 Q, the start state F Q, the set of final states δ : Q Σ Q, the transition function. Notes Again, δ may be a partial function, DFSA are by definition ε-free, that is, contain no ε-transitions. DFSA and NDFA have the same generative power that is both concepts are equivalent (cf. subset construction). Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

11 Finite-state acceptors Example (Deterministic version of A lex ) Deterministic acyclic FSA are also called tries. Tries are useful for lexicon representation. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

12 Finite-state acceptors Examples Example (DFSA A NP accepting English noun phrase patterns) Example (DFSA A English accepting some English sentences) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

13 Finite-state acceptors Generalized transition function, language Definition (Generalized transition function δ ) δ is the reflexive and transitive closure of δ δ (q, ε) = q, q Q δ (q, aw) = δ (δ(q, a), w) Definition (Language of a DFSA A) L(A) = {w Σ δ (q 0, w) F } We also say that L(A) is recognized by A. Definition (Regular language) The language is called regular if there exists some DFA which recognizes it. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

14 Finite-state transducers Definition Definition ((Non-deterministic) finite-state transducer (NFST)) A (non-deterministic) finite-state transducer T is a 7-tuple Q, Σ,, q 0, F, δ, σ, where Q is a non-empty set of states Σ is a non-empty set and called the input alphabet of T is a non-empty set and called the output alphabet of T q 0 Q, the start state F Q, the set of final states δ : Q Σ {ε} 2 Q, the transition function. σ : Q Σ {ε} Q, the output function. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

15 Finite-state transducers Alternative definition To simplify some definitions, we combine transition and output function to a set of transitions. In addition, we restrict the output function to single symbols or ε. Definition (Normalized finite-state transducer) A normalized finite-state transducer T is a 6-tuple Q, Σ,, q 0, F, E, where Q is a non-empty set of states Σ is a non-empty set, the input alphabet of T is a non-empty set, the output alphabet of T q 0 Q, the start state F Q, the set of final states E Q (Σ {ε}) ( {ε}) Q, the set of transitions. Note that every transducer can be transformed into a normalized transducer. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

16 Finite-state transducers Example (T lex mapping surface forms to morph. features) Note that fish is nondeterministically mapped to { NOUN sg, NOUN pl} Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

17 Finite-state transducers Example (Laughter machine T laugh ) The input string laugh is mapped to the infinite set {ha n n 1} Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

18 Finite-state transducers Example (Bracketing machine T bracket ) Every occurrence of ab is enclosed within brackets. For example, the input string cabbabc is mapped to c{ab}b{ab}c by traversing the state sequence Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

19 Finite-state transducers Language Definition (Language of a FST) The language L(T ) of a FST T = Q, Σ,, q 0, F, δ, σ is defined in the following way: L(T ) = { u, v δ (q 0, u) F v σ (q 0, u)} δ is recursively defined: δ (q, ε) = {q} and δ (q, wa) = q δ (q,w) δ(q, a) σ is the generalized output function and defined like this: σ (q, ε) = {ε} and σ (q, wa) = σ (q, w) q δ (q,w) σ(q, a, p) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

20 Finite-state transducers Transduction mapping Definition (Transduction mapping) The transduction mapping T : Σ of a FST T is defined as: T (x) = {y x, y L(T )} Definition (Functional transducer) A transducer T is called functional if T (x) 1 for all x Σ Example T bracket is functional. T lex is not functional. Definition (Ambiguous transducer) A transducer T is called ambiguous, if T (x) > 1 for some x Σ. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

21 Finite-state transducers Ambiguous transducers Definition (Finitely ambiguous transducer) A transducer T is called finitely ambiguous, if T (x) is finite for all x Σ. Example T lex is finitely ambiguous. Definition (Infinitely ambiguous transducer) A transducer T is called infinitely ambiguous, if T (x) is infinite for some x Σ. Example T laugh is infinitely ambiguous. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

22 Finite-state transducers Deterministic finite-state transducer Definition (Deterministic finite-state transducer (DFST)) A deterministic finite-state transducer T is a 7-tuple Q, Σ,, q 0, F, δ, σ, where Q is a non-empty set of states Σ is a non-empty set and called the input alphabet of T is a non-empty set and called the output alphabet of T q 0 Q, the start state F Q, the set of final states δ : Q Σ Q, the (deterministic) transition function. σ : Q Σ, the (deterministic) output function. Theorem Every deterministic transducer is functional. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

23 Outline 3 Closure properties and algebra of finite state acceptors Union Concatenation Star closure Plus closure Reversal Complementation Intersection Difference Substitution Homomorphism Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

24 Closure properties and algebra of finite state acceptors

25 Closure properties and algebra of finite state acceptors Definition (Closure of a set) Let S be a set and let f k be a k-ary function taking k-tuples over S as arguments. We say that S is closed under f k if for all a i S f k (a 1, a 2,..., a k ) S. Note Closure properties are important for the modularity based on a specific formalism. They allow to build complex things out of simpler ones by combining them with a number of operations. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

26 Closure properties and algebra of finite state acceptors Closure properties of regular languages The set of languages which is recognized by finite-state acceptors (the regular languages) is closed under Union Concatenation Plus and star closure Reversal Complementation Intersection Difference Homomorphism and substitution Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

27 Closure properties and algebra of finite state acceptors Union Example (Union of two acceptors) A 2 A 1 A 1 A 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

28 Closure properties and algebra of finite state acceptors Concatenation Example (Concatenation of two acceptors) A 1 A 2 A 1 A 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

29 Closure properties and algebra of finite state acceptors Star closure Example (Star (Kleene) closure of an acceptor) A A Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

30 Closure properties and algebra of finite state acceptors Plus closure Example (Plus closure of an acceptor) A A + Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

31 Closure properties and algebra of finite state acceptors Reversal Definition (Reversal of a string) The reversal of a string w Σ denoted by w R is defined as: ɛ R = ɛ Example (a w) R = w R a, a Σ w Σ obama R = bama R o = ama R bo = ma R abo = a R mabo = ɛ amabo Definition (Reversal of a string set) Let S Σ be a set of strings. The reversal of S - denoted by S R is defined as: S R = {w R w S}. Theorem The set of regular languages is closed under reversal. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

32 Closure properties and algebra of finite state acceptors Reversal Example (Reversal of an acceptor) A A R Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

33 Closure properties and algebra of finite state acceptors Complementation Given an alphabet Σ and a FSA A you sometimes need a FSA A representing all strings x over Σ which are not in A. Formally: L(A) = {x Σ x L(A)} or L(A) = Σ L(A) L(A) L(A) Theorem The set of regular languages is closed under complementation. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

34 Closure properties and algebra of finite state acceptors Complementation: algorithm Algorithm: 1 Determinize A and obtain A. 2 Make A complete by adding a sink state s and adding for each state q and each symbol a Σ not already used at q a transition δ(q, a) = s. 3 Exchange final and non-final states. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

35 Closure properties and algebra of finite state acceptors Complementation Example (Complementation of a finite-state acceptor) A (Σ = {a, c, r, t}) A Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

36 Closure properties and algebra of finite state acceptors Complementation Example (Why complementation works) Consider a trie for W = {cat, camel, dog, f rog}. Definition (Definition of a trie) Let W be a finite set of words over Σ. Let P ref(w ) the set of all prefixes of W. Define a DFA A = P ref(w ), Σ, ε, W, δ with a Σ, x, xa P ref(w ): δ(x, a) = xa. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

37 Closure properties and algebra of finite state acceptors Complementation Example (A trie for W = {cat, camel, dog, f rog}) States are labeled with prefixes of W. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

38 Closure properties and algebra of finite state acceptors Complementation In a trie a special acyclic DFA each state corresponds to a single prefix of a word in W. In a general DFA A, each state q corresponds to a set of prefixes of the words in L(A), the left language of q. Definition (Left language) The left language of a state q symbolically L (q) is defined as: L (q) = {w Σ δ (q 0, w) = q} Example (Left language) L (1) = {a(ba) n n 0} Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

39 Closure properties and algebra of finite state acceptors Complementation: Why can we only complementize DFAs? Example ( Complementation of an NFA) The complementized NFA still accepts ab. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

40 Closure properties and algebra of finite state acceptors Complementation: Importance Complementation (negation) is important for the inherent robustness of methods based on finite-state automata. A NLP system is called robust if there does not exist an input string for which it fails. That means: A robust NLP system accepts Σ. This is immediately related to negation: if there is some input string w which is not accepted by some DFA A (say, for example, a DFA representing some NP grammar about stock indices), one could use A to accept w: L(A) L(A) = Σ. This property does not carry over to context-free languages: they are not closed under complementation. Context-sensitive languages are perhaps surprisingly again closed under complementation. But their recognition problem is notoriously difficult. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

41 Closure properties and algebra of finite state acceptors Complementation: the sting in the tail: complexity Theorem Consider an NFA A with state set Q. The state complexity of an equivalent DFA A can be in the worst case in O( Σ 2 Q ). Example (Worst case of determinization) Consider the regular language L = Σ a(a b) k for some k (with Σ = {a, b}). While an NFA for L has k + 2 states, the equivalent DFA has 2 k+1 states. k = 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

42 Closure properties and algebra of finite state acceptors Intersection If we know that FSAs are closed under complementation and union then we also know that they are closed under intersection. Why? By DeMorgan! A B A B But this approach is very complex, since it requires three complementation operations which in turn require determinization. There is a more direct method: we let pair of states of the original FSAs be states of the intersection FSA. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

43 Closure properties and algebra of finite state acceptors Intersection: Example Example (Intersection with the product state construction) A 1 A 2 A 1 A 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

44 Closure properties and algebra of finite state acceptors Intersection: formal definition Definition (Intersection of two finite-state acceptors) Let A 1 = Q 1, Σ 1, q 01, F 1, δ 1 and A 2 = Q 2, Σ 2, q 02, F 2, δ 2 be two FSAs. A 1 A 2, the intersection of A 1 and A 2 is an acceptor: A = Q 1 Q 2, Σ 1 Σ 2, q 01, q 02, F 1 F 2, δ where p, q δ( p, q, a) if p δ 1 (p, a) and q δ 2 (q, a) for all a Σ 1 Σ 2. This mathematical approach generates in the worst as in the best case a FSA with Q 1 Q 2 states. But a lot of these states may not contribute to the language Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

45 Closure properties and algebra of finite state acceptors Intersection: useless states Definition (Inaccessible and non-coaccessible states) Let A be a finite-state automaton (acceptor or transducer) with start state q 0. A state q in A is called inaccessible if there is no path in A from q 0 to q. A state q in A is called non-coaccessible if there is no path in A from q to a final state of A. A state is called useless if it is inaccessible or non-coaccessible. A finite-state automaton A is called trim or connected if it has no useless states. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

46 Closure properties and algebra of finite state acceptors Intersection: removal of useless states Algorithm connect(a) Require: FSM A with start state q 0, state set Q and final state set F Ensure: A without useless states 1: Perform a depth-first search starting at q 0 and mark each visited state 2: Delete each unmarked state q and all its ingoing and outgoing transitions 3: Reverse A 4: Unmark all states in Q 5: Perform a depth-first search starting at all states q F and mark each visited state 6: Delete each unmarked state q and all its ingoing and outgoing transitions 7: Reverse A Complexity of connect(a) If A has Q number of states and E number of transitions, the complexity of connect(a) is in O( Q + E ). Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

47 Closure properties and algebra of finite state acceptors Intersection: algorithm Require: FSAs A 1 = Q 1, Σ 1, q 01, F 1, δ 1 and A 2 = Q 2, Σ 2, q 02, F 2, δ 2 Ensure: A = A 1 A 2 1: F := Q := 2: ENQUEUE(S, q 01, q 02 ) 3: while S do 4: q 1, q 2 := DEQUEUE(S) 5: for all a Σ 1 Σ 2 do 6: if q 1 δ 1(q 1, a) q 2 δ 2(q 2, a) then 7: δ( q 1, q 2, a) := δ( q 1, q 2, a) { q 1, q 2 } 8: if q 1, q 2 / Q then 9: Q := Q { q 1, q 2 } 10: if q 1 F 1 q 2 F 2 then 11: F := F { q 1, q 2 } 12: end if 13: ENQUEUE(S, q 1, q 2 ) 14: end if 15: end if 16: end for 17: end while 18: CONNECT (A) 19: return A Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

48 Closure properties and algebra of finite state acceptors Intersection: practical importance Closure under intersection means that we can develop constraints independently of each other and then enforce their validity simultaneously by intersecting them A lot of finite-state based NLP is based on intersection: (Two-level-) Morphology, Constraint based grammar, pattern matching etc. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

49 Closure properties and algebra of finite state acceptors Difference Definition (Difference) Let A 1 and A 2 two FSAs. The difference A 1 A 2 is defined as: A 1 A 2 A 1 A 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

50 Closure properties and algebra of finite state acceptors Difference Example (Difference) A 1 A 2 A 1 A 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

51 Closure properties and algebra of finite state acceptors Substitution Definition (Substitution) A substitution is a mapping s : Σ 2 for two alphabets Σ and. s is generalized to s : Σ 2 by: s (ε) = ε s (xa) = s (x)s(a) Theorem (Closure under substitution) The set of regular languages is closed under substitution with regular languages. Note A lot of finite-state based NLP is based on closure under substitution. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

52 Closure properties and algebra of finite state acceptors Substitution Example (Substitution in computational morphology) A morphology rule as a FSA A A 1 A 2 Result of the substitution A{NST EM = A 1, NINF L = A 2 } Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

53 Closure properties and algebra of finite state acceptors Homomorphism Definition (Homomorphism) A homomorphism is a mapping h : Σ for two alphabets Σ and. Definition (Inverse homomorphism) Given a homomorphism h, the inverse homomorphic image h 1 of a language L is defined as: h 1 (L) = {w h(w) L} Theorem (Closure under homomorphism) The set of regular languages is closed under homomorphism and inverse homomorphism Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

54 Outline 4 Closure properties and algebra of finite state transducers Projection Composition Cross product Inversion Intersection Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

55 Closure properties and algebra of finite state transducers

56 Closure properties and algebra of finite state transducers The set of finite state transducers is closed under Union Concatenation Closure Reversal Projection (note that this leads to FSAs) Composition Inversion Finite state transducers are not closed under Complementation Intersection (but acyclic and ε-free transducers are) Difference Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

57 Closure properties and algebra of finite state transducers Projection Definition (First and second projection) Let T = Q, Σ,, q 0, F, S be a transducer. The first projection of T symbolically π 1 (T ) is the FSA A = Q, Σ, q 0, F, δ where a Σ {ε}, δ(p, a) = {q b : p, a, b, q S} The second projection of T symbolically π 2 (T ) is the FSA A = Q, Σ, q 0, F, δ where b {ε}, δ(p, b) = {q a Σ : p, a, b, q S} Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

58 Closure properties and algebra of finite state transducers Projection Example (Projection) Transducer T π 1 (T ) π 2 (T ) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

59 Closure properties and algebra of finite state transducers Composition Composing a transducer T 1 with a transducer T 2 (formally T 1 T 2 ) means: take some input u for T 1, collect the output v of T 1, feed it as input into T 2 and collect the output w of T 2. Definition (Composition relation) Let T 1 = Q 1, Σ 1, 1, q 01, F 1, S 1 and T 2 = Q 2, Σ 2, 2, q 02, F 2, S 2 be transducers. L(T 1 T 2 ) = { u, w Σ 1 2 v 1 Σ 2 : u, v L(T 1) v, w L(T 2 )} Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

60 Closure properties and algebra of finite state transducers Composition Definition (ε-free composition) Let T 1 = Q 1, Σ 1, 1, q 01, F 1, E 1 and T 2 = Q 2, Σ 2, 2, q 02, F 2, E 2 be two normalized, ε-free FSTs. T 1 T 2, the composition of T 1 and T 2, is the transducer T = Q 1 Q 2, Σ 1, 2, q 01, q 02, F 1 F 2, E where E = { p, q, a, b, p, q c 1 Σ 2 : p, a, c, p E 1 q, c, b, q E 2 } Properties of composition The composition operation is not commutative, that is, in general: T 1 T 2 T 2 T 1 The composition operation is associative, that is: T 1 T 2 T 3 = (T 1 T 2 ) T 3 = T 1 (T 2 T 3 ) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

61 Closure properties and algebra of finite state transducers Composition How does composition work? Whenever T 1 contains a transition: and T 2 contains a transition: T will contain a transition: Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

62 Closure properties and algebra of finite state transducers Composition Example (Composition) FST mapping NP-patterns to = FST repeatedly mapping NP category words to their categories Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

63 Closure properties and algebra of finite state transducers Application Definition (Identity transducer) Let A = Q, Σ, q 0, F, δ be a FSA. The identity transducer ID(A) is defined by Q, Σ, Σ, q 0, F, E where E = { p, a, a, q p, q Q, a Σ {ε} : q δ(p, a)} Example (Identity transducer) Definition (Application) The application of a FST T to a FSA A symbolically T [A] is defined as T [A] π 2 (ID(A) T ) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

64 Closure properties and algebra of finite state transducers Application Example FSA A FSA ID(A) FST T ID(A) T T [A] = π 2 (ID(A) T ) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

65 Closure properties and algebra of finite state transducers Composition: relationship to intersection Composition can be considered as a generalization of intersection. The intersection of two FSAs A 1 and A 2 can be defined as follows: A 1 A 2 = π 1 (ID(A 1 ) ID(A 2 )) So, intersecting two FSAs is done by composing their identity transducers and afterwards projecting one of the tapes. Composing two transducers X and Y means synchronizing (intersecting) their inner tapes and then combining the outer tapes: Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

66 Closure properties and algebra of finite state transducers Composition: handling ε-transitions It is possible to generalize the composition definition to transducers with ɛ-transitions: Definition (Transducer composition) Let T 1 = Q 1, Σ 1, 1, q 01, F 1, E 1 and T 2 = Q 2, Σ 2, 2, q 02, F 2, E 2 be two normalized FSTs. T 1 T 2, the composition of T 1 and T 2, is the transducer T = Q 1 Q 2, Σ 1, 2, q 01, q 02, F 1 F 2, E E ε E i,ε E o,ε where 1 E = { p, q, a, b, p, q c 1 Σ 2 : p, a, c, p E 1 q, c, b, q E 2 } 2 E ε = { p, q, a, b, p, q p, a, ε, p E 1 q, ε, b, q E 2 } 3 E i,ε = { p, q, ε, a, p, q q, ε, a, q E 2 p Q 1 } 4 E o,ε = { p, q, a, ε, p, q p, a, ε, p E 1 q Q 2 } Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

67 Closure properties and algebra of finite state transducers Composition: handling ε-transitions There are four different ways, how ε and alphabet symbols on the second tape of T 1 and the first tape of T 2 can interact: 1 T 1 contains a a : c-transition and T 2 contains a c : b-transition: this is handled in the same way as in the ε-free case 2 T 1 contains a a : ε-transition and T 2 contains a ε : b-transition T contains a a : b-transition. That is: ε is treated as a regular symbol. 3 T 1 stays in the same state, T 2 moves on: 4 T 1 moves on, T 2 stays in the same state: Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

68 Closure properties and algebra of WFSA Composition Example (Composition of two unweighted FSTs) T 1 T 2 T 1 T 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

69 Closure properties and algebra of finite state transducers Composition: Application Composition is a very important operation for building processing or filtering cascades, for example in robust parsing and morphological analysis. Since composition is not commutative, the order of a transducer cascade C = T 1 T 2... T k matters. This may lead to problems related to feeding, counter-feeding, bleeding and counter-bleeding. Since the composition operation is associative, the order in which the compositions in C are computed does not matter. This entails some freedom degrees for implementing such cascades. Note, that the state complexity of T 1 T 2... T k is Q 1 Q 2... Q k in the worst case. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

70 Closure properties and algebra of finite state transducers Cross product Definition ((Cartesian) Product) Given two sets S 1 and S 2, the Cartesian product S 1 S 2 is defined as: Theorem (Product of regular sets) S 1 S 2 = { x, y x S 1 y S 2 } Let A 1 = Q 1, Σ 1, q 01, F 1, δ 1 and A 2 = Q 2, Σ 2, q 02, F 2, δ 2 be two finite-state acceptors. Then L(A 1 ) L(A 2 ) is representable by a finite-state transducer A 1 A 2. Proof. A 1 A 2 ID(A 1 ) T Σ 1 ε T ε Σ 2 ID(A 2 ) Note Cross product is the core of all replacement operations. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

71 Closure properties and algebra of finite state transducers Cross product Example (Cross product) A 1 A 2 T Σ 1 ε T ε Σ 2 A 1 A 2 Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

72 Closure properties and algebra of finite state transducers Inversion Definition (Inversion) Let T = Q, Σ,, q 0, F, E be a transducer. The inversion of T symbolically T 1 is the FST T 1 = Q,, Σ, q 0, F, E 1 where E 1 = { p, b, a, q p, a, b, q E} Note Thus, inversion simply exchanges input- and output tapes of a transducer. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

73 Closure properties and algebra of finite state transducers Inversion Example (Morphological analysis vs. generation) FST T Morph mapping words to morphological categories FST T 1 Morph mapping morphological categories to words Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

74 Closure properties and algebra of finite state transducers Why are FSTs not closed under intersection? Example T an b n c T an b c n The intersection of T a n b n c and T a n b cn would result in the relation R = { a n, b n c n n 0} which is not regular and thus not representable by a finite-state transducer. Note This has consequences for creating applications based on finite-state transducers. They cannot be based on the intersection of constraints represented as transducers. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

75 Closure properties and algebra of finite state transducers Why are FSTs not closed under intersection? Intuitively, the existence of ε within loops leading to infinite ambiguity is the reason why FSTs are not closed under intersection Thus, ε-free FSTs also called equal-length transducers are closed under intersection The same is true for acyclic FSTs, where we have some freedom where to realize the ε-transitions By DeMorgan, non-closure under intersection leads to non-closure under complementation Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

76 Outline 5 Equivalence transformations on finite-state acceptors ε-removal Determinization Minimization Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

77 Equivalence transformations on finite-state acceptors

78 Equivalence transformations on finite-state acceptors Equivalence transformations Equivalence transformations are operations on automata which change the topology of an automaton without changing its language. They usually serve optimization purposes, that is, they create smaller and/or faster automata. Finite-state acceptors admit the following equivalence transformations: ε-removal Determinization Minimization Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

79 Equivalence transformations on finite-state acceptors ε-removal Definition ( ) Let p and q be states in Q and let w be a string in Σ. Let w be a relation Q Q, such that p, q w if there is a path labeled with w from p to q. Definition (ε-closure) Given a NFA A = Q, Σ, q 0, F, δ, ε-closure(q) = {q} {p Q q ε p}. Definition (ε-free FSA) Let A = Q, Σ, q 0, F, δ be a FSA. Define A, the equivalent ε-free FSA with L(A ) = L(A), as A = Q, Σ, q 0, F, δ where: δ (q, a) = ε-closure( q ε-closure(q) δ(q, a)), q Q, a Σ F = F {q 0 }, if ε-closure(q 0 ) F, else F = F. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

80 Equivalence transformations on finite-state acceptors ε-removal Example Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

81 Equivalence transformations on finite-state acceptors Determinization Definition (Subset construction) Let A = Q, Σ, q 0, F, δ be a FSA. Define A, the equivalent DFA with L(A ) = L(A) as A = 2 Q, Σ, {q 0 }, F, δ with : F = {S Q S F } δ (S, a) = q S δ(q, a), a Σ, S Q Note The complexity of an algorithm which implements this in a naive way is exponential. In the normal case, most of the subset states in the DFA are not accessible / coaccessible. A better algorithm based on a state queue avoids inaccessible states. But this doesn t change the complexity in the worst case. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

82 Equivalence transformations on finite-state acceptors Determinization Example Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

83 Equivalence transformations on finite-state acceptors Minimization Definition (Minimal DFA) Let A = Q, Σ, q 0, F, δ be a DFA. L is minimal if A = Q, Σ, q 0, F, δ : L(A ) = L(A) Q Q Definition (Right language) Let A = Q, Σ, q 0, F, δ be a DFA. The right language of a state q Q symbolically L (q) is defined as: L (q) = {w Σ δ (q, w) F } Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

84 Equivalence transformations on finite-state acceptors Minimization Definition (Equivalent states) Two states p and q are called equivalent if L (p) = L (q). This holistic definition based on right languages can be turned into a recursive definition of equivalence of states: Definition (State equivalence ) Let A = Q, Σ, q 0, F, δ be a DFA. Two states p and q are called equivalent symbolically p q, if: p q if p F q F a Σ : δ(p, a) δ(q, a). Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

85 Equivalence transformations on finite-state acceptors Minimization Based on state equivalence, we come up with a definition of a minimal DFA: Theorem (Minimal DFA I) Let A = Q, Σ, q 0, F, δ be a DFA. A is minimal iff p, q Q : p q L (p) L (q). By substituting the recursive definition of state equivalence into the last theorem, we arrive at: Theorem (Minimal DFA II) Let A = Q, Σ, q 0, F, δ be a DFA. A is minimal iff p, q Q : p q p F q F a Σ : δ(p, a) δ(q, a). Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

86 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Theorem (Myhill-Nerode) The following propositions are equivalent: 1 L is recognized by a DFA A L = Q, Σ, q 0, F, δ. 2 L is the union of some equivalence classes of a right invariant equivalence relation R with finite index. 3 R L (x R L y iff z Σ : xz L yz L) is of finite index. Notes The Myhill-Nerode theorem links states with subsets of Σ. It is central to the theorem that the number of states in a DFA is finite. The Myhill-Nerode theorem assumes complete DFAs, that is, the corresponding transition function δ is total. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

87 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Definition (Equivalence relation, equivalence class) Let S be a set and E S S a binary relation. E is called a equivalence relation if E is reflexive, symmetric and transitive. If E is a equivalence relation, we call [x] E = {y x E y} the equivalence class of x wrt E. Properties of equivalence relations 1 x [x] E, x S 2 [x] E = [y] E [x] E [y] E =, x, y S 3 [x] E = S x S Definition (Index of a equivalence relation) Let E be a equivalence relation. The index I E of E is the number of E s equivalence classes. E is of finite index if I E is finite. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

88 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Definition (Right-invariant equivalence relation) Let R be a equivalence relation over Σ. R is called right-invariant (with respect to concatenation) if x, y, z Σ : x R y xz R yz Definition (Left language) Let A = Q, Σ, q 0, F, δ be a DFA. Define the left language L (p) of a state p Q as L (p) = {w Σ δ(q 0, w) = p}. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

89 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Myhill-Nerode theorem. We prove the theorem by chaining 1 2, 2 3 and Let A be a DFA recognizing L. Define R A as x R a y if δ(q 0, x) = δ(q 0, y), x, y Σ. Subproof: R A is right-invariant equivalence relation of finite index (the index of R A is Q ). The equivalence classes of R A are the left languages L (p), p Q. L (q) = L. Example q F L (1) = {ε, a(ba) b}, L (2) = {a(ba) }, L (3) = {a(ba) c(d) } L = L (2) L (3) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

90 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Myhill-Nerode theorem (continued) R = R A is an refinement of R L, that is, every equivalence class of R A is contained in some equivalence class of R L. 1 Assume that x R A y. 2 Since R A is right-invariant, xz R A yz, for all z Σ. 3 Thus xz L if and only if yz L. 4 Thus xz R L yz and the equivalence class of x wrt R A is contained in the equivalence class of x wrt R L. 5 Since the index of R A is finite (at most equal to Q ) and the index of R L is less or equal to the index of R A, we conclude that R L is of finite index. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

91 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Example (R A is a refinement of R L ) State Corresponding equiv. class of R A 3 e 9 frie 15 dog 17 doll 19 dollar 22 coll 24 collar State Corresponding equiv. class of R L 7 dog, collar, dollar, end, friend, frog 8 e, frie 12 doll 16 coll Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

92 Equivalence transformations on finite-state acceptors Minimization: Myhill-Nerode theorem Myhill-Nerode theorem (continued) Given R L, construct a new FSA A = Q, Σ, q 0, F, δ as follows: 1 Q = {[x] [x] is a equivalence class of Σ under R L } 2 q 0 = [ε] 3 δ : Q Σ Q : δ ([x], a) = [xa], [x] Q a Σ 4 F = {[x] x L} Example δ ([e], n) = [en] δ ({frie, e}, n) = {frien, en} δ ([en], d) = [end] δ ({frien, en}, d) = {friend, end} a a Inspired by CAKE: Friend is a four-letter word Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

93 Equivalence transformations on finite-state acceptors Minimization: algorithms Approaches 1 Union-Find-based: Find all states p and q with L (p) = L (q) and merge them. 2 Partition-based: Starting at sets of non-equivalent states, partition these sets further until each set contains only equivalent states. Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

94 Outline 6 Equivalence transformations on finite-state transducers Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

95 Equivalence transformations on finite-state transducers

96 Outline 7 Decidability properties of unweighted finite-state acceptors and transducers Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

97 Decidability properties of unweighted finite-state acceptors and transducers

98 Decidability properties of unweighted finite-state acceptors and transducers Given two finite-state acceptors A and A, the following properties are decidable: L(A) = L(A) = Σ L(A) = L(A ) L(A) L(A ) Given two finite-state transducers T and T, the following properties are decidable: T is functional Given two finite-state transducers T and T, the following properties are undecidable: L(T ) = L(T ) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

99 Version history : version 0.1 (initial version) : version 0.2 (some error corrections, added definition of composition of FSTs with ε-transitions) : version 0.3 (added example for ε-composition, enhanced example for cross product, added subtitles) : version 0.4 (completed minimization section) Thomas Hanneforth (Universität Potsdam) Finite-state Machines: Theory and Applications December 10, / 99

Formal Models in NLP

Formal Models in NLP Formal Models in NLP Finite-State Automata Nina Seemann Universität Stuttgart Institut für Maschinelle Sprachverarbeitung Pfaffenwaldring 5b 70569 Stuttgart May 15, 2012 Nina Seemann (IMS) Formal Models

More information

Finite State Transducers

Finite State Transducers Finite State Transducers Eric Gribkoff May 29, 2013 Original Slides by Thomas Hanneforth (Universitat Potsdam) Outline 1 Definition of Finite State Transducer 2 Examples of FSTs 3 Definition of Regular

More information

Closure under the Regular Operations

Closure under the Regular Operations September 7, 2013 Application of NFA Now we use the NFA to show that collection of regular languages is closed under regular operations union, concatenation, and star Earlier we have shown this closure

More information

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed HKN CS/ECE 374 Midterm 1 Review Nathan Bleier and Mahir Morshed For the most part, all about strings! String induction (to some extent) Regular languages Regular expressions (regexps) Deterministic finite

More information

What we have done so far

What we have done so far What we have done so far DFAs and regular languages NFAs and their equivalence to DFAs Regular expressions. Regular expressions capture exactly regular languages: Construct a NFA from a regular expression.

More information

Finite Automata and Regular Languages

Finite Automata and Regular Languages Finite Automata and Regular Languages Topics to be covered in Chapters 1-4 include: deterministic vs. nondeterministic FA, regular expressions, one-way vs. two-way FA, minimization, pumping lemma for regular

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r Syllabus R9 Regulation UNIT-II NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: In the automata theory, a nondeterministic finite automaton (NFA) or nondeterministic finite state machine is a finite

More information

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET Regular Languages and FA A language is a set of strings over a finite alphabet Σ. All languages are finite or countably infinite. The set of all languages

More information

FS Properties and FSTs

FS Properties and FSTs FS Properties and FSTs Chris Dyer Algorithms for NLP 11-711 Announcements HW1 has been posted; due in class in 2 weeks Goals of Today s Lecture Understand properties of regular languages Understand Brzozowski

More information

Nondeterministic Finite Automata

Nondeterministic Finite Automata Nondeterministic Finite Automata Not A DFA Does not have exactly one transition from every state on every symbol: Two transitions from q 0 on a No transition from q 1 (on either a or b) Though not a DFA,

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 5-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY NON-DETERMINISM and REGULAR OPERATIONS THURSDAY JAN 6 UNION THEOREM The union of two regular languages is also a regular language Regular Languages Are

More information

Computational Models - Lecture 3

Computational Models - Lecture 3 Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 1 Computational Models - Lecture 3 Equivalence of regular expressions and regular languages (lukewarm leftover

More information

CS 455/555: Finite automata

CS 455/555: Finite automata CS 455/555: Finite automata Stefan D. Bruda Winter 2019 AUTOMATA (FINITE OR NOT) Generally any automaton Has a finite-state control Scans the input one symbol at a time Takes an action based on the currently

More information

Closure under the Regular Operations

Closure under the Regular Operations Closure under the Regular Operations Application of NFA Now we use the NFA to show that collection of regular languages is closed under regular operations union, concatenation, and star Earlier we have

More information

Computational Models - Lecture 3 1

Computational Models - Lecture 3 1 Computational Models - Lecture 3 1 Handout Mode Iftach Haitner and Yishay Mansour. Tel Aviv University. March 13/18, 2013 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice

More information

COM364 Automata Theory Lecture Note 2 - Nondeterminism

COM364 Automata Theory Lecture Note 2 - Nondeterminism COM364 Automata Theory Lecture Note 2 - Nondeterminism Kurtuluş Küllü March 2018 The FA we saw until now were deterministic FA (DFA) in the sense that for each state and input symbol there was exactly

More information

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism, CS 54, Lecture 2: Finite Automata, Closure Properties Nondeterminism, Why so Many Models? Streaming Algorithms 0 42 Deterministic Finite Automata Anatomy of Deterministic Finite Automata transition: for

More information

Theory of Computation

Theory of Computation Fall 2002 (YEN) Theory of Computation Midterm Exam. Name:... I.D.#:... 1. (30 pts) True or false (mark O for true ; X for false ). (Score=Max{0, Right- 1 2 Wrong}.) (1) X... If L 1 is regular and L 2 L

More information

Theory of Computation (II) Yijia Chen Fudan University

Theory of Computation (II) Yijia Chen Fudan University Theory of Computation (II) Yijia Chen Fudan University Review A language L is a subset of strings over an alphabet Σ. Our goal is to identify those languages that can be recognized by one of the simplest

More information

3515ICT: Theory of Computation. Regular languages

3515ICT: Theory of Computation. Regular languages 3515ICT: Theory of Computation Regular languages Notation and concepts concerning alphabets, strings and languages, and identification of languages with problems (H, 1.5). Regular expressions (H, 3.1,

More information

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is, Recall A deterministic finite automaton is a five-tuple where S is a finite set of states, M = (S, Σ, T, s 0, F ) Σ is an alphabet the input alphabet, T : S Σ S is the transition function, s 0 S is the

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

CS 121, Section 2. Week of September 16, 2013

CS 121, Section 2. Week of September 16, 2013 CS 121, Section 2 Week of September 16, 2013 1 Concept Review 1.1 Overview In the past weeks, we have examined the finite automaton, a simple computational model with limited memory. We proved that DFAs,

More information

Finite-State Transducers

Finite-State Transducers Finite-State Transducers - Seminar on Natural Language Processing - Michael Pradel July 6, 2007 Finite-state transducers play an important role in natural language processing. They provide a model for

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY REVIEW for MIDTERM 1 THURSDAY Feb 6 Midterm 1 will cover everything we have seen so far The PROBLEMS will be from Sipser, Chapters 1, 2, 3 It will be

More information

FINITE STATE AUTOMATA

FINITE STATE AUTOMATA FINITE STATE AUTOMATA States An FSA has a finite set of states A system has a limited number of configurations Examples {On, Off}, {1,2,3,4,,k} {TV channels} States can be graphically represented as follows:

More information

More on Finite Automata and Regular Languages. (NTU EE) Regular Languages Fall / 41

More on Finite Automata and Regular Languages. (NTU EE) Regular Languages Fall / 41 More on Finite Automata and Regular Languages (NTU EE) Regular Languages Fall 2016 1 / 41 Pumping Lemma is not a Sufficient Condition Example 1 We know L = {b m c m m > 0} is not regular. Let us consider

More information

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova. Introduction to the Theory of Computation Automata 1VO + 1PS Lecturer: Dr. Ana Sokolova http://cs.uni-salzburg.at/~anas/ Setup and Dates Lectures and Instructions 23.10. 3.11. 17.11. 24.11. 1.12. 11.12.

More information

DM17. Beregnelighed. Jacob Aae Mikkelsen

DM17. Beregnelighed. Jacob Aae Mikkelsen DM17 Beregnelighed Jacob Aae Mikkelsen January 12, 2007 CONTENTS Contents 1 Introduction 2 1.1 Operations with languages...................... 2 2 Finite Automata 3 2.1 Regular expressions/languages....................

More information

Introduction to Finite-State Automata

Introduction to Finite-State Automata Introduction to Finite-State Automata John McDonough Language Technologies Institute, Machine Learning for Signal Processing Group, Carnegie Mellon University March 26, 2012 Introduction In this lecture,

More information

Sri vidya college of engineering and technology

Sri vidya college of engineering and technology Unit I FINITE AUTOMATA 1. Define hypothesis. The formal proof can be using deductive proof and inductive proof. The deductive proof consists of sequence of statements given with logical reasoning in order

More information

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars COMP-330 Theory of Computation Fall 2017 -- Prof. Claude Crépeau Lec. 10 : Context-Free Grammars COMP 330 Fall 2017: Lectures Schedule 1-2. Introduction 1.5. Some basic mathematics 2-3. Deterministic finite

More information

Inf2A: Converting from NFAs to DFAs and Closure Properties

Inf2A: Converting from NFAs to DFAs and Closure Properties 1/43 Inf2A: Converting from NFAs to DFAs and Stuart Anderson School of Informatics University of Edinburgh October 13, 2009 Starter Questions 2/43 1 Can you devise a way of testing for any FSM M whether

More information

1 More finite deterministic automata

1 More finite deterministic automata CS 125 Section #6 Finite automata October 18, 2016 1 More finite deterministic automata Exercise. Consider the following game with two players: Repeatedly flip a coin. On heads, player 1 gets a point.

More information

Chapter Five: Nondeterministic Finite Automata

Chapter Five: Nondeterministic Finite Automata Chapter Five: Nondeterministic Finite Automata From DFA to NFA A DFA has exactly one transition from every state on every symbol in the alphabet. By relaxing this requirement we get a related but more

More information

Part I: Definitions and Properties

Part I: Definitions and Properties Turing Machines Part I: Definitions and Properties Finite State Automata Deterministic Automata (DFSA) M = {Q, Σ, δ, q 0, F} -- Σ = Symbols -- Q = States -- q 0 = Initial State -- F = Accepting States

More information

Finite Automata and Regular languages

Finite Automata and Regular languages Finite Automata and Regular languages Huan Long Shanghai Jiao Tong University Acknowledgements Part of the slides comes from a similar course in Fudan University given by Prof. Yijia Chen. http://basics.sjtu.edu.cn/

More information

Formal Languages. We ll use the English language as a running example.

Formal Languages. We ll use the English language as a running example. Formal Languages We ll use the English language as a running example. Definitions. A string is a finite set of symbols, where each symbol belongs to an alphabet denoted by. Examples. The set of all strings

More information

Chapter 2: Finite Automata

Chapter 2: Finite Automata Chapter 2: Finite Automata 2.1 States, State Diagrams, and Transitions Finite automaton is the simplest acceptor or recognizer for language specification. It is also the simplest model of a computer. A

More information

Uses of finite automata

Uses of finite automata Chapter 2 :Finite Automata 2.1 Finite Automata Automata are computational devices to solve language recognition problems. Language recognition problem is to determine whether a word belongs to a language.

More information

CS 154. Finite Automata, Nondeterminism, Regular Expressions

CS 154. Finite Automata, Nondeterminism, Regular Expressions CS 54 Finite Automata, Nondeterminism, Regular Expressions Read string left to right The DFA accepts a string if the process ends in a double circle A DFA is a 5-tuple M = (Q, Σ, δ, q, F) Q is the set

More information

Intro to Theory of Computation

Intro to Theory of Computation Intro to Theory of Computation 1/19/2016 LECTURE 3 Last time: DFAs and NFAs Operations on languages Today: Nondeterminism Equivalence of NFAs and DFAs Closure properties of regular languages Sofya Raskhodnikova

More information

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA) Deterministic Finite Automata Non deterministic finite automata Automata we ve been dealing with have been deterministic For every state and every alphabet symbol there is exactly one move that the machine

More information

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I Internal Examination 2017-18 B.Tech III Year VI Semester Sub: Theory of Computation (6CS3A) Time: 1 Hour 30 min. Max Marks: 40 Note: Attempt all three

More information

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA) Languages Non deterministic finite automata with ε transitions Recall What is a language? What is a class of languages? Finite Automata Consists of A set of states (Q) A start state (q o ) A set of accepting

More information

Automata: a short introduction

Automata: a short introduction ILIAS, University of Luxembourg Discrete Mathematics II May 2012 What is a computer? Real computers are complicated; We abstract up to an essential model of computation; We begin with the simplest possible

More information

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F) Outline Nondeterminism Regular expressions Elementary reductions http://www.cs.caltech.edu/~cs20/a October 8, 2002 1 Determistic Finite Automata A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F) Q is a finite

More information

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova. Introduction to the Theory of Computation Automata 1VO + 1PS Lecturer: Dr. Ana Sokolova http://cs.uni-salzburg.at/~anas/ Setup and Dates Lectures Tuesday 10:45 pm - 12:15 pm Instructions Tuesday 12:30

More information

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism Closure Properties of Regular Languages Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism Closure Properties Recall a closure property is a statement

More information

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska LECTURE 6 CHAPTER 2 FINITE AUTOMATA 2. Nondeterministic Finite Automata NFA 3. Finite Automata and Regular Expressions 4. Languages

More information

Theory of Computation (I) Yijia Chen Fudan University

Theory of Computation (I) Yijia Chen Fudan University Theory of Computation (I) Yijia Chen Fudan University Instructor Yijia Chen Homepage: http://basics.sjtu.edu.cn/~chen Email: yijiachen@fudan.edu.cn Textbook Introduction to the Theory of Computation Michael

More information

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS Automata Theory Lecture on Discussion Course of CS2 This Lecture is about Mathematical Models of Computation. Why Should I Care? - Ways of thinking. - Theory can drive practice. - Don t be an Instrumentalist.

More information

Introduction to Formal Languages, Automata and Computability p.1/51

Introduction to Formal Languages, Automata and Computability p.1/51 Introduction to Formal Languages, Automata and Computability Finite State Automata K. Krithivasan and R. Rama Introduction to Formal Languages, Automata and Computability p.1/51 Introduction As another

More information

Finite Automata and Languages

Finite Automata and Languages CS62, IIT BOMBAY Finite Automata and Languages Ashutosh Trivedi Department of Computer Science and Engineering, IIT Bombay CS62: New Trends in IT: Modeling and Verification of Cyber-Physical Systems (2

More information

Nondeterministic Finite Automata. Nondeterminism Subset Construction

Nondeterministic Finite Automata. Nondeterminism Subset Construction Nondeterministic Finite Automata Nondeterminism Subset Construction 1 Nondeterminism A nondeterministic finite automaton has the ability to be in several states at once. Transitions from a state on an

More information

V Honors Theory of Computation

V Honors Theory of Computation V22.0453-001 Honors Theory of Computation Problem Set 3 Solutions Problem 1 Solution: The class of languages recognized by these machines is the exactly the class of regular languages, thus this TM variant

More information

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar Büchi Automata and their closure properties - Ajith S and Ankit Kumar Motivation Conventional programs accept input, compute, output result, then terminate Reactive program : not expected to terminate

More information

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES Contents i SYLLABUS UNIT - I CHAPTER - 1 : AUT UTOMA OMATA Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 2 : FINITE AUT UTOMA OMATA An Informal Picture of Finite Automata,

More information

Finite Automata - Deterministic Finite Automata. Deterministic Finite Automaton (DFA) (or Finite State Machine)

Finite Automata - Deterministic Finite Automata. Deterministic Finite Automaton (DFA) (or Finite State Machine) Finite Automata - Deterministic Finite Automata Deterministic Finite Automaton (DFA) (or Finite State Machine) M = (K, Σ, δ, s, A), where K is a finite set of states Σ is an input alphabet s K is a distinguished

More information

Computational Models: Class 3

Computational Models: Class 3 Computational Models: Class 3 Benny Chor School of Computer Science Tel Aviv University November 2, 2015 Based on slides by Maurice Herlihy, Brown University, and modifications by Iftach Haitner and Yishay

More information

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Before we show how languages can be proven not regular, first, how would we show a language is regular? CS35 Proving Languages not to be Regular Before we show how languages can be proven not regular, first, how would we show a language is regular? Although regular languages and automata are quite powerful

More information

Lecture 3: Nondeterministic Finite Automata

Lecture 3: Nondeterministic Finite Automata Lecture 3: Nondeterministic Finite Automata September 5, 206 CS 00 Theory of Computation As a recap of last lecture, recall that a deterministic finite automaton (DFA) consists of (Q, Σ, δ, q 0, F ) where

More information

Computational Models - Lecture 5 1

Computational Models - Lecture 5 1 Computational Models - Lecture 5 1 Handout Mode Iftach Haitner and Yishay Mansour. Tel Aviv University. April 10/22, 2013 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice

More information

Theory of Computation

Theory of Computation Theory of Computation COMP363/COMP6363 Prerequisites: COMP4 and COMP 6 (Foundations of Computing) Textbook: Introduction to Automata Theory, Languages and Computation John E. Hopcroft, Rajeev Motwani,

More information

CPSC 421: Tutorial #1

CPSC 421: Tutorial #1 CPSC 421: Tutorial #1 October 14, 2016 Set Theory. 1. Let A be an arbitrary set, and let B = {x A : x / x}. That is, B contains all sets in A that do not contain themselves: For all y, ( ) y B if and only

More information

COMP4141 Theory of Computation

COMP4141 Theory of Computation COMP4141 Theory of Computation Lecture 4 Regular Languages cont. Ron van der Meyden CSE, UNSW Revision: 2013/03/14 (Credits: David Dill, Thomas Wilke, Kai Engelhardt, Peter Höfner, Rob van Glabbeek) Regular

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION FORMAL LANGUAGES, AUTOMATA AND COMPUTATION DECIDABILITY ( LECTURE 15) SLIDES FOR 15-453 SPRING 2011 1 / 34 TURING MACHINES-SYNOPSIS The most general model of computation Computations of a TM are described

More information

Automata Theory and Formal Grammars: Lecture 1

Automata Theory and Formal Grammars: Lecture 1 Automata Theory and Formal Grammars: Lecture 1 Sets, Languages, Logic Automata Theory and Formal Grammars: Lecture 1 p.1/72 Sets, Languages, Logic Today Course Overview Administrivia Sets Theory (Review?)

More information

Lecture 1: Finite State Automaton

Lecture 1: Finite State Automaton Lecture 1: Finite State Automaton Instructor: Ketan Mulmuley Scriber: Yuan Li January 6, 2015 1 Deterministic Finite Automaton Informally, a deterministic finite automaton (DFA) has finite number of s-

More information

Chapter 5. Finite Automata

Chapter 5. Finite Automata Chapter 5 Finite Automata 5.1 Finite State Automata Capable of recognizing numerous symbol patterns, the class of regular languages Suitable for pattern-recognition type applications, such as the lexical

More information

Finite State Automata

Finite State Automata Trento 2005 p. 1/4 Finite State Automata Automata: Theory and Practice Paritosh K. Pandya (TIFR, Mumbai, India) Unversity of Trento 10-24 May 2005 Trento 2005 p. 2/4 Finite Word Langauges Alphabet Σ is

More information

This lecture covers Chapter 7 of HMU: Properties of CFLs

This lecture covers Chapter 7 of HMU: Properties of CFLs This lecture covers Chapter 7 of HMU: Properties of CFLs Chomsky Normal Form Pumping Lemma for CFs Closure Properties of CFLs Decision Properties of CFLs Additional Reading: Chapter 7 of HMU. Chomsky Normal

More information

Regular expressions and automata

Regular expressions and automata Regular expressions and automata Introduction Finite State Automaton (FSA) Finite State Transducers (FST) Regular expressions(i) (Res) Standard notation for characterizing text sequences Specifying text

More information

Computer Sciences Department

Computer Sciences Department 1 Reference Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER 3 objectives Finite automaton Infinite automaton Formal definition State diagram Regular and Non-regular

More information

Introduction to Finite Automaton

Introduction to Finite Automaton Lecture 1 Introduction to Finite Automaton IIP-TL@NTU Lim Zhi Hao 2015 Lecture 1 Introduction to Finite Automata (FA) Intuition of FA Informally, it is a collection of a finite set of states and state

More information

Computational Models - Lecture 4 1

Computational Models - Lecture 4 1 Computational Models - Lecture 4 1 Handout Mode Iftach Haitner. Tel Aviv University. November 21, 2016 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice Herlihy, Brown University.

More information

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen Pushdown automata Twan van Laarhoven Institute for Computing and Information Sciences Intelligent Systems Version: fall 2014 T. van Laarhoven Version: fall 2014 Formal Languages, Grammars and Automata

More information

arxiv: v2 [cs.fl] 29 Nov 2013

arxiv: v2 [cs.fl] 29 Nov 2013 A Survey of Multi-Tape Automata Carlo A. Furia May 2012 arxiv:1205.0178v2 [cs.fl] 29 Nov 2013 Abstract This paper summarizes the fundamental expressiveness, closure, and decidability properties of various

More information

Equivalence of Regular Expressions and FSMs

Equivalence of Regular Expressions and FSMs Equivalence of Regular Expressions and FSMs Greg Plaxton Theory in Programming Practice, Spring 2005 Department of Computer Science University of Texas at Austin Regular Language Recall that a language

More information

Formal Languages. We ll use the English language as a running example.

Formal Languages. We ll use the English language as a running example. Formal Languages We ll use the English language as a running example. Definitions. A string is a finite set of symbols, where each symbol belongs to an alphabet denoted by Σ. Examples. The set of all strings

More information

UNIT-VIII COMPUTABILITY THEORY

UNIT-VIII COMPUTABILITY THEORY CONTEXT SENSITIVE LANGUAGE UNIT-VIII COMPUTABILITY THEORY A Context Sensitive Grammar is a 4-tuple, G = (N, Σ P, S) where: N Set of non terminal symbols Σ Set of terminal symbols S Start symbol of the

More information

Notes on State Minimization

Notes on State Minimization U.C. Berkeley CS172: Automata, Computability and Complexity Handout 1 Professor Luca Trevisan 2/3/2015 Notes on State Minimization These notes present a technique to prove a lower bound on the number of

More information

Finite Universes. L is a fixed-length language if it has length n for some

Finite Universes. L is a fixed-length language if it has length n for some Finite Universes Finite Universes When the universe is finite (e.g., the interval 0, 2 1 ), all objects can be encoded by words of the same length. A language L has length n 0 if L =, or every word of

More information

Functions on languages:

Functions on languages: MA/CSSE 474 Final Exam Notation and Formulas page Name (turn this in with your exam) Unless specified otherwise, r,s,t,u,v,w,x,y,z are strings over alphabet Σ; while a, b, c, d are individual alphabet

More information

UNIT-III REGULAR LANGUAGES

UNIT-III REGULAR LANGUAGES Syllabus R9 Regulation REGULAR EXPRESSIONS UNIT-III REGULAR LANGUAGES Regular expressions are useful for representing certain sets of strings in an algebraic fashion. In arithmetic we can use the operations

More information

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata : Organization of Programming Languages Theory of Regular Expressions Finite Automata Previous Course Review {s s defined} means the set of string s such that s is chosen or defined as given s A means

More information

Theory of computation: initial remarks (Chapter 11)

Theory of computation: initial remarks (Chapter 11) Theory of computation: initial remarks (Chapter 11) For many purposes, computation is elegantly modeled with simple mathematical objects: Turing machines, finite automata, pushdown automata, and such.

More information

THEORY OF COMPUTATION

THEORY OF COMPUTATION THEORY OF COMPUTATION There are four sorts of men: He who knows not and knows not he knows not: he is a fool - shun him; He who knows not and knows he knows not: he is simple teach him; He who knows and

More information

CPS 220 Theory of Computation

CPS 220 Theory of Computation CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of

More information

Non-deterministic Finite Automata (NFAs)

Non-deterministic Finite Automata (NFAs) Algorithms & Models of Computation CS/ECE 374, Fall 27 Non-deterministic Finite Automata (NFAs) Part I NFA Introduction Lecture 4 Thursday, September 7, 27 Sariel Har-Peled (UIUC) CS374 Fall 27 / 39 Sariel

More information

Prime Languages, Orna Kupferman, Jonathan Mosheiff. School of Engineering and Computer Science The Hebrew University, Jerusalem, Israel

Prime Languages, Orna Kupferman, Jonathan Mosheiff. School of Engineering and Computer Science The Hebrew University, Jerusalem, Israel Prime Languages, Orna Kupferman, Jonathan Mosheiff School of Engineering and Computer Science The Hebrew University, Jerusalem, Israel Abstract We say that a deterministic finite automaton (DFA) A is composite

More information

Computational Theory

Computational Theory Computational Theory Finite Automata and Regular Languages Curtis Larsen Dixie State University Computing and Design Fall 2018 Adapted from notes by Russ Ross Adapted from notes by Harry Lewis Curtis Larsen

More information

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata. CMSC 330: Organization of Programming Languages Last Lecture Languages Sets of strings Operations on languages Finite Automata Regular expressions Constants Operators Precedence CMSC 330 2 Clarifications

More information

The View Over The Horizon

The View Over The Horizon The View Over The Horizon enumerable decidable context free regular Context-Free Grammars An example of a context free grammar, G 1 : A 0A1 A B B # Terminology: Each line is a substitution rule or production.

More information

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata CISC 4090: Theory of Computation Chapter Regular Languages Xiaolan Zhang, adapted from slides by Prof. Werschulz Section.: Finite Automata Fordham University Department of Computer and Information Sciences

More information

Regular Expressions and Finite-State Automata. L545 Spring 2008

Regular Expressions and Finite-State Automata. L545 Spring 2008 Regular Expressions and Finite-State Automata L545 Spring 2008 Overview Finite-state technology is: Fast and efficient Useful for a variety of language tasks Three main topics we ll discuss: Regular Expressions

More information

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs CSE : Foundations of Computing Lecture : Finite State Machine Minimization & NFAs State Minimization Many different FSMs (DFAs) for the same problem Take a given FSM and try to reduce its state set by

More information

NOTES ON AUTOMATA. Date: April 29,

NOTES ON AUTOMATA. Date: April 29, NOTES ON AUTOMATA 1. Monoids acting on sets We say that a monoid S with identity element ɛ acts on a set Q if q(st) = (qs)t and qɛ = q. As with groups, if we set s = t whenever qs = qt for all q Q, then

More information

Java II Finite Automata I

Java II Finite Automata I Java II Finite Automata I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz November, 23 Processing Regular Expressions We already learned about Java s regular expression

More information