CS 54, Lecture 2: Finite Automata, Closure Properties Nondeterminism,
Why so Many Models?
Streaming Algorithms 0 42
Deterministic Finite Automata
Anatomy of Deterministic Finite Automata transition: for every state and alphabet symbol states accept states (F) q 0 0, q 0 q 2 0 0 start state (q 0 ) q 3 states
0 0, 0 The DFA reads its string from left to right 0 0 The automaton accepts the string if this process ends in a double-circle (accept state). Otherwise, the automaton rejects the string.
What and Why DFA Simple: Constant-size memory & read-once input Both as important? Foundational or archaic object? So close to a Turing Machine and yet so far: nondeterminism (verified guessing) is weak, optimizing (and some settings of learning) is easy Allows us to talk about useful vocabulary, nondeterminism, (learning), limitations, streaming, communication complexity
IBM JOURNAL APRIL 959 Turing Award winning paper
Some Vocabulary An alphabet Σ is a finite set (e.g., Σ = {0,}) A string over Σ is a finite length sequence of elements of Σ Σ* = the set of all strings over Σ For string x, x is the length of x The unique string of length 0 is denoted by ε and is called the empty string
Languages A language over Σ is a set of strings over Σ In other words: a language is a subset of Σ* Languages = Boolean functions over strings Every language L over Σ corresponds to a unique function f : Σ* {0,}: Define f such that f(x) = if x L = 0 otherwise
Definition. A DFA is a 5-tuple M = (Q, Σ,, q 0, F) Q is the set of states (finite) Σ is the alphabet (finite) : Q Σ Q is the transition function q 0 Q is the start state F Q is the set of accept/final states
Definition. A DFA is a 5-tuple M = (Q, Σ,, q 0, F) Let w,..., w n Σ and w = w... w n Σ* M accepts w if there are r 0, r,..., r n Q, s.t. r 0 = q 0 (r i, w i+ ) = r i+ for all i = 0,..., n-, and r n F
M = (Q, Σ,, q 0, F) where Q = {q 0, q, q 2, q 3 } Σ = {0,} : Q Σ Q transition function* q 0 Q is start state F = {q, q 2 } Q accept states * 0 q 0 q 0 q q q 2 q 2 q 2 q 3 q 2 q 3 q 0 q 2
A DFA is a 5-tuple M = (Q, Σ,, q 0, F) L(M) = set of all strings that M accepts = the language recognized by M = the function computed by M
0, M 00 00 0 0 0, L(M) = { w w begins with }
q 0 0, L(M) ={0,}*
0 0 q 0 q L(M) = { w w has an even number of s}
0 0 Theorem: p q L(M) = {w w has odd number of s } Proof: By induction on n, the length of a string. Base Case n=0: ε L and ε L(M) Induction Step Suppose for all w Σ*, w = n, M accepts w w has odd number of s A string of length n+ has the form w0 or w, w =n, continue with a case analysis
Build a DFA that accepts exactly the strings containing 00 0 0, 0 0 q q 0 q 00 q 00
Definition: A language L is regular if L is recognized by a DFA; that is, there is a DFA M where L = L(M). L = { w w contains 00} is regular L = { w w has an even number of s} is regular
Closure Properties for Regular Languages Union: A B = { w w A or w B } Intersection: A B = { w w A and w B } Complement: A = { w Σ* w A } Reverse: A R = { w w k w k w A, w i Σ} Concatenation: A B = { vw v A and w B } Star: A* = { s s k k 0 and each s i A } Theorem: if A and B are regular then so are: A B, A B, A, A R, A B, and A*
Union Theorem for Regular Languages Given two languages L and L 2 recall that the union of L and L 2 is L L 2 = { w w L or w L 2 } Theorem: The union of two regular languages is also a regular language
Proof: Let Theorem: The union of two regular languages is also a regular language M = (Q, Σ,, q 0, F ) be a finite automaton for L 2 and M 2 = (Q 2, Σ, 2, q 0, F 2 ) be a finite automaton for L 2 We want to construct a finite automaton M = (Q, Σ,, q 0, F) that recognizes L = L L 2
Proof Idea: Run both M and M 2 in parallel! M = (Q, Σ,, q 0, F ) recognizes L Q = pairs of states, one from M and one from M 2 = { (q, q 2 ) q Q and q 2 Q 2 } = Q Q 2 q 0 = (q 0, q 2 0 ) F = { (q, q 2 ) q F OR q 2 F 2 } ( (q,q 2 ), ) = ( (q, ), 2 (q 2, )) 2 M 2 = (Q 2, Σ, 2, q 0, F 2 ) recognizes L 2 and
Theorem: The union of two regular languages is also a regular language 0 0 q 0 q 0 p 0 p 0
q 0,p 0 q,p 0 0 0 0 0 q 0,p q,p What about the INTERSECTION of two languages?
Proof Idea: Run both M and M 2 in parallel! M = (Q, Σ,, q 0, F ) recognizes L Q = pairs of states, one from M and one from M 2 = { (q, q 2 ) q Q and q 2 Q 2 } = Q Q 2 q 0 = (q 0, q 2 0 ) F = { (q, q 2 ) q F AND q 2 F 2 } ( (q,q 2 ), ) = ( (q, ), 2 (q 2, )) 2 M 2 = (Q 2, Σ, 2, q 0, F 2 ) recognizes L 2 and
Union Theorem for Regular Languages: The union of two regular languages is also a regular language Regular Languages Are Closed Under Union Intersection Theorem for Regular Languages: The intersection of two regular languages is also a regular language Regular Languages Are Closed Under Intersextion
Complement Theorem for Regular Languages The complement of a regular language is also a regular language: If L is regular than so is L= { w Σ* w L } Proof?
Closure Properties for Regular Languages Union: A B = { w w A or w B } Intersection: A B = { w w A and w B } Complement: A = { w Σ* w A } Reverse: A R = { w w k w k w A, w i Σ} Concatenation: A B = { vw v A and w B } Star: A* = { s s k k 0 and each s i A } Theorem: if A and B are regular then so are: A B, A B, A, A R, A B, and A*
The Reverse of a Language Reverse of L: L R = { w w k w k w L, w i Σ} Theorem: If L is regular, then is L R also regular If L is recognized by the usual kind of DFA, Then L R is recognized by a DFA that reads its strings from right to left How can every Right-to-Left DFA be replaced by a normal Left-to-Right DFA?
Reversing DFAs Assume L is a regular language. Let M be a DFA that recognizes L We want to build a machine M R that accepts L R If M accepts w, then w describes a directed path in M from the start state to an accept state First Attempt: Try to define M R as M with the arrows reversed, turn start state into a final state, turn final states into starts
Problem: M R is not always a DFA! It could have many start states Some states may have more than one outgoing edge, or none at all!
Non-deterministic Finite Automata (NFA) What happens with 00? We will say this new machine accepts a string if there is some path that reaches some accept state from some start state
Another Example of an NFA At each state, we can have any number of out arrows for some letter Σ, including ε Set of strings accepted by this NFA = {w w contains a 0}
Multiple Start States We allow multiple start states for NFAs, and Sipser allows only one Can easily convert NFA with many start states into one with a single start state: ε ε ε
Yet Another Example of an NFA L(M) = {,00}
A non-deterministic finite automaton (NFA) is a 5-tuple N = (Q, Σ,, Q 0, F) where Q is the set of states Σ is the alphabet : Q Σ ε! 2 Q is the transition function Q 0 Q is the set of start states F Q is the set of accept states 2 Q is the set of all possible subsets of Q Σ ε = Σ {ε}
Def. Let w Σ*. Let N be an NFA. N accepts w if there s a sequence of states r 0, r,..., r k Q and w can be written as w... w k with w i Σ {ε} s.t.. r 0 Q 0 2. r i+ (r i, w i+ ) for all i = 0,..., k-, and 3. r n F L(N) = the language recognized by N = set of all strings machine N accepts Language L is recognized by an NFA N if L = L(N).
N = (Q, Σ,, Q 0, F) Q = {q, q 2, q 3, q 4 } Σ = {0,} Q 0 = {q, q 2 } F = {q 4 } Q 00 L(N)? 0 L(N)? (q 2,) = {q 4 } (q 3,) = (q,0) = {q 3 }
NFAs are generally simpler than DFAs A DFA recognizing the language {} An NFA recognizing the language {}
Deterministic Computation Non-Deterministic Computation reject accept or reject accept Are these equally powerful???
Parting thoughts: Regular Languages can be manipulated When life hands you ambiguity define Nondeterministic Finite Automata Is verifying easier than computing (I)? Questions?