Deterministic Finite Automata

Similar documents
Deterministic Finite Automata

Models of Computation I: Finite State Automata

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Nondeterministic Finite Automata

Theory of Computation

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

What we have done so far

Finite Automata and Languages

Finite Automata. Warren McCulloch ( ) and Walter Pitts ( )

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Finite Automata. Finite Automata

Decision, Computation and Language

CS 154, Lecture 3: DFA NFA, Regular Expressions

Finite Automata. Mahesh Viswanathan

Nondeterminism. September 7, Nondeterminism

Finite State Machines 2

Announcements. Problem Set Four due Thursday at 7:00PM (right before the midterm).

Sri vidya college of engineering and technology

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

CISC4090: Theory of Computation

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Turing Machines Part III

Nondeterministic Finite Automata

CS 154 Introduction to Automata and Complexity Theory

Languages, regular languages, finite automata

Finite Automata (contd)

Proving languages to be nonregular

Finite Automata Part One

Turing Machines. COMP2600 Formal Methods for Software Engineering. Katya Lebedeva. Australian National University Semester 2, 2014

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

More on Regular Languages and Non-Regular Languages

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

Turing Machines Part II

UNIT-III REGULAR LANGUAGES

CSC236 Week 10. Larry Zhang

Theory of computation: initial remarks (Chapter 11)

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Lecture 2: Connecting the Three Models

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

Undecidable Problems and Reducibility

Finite Automata Part One

Lecture Notes: The Halting Problem; Reductions

Computational Models - Lecture 4

CSCI 2200 Foundations of Computer Science Spring 2018 Quiz 3 (May 2, 2018) SOLUTIONS

3515ICT: Theory of Computation. Regular languages

(Refer Slide Time: 0:21)

Theory of Computation


Computational Models #1

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Computational Models: Class 3

Theory of Computation

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Regular Languages and Finite Automata

Theory of Computation Lecture 1. Dr. Nahla Belal

Theory of computation: initial remarks (Chapter 11)

Accept or reject. Stack

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Theory of Computation

CPSC 421: Tutorial #1

Deterministic Finite Automata (DFAs)

5 Context-Free Languages

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

September 7, Formal Definition of a Nondeterministic Finite Automaton

CS375: Logic and Theory of Computing

CSE 105 THEORY OF COMPUTATION

CSC173 Workshop: 13 Sept. Notes

Deterministic Finite Automata (DFAs)

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Part I: Definitions and Properties

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Recognizing Safety and Liveness by Alpern and Schneider

Theory Bridge Exam Example Questions

CSE 105 THEORY OF COMPUTATION

Computational Models Lecture 2 1

Finite Automata Part Two

Introduction to Turing Machines. Reading: Chapters 8 & 9

Computational Models - Lecture 1 1

Context-Free Grammars and Languages. Reading: Chapter 5

arxiv: v1 [cs.fl] 19 Mar 2015

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

CSE 105 THEORY OF COMPUTATION

Foundations of Informatics: a Bridging Course

Alan Bundy. Automated Reasoning LTL Model Checking

Finite Automata Part One

Lecture Notes on Inductive Definitions

CMPSCI 250: Introduction to Computation. Lecture #23: Finishing Kleene s Theorem David Mix Barrington 25 April 2013

UNIT-VIII COMPUTABILITY THEORY

Computational Models - Lecture 3

Finite-State Machines (Automata) lecture 12

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Automata: a short introduction

Transcription:

Deterministic Finite Automata COMP2600 Formal Methods for Software Engineering Ranald Clouston Australian National University Semester 2, 2013 COMP 2600 Deterministic Finite Automata 1

Pop quiz What is x y( x y ) where x = 4195835 ; y = 3145727? COMP 2600 Deterministic Finite Automata 2

Pop quiz - answer If you answered 256, congratulations! You are a Pentium microprocessor released by Intel in 1993. This floating point error evaded Intel s testing, only to be discovered by computational number theorist Thomas R. Nicely while generating large prime numbers. A full recall of the chip ended up costing Intel US$475 million. But how can Intel ever have confidence that their hardware is error-free and so won t cost them money and reputation? Sounds like a job for formal methods! COMP 2600 Deterministic Finite Automata 3

Formal modelling Intel are now world leaders in the formal verification of hardware. A full description of the formal methods approaches they use is outside the bounds of this course. But here s a specific problem - how do you reason mathematically about a physical object like a computer chip? Photo by Matt Britt & Matt Gibbs COMP 2600 Deterministic Finite Automata 4

Abstract Machines The solution is to model the physical object via a mathematically defined abstract machine. This approach is also useful for software, and is particularly well suited to certain applications such as parsing. Abstract machines come in various flavours, and will be the focus of the rest of this course. The technical highlight will be Turing machines, which are powerful enough to perform any computation that can be imagined! However less powerful abstract machines are often easier to reason about and hence more useful for formal methods applications. COMP 2600 Deterministic Finite Automata 5

Finite State Automata The first flavour of abstract machine we will look at will be the least powerful - Finite State Automata (FSA). They are still powerful enough to encode the actions of microprocessors, and so are a fundamental formal methods concept applied by Intel and others. In fact they were invented, not by computer scientists interested in hardware, but by two neuropsychologists, Warren S. McCullough and Walter Pitts, who back in 1943 were trying to model neurons in the human brain! Such unlikely connections are a common theme in the history of science. COMP 2600 Deterministic Finite Automata 6

Input and Output Finite state automata in this course will Accept as input a string of symbols. Produce as output success or failure only. Therefore one way to think about abstract machines is to ask which strings are accepted and which rejected by them. Conversely, given a set of strings (called a language) we might try to design an abstract machine that recognises all and only those strings. The study of abstract machines and the study of (formal) languages are therefore closely linked. COMP 2600 Deterministic Finite Automata 7

Language Examples Formal Language Theory gets applied to natural and artificial languages and to programming languages, but in this course we will look at simpler examples over small alphabets, e.g.: 1. finite sets. {a,aa,ab,aaa,aab,aba,abb} 2. Palindromes consisting of bits (0,1): {0,1,00,11,010,101,000,111,0110,...} In each case the language is the set of strings. The first is a finite language and the second is an infinite language. COMP 2600 Deterministic Finite Automata 8

Language Terminology The alphabet (or vocabulary) of a formal language is a set of tokens (or letters). It is usually denoted Σ. A string (or word) over Σ is a sequence of tokens, or the null-string ε. For example, if Σ = {a,b,c}, then ababc is a string over Σ. A language with alphabet Σ is some set of strings over Σ. The strings of a language are called the sentences of the language. Notation: Σ is the set of all strings over Σ. Therefore, every language with alphabet Σ is some subset of Σ. COMP 2600 Deterministic Finite Automata 9

Finite State Automata The simplest useful abstraction of a computing machine consists of: A fixed, finite set of states A transition relation over the states Simplified Example: a traffic light machine has 3 states: G R G names state in which light is green. Y names state in which light is yellow. Y R names state in which light is red. Some information is missing where does the machine start and stop, and what triggers transitions between states? COMP 2600 Deterministic Finite Automata 10

Example: Vending Machine Imagine a vending machine which (1) accepts $1 and $2 coins; (2) refunds all money if more than $4 is added; (3) is ready to deliver if exactly $4 has been added. 1, 2 2 2 $2 $4 1 1 1 1 $0 $1 2 $3 2 Note that the transitions are labelled with important information, and the start state and end state(s) are specially labelled. COMP 2600 Deterministic Finite Automata 11

The Vending Machine ctd You start at the state $0, indicated by the at the left the alphabet Σ = {1,2} at the $4 state (circled) you have credit for a purchase what strings of tokens (starting at the $0 ) leave you in the circled state? the empty string ε 22 1222 1222221111 the accepted strings are the language defined by the automaton. COMP 2600 Deterministic Finite Automata 12

Terminology We will study deterministic finite automata (DFA) first. The alphabet of a DFA is a finite set of input tokens that an automaton acts on. a DFA has a finite set of states. One of the states is the initial state where the automaton starts. At least one of the states is a final state. A transition function (next state function): State Token State COMP 2600 Deterministic Finite Automata 13

Formal Definition of DFA A Deterministic Finite State Automaton (DFA) is completely characterised by five items: (Σ,S,s 0,F,N) an input alphabet Σ, the set of tokens a finite set of states S an initial state s 0 S (we start here) a set of final states F S (we hope to finish in one of these) a transition function N : S Σ S (We will see later that N being a function is the reason the automaton is deterministic.) COMP 2600 Deterministic Finite Automata 14

Example 1 Alphabet - {0,1} S0 1 S 0 1 1 2 0 S 1 0 States - {S 0,S 1,S 2 } Initial state - S 0 Final states - {S 2 } Transition function (as a table) - 0 1 S 0 S 1 S 0 S 1 S 1 S 2 S 2 S 1 S 0 (Note that the actual state names are irrelevant.) COMP 2600 Deterministic Finite Automata 15

Example 1, ctd If N is the transition function, then we write N(S 0,0) when we want to refer to the state that the above automaton goes to when it starts in S 0 and absorbs a 0. From the definition, N(S 0,0) = S 1. Similarly, N(S 1,1) = S 2 COMP 2600 Deterministic Finite Automata 16

Eventual State Function Revisit example 1: S0 1 S 0 1 1 2 0 S 1 0 Input 0101 takes the DFA from S 0 to S 2, Input 1011 takes the DFA from S 1 to S 0, etc. A complete list of such possibilities is a function from a given state and a string to an eventual state. This is the automaton s Eventual State Function. COMP 2600 Deterministic Finite Automata 17

Eventual State Function Definition Suppose A is a DFA with states S, alphabet Σ, and transition function N. The eventual state function for A is N : S Σ S N (s,w) is the state A reaches, starting in state s and reading string w. N can be defined inductively: N (s,ε) = s N (s,xα) = N (N(s,x),α) (N1) (N2) It is the eventual state function acting on the start state that we are really interested in. COMP 2600 Deterministic Finite Automata 18

An Important (but Unsurprising) Theorem about N The Append Theorem For all states s S and for all strings α,β Σ N (s,αβ) = N (N (s,α),β) Proof by induction on the length of α. Base case: α = ε LHS = N (s,εβ) = N (s,β) RHS = N (N (s,ε),β) = N (s,β) = LHS (by (N1)) COMP 2600 Deterministic Finite Automata 19

Proof ctd: Step case: Suppose N (s,αβ) = N (N (s,α),β) (IH) LHS = N (s,(xα)β) = N (s,x(αβ)) = N (N(s,x),αβ) (by (N2)) = N (N (N(s,x),α),β) (by IH) RHS = N (N (s,xα),β) = N (N (N(s,x),α),β) (by (N2)) Corollary when β is a single token N (s,αy) = N(N (s,α),y) COMP 2600 Deterministic Finite Automata 20

Example S0 1 S 0 1 1 2 0 S 1 0 N (S 1,1011) = N (N(S 1,1),011) = N (S 2,011) = N (S 1,11) = N (S 2,1) = N (S 0,ε) = S 0 COMP 2600 Deterministic Finite Automata 21

Language of an Automaton We say a DFA accepts a string if, starting from the start state, it terminates in one of the final states. More precisely, let A = (Σ,S,s 0,F,N) be a DFA and w be a string in Σ. We say w is accepted by A if N (s 0,w) F The language accepted by A is the set of all strings accepted by A: L(A) = {w Σ N (s 0,w) F} That is, w L(A) iff N (s 0,w) F. COMP 2600 Deterministic Finite Automata 22

Example 1 again S0 1 S 0 A 1 : 1 1 2 0 S 1 0 It is easy to see whether a given string is accepted or not. For example, 0011101 takes the machine from state S 0 through states S 1, S 1, S 2, S 0, S 0, S 1 to S 2 (a final state). N (S 0,0011101) = N (S 1,011101) = N (S 1,11101) =... = N (S 1,1) = S 2 Other strings that are accepted: 01, 001, 101, 0001, 0101, 00101101... COMP 2600 Deterministic Finite Automata 23

Example 1 (ctd.) S0 1 S 0 A 1 : 1 1 2 0 S 1 0 Accepted strings: 01, 001, 101, 0001, 0101, 00101101... Not accepted: ε, 0, 1, 00, 10, 11, 100... How is the first list different from the second list...? How do we justify our guess at this answer? COMP 2600 Deterministic Finite Automata 24

Proving an Acceptance Predicate in General If we are to claim that a machine M accepts the language that is characterised by a predicate P, then we must prove the following: Proof obligation 1: Show that any string satisfying P is accepted by M. Proof obligation 2: Show any string accepted by M satisfies P. COMP 2600 Deterministic Finite Automata 25

Proving an Acceptance Predicate for A 1 Proof obligation 1: If a string ends in 01, then it is accepted by A 1. That is: For all α Σ, N (S 0,α01) F Proof obligation 2: If a string is accepted by A 1, then it ends in 01. That is: For all w Σ, if N (S 0,w) F then α Σ. w = α01 COMP 2600 Deterministic Finite Automata 26

Part 1: α Σ, N (S 0,α01) F Lemma: s S. N (s,01) = S 2 Proof: Prove that N (S i,01) = S 2 by cases: N (S 0,01) = N (S 1,1) = S 2 N (S 1,01) = N (S 1,1) = S 2 N (S 2,01) = N (S 1,1) = S 2 So, by the append theorem above, whether N (S 0,α) is S 0,S 1 or S 2, N (S 0,α01) = N (N (S 0,α),01) = S 2 COMP 2600 Deterministic Finite Automata 27

Part 2: N (S 0,w) = S 2 = α. w = α01 Proof: If w were the empty string ε, then N (S 0,ε) = S 0 S 2, so the antecedent must be false, so by the rules of propositional logic, the claim is true! If w were one token long then again the antecedent is false - there is no one step path from S 0 to S 2. There clearly are paths from S 0 to S 2 of two or more steps, so suppose we are in that position with N (S 0,αxy) = S 2. Then by the corollary to the append theorem, N(N (S 0,αx),y) = S 2 By the definition of N, y must be 1 and N (S 0,αx) must be S 1. Similarly, N(N (S 0,α),x) = S 1, so x is 0, again by the definition of N. COMP 2600 Deterministic Finite Automata 28

Another Example What language does this DFA accept? O1B: S0 0 1 S 1 0 1 S 2 1 0 COMP 2600 Deterministic Finite Automata 29

Answer for O1B O1B accepts the language of bit-strings containing exactly one 1-bit. Proof obligations: Show that if a bit-string contains exactly one 1-bit then it is accepted by O1B. Show that if a string is accepted by O1B it contains exactly one 1-bit. O1B: S0 0 1 S 1 0 1 S 2 1 0 COMP 2600 Deterministic Finite Automata 30

Mapping to Mathematics Expressed mathematically, the main conclusion is L(O1B) = {w Σ w = 0 m 10 n } where m, n are any natural numbers. The two subgoals are 1. If w = 0 m 10 n then N (S 0,w) = S 1 2. If N (S 0,w) = S 1 then w = 0 m 10 n. For this DFA the phrase w is accepted by O1B is captured by the expression N (S 0,w) = S 1. COMP 2600 Deterministic Finite Automata 31

Proving these subgoals The first subgoal follows immediately from the following two lemmas, which are easily proved by induction: n 0. N (S 0,0 n ) = S 0 n 0. N (S 1,0 n ) = S 1 Therefore N (S 0,0 n 10 m ) = N (N (S 0,0 n ),10 m ) = N (S 0,10 m ) = N (N(S 0,1),0 m ) = N (S 1,0 m ) = S 1 The second subgoal, stated more formally as w : N (S 0,w) = S 1 = n,m 0. w = 0 n 10 m is proved on the next slide. COMP 2600 Deterministic Finite Automata 32

Proving these subgoals ctd. Lemma: N (S 0,β) = S 0 = β = 0 n for some n 0 (easy induction) We now prove N (S 0,w) = S 1 = w = 0 n 10 m for some n,m 0, by induction on the length of w. The base case holds because the antecedent is false N (S 0,ε) S 1. Suppose for induction that the theorem holds for all words of length k, and say that N (S 0,α) = S 1 for α with length k + 1. Case One: α = β1 and N (S 0,β) = S 0. Then by the lemma β = 0 k, so α = 0 k 10 0. Case Two: α = β0 and N (S 0,β) = S 1. Then by the induction hypothesis β = 0 n 10 m, so α = 0 n 10 m+1. COMP 2600 Deterministic Finite Automata 33

Limitations of DFAs General Question: How powerful are DFAs? Specifically, what class of languages can be recognised by DFAs? A very important example: Consider this language: L = { a n b n n N} That is, L = {ε,ab,aabb,aaabbb,a 4 b 4,a 5 b 5,...} This language cannot be recognised by any finite state automata This is because the memory of a DFA is limited, but the machine would have to remember how many a s it has seen. COMP 2600 Deterministic Finite Automata 34

Proof by contradiction Suppose A is a DFA that accepts L. That is L = L(A). Each of the following expressions denotes a state of A N (S 0,a), N (S 0,aa), N (S 0,a 3 )... Since this list is infinite and the number of states in A is finite, some of these expressions must denote the same state. Choose distinct i and j such that N (S 0,a i ) = N (S 0,a j ). (What we have done here is pick on two initial string fragments that the automaton will not be able to distinguish, in terms of what is allowed for the rest of the string) COMP 2600 Deterministic Finite Automata 35

Proof by contradiction (ctd) Since a i b i is accepted, we know N (S 0,a i b i ) F By the append theorem N (N (S 0,a i ),b i ) = N (S 0,a i b i ) F Now, since N (S 0,a i ) = N (S 0,a j ) N (S 0,a j b i ) = N (N (S 0,a j ),b i ) F So a j b i is accepted by A, but a j b i is not in L, contradicting the initial assumption. COMP 2600 Deterministic Finite Automata 36

Pigeon-Hole Principle The proof used the pigeon-hole principle: If we have more pigeons than pigeon-holes, then at least two pigeons must be sharing a hole. This was obviously true in our example as we had infinitely many pigeons crammed into finitely many holes. This is a useful generic way to prove that no finite state automata can recognise certain infinite languages. (but there are many infinite languages that can be recognised by DFAs, as we ve seen, so be careful!) COMP 2600 Deterministic Finite Automata 37

Equivalence of Automata Two automata are said to be equivalent if they accept the same language. a b b A 2 : S a S 1 S 2 a b 0 A 3 : b S 3 S a 4 b a Interesting question: a S a S 1 b a,b 0 S 2 b S 3 a b Can we simplify a DFA? Is there an equivalent DFA with fewer states? COMP 2600 Deterministic Finite Automata 38

Equivalence of States Two states S j and S k of an DFA are equivalent if, for all input strings w N (S j,w) F if and only if N (S k,w) F Note that N (S j,w) and N (S k,w) may be different states - we only care that both are in, or not in, F. In the following example, S 2 is equivalent to S 4. a b b A 2 : a S 1 S 2 S a b 0 b S 3 S a 4 b a COMP 2600 Deterministic Finite Automata 39

An Algorithm for Finding Equivalent States There is an iterative algorithm to compute a list of equivalence classes of states. The working data structure for the algorithm is a list of lists ( groups ) of states Each group contains states that appear to be equivalent, given the tests we have done so far On each iteration, we test one of the groups with a symbol from the alphabet. If we notice differing behaviour, we split the group. COMP 2600 Deterministic Finite Automata 40

The Algorithm Details Input: A list containing two groups (a group is represented as a list of states). One group consists of the Final states and the other consists of the non-final states. Data: The working data structure, W DS : [[State]], is a list of groups of states. When two states are in different groups, we know they are not equivalent. Loop: Pick a group, {s 1,...s j } and a symbol, x. If the states {N(s i,x)} are all in the same group, then the group {s 1,...s j } is not split. If the states {N(s i,x)} belong to different groups of WDS, then the group {s 1,...s j } should be split accordingly. Continue until we cannot, by any choice of letter, split any group. COMP 2600 Deterministic Finite Automata 41

Our Previous Example We split the non-final states, but leave the final states together: A 2 : a b [[s 0,s 1,s 3 ],[s 2,s 4 ]] b a S 1 S 2 a [[s S a 0,s 1 ],[s 3 ],[s 2,s 4 ]] b 0 b b S 3 S a 4 [[s 0 ],[s 1 ],[s 3 ],[s 2,s 4 ]] a b a [[s 0 ],[s 1 ],[s 3 ],[s 2,s 4 ]] [[s b 0 ],[s 1 ],[s 3 ],[s 2,s 4 ]] COMP 2600 Deterministic Finite Automata 42

Elimination of States Suppose A = (Σ,S,S 0,F,N) is a DFA with state S k equivalent to S j. (and S k is not S 0.) We can eliminate S k from this automaton by defining a new automaton A = (Σ,S,S 0,F,N ) as follows: S is S without S k F is F without S k N S j (s,w) = N(s, w) if N(s,w) = S k otherwise. COMP 2600 Deterministic Finite Automata 43

Example Since S 2 S 4 in A 2, let s eliminate S 4. New set of states is {S 0,S 1,S 2,S 3 } New set of final states is {S 2 } New transition function is: A 3 : a a S 1 b a,b S 0 S 2 b S 3 a b COMP 2600 Deterministic Finite Automata 44

DFA Minimisation Consider the DFA below: a a,b b S 1 S 2 a b S 3 None of S 1, S 2, S 3 are equivalent... But S 3 is inaccessible from the start state so can safely be deleted (along with the transitions emerging from it). Deleting equivalent and inaccessible states will give a minimal DFA. COMP 2600 Deterministic Finite Automata 45