Models of Computation I: Finite State Automata

Similar documents
Deterministic Finite Automata

Deterministic Finite Automata

3515ICT: Theory of Computation. Regular languages

UNIT-III REGULAR LANGUAGES

Nondeterministic Finite Automata

What we have done so far

Theory of computation: initial remarks (Chapter 11)

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Automata: a short introduction

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Sri vidya college of engineering and technology

Decision, Computation and Language

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Computational Models - Lecture 3

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Computational Models - Lecture 4

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Lecture 17: Language Recognition

Theory of computation: initial remarks (Chapter 11)

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Uses of finite automata

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

Chapter Five: Nondeterministic Finite Automata

CPSC 421: Tutorial #1

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Lecture 3: Nondeterministic Finite Automata

CS 121, Section 2. Week of September 16, 2013

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Inf2A: Converting from NFAs to DFAs and Closure Properties

Takeaway Notes: Finite State Automata

Theory of Computation

Computational Theory


Computer Sciences Department

Theory of Computation

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Context Free Grammars

CS 154, Lecture 3: DFA NFA, Regular Expressions

Deterministic Finite Automata (DFAs)

Theory of Computation (I) Yijia Chen Fudan University

Deterministic Finite Automata (DFAs)

Finite Automata and Regular languages

Kleene Algebras and Algebraic Path Problems

CSE 211. Pushdown Automata. CSE 211 (Theory of Computation) Atif Hasan Rahman

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

1 Showing Recognizability

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Lecture Notes on Inductive Definitions

Finite Automata. Finite Automata

2. Elements of the Theory of Computation, Lewis and Papadimitrou,

Johns Hopkins Math Tournament Proof Round: Automata

Computational Models Lecture 2 1

CSC173 Workshop: 13 Sept. Notes

Computational Models: Class 3

Theory of Computation Lecture 1. Dr. Nahla Belal

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

Deterministic Finite Automaton (DFA)

Finite Automata and Regular Languages

Theory of Computation (II) Yijia Chen Fudan University

Regular Expressions and Language Properties

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Nondeterministic Finite Automata

Automata and Computability. Solutions to Exercises

Finite Automata Part Two

Foundations of

Computational Models Lecture 2 1

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Theory of Computation

Finite Automata Part One

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

CMP 309: Automata Theory, Computability and Formal Languages. Adapted from the work of Andrej Bogdanov

Theory of Computation

Nondeterministic Finite Automata and Regular Expressions

Nondeterministic Finite Automata

Automata and Computability. Solutions to Exercises

Languages, regular languages, finite automata

CSE 105 THEORY OF COMPUTATION

Nondeterministic finite automata

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Extended transition function of a DFA

CSE 105 THEORY OF COMPUTATION

INF Introduction and Regular Languages. Daniel Lupp. 18th January University of Oslo. Department of Informatics. Universitetet i Oslo

Deterministic Finite Automata (DFAs)

CS 154 Introduction to Automata and Complexity Theory

CSE 105 THEORY OF COMPUTATION

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

Finite Automata Part One

Reducability. Sipser, pages

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Finite Automata Part One

CS 208: Automata Theory and Logic

Accept or reject. Stack

Announcements. Problem Set Four due Thursday at 7:00PM (right before the midterm).

Intro to Theory of Computation

Transcription:

Models of Computation I: Finite State Automata COMP1600 / COMP6260 Dirk Pattinson Australian National University Semester 2, 2017

Catch Up / Drop in Lab When Fridays, 15.00-17.00 Where N335, CSIT Building (bldg 108) Until the end of the semester This is not a tutorial, but we re happy to answer all your questions. No registration or anything required just drop by! 1 / 68

The Story So Far... Logic. language and proofs to speak about systems precisely useful to express properties and do proofs Functional Programs establish properties of functional programs main tool: (structural) induction Imperative Programs. again: focus on properties of programs main tool: Hoare Logic Q. Is there a general notion of computation? That encompasses both? 2 / 68

First Shot: Your Laptop Abstract Characteristics. can do computation has memory a finite amount has (lots of) internal states 3 / 68

From Laptops to Formal Models Concrete (your laptop) realistic (it exists!) complex hard to analyse Abstract (mathematical model) exists only as a model simple easy to analyse Q. What is a good simple model of computation? should match what really exists (possibly by a long shot) should be conceptually simple 4 / 68

First Answer: Finite State Automata Basic Components. internal states finitely many state transitions triggered by reading input simplifying assumption: just one output: yes/no Data. basic input: strings (what you type in, text/xml file) characters: drawn from finite set (alphabet) 5 / 68

Example: Java Identifiers From Oracle s Java Language Specification. An identifier is a sequence of one or more characters. The first character must be a valid first character (letter, $, ) in an identifier of the Java programming language, hereafter in this chapter called simply Java. Each subsequent character in the sequence must be a valid nonfirst character (letter, digit, $, ) in a Java identifier. Graphical Specification Letter Identifier $ _ Letter Digit $ _ Q. Can you see a machine that recognises Java identifiers? 6 / 68

Java Identifiers Example: Main Components Letter Identifier $ _ Letter Digit $ _ Data. drawn form a finite alphabet (unicode, or ASCII) Control. yes if I can get from the left to the right, no otherwise have states after taking a transition (implicit in diagram) Computational Problem with yes/no answer: it a given sequence of characters a valid Java identifier? 7 / 68

Preview. This week. Finite Automata start with simplest model: finite automata relate to regular languages, non-determinism conclusion: finite automata too simple Next Week. Pushdown automata like finite automata, but some more memory useful for e.g. specifying syntax of programming languages still too simple for general computation Then. Turing machines The most widely accepted model of computation infinite memory idea: buy another hard disk whenever your computation runs out of memory limits of what can be computed 8 / 68

Finite State Automata: First Example The simplest useful abstraction of a computing machine consists of: A fixed, finite set of states A transition relation over the states Example: a traffic light FSA has 3 states: G R G names state in which light is green. Y Y names state in which light is yellow. R names state in which light is red. System designs are often in terms of state machines. 9 / 68

Second Example: Vending Machine Operation accept 10c and 20c coins delivers if it has received at least 40c and selection is made select 20 0c 20 20c 40c 10 10 10 10 10 select 10c 20 30c 20 50c Note. transitions are labelled new ingredient: final states (doubly circled) Computation. Sequences of actions (lablels) from initial to final state. 10 / 68

Language Examples Main Idea. input: a string over a fixed character set operation: transitions labelled with characters output: yes if in final state after reading the input More Generally. Setup: Fix a finite set of characters (an alphabet) Problem: A set of strings (called language) that are valid or good Task: decide computationally which strings are good Example Languages. 1. A finite set. {a, aa, ab, aaa, aab, aba, abb} 2. Palindromes consisting of bits (0,1): {0, 1, 00, 11, 010, 101, 000, 111, 0110,...} Languages in this sense are called formal languages. 11 / 68

Terminology Alphabet. A finite set (of symbols). Usually denoted by Σ. Strings over an alphabet Σ finite sequence of characters (elements of Σ, can be the empty sequence. E.g. for Σ = {a, b, c}, ababc is a string over Σ. Languages over alphabet Σ are just sets of strings over Σ. Sentences of the language just another name for the elements (strings) of the language. Notation: Σ is the set of all strings over Σ. Therefore, every language with alphabet Σ is some subset of Σ. 12 / 68

Automata First Model of Copmputation. Deterministic Finite Automata solve computational problem: given string s, is s accepted? Basic Ingredients. (see e.g. traffic light and vending machine example) The alphabet of a DFA is a finite set of input tokens that an automaton acts on. a DFA consists of a finite set of states (a primitive notion) One of the states is the initial state where the automaton starts At least one of the states is a final state A transition function (next state function): State Token State 13 / 68

Recurring Theme Diagrammatic Notation. useful for humans e.g. the transition diagram of the vending machine Mathematical Notation. useful for formal manipulation (e.g. proving theorems) useful for computer implementation Glue between Diagrams and Maths both notions convey precisely the same information crucial: being able to switch back and forth! 14 / 68

Formal Definition of DFA A Deterministic Finite State Automaton (DFA) consists of five parts: A = (Σ, S, s 0, F, N) an input alphabet Σ, the set of tokens a set of states S an initial state s 0 S (we start here) a set of final states F S (we hope to finish in one of these) a transition function N : S Σ S Aside. Having a transition function is what makes the automaton deterministic. 15 / 68

Example 1 As a diagram. S 0 1 S2 0 1 1 0 S 1 0 In Mathematical Notation. Alphabet - {0, 1} States - {S 0, S 1, S 2 } Initial state - S 0 Final states - {S 2 } Transition function (as a table) - 0 1 S 0 S 1 S 0 S 1 S 1 S 2 S 2 S 1 S 0 Aside. The actual names of the states are irrelevant. 16 / 68

Example 1, ctd Recall. N : S Σ S is the transition function. 0 1 S 0 S 1 S 0 S 1 S 1 S 2 S 2 S 1 S 0 Single Steps of the automaton N(S 0, 0) is the state that the automation transitions to from state S 0 reading letter 0. Here: N(S 0, 0) = S 1. Multiple Steps of the automaton N(N(S 0, 0), 1) is the state of the automation when starting in S 0 and reading first 0, then 1. Here: N(N(S 0, 0), 1) = S 2. 17 / 68

Example 2 b U a,b,c V a c b a,b,c Z Y c a a b c U Z V Y V V V V Y Z V Y Z Z Z Z (the table carries the same information as the diagram) Q. What is the language of this automaton? 18 / 68

Eventual State Function Revisit example 1: S 0 1 S2 0 1 1 0 S 1 0 Input 0101 takes the DFA from S 0 to S 2, Input 1011 takes the DFA from S 1 to S 0, etc A complete list of such possibilities is a function from a given state and a string to an eventual state. This is the idea of Eventual State Function. 19 / 68

Eventual State Function Definition Definition. Let A be a DFA with states S, alphabet Σ, and transition function N. The eventual state function for A is of type and is defined inductively by: N : S Σ S N (s, ɛ) = s N (s, xα) = N (N(s, x), α) (N1) (N2) Informally. N (s, w) is the state A reaches, starting in state s and reading string w. For Haskell afficionados: N = uncurry(foldl(curry N)) 20 / 68

An Important (but Unsurprising) Theorem about N Theorem. For all states s S and for all strings α, β Σ N (s, αβ) = N (N (s, α), β) Proof by induction on the length of α. Base case: α = ɛ LHS = N (s, ɛβ) = N (s, β) RHS = N (N (s, ɛ), β) = N (s, β) = LHS (by (N1)) 21 / 68

Proof ctd: Step case: Step Case. Show that N (s, (xα)β) = N (N (s, xα), β) LHS = N (s, (xα)β) = N (s, x(αβ)) = N (N(s, x), αβ) (by (N2)) = N (N (N(s, x), α), β) (by IH) RHS = N (N (s, xα), β) = N (N (N(s, x), α), β) (by (N2)) Corollary when β is a single token N (s, αy) = N(N (s, α), y) 22 / 68

Example S 0 1 S2 0 1 1 0 S 1 0 N (S 1, 1011) = N (N(S 1, 1), 011) = N (S 2, 011) = N (S 1, 11) = N (S 2, 1) = N (S 0, ɛ) = S 0 23 / 68

Language of an Automaton Acceptance Informally. A DFA accepts a string if, starting from the start state, it terminates in one of the final states. Acceptance, Formally. Let A = (Σ, S, s 0, F, N) be an DFA and w be a string in Σ. We say w is accepted by A if N (s 0, w) F The language accepted by A is the set of all strings accepted by A: L(A) = {w Σ N (s 0, w) F } (That is, w L(A) iff N (s 0, w) F.) 24 / 68

Example 1 again A 1 : S 0 1 S2 0 1 1 0 S 1 0 Q. Which strings are accepted? e.g. 0011101 takes the machine from state S 0 through states S 1, S 1, S 2, S 0, S 0, S 1 to S 2 (a final state). N (S 0, 0011101) = N (S 1, 011101) = N (S 1, 11101) =... N (S 1, 1) = S 2 others: 01, 001, 101, 0001, 0101, 00101101... 25 / 68

Example 1 (ctd.) A 1 : S 0 1 S2 0 1 1 0 S 1 0 Accepted Strings. 01, 001, 101, 0001, 0101, 00101101... Strings that are not accepted. ɛ, 0, 1, 00, 10, 11, 100... Q. What do the accepted strings have in common? How do we justify this? 26 / 68

Proving an Acceptance Predicate in General Our Claim. The automaton A accepts precisely the strings that are elements of the language L = {w Σ P(w)}. (P is sometimes called an acceptance predicate.) Proof Obligations. 1. Show that any string satisfying P is accepted by A. 2. Show any string accepted by A satisfies P. 27 / 68

Proving an Acceptance Predicate for A 1 Proof obligation 1: If a string ends in 01, then it is accepted by A 1. That is: For all α Σ, N (S 0, α01) F Proof obligation 2: If a string is accepted by A 1, then it ends in 01. That is: For all w Σ, if N (S 0, w) F then α Σ. w = α01 28 / 68

Part 1: α Σ, N (S 0, α01) F Lemma: s S. N (s, 01) = S 2 Proof by cases: N (S 0, 01) = N (S 1, 1) = S 2 N (S 1, 01) = N (S 1, 1) = S 2 N (S 2, 01) = N (S 1, 1) = S 2 So, by the append theorem above, N (S 0, α01) = N (N (S 0, α), 01) = S 2 29 / 68

Part 2: N (S 0, w) = S 2 = α. w = α01 Proof. Suppose N (S 0, αxy) = S 2. By corollary to apppend-theorem (case of single token): N(N (S 0, αx), y) = S 2 By the definition of N, y must be 1 and N (S 0, αx) must be S 1. Similarly, N(N (S 0, α), x) = S 1 and x is 0, again by the definition of N. 30 / 68

Another Example What language does this DFA accept? SOB: S 0 1 S1 0 0 1 S 2 1 0 31 / 68

Answer for SOB SOB accepts the language of bitstrings containing exactly one 1-bit. Proof obligations: Show that if a bitstring contains exactly one 1-bit then it is accepted by SOB. Show that if a string is accepted by SOB it contains exactly one 1-bit. SOB: S 0 1 S1 1 S 2 1 0 0 0 32 / 68

Mapping to Mathematics Expressed mathematically, the main conclusion is The two subgoals are L(SOB) = {w Σ w = 0 n 10 m } 1. If w = 0 n 10 m then N (S 0, w) = S 1 2. If N (S 0, w) = S 1 then w = 0 n 10 m. For this DFA the phrase w is accepted by SOB is captured by the expression N (S 0, w) = S 1. 33 / 68

Proving these subgoals The first subgoal follows immediately from the following two lemmas, which are easily proved by induction: n 0. N (S 0, 0 n ) = S 0 n 0. N (S 1, 0 n ) = S 1 Therefore N (S 0, 0 n 10 m ) = N (N (S 0, 0 n ), 10 m ) = N (S 0, 10 m ) = N (N(S 0, 1), 0 m ) = N (S 1, 0 m ) = S 1 The second subgoal, stated more formally as w : N (S 0, w) = S 1 = n, m 0. w = 0 n 10 m can be proved in a similar fashion to Example 1 on earlier slides. 34 / 68

Limitations of FSAs Q. Is an FSA a good model of computation? Suppose we have a program P that always terminates and outputs yes or no for every input string Is there an FSA that accepts precisely the strings for which P says yes? Technical Analysis. Properties of languages accepted by a DFA. A very important example: L = { a n b n n N} L = {ɛ, ab, aabb, aaabbb, a 4 b 4, a 5 b 5,...} Claim. There is no FSA that recognises this language. (because an FSA s memory is limited.) Q. Given the claim above, are FSA s realistic models of computation? 35 / 68

Proof of Claim Proof by contradiction. Suppose A is an FSA that accepts L. That is L = L(A). Then each of the following are states of A: N (S 0, a), N (S 0, a 2 ), N (S 0, a 3 )... But A only has finitely many states, so some state must repeat: There are distinct i and j such that N (S 0, a i ) = N (S 0, a j ). that is, the automaton cannot tell a i and a j apart. 36 / 68

Proof by contradiction (ctd) Since a i b i is accepted, we know By the append theorem Now, since N (S 0, a i ) = N (S 0, a j ) N (S 0, a i b i ) F N (N (S 0, a i ), b i ) = N (S 0, a i b i ) F N (N (S 0, a j ), b i ) = N (S 0, a j b i ) F So a j b i is accepted by A but a j b i is not in L, contradicting the initial assumption. 37 / 68

Pigeon-Hole Principle The proof used the pigeon-hole principle: No function from one set to a smaller finite set can be one-to-one. (Finiteness is not really necessary no function from one set to another with smaller cardinality can be one-to-one.) You cannot fit n + 1 pigeons into n holes 38 / 68

Equivalence of Automata Two automata are said to be equivalent if they accept the same language. Example: 0 1 S 0 S 1 0 0 S 0 A4 : 1 A 5 : 1 1 1 0 0 S S2 0 3 1 S 1 Q. Can FSAs be simplified? is there an equivalent FSA with fewer states? 39 / 68

Equivalence of States Two states S j and S k a FSA are equivalent if, for all input strings w N (S j, w) F if and only if N (S k, w) F Example. In A 4, S 2 is equivalent to S 0 and S 1 is equivalent to S 3. A 4 : 0 1 S 0 S 1 0 1 1 0 0 S S2 3 1 40 / 68

Elimination of Equivalent States Assumptions. A = (Σ, S, S 0, F, N) is an FSA S k and S j be equivalent S k S 0 (don t eliminate the initial state!) Elimination of S k from A: new automaton A = (Σ, S, S 0, F, N ) S is S without S k F is F without S k N (s, w) = (if N(s, w) = S k then S j else N(s, w)) 41 / 68

Example Since S 2 S 0 in A 4, let s eliminate S 2. New set of states is {S 0, S 1, S 3 } New set of final states is {S 0 } New transition function is: 0 1 S 0 0 S 1 1 A 6 : 1 0 S 3 0 1 S 0 S 0 S 1 S 1 S 1 S 0 S 3 S 3 S 0 42 / 68

FSA Minimisation Elimination of equivalent states. if two states are equivalent, one can be elimnated Elimination of Unreachable States if a state cannot be reached from the initial state then it can also be eliminated. Example. S 3 not reachable 0 1 S 0 0 S 1 1 A 6 : 1 0 S 3 43 / 68

The Standard Minimisation Algorithm Main Idea. aggregate states into groups (of possibly equivalent states) initially, all states are possibly equivalent split a group of possibly equivalent states if we have evidence that they are not equivalent. a non-final state is never equivalent to a final state two states are non-equivalent if the transition function takes them into different groups (with the same letter) repeat until no more groups can be split. Realisation. The working data structure for the algorithm is a list of lists ( groups ) of states On each iteration, we test one of the groups with a symbol from the alphabet. If we notice differing behaviour, we split the group. 44 / 68

The Algorithm Details Input: A list containing two groups. (a group is represented as a list of states). One group consists of the Final states and the other consists of the non-final states. Data: The working data structure, WDS : [[State]], is a list of groups of states. When two states are in different groups, we know they are not equivalent. Loop: Pick a group, {s 1,...s j } and a symbol, x. If the states {N(s i, x) i = 1,..., j} are all in the same group, then the group {s 1,...s j } is not split. If the states {N(si, x) i = 1,..., j} belong to different groups of WDS, then the group {s 1,...s j } should be split accordingly. Continue until we cannot, by any choice of letter, split any group. 45 / 68

Our Previous Example Our running example is trivial. The initial split is it. A: 0 S 1 0 S 1 0 1 1 0 0 S S2 3 1 [[s 0, s 2 ], [s 1, s 3 ]] [[s 0 0, s 2 ], [s 1, s 3 ]] 0 [[s 0, s 2 ], [s 1, s 3 ]] 1 [[s 0, s 2 ], [s 1, s 3 ]] 1 [[s 0, s 2 ], [s 1, s 3 ]] A : 0 S a 1 1 0 S b 46 / 68

Non-Deterministic Finite State Automata NFAs Consider this FSA: S 0 a S 1 S b 2 c S3 a b c Q. Is it intuitively clear what it does? Q. Is it a DFA in the sense of our definition? 47 / 68

Is it legal, i.e. a proper DFA? S 0 a S 1 S b 2 c S3 a b c A. It makes sense, but it is nondeterministic: A nondeterministic finite automaton (NFA). So not a legal DFA, but a specimen of a different breed. Differences to deterministic automata Multiple edges with the same label come out of states For some states, there is not an edge for every token Formally. NFAs have a transition relation rather than a transition function. transition relation R(s 1, x, s 2 ) obtains if there s an x-labelled edge from s 1 to s 2 there can be no x-labelled edge between s 1 and any state there can be many states s 2, s 3,... that are connected to s 1 via an x-labelled edge. 48 / 68

Is it clear what it does? S 0 a S 1 S b 2 c S3 a b c Observations. Some states don t have an outgoing edge with a certain letter, so the NFA can get stuck. In some states, there s more than one possible successor state with a certain letter. Acceptance condition for NFAs given string α: can get from initial to final state, making the right choice of successor state without getting stuck Exanple. α = aaabcc need to look ahead to make the right choice (alternatively, try to backtrack if wrong choice has been made) 49 / 68

DFAs vs NFAs Key Differences. For each state in a DFA and for each input symbol, there is a unique successor state. DFAs have a transition function. NFAs allow zero, one or more transitions from a state for the same input symbol. NFAs have a transition relation. An input sequence a 1, a 2,..., a n is accepted by a NFA if there exists some sequence of transitions that leads from the initial state to a final state. 50 / 68

Why NFAs? Example. NFAs are simpler. A NFA recognizing strings of letters ending in man : (Σ is the Latin alphabet) S 0 m S 1 a S 2 n S3 Σ 51 / 68

An Equivalent DFA Example. DFAs are (often) more complex. A DFA that recognises strings of letters than end in man. m S Σ-{a,m} m Σ-{m} 0 m m S 1 a S 2 n S3 Σ-{m} Σ-{m,n} 52 / 68

NFAs: Formal Definition A Nondeterministic Finite State Automaton (DFA) consists of five parts: A = (Σ, S, s 0, F, N) an input alphabet Σ, the set of tokens a set of states S an initial state s 0 S (we start here) a set of final states F S (we hope to finish in one of these) a transition relation R S Σ S. Aside. The transition relation is what makes the automaton nondeterministic. 53 / 68

Eventual State Relation for NFAs Basic Idea. The eventual state relation R (s, w, s ) is true if s is a state the NFA can reach, starting in state s and reading string w. Formal Definition. The eventual state relation has type and is defined inductively as follows: R S Σ S or R : S Σ S Bool R (s, ɛ, s) R (s, xα, s ) = s.r(s, x, s ) R (s, α, s ) 54 / 68

An Important (but Unsurprising) Theorem about R ) For all states s, s and for all strings α, β Σ R (s, αβ, s ) if and only if s. R (s, α, s ) R (s, β, s ) The proof is similar to the corresponding result for N in DFAs. 55 / 68

Language of a NFA Let A = (Σ, S, s 0, F, R) be a NFA. Definition. A string w is accepted by A if s F. R (s 0, w, s) The language accepted by A is the set of all strings accepted by A L(A) = {w Σ s F. R (s 0, w, s)} Informally. That is, w L(A) iff there exists a path through the diagram for A, from s 0 to a final state s (s F ), such that the symbols on the path match the symbols in w 56 / 68

Power of Nondeterminism? Q. Is there a language that is accepted by an NFA for which we cannot find a DFA that (also) accepts it? it seems easier to construct NFAs but in examples, DFAs did also exist A. A simple no. Theorem. If language L is accepted by a NFA, then there is some DFA which accepts the same language. Moreover, this DFA can be computed using an algorithm.) just like the minimal automaton can be computed using state equivalence Drawback. The resulting NFA may have exponentially many states Have to record a set of states that the NFA could be in. 57 / 68

Constructing the Equivalent DFA from an NFA Assumption. We have an NFA with state set {q 0,..., q n }. Basic Idea. consider all possible runs of the NFA in parallel as a consequence, can be in a set of tates Construction. A state of the DFA is a set of states of the NFA e.g. {q 3, q 7 } or signifies the states that the NFA can be in after reading some input transition function: records possible next states e.g. from {q 3, q 7 with letter x, take union of transitions (with x) from q 3 and q 7 final states are state sets that contain a final state. 58 / 68

Regular Expressions Challenge. Understand the computational power of DFAs / NFAs. Approach. Characterise the languages that can be accepted by an NFA in a different form. One Characterisation. Regular expressions (cf. Perl, Ruby, grep) Basic Operators used to construct new expressions from old: vertical bar (pipe): choose either the left or right expression Kleene star: repeat strings from an expression ɛ, the empty string, and every letter of the alphabet concatenation, for sequencing expressions parentheses, for grouping Example. a indicates 0 or more as. yes no is the language with just the 2 given strings. (0 1) indicates the set of binary numerals. 59 / 68

Regular Expressions More Examples 0 (1(0 1) ) is the set of binary numerals with no leading zeros. (a b) c(a b) is the set of strings over {a, b, c} with just one c. (0 10 10 ) is the language of bit-strings that have an even number of ones. (Alternatively 0 (10 10 ) ) (z (x y ) z)) is the set of strings over {x, y, z} with no x and y adjacent. 1 (0 ( ɛ (.(0 1) 1)))) is binary fractional numerals between 0 and 1 with no trailing zeroes. (e.g. 0.1, 0.110011 but not.1 or 0.10) 60 / 68

R is (inductively) defined as {ɛ} RR 61 / 68 The Definition of Regular Expressions Key Concept. regular expressions are purely syntactical just like formulae but: every expression denotes a set of strings this is the meaning. Definition. The regular expressions over alphabet Σ and the sets that they denote are: is a regular expression and denotes the empty set ɛ is a regular expression and denotes the set {ɛ} for each a Σ, a is a regular expression and denotes the set {a} If α and β are regular expressions denoting languages R and S respectively, then: α β denotes R S α β denotes RS which is {xy x R y S} α denotes R, ie, the set of finitely many r i R, concatenated

Regular Expressions and FSAs Key Insight. Regular expressions and NFAs / DFAs are equivalent. for every DFA A, have regular expression r with L(A) = L(r) for every regular expression r, have DFA A with L(r) = L(A) so the power of NFAs / DFAs are completely described by regular expressions. Q. Can we compute more than what can be described by regular expressions? 62 / 68

From Regular Expressions to NFAs Extra Ingredient: Spontaneous transitions NFAs that may change state without consuming a symbol. NFAs of this kind are called NFAs with ɛ-transitions can convert NFAs with ɛ-transitions to (standard) NFAs (so no more expressive power, we don t cover this translation). Formal Definition. An NFA with ɛ-transitions is an NFA, but the transition relation has the form R S Σ {ɛ} S cf. NFAs with transition relation R S Σ S R(s, ɛ, s ) signifies a spontaneous transition (without reading input symbol) 63 / 68

Regular Expressions to NFAs Key Insight. regular expressions are an inductively defined structure e.g. representable by an inductive data type in Haskell as a consequence, we can give inductive definition of the corresponding automaton Construction. (start state on left, final state on right) When the regular expression is a symbol a of the alphabet (language is {a}) the automaton is a When the regular expression is ɛ (language is {ɛ}) the automaton is When the regular expression is (language is ) the automaton has no edges ε 64 / 68

Regular Expressions to NFAs, ctd Suppose the NFA corresponding to some R is: Then NFAs corresponding to composite regular expressions are defined as follows: R R 1 R 2 R 1 R 2 ε ε R* ε R ε ε R 1 ε R 1 R 2 ε R 2 ε 65 / 68

Example Given the regular expression for binary numerals without leading zeros, (0 1(0 1) ), the above algorithm gives this NFA. 0 ε ε ε ε 1 ε ε ε 0 1 ε ε ε ε 66 / 68

Summary. Starting Point. Finite Automata motivated by computers having finite memory (only) solving simple problems: is string s accepted? Limitations of Finite Automata e.g. cannot recognise L = {a n b n n 0} Characterisation of expressive power can go back and forth between automata and regular expressions Q. Are finite automata a good model of computation? if yes, why? if not, why not? What is missing? 67 / 68

Literature. Introduction to Automata Theory, Languages, and Computation By Hopcroft, Motwani, and Ullman. A classic text that has been re-worked from a standard textbook. Introduction To The Theory Of Computation by Michael Sipser The part on Automata and Languages covers (more than) what we have discussed here. 68 / 68