Lecture 4: Finite Automata

Similar documents
Lecture 3: Finite Automata

Lecture 3: Finite Automata

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

CS 154, Lecture 3: DFA NFA, Regular Expressions

CS243, Logic and Computation Nondeterministic finite automata

Deterministic Finite Automaton (DFA)

Takeaway Notes: Finite State Automata

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Discrete Mathematics. CS204: Spring, Jong C. Park Computer Science Department KAIST

CS21 Decidability and Tractability

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

(Refer Slide Time: 0:21)

Decision, Computation and Language

CSE 105 Theory of Computation Professor Jeanne Ferrante

2017/08/29 Chapter 1.2 in Sipser Ø Announcement:

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

Sri vidya college of engineering and technology

Fooling Sets and. Lecture 5

Finite Automata. Finite Automata

CS5371 Theory of Computation. Lecture 5: Automata Theory III (Non-regular Language, Pumping Lemma, Regular Expression)

Nondeterministic Finite Automata

Outline. CS21 Decidability and Tractability. Regular expressions and FA. Regular expressions and FA. Regular expressions and FA

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CS21 Decidability and Tractability

CSE 105 THEORY OF COMPUTATION

Lecture 3: Nondeterministic Finite Automata

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

CS 154 Introduction to Automata and Complexity Theory

Deterministic Finite Automata (DFAs)

Theory of Computation Lecture 1. Dr. Nahla Belal

Nondeterministic finite automata

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 154. Finite Automata, Nondeterminism, Regular Expressions

1 More finite deterministic automata

Theory of Computation (II) Yijia Chen Fudan University

Computational Models - Lecture 4

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Theory of computation: initial remarks (Chapter 11)

Finite Automata and Formal Languages

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007.

CISC4090: Theory of Computation

Theory of computation: initial remarks (Chapter 11)

Chapter 6. Properties of Regular Languages

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

UNIT II REGULAR LANGUAGES

Deterministic Finite Automata (DFAs)

What we have done so far

Deterministic Finite Automata (DFAs)

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata

Intro to Theory of Computation

Computational Models: Class 3

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Finite Automata Part Two

Languages, regular languages, finite automata

Structural Induction

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

CSE 105 THEORY OF COMPUTATION

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Chapter Five: Nondeterministic Finite Automata

Johns Hopkins Math Tournament Proof Round: Automata

Lecture 4: More on Regexps, Non-Regular Languages

Chapter 2 The Pumping Lemma

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

CSE 105 THEORY OF COMPUTATION

Foundations of Informatics: a Bridging Course

COMP4141 Theory of Computation

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

3515ICT: Theory of Computation. Regular languages

Foundations of

CSE 105 THEORY OF COMPUTATION

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

CMSC 330: Organization of Programming Languages

CSC236 Week 10. Larry Zhang

Theory of Computation

Computational Models - Lecture 3

Deterministic Finite Automata (DFAs)

Automata: a short introduction

Regular Languages and Finite Automata

CS 322 D: Formal languages and automata theory

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Nondeterministic Finite Automata

Please give details of your calculation. A direct answer without explanation is not counted.

Deterministic finite Automata

Computer Sciences Department

Nondeterministic Finite Automata

Theory of Languages and Automata

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Transcription:

Administrivia Lecture 4: Finite Automata Everyone should now be registered electronically using the link on our webpage. If you haven t, do so today! I dliketohaveteamsformedbyfriday,ifpossible,butnextmonday at the latest. First homework due next Wednesday(4 Feb); will be released Wednesday, 28 Jan. evening. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 1

An Alternative Style for Describing Languages Rather than giving a single pattern, we can give a set of rules of the forml A : α 1 α 2 α n, n 0, where A is a symbol that is intended to stand for a language (set of strings) a metavariable or nonterminal symbol. Each α i is either a literal character (like "a") or a nonterminal symbol. The interpretation of this rule is One way to form a string in L(A) (the language denoted by A) is to concatenate one string each from L(α 1 ), L(α 2 ),... (where L("c") is just the language {"c"}). This is Backus-Naur Form (BNF). A set of rules is a grammar. One of the nonterminals is designated as its start symbol denoting the language described by the grammar. Aside: You ll see that : written many different ways, such as ::=,, etc. We ll just use the same notation our tools use. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 2

Some Abbreviations The basic form from the last slide is good for formal analysis, but not for writing. So, we can allow some abbreviations that are obviously exandable into the basic forms: Abbreviation A : R 1 A : R 1 R n. A : R n A : (R) Meaning B : R A : B A : "c 1 " "c n " [c 1 c n ] (likewise other character classes) Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 3

Some Technicalities From the definition, each nonterminal in a grammar defines a language. Often, we are interested in just one of them (the start symbol), and the others are auxiliary definitions. The definition of what a rule means ( One way to form a string in L(A) is... ) leaves open the possibility that there are other ways to form items in L(A) than covered in the rule. We need that freedom in order to allow multiple rules for A, but we don t really want to include strings that aren t covered by some rule. So precise mathematical definitions throw in sentences like: A grammar defines the minimal languages that contain all strings that satisfy the rules. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 4

A Big Restriction (for now) For the time being, we ll also add a restriction. In each rule: A : α 1 α 2 α n, n 0, we ll require that if α i is a nonterminal symbol, then either All the rules for that symbol have to occured before all the rules for A, or i = n (i.e., is the last item) and α n is A. We call such a restricted grammar a Type 3 or regular grammar. The languages definable by regular grammars are called regular languages. Claim: Regular languages are exactly the ones that can be defined by regular expressions. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 5

Proof of Claim (I) Start with a regular expression, R, and make a (possibly not yet valid) rule, R: R Create a new (preceding) rule for each parenthesized expression. This will leave just the constructs X, X+, and X?. What do we do with them? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 6

Proof of Claim (II) Replace construct...... with Q, where R* Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 7

Proof of Claim (II) Replace construct... R*... with Q, where Q : Q : R Q R+ Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 7

Proof of Claim (II) Replace construct... R* R+... with Q, where Q : Q : R Q Q : R Q : R Q R? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 7

Proof of Claim (II) Replace construct... R* R+ R?... with Q, where Q : Q : R Q Q : R Q : R Q Q : Q : R Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 7

Example Consider the regular expression ("+" "-")?("0" "1")+ 1. R: ("+" "-")?("0" "1")+ replace with... 2. Q 1 : "+" "-" Q 2 : "0" "1" R: Q 1? Q 2 + replace with... 3. Q 3 : Q 1 Q 4 : Q 2 Q 2 Q 4 R: Q 3 Q 4 Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 8

Side Note: The Empty Language The grammar for the empty language is a bit non-intuitive: Q: Q Any set of strings, Q, satisfies this rule. Hence,by theimplicitrule thatwe choosethesmallest solution that satisfies all rules, Q represents the empty set. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 9

Classical Pattern-Matching Implementation For compilers, can generally make do with classical regular expressions. Implementable using finite(-state) automata or FAs. ( Finite state = finite memory ). Classical construction: regular expression nondeterministic FA (NFA) deterministic FA (DFA) table-driven program. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 10

Review: FA operation AFAisagraphwhosenodesarestates(ofmemory)andwhoseedges are state transitions. There are a finite number of nodes. One state is the designated start state. Some subset of the nodes are final states. Each transition is labeled with a set of symbols (characters, etc.) or. A FA recognizes a string c 1 c 2 c n if there is a path (sequence of edges) from the start state to a final state such that the labels of the edges in sequence, aside from edges, respectively contain c 1, c 2,...,c n. If the edges leaving any node have disjoint sets of characters and if there are no nodes, FA is a DFA, else an NFA. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 11

Example: What does this DFA recognize? 0 0 0 0 0 0 1 1 1 1 1 1 What is the simplest equivalent NFA you can think of? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 12

Example: What does this DFA recognize? 0 0 0 0 0 0 1 1 1 1 1 Bit strings with # of 1 s divisible by 2 or 3. 1 What is the simplest equivalent NFA you can think of? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 12

Example: What does this DFA recognize? 0 0 0 0 0 0 1 1 1 1 1 Bit strings with # of 1 s divisible by 2 or 3. 1 What is the simplest equivalent NFA you can think of? 0 0 1 0 1 0 0 1 1 1 Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 12

Example: What does this NFA recognize? [A-Z] A B C D A B D What is the simplest equivalent DFA you can think of? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 13

Example: What does this NFA recognize? [A-Z] A B C D A B D What is the simplest equivalent DFA you can think of? Strings of capitals ending in ABCDABD. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 13

Example: What does this NFA recognize? [A-Z] A B C D A B D What is the simplest equivalent DFA you can think of? Strings of capitals ending in ABCDABD. [B-Z] A 0 A A B B C C D A B D A A 0 A A 0 0 A A 0 A C C A 0 A (Edges without labels mean any character not covered by another edge. ) 0 Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 13

Example: What does this NFA recognize? X Y [XY] Z [XY] What is the simplest equivalent DFA you can think of? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 14

Review: Classical Regular Expressions to NFAs (I) a a R 1 R 2 R 1 R 2 Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 15

Review: Classical Regular Expressions to NFAs (II) R 1 R 1 R 2 R 2 R R Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 16

Extensions? How would you translate φ(the empty language, containing no strings) into an FA? How could you translate R? into an NFA? How could you translate R+ into an NFA? How could you translate R 1 R 2 R n into an NFA? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 17

Example of Conversion How would you translate ((ab)* c)* into an NFA (using the construction above)? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 18

Example of Conversion How would you translate ((ab)* c)* into an NFA (using the construction above)? a b c Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 18

Example of Conversion How would you translate ((ab)* c)* into an NFA (using the construction above)? a b c Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 18

Example of Conversion How would you translate ((ab)* c)* into an NFA (using the construction above)? a b c Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 18

Example of Conversion How would you translate ((ab)* c)* into an NFA (using the construction above)? a b c Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 18

Abstract Implementation of NFAs 0 1 X 2 Y 3 4 [XY] 5 Z 6 0 1 X 2 Y 3 4 [XY] 5 Z 6 0 [XY] 1 X 2 Y 3 4 [XY] 5 Z 6 0 [XY] 1 X 2 Y 3 4 [XY] 5 Z 6 0 [XY] 1 X 2 Y 3 4 [XY] 5 Z 6 [XY] String: XYYZ [XY] Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 19

Review: Converting to DFAs OBSERVATION: The set of states that are marked (colored red) changes with each character in a way that depends only on the set and the character. In other words, machine on previous slide acted like this DFA: 014 X 25 Y X 5 Y Z 35 [XY] Z 6 [XY] Z Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 20

DFAs as Programs Can realize DFA in program with control structure: state = INITIAL; for (s = input; *s!= \0 ; s += 1) { switch (state): case INITIAL: if (*s == a ) state = A_STATE; break; case A_STATE: if (*s == b ) state = B_STATE; else state = INITIAL; break;... } } return state == FINAL1 state == FINAL2; Or with data structure (table driven): state = INITIAL; for (s = input; *s!= \0 ; s += 1) state = transition[state][s]; return isfinal[state]; Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 21

What Flex Does Flex program specification is giant regular expression of the form R 1 R 2 R n, where none of the R i match. Each final state labeled with some action. Converted, by previous methods, into a table-driven DFA. But, this particular DFA is used to recognize prefixes of the (remaining) input: initial portions that put machine in a final state. Which final state(s) we end up in determine action. To deal with multiple actions: Match longest prefix ( maximum munch ). If there are multiple matches, apply first rule in order. Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 22

How Do They Do It? How can we use a DFA to recognize longest match? How can we use DFA to act on first of equal-length matches? How can we use a DFA to handle the R 1 /R 2 pattern (matches just R 1 but only if followed by R 2, like R 1 (?=R 2 ) in Python)? Last modified: Mon Feb 2 10:06:22 2015 CS164: Lecture #4 23