Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

Size: px
Start display at page:

Download "Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering"

Transcription

1 Tasks of lexer CISC 5920: Compiler Construction Chapter 2 Lexical Analysis Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Copyright Arthur G. Werschulz, All rights reserved. Spring, 2017 Scan source code, convert into tokens, e.g.: Example: if (big < x[i]) big = x[i]; becomes KEYWORD LPAREN ID LT ID LBRACK ID BRACK RPAREN ID GETS ID LBRACK ID RBRACK SEMI Remove comments Case conversion (where applicable) Remove white space (fortran special case) Interpret compiler directives Communicate with symbol table Prepare output listing 1 / 51 2 / 51 Tokens and lexemes Buffering Go back to if (big < x[i]) big = x[i]; For parsing purposes: All identifiers are the same. All relational operators are the same. But eventually must distinguish. Distinguish between tokens and lexemes: Token id relop ( ) if then = [ ] Lexeme id < ( ) if then = [ ] x i Lexer may need to back up when scanning input data How? Read text chunks into a buffer array, divided in half: if (big < x[i ] big = After first half has been processed (wraps around) x[i] ; ] big = Only refresh one half if you re really done with it for may be beginning of fork. typically, by seeing a separator (whitespace, punctuation) Need two indices for the buffer: beginning of lexeme, current char 3 / 51 4 / 51

2 Finite-state automata (FSAs) State diagrams and state tables for FSAs Can use regular expressions to define lexemes, for instance: identifier: l(l d) integer constant: (ɛ + )dd floating-point number: (ɛ + )(0 d ).(0 d Finite-state automaton: system with finite set of states, rules for state transition upon inputs Game plan: RE-based lexeme description FSA to recognize lexemes code for recognizing lexemes Example: FSA for 10 candy machine Inputs: s (select),n (nickel), d (dime), q (quarter) States: 0 (no money), 1 (5 ), 2 (10 ), 3 (overpayment) Can draw a digraph (nodes are states, edges are transitions) Can represent via a state table Current Inputs state n d q s : give candy and change How about an FSA for an identifier? 5 / 51 6 / 51 Formal definition of FSA Formal definition of FSA (cont d) Example: Σ = {a, b}, Q = {1, 2, 3}, q 0 = 1, F = {3} A (deterministic) finite state machine M is given by a quintuple M = (Σ, Q, q 0, F, N), where Σ is a finite set (alphabet) of input symbols Q is a finite set of states q 0 Q: start state F Q: set of accepting or final states N : Q Σ Q is the state-transition function Interpretation:: N(q i, x) = q j means if M is in state q i and the current input is x, the next state is q j. Can represent N by the state table (one row/state). N a b Example: Σ = {a, b}, Q = {1, 2, 3, 4}, q 0 = 1, F = {3} N a b State 4 is unreachable (can remove without loss of generality) 7 / 51 8 / 51

3 Acceptance Use FSA for recognizing tokens w = x 0 x 1... x n 1 Σ is accepted by M if we have with q n F. Example (previous FSA): Is abab accepted? Is ababa accepted? q i+1 = N(q i, x i ) (0 i n 1) L(M) = { w Σ : w is accepted by M }... language accepted by M Languages L i and L j are equivalent if L(M i ) = L(M j ). Example: If one FSA per token, then L(M) = { lexemes corresponding to M s token } Acceptance (cont d) Coding an FSA? Suppose that char table[n_states][n_symbols]; and that int char_to_column(char ch); gives the column number for character ch. Then... state = 0; for (int i = 0; i < w.size(); i++) state = table(state, char_to_column(ch); computes the state that M reaches for the input string w. 9 / / 51 Non-deterministic finite-state automata NFAs (cont d) A nondeterministic finite state machine M is given by a quintuple M = (Σ, Q, q 0, F, N), where Σ is a finite set (alphabet) of input symbols Q is a finite set of states q0 Q: start state F Q: set of accepting or final states N : Q (Σ {ɛ}) P(Q) is the nondeterministic state-transition function How does NFA differ from DFA? State can change without reading a character. Transition can be to a set of states (i.e., more than one state) Why? It s easier to build an NFA to recognize REs than a DFA. It s straightforward to convert NFA into DFA. Example: NFA with state table a b 1 {1,2} {3} 2 {1} {2,3} 3 {1,2} {3} Does it accept aab? Only need one good path to accept, but all paths must be bad to reject. 11 / / 51

4 NFAs: ɛ-transitions NSAs: Equivalence Example: Let M have state table a b ɛ 1 {1,2} {3} {2,3} 2 {1} {2,3} {1} 3 {1,2} {3} {} M goes to accept-state 2 if no input! Only need one good path to accept, but all paths must be bad to reject. Transitions between states: unpredictable Transitions between sets of states: predictable For any NFA M, we can construct a DFA M for which L(M) = L(M ). The M -states correspond to sets of M-states DFAs and NFAs accept the same languages. Use? In our proposed workflow Tokens REs NFA DFA Since we re at the end o the chain, we continue working backwards through the chain / / 51 The subset construction Ken Thompson, AT&T Bell Labs To start with, suppose that there are no ɛ-transitions. Given: the NFA M = (Σ, Q, q 0, F, N) Our DFA M = (Σ, Q, q 0, F, N ), where Q = P(Q) q 0 = [q 0 ] (use brackets, not braces, for subsets in M. [qi1,..., q in ] F iff q ij F for some index j If N({q i1,..., q in }, x) = {q k1,..., q km }), then N ([q i1,..., q in ], x) = [q k1,..., q km ]). The subset construction (cont d) a b 1 {1, 2} {3} 2 {1} {2, 3} 3 {1, 2} {3} Disadvantages: Q = 2 Q unreachable states a b [1] [1,2] [3] [2] [1] [2,3] [3] [1,2] [3] [1,2] [1,2] [2,3] [1,3] [1,2] [2,3] [2,3] [1,2] [2,3] [1,2,3] [1,2] [2,3] [] [] [] ( : unreachable state) Can we do better? Yes! Do one state at a time! 15 / / 51

5 The subset construction (cont d) The subset construction (cont d) Algorithm: create start state [q 0 ] Q = {[q 0 ]} while ( uncompleted row r in table for M ) do x = [s 1,..., s k ] = state for row r for a Σ do T = N({s 1,..., s k }, a) y = [T ] if y Q then Q = Q {y} add rule N (x, a) = y to M -transition rules identify accepting states in M a b ɛ 1 {1, 2} {3} {3} 2 {1} {2, 3} 3 {1, 2} {3} becomes a b [1] [1,2] [3] [1,2] [1,2] [2,3] [3] [1,2] [3] [2,3] [1,2] [2,3] 17 / / 51 The subset construction (cont d) The subset construction (cont d) What about handling ɛ-transitions? For [q 0 ] and for each new M -state, also include the ɛ-closure of the state, i.e., the set of all sets reachable from said state via ɛ-transitions. Revised algorithm: Q = [q 0 ] while ( uncompleted row r in table for M ) do x = [s 1,..., s k ]] = state for row r for a Σ do T = N({s 1,..., s k }, a) y = [T ] if y Q then Q = Q {y} add rule N (x, a) = y to M -transition rules identify accepting states in M a b ɛ 1 {1, 2} {3} {2, 3} 2 {1} {2, 3} {... } 3 {1, 2} {3} {1, 2} becomes a b [1,2,3] [1,2,3] [1,2,3] 19 / / 51

6 Regular expressions Lexing, parsing: use tables, rather than customized code Building DFA by hand: difficult, error-prone Need a mechanism: How to represent a token? Regular expressions Program to turn representation into DFA? Unix lex: regexp NFA DFA Examples of regular expressions b 4 = bbbb a n : n instances of a in a row a : any concatenation of a s (the Kleene star operation) b + : any nonempty concatenation of b s ab bc: ab or bc Regular expressions (cont d) Formal definition: Let Σ be an alphabet. Then: x Σ = x is a regular expression. ɛ is a regular expression. If R is a regular expression, then R is a regular expression. If R and S are regular expressions, then RS (sometimes written R S and R S are regular expressions. Nothing else is a regular expression. Here: L(RS) = { vw : v L(R), w L(S) } L(R S) = L(R) L(S) L(R ) = { w 1... w n : n 0 and w 1,... w n L(R) } L(R + ) = { w 1... w n : n 1 and w 1,... w n L(R) } 21 / / 51 Regular expressions (cont d) Example: Let Σ = {a, b,..., z}. a Σ = a is a regular expression. b and c are regular expressions. ab is a regular expression. (ab b) is a regular expression. (ab b) is a regular expression. Regular expressions (cont d) Regular expressions R and S are equivalent if L(R) = L(S). L(R) = L(S) L(R) L(S) L(S) L(R). Useful equivalences: R(ST ) = (RS)T R (S T ) = (R S)T R S = S R R(S T ) = RS RT R + = RR R = ɛ R + Rɛ = ɛr = R but RS SR! If L is a a language such that L(E) = L for a regular expression E, then L is a regular language. Not all languages are regular! However, tokens are regular. 23 / / 51

7 Regular expressions and finite automata Regular expressions are defined inductively. Thompson s construction inductively builds NFA to recognize regular expressions: recognizes ɛ recognizes a Σ recognizes R S recognizes R S recognizes R + recognizes R Pumping lemma Theorem Let M = (Σ, Q, q 0, F, N) be a finite automaton, with n = Q. Then there exists k n such that the following holds: If w L(M) with w k, then there exist x, y, z Σ, with y ɛ, such that w = xyz and xy z L(M). Proof Let w = w 1... w n. There exist states q 0,..., q n Q such that N(q i, w i ) = q i+1 (0 i n 1). Since Q = n, there exist i, j {0,..., n} such that i < j and q i = q j. Exercise: Design an NFA to recognize a(a bc). 25 / / 51 Pumping lemma (cont d) Pumping lemma (cont d) Proof (cont d). Let x = w 1,..., w i y = w i+1... w j z = w j+1... w n Then y ɛ and M accepts xy z. Now let S be the set of all positive k Z such that w L(M) w k = x, y, z Σ with y ɛ such that w = xyz xy z L(M). S is nonempty since n S. Hence S has a minimal element k. Example Example: Let Σ = { (, ) } and let L () = { ( k ) k : k N }. We show that L is not regular. Proof. Suppose that L () = L(M) for some FSA M. Let n = Q. Consider ( k ) k for some k n. Write ( k ) k = xyz where y ɛ and xy z L (). Regardless of whether y (, y ), or y = ( l ) m for some l, m > 0, one can find xy z that are unbalanced, contradicting our assumption that L () = L(M). Conclusion: Lexer can t check for balanced parentheses! 27 / / 51

8 Application to lexical analysis Application to lexical analysis (cont d) NFA for X = ab Let Σ = {a, b, c}. Suppose there are exactly two tokens: X = ab and Y = (a c). To build: a DFA recognizing X Y : 1. Build NFA for X. 2. Build NFA for Y. 3. Build NFA M for X Y. NFA for Y = (a c) 29 / / 51 Application to lexical analysis (cont d) Application to lexical analysis (cont d) NFA for X Y = (ab ) (a c) We want to build DFA M corresponding to to the NFA M: Let N and N denote transition functions for M and M. T is start state for M. [T ] = [AGHIKNTU] is start state for M. Inputs State [AGHIKNTU] [BCFHIJKMNU] [ ] [HIKLMNU] [BCF [HIJKMNU]] [HIJKMNU] [CDEFU] [HIKLMNU] [ ] [ ] [ ] [ ] [HIKLMNU] [HIJKMNU] [ ] [HIKLMNU] [HIJKMNU] [HIJKMNU] [ ] [HIKLMNU] [CDEFU] [ ] [CDEFU] [ ] 31 / / 51

9 Application to lexical analysis (cont d) State minimization Relabeling, we get: Inputs State Use state minimization techniques (Appendix A) to remove equivalent states. Two states q i, q j Q are equivalent if language L and two states q m, q n F such that L : q i q m and L : q j q n. Two non-equivalent states are distinguishable. Reduce machine to one for which all state pairs are distinguishable. 1. Initially assume all state pairs distinguishable (until proven otherwise). 2. Only look at single input symbols: a Σ, N(q i, a) F N(q j, a) F = q i q j a Σ, N(q i, a) N(q j, a) = q i q j 33 / / 51 Distinguishability matrix D: (n 1) (n 1) upper triangular bit matrix: d i,j = 1 iff q i q j Here, n = Q. Start with at least one pair of distinguishable states, say, q i F and q j F ; we set d i,j = 1 for same. Consider all unmarked entries. Suppose states are p and q. Then a Σ : N(p, a) F N(q, a) F So look at p-row, q-row in state table: (a) Rows are identical or equivalent:leave entry unmarked. (b) Rows differ by known distinguishable states: States are distinguishable. (c) Rows differ by states whose distinguishability is unknown: Don t know yet. More on case (c): Suppose we have p r t v q s u w where we know nothing about {(r, s), (t, u), (v, w)}. Distinguishability of (p, q) depends on what we later learn about these pairs. Put (p, q) on list linked to each pair. Once we find a pair to be non-equivalent, mark each pair on that list as also being non-equivalent. Algorithm terminates when all entries are checked: If row or column for q contains a zero, we ve found a state equivalent to q. The equivalence classes of states are the states of the new machine. 35 / / 51

10 Go back to original problem Then Inputs State Since 3 6 = 1 2, we now have / / 51 Since 2? 5, we have Since 2? 4, we also have (2, 5) (1, 4) So 3 6 = 1 6. Hence (2, 5) (1, 4) (1, 5) 39 / / 51

11 So 6 3 = 2 4. Hence So 6 3 = 2 5. Since (2, 5) (1, 4) (1, 5), we now have 1 4 and 1 5. Hence / / So 5 3 = 2 6, we now have So 4 5, and so we now have / / 51

12 Examine the remaining unmarked pairs (4, 6) and (5, 6) in like manner, we find that 4 6 and 5 6. So we finally have We can delete either state 4 or state 5. Let s delete state 5, leaving us with states 1, 2, 3, 4, and 6: Inputs State Finally, relabel state 6 as state 5, getting Inputs State / / 51 Recognizing tokens Modifications to basic FSA for lexing program source: 1. Ignore whitespace, except when it delimits a token. Use an extra state for this. 2. Whenever we reach an accepting state, announce a token. Don t enter accepting state until entire token is read. 3. Exactly one accepting state per token; state identifies token type. 4. Treat keywords as identifiers, but do a table lookup in symbol table. Other considerations: Some tokens are prefixes of others. Can t recognize identifier until past its end. May need to back up one character. Comments: may have multi-character delimiters (for instance, /*... */ or //... ). Quotation marks within quotes (\ vs. ). Stripped-down Pascal lexer Need to identify the following tokens: Identifiers, constants, labels Keywords (such as for). Simple operators (such as <). Compound operators (such as <=). Multi-character tokens whose prefix is a token. Comment syntax. Turbo Pascal allows {... } as well as (*... *). Compiler directives (analogous to #include). 47 / / 51

13 Stripped-down Pascal lexer (cont d) Stripped-down Pascal lexer (cont d) 49 / 51 Coding an FSA Basic outline of pseudocode: repeat get next input char find new state table entry if (new state is final for some token) then begin isolate token pass to parser decrement cp if necessary end until no more input 51 / / 51

Deterministic Finite Automaton (DFA)

Deterministic Finite Automaton (DFA) 1 Lecture Overview Deterministic Finite Automata (DFA) o accepting a string o defining a language Nondeterministic Finite Automata (NFA) o converting to DFA (subset construction) o constructed from a regular

More information

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata CISC 4090: Theory of Computation Chapter Regular Languages Xiaolan Zhang, adapted from slides by Prof. Werschulz Section.: Finite Automata Fordham University Department of Computer and Information Sciences

More information

COSE312: Compilers. Lecture 2 Lexical Analysis (1)

COSE312: Compilers. Lecture 2 Lexical Analysis (1) COSE312: Compilers Lecture 2 Lexical Analysis (1) Hakjoo Oh 2017 Spring Hakjoo Oh COSE312 2017 Spring, Lecture 2 March 12, 2017 1 / 15 Lexical Analysis ex) Given a C program float match0 (char *s) /* find

More information

Chapter 5. Finite Automata

Chapter 5. Finite Automata Chapter 5 Finite Automata 5.1 Finite State Automata Capable of recognizing numerous symbol patterns, the class of regular languages Suitable for pattern-recognition type applications, such as the lexical

More information

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

Regular Languages. Problem Characterize those Languages recognized by Finite Automata. Regular Expressions Regular Languages Fundamental Question -- Cardinality Alphabet = Σ is finite Strings = Σ is countable Languages = P(Σ ) is uncountable # Finite Automata is countable -- Q Σ +1 transition

More information

Uses of finite automata

Uses of finite automata Chapter 2 :Finite Automata 2.1 Finite Automata Automata are computational devices to solve language recognition problems. Language recognition problem is to determine whether a word belongs to a language.

More information

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism Closure Properties of Regular Languages Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism Closure Properties Recall a closure property is a statement

More information

Theory of computation: initial remarks (Chapter 11)

Theory of computation: initial remarks (Chapter 11) Theory of computation: initial remarks (Chapter 11) For many purposes, computation is elegantly modeled with simple mathematical objects: Turing machines, finite automata, pushdown automata, and such.

More information

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is, Recall A deterministic finite automaton is a five-tuple where S is a finite set of states, M = (S, Σ, T, s 0, F ) Σ is an alphabet the input alphabet, T : S Σ S is the transition function, s 0 S is the

More information

CS 154, Lecture 3: DFA NFA, Regular Expressions

CS 154, Lecture 3: DFA NFA, Regular Expressions CS 154, Lecture 3: DFA NFA, Regular Expressions Homework 1 is coming out Deterministic Finite Automata Computation with finite memory Non-Deterministic Finite Automata Computation with finite memory and

More information

CISC4090: Theory of Computation

CISC4090: Theory of Computation CISC4090: Theory of Computation Chapter 2 Context-Free Languages Courtesy of Prof. Arthur G. Werschulz Fordham University Department of Computer and Information Sciences Spring, 2014 Overview In Chapter

More information

Theory of Computation (II) Yijia Chen Fudan University

Theory of Computation (II) Yijia Chen Fudan University Theory of Computation (II) Yijia Chen Fudan University Review A language L is a subset of strings over an alphabet Σ. Our goal is to identify those languages that can be recognized by one of the simplest

More information

September 7, Formal Definition of a Nondeterministic Finite Automaton

September 7, Formal Definition of a Nondeterministic Finite Automaton Formal Definition of a Nondeterministic Finite Automaton September 7, 2014 A comment first The formal definition of an NFA is similar to that of a DFA. Both have states, an alphabet, transition function,

More information

Regular Expression Unit 1 chapter 3. Unit 1: Chapter 3

Regular Expression Unit 1 chapter 3. Unit 1: Chapter 3 Unit 1: Chapter 3 (Regular Expression (RE) and Language) In previous lectures, we have described the languages in terms of machine like description-finite automata (DFA or NFA). Now we switch our attention

More information

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is Abstraction of Problems Data: abstracted as a word in a given alphabet. Σ: alphabet, a finite, non-empty set of symbols. Σ : all the words of finite length built up using Σ: Conditions: abstracted as a

More information

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University.

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University. Lexical Analysis Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University http://compilers.cs.uni-saarland.de Compiler Construction Core Course 2017 Saarland University Today

More information

CS 154. Finite Automata, Nondeterminism, Regular Expressions

CS 154. Finite Automata, Nondeterminism, Regular Expressions CS 54 Finite Automata, Nondeterminism, Regular Expressions Read string left to right The DFA accepts a string if the process ends in a double circle A DFA is a 5-tuple M = (Q, Σ, δ, q, F) Q is the set

More information

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1 Properties of Regular Languages BBM 401 - Automata Theory and Formal Languages 1 Properties of Regular Languages Pumping Lemma: Every regular language satisfies the pumping lemma. A non-regular language

More information

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union. We can show that every RL is also a CFL Since a regular grammar is certainly context free. We can also show by only using Regular Expressions and Context Free Grammars That is what we will do in this half.

More information

Languages, regular languages, finite automata

Languages, regular languages, finite automata Notes on Computer Theory Last updated: January, 2018 Languages, regular languages, finite automata Content largely taken from Richards [1] and Sipser [2] 1 Languages An alphabet is a finite set of characters,

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http://www.cse.buffalo.edu/faculty/alphonce/sp17/cse443/index.php https://piazza.com/class/iybn4ndqa1s3ei Syllabus Posted on website

More information

acs-04: Regular Languages Regular Languages Andreas Karwath & Malte Helmert Informatik Theorie II (A) WS2009/10

acs-04: Regular Languages Regular Languages Andreas Karwath & Malte Helmert Informatik Theorie II (A) WS2009/10 Regular Languages Andreas Karwath & Malte Helmert 1 Overview Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages

More information

3515ICT: Theory of Computation. Regular languages

3515ICT: Theory of Computation. Regular languages 3515ICT: Theory of Computation Regular languages Notation and concepts concerning alphabets, strings and languages, and identification of languages with problems (H, 1.5). Regular expressions (H, 3.1,

More information

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

September 11, Second Part of Regular Expressions Equivalence with Finite Aut Second Part of Regular Expressions Equivalence with Finite Automata September 11, 2013 Lemma 1.60 If a language is regular then it is specified by a regular expression Proof idea: For a given regular language

More information

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed HKN CS/ECE 374 Midterm 1 Review Nathan Bleier and Mahir Morshed For the most part, all about strings! String induction (to some extent) Regular languages Regular expressions (regexps) Deterministic finite

More information

Inf2A: Converting from NFAs to DFAs and Closure Properties

Inf2A: Converting from NFAs to DFAs and Closure Properties 1/43 Inf2A: Converting from NFAs to DFAs and Stuart Anderson School of Informatics University of Edinburgh October 13, 2009 Starter Questions 2/43 1 Can you devise a way of testing for any FSM M whether

More information

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Before we show how languages can be proven not regular, first, how would we show a language is regular? CS35 Proving Languages not to be Regular Before we show how languages can be proven not regular, first, how would we show a language is regular? Although regular languages and automata are quite powerful

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY REVIEW for MIDTERM 1 THURSDAY Feb 6 Midterm 1 will cover everything we have seen so far The PROBLEMS will be from Sipser, Chapters 1, 2, 3 It will be

More information

Decision, Computation and Language

Decision, Computation and Language Decision, Computation and Language Non-Deterministic Finite Automata (NFA) Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218 Finite State Automata

More information

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r Syllabus R9 Regulation UNIT-II NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: In the automata theory, a nondeterministic finite automaton (NFA) or nondeterministic finite state machine is a finite

More information

UNIT-III REGULAR LANGUAGES

UNIT-III REGULAR LANGUAGES Syllabus R9 Regulation REGULAR EXPRESSIONS UNIT-III REGULAR LANGUAGES Regular expressions are useful for representing certain sets of strings in an algebraic fashion. In arithmetic we can use the operations

More information

Finite Automata and Regular languages

Finite Automata and Regular languages Finite Automata and Regular languages Huan Long Shanghai Jiao Tong University Acknowledgements Part of the slides comes from a similar course in Fudan University given by Prof. Yijia Chen. http://basics.sjtu.edu.cn/

More information

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text On the menu today jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text 1 jflap Demo jflap: Useful tool for creating and testing abstract machines Finite automata,

More information

Computational Theory

Computational Theory Computational Theory Finite Automata and Regular Languages Curtis Larsen Dixie State University Computing and Design Fall 2018 Adapted from notes by Russ Ross Adapted from notes by Harry Lewis Curtis Larsen

More information

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach CS311 Computational Structures Regular Languages and Regular Expressions Lecture 4 Andrew P. Black Andrew Tolmach 1 Expressions Weʼre used to using expressions to describe mathematical objects Example:

More information

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages CS 154 Finite Automata vs Regular Expressions, Non-Regular Languages Deterministic Finite Automata Computation with finite memory Non-Deterministic Finite Automata Computation with finite memory and guessing

More information

Sri vidya college of engineering and technology

Sri vidya college of engineering and technology Unit I FINITE AUTOMATA 1. Define hypothesis. The formal proof can be using deductive proof and inductive proof. The deductive proof consists of sequence of statements given with logical reasoning in order

More information

Automata and Languages

Automata and Languages Automata and Languages Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Nondeterministic Finite Automata with empty moves (-NFA) Definition A nondeterministic finite automaton

More information

Lecture 3: Nondeterministic Finite Automata

Lecture 3: Nondeterministic Finite Automata Lecture 3: Nondeterministic Finite Automata September 5, 206 CS 00 Theory of Computation As a recap of last lecture, recall that a deterministic finite automaton (DFA) consists of (Q, Σ, δ, q 0, F ) where

More information

CPSC 421: Tutorial #1

CPSC 421: Tutorial #1 CPSC 421: Tutorial #1 October 14, 2016 Set Theory. 1. Let A be an arbitrary set, and let B = {x A : x / x}. That is, B contains all sets in A that do not contain themselves: For all y, ( ) y B if and only

More information

TDDD65 Introduction to the Theory of Computation

TDDD65 Introduction to the Theory of Computation TDDD65 Introduction to the Theory of Computation Lecture 2 Gustav Nordh, IDA gustav.nordh@liu.se 2012-08-31 Outline - Lecture 2 Closure properties of regular languages Regular expressions Equivalence of

More information

Non-deterministic Finite Automata (NFAs)

Non-deterministic Finite Automata (NFAs) Algorithms & Models of Computation CS/ECE 374, Fall 27 Non-deterministic Finite Automata (NFAs) Part I NFA Introduction Lecture 4 Thursday, September 7, 27 Sariel Har-Peled (UIUC) CS374 Fall 27 / 39 Sariel

More information

Theory of computation: initial remarks (Chapter 11)

Theory of computation: initial remarks (Chapter 11) Theory of computation: initial remarks (Chapter 11) For many purposes, computation is elegantly modeled with simple mathematical objects: Turing machines, finite automata, pushdown automata, and such.

More information

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described

More information

Formal Languages. We ll use the English language as a running example.

Formal Languages. We ll use the English language as a running example. Formal Languages We ll use the English language as a running example. Definitions. A string is a finite set of symbols, where each symbol belongs to an alphabet denoted by. Examples. The set of all strings

More information

NPDA, CFG equivalence

NPDA, CFG equivalence NPDA, CFG equivalence Theorem A language L is recognized by a NPDA iff L is described by a CFG. Must prove two directions: ( ) L is recognized by a NPDA implies L is described by a CFG. ( ) L is described

More information

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Lecture 3 Lexical analysis Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Big picture Source code Front End IR Back End Machine code Errors Front end responsibilities Check

More information

Ogden s Lemma for CFLs

Ogden s Lemma for CFLs Ogden s Lemma for CFLs Theorem If L is a context-free language, then there exists an integer l such that for any u L with at least l positions marked, u can be written as u = vwxyz such that 1 x and at

More information

Lecture 17: Language Recognition

Lecture 17: Language Recognition Lecture 17: Language Recognition Finite State Automata Deterministic and Non-Deterministic Finite Automata Regular Expressions Push-Down Automata Turing Machines Modeling Computation When attempting to

More information

Nondeterministic Finite Automata

Nondeterministic Finite Automata Nondeterministic Finite Automata Not A DFA Does not have exactly one transition from every state on every symbol: Two transitions from q 0 on a No transition from q 1 (on either a or b) Though not a DFA,

More information

CS 455/555: Finite automata

CS 455/555: Finite automata CS 455/555: Finite automata Stefan D. Bruda Winter 2019 AUTOMATA (FINITE OR NOT) Generally any automaton Has a finite-state control Scans the input one symbol at a time Takes an action based on the currently

More information

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions Introduction to Language Theory and Compilation: Exercises Session 2: Regular expressions Regular expressions (RE) Finite automata are an equivalent formalism to regular languages (for each regular language,

More information

CONCATENATION AND KLEENE STAR ON DETERMINISTIC FINITE AUTOMATA

CONCATENATION AND KLEENE STAR ON DETERMINISTIC FINITE AUTOMATA 1 CONCATENATION AND KLEENE STAR ON DETERMINISTIC FINITE AUTOMATA GUO-QIANG ZHANG, XIANGNAN ZHOU, ROBERT FRASER, LICONG CUI Department of Electrical Engineering and Computer Science, Case Western Reserve

More information

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010 University of Virginia - cs3102: Theory of Computation Spring 2010 PS2 - Comments Average: 77.4 (full credit for each question is 100 points) Distribution (of 54 submissions): 90, 12; 80 89, 11; 70-79,

More information

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata : Organization of Programming Languages Theory of Regular Expressions Finite Automata Previous Course Review {s s defined} means the set of string s such that s is chosen or defined as given s A means

More information

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

Finite Automata. BİL405 - Automata Theory and Formal Languages 1 Finite Automata BİL405 - Automata Theory and Formal Languages 1 Deterministic Finite Automata (DFA) A Deterministic Finite Automata (DFA) is a quintuple A = (Q,,, q 0, F) 1. Q is a finite set of states

More information

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont ) CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont ) Sungjin Im University of California, Merced 2-3-214 Example II A ɛ B ɛ D F C E Example II A ɛ B ɛ D F C E NFA accepting

More information

Formal Languages. We ll use the English language as a running example.

Formal Languages. We ll use the English language as a running example. Formal Languages We ll use the English language as a running example. Definitions. A string is a finite set of symbols, where each symbol belongs to an alphabet denoted by Σ. Examples. The set of all strings

More information

Nondeterministic finite automata

Nondeterministic finite automata Lecture 3 Nondeterministic finite automata This lecture is focused on the nondeterministic finite automata (NFA) model and its relationship to the DFA model. Nondeterminism is an important concept in the

More information

1.3 Regular Expressions

1.3 Regular Expressions 51 1.3 Regular Expressions These have an important role in descriing patterns in searching for strings in many applications (e.g. awk, grep, Perl,...) All regular expressions of alphaet are 1.Øand are

More information

Finite Automata Part Two

Finite Automata Part Two Finite Automata Part Two Recap from Last Time Old MacDonald Had a Symbol, Σ-eye-ε-ey, Oh! You may have noticed that we have several letter- E-ish symbols in CS103, which can get confusing! Here s a quick

More information

Nondeterministic Finite Automata

Nondeterministic Finite Automata Nondeterministic Finite Automata Mahesh Viswanathan Introducing Nondeterminism Consider the machine shown in Figure. Like a DFA it has finitely many states and transitions labeled by symbols from an input

More information

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA) Deterministic Finite Automata Non deterministic finite automata Automata we ve been dealing with have been deterministic For every state and every alphabet symbol there is exactly one move that the machine

More information

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013 CMPSCI 250: Introduction to Computation Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013 λ-nfa s to NFA s to DFA s Reviewing the Three Models and Kleene s Theorem The Subset

More information

UNIT-VIII COMPUTABILITY THEORY

UNIT-VIII COMPUTABILITY THEORY CONTEXT SENSITIVE LANGUAGE UNIT-VIII COMPUTABILITY THEORY A Context Sensitive Grammar is a 4-tuple, G = (N, Σ P, S) where: N Set of non terminal symbols Σ Set of terminal symbols S Start symbol of the

More information

Finite Automata Part Two

Finite Automata Part Two Finite Automata Part Two DFAs A DFA is a Deterministic Finite Automaton A DFA is defined relative to some alphabet Σ. For each state in the DFA, there must be exactly one transition defined for each symbol

More information

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application Examples of Regular Expressions 1. 0 10, L(0 10 ) = {w w contains exactly a single 1} 2. Σ 1Σ, L(Σ 1Σ ) = {w w contains at least one 1} 3. Σ 001Σ, L(Σ 001Σ ) = {w w contains the string 001 as a substring}

More information

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions Andrés Sicard-Ramírez Universidad EAFIT Semester 2018-2 Introduction Equivalences DFA NFA -NFA RE Finite Automata and Regular

More information

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS Automata Theory Lecture on Discussion Course of CS2 This Lecture is about Mathematical Models of Computation. Why Should I Care? - Ways of thinking. - Theory can drive practice. - Don t be an Instrumentalist.

More information

Johns Hopkins Math Tournament Proof Round: Automata

Johns Hopkins Math Tournament Proof Round: Automata Johns Hopkins Math Tournament 2018 Proof Round: Automata February 9, 2019 Problem Points Score 1 10 2 5 3 10 4 20 5 20 6 15 7 20 Total 100 Instructions The exam is worth 100 points; each part s point value

More information

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2 BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2 Prepared by, Dr. Subhendu Kumar Rath, BPUT, Odisha. UNIT 2 Structure NON-DETERMINISTIC FINITE AUTOMATA

More information

CSC236 Week 10. Larry Zhang

CSC236 Week 10. Larry Zhang CSC236 Week 10 Larry Zhang 1 Today s Topic Deterministic Finite Automata (DFA) 2 Recap of last week We learned a lot of terminologies alphabet string length of string union concatenation Kleene star language

More information

Theory of Computation (I) Yijia Chen Fudan University

Theory of Computation (I) Yijia Chen Fudan University Theory of Computation (I) Yijia Chen Fudan University Instructor Yijia Chen Homepage: http://basics.sjtu.edu.cn/~chen Email: yijiachen@fudan.edu.cn Textbook Introduction to the Theory of Computation Michael

More information

Theory of Languages and Automata

Theory of Languages and Automata Theory of Languages and Automata Chapter 1- Regular Languages & Finite State Automaton Sharif University of Technology Finite State Automaton We begin with the simplest model of Computation, called finite

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 5-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY NON-DETERMINISM and REGULAR OPERATIONS THURSDAY JAN 6 UNION THEOREM The union of two regular languages is also a regular language Regular Languages Are

More information

CSCE 551 Final Exam, Spring 2004 Answer Key

CSCE 551 Final Exam, Spring 2004 Answer Key CSCE 551 Final Exam, Spring 2004 Answer Key 1. (10 points) Using any method you like (including intuition), give the unique minimal DFA equivalent to the following NFA: 0 1 2 0 5 1 3 4 If your answer is

More information

Formal Languages, Automata and Models of Computation

Formal Languages, Automata and Models of Computation CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 School of Innovation, Design and Engineering Mälardalen University 2011 1 Content - More Properties of Regular Languages (RL)

More information

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F) Outline Nondeterminism Regular expressions Elementary reductions http://www.cs.caltech.edu/~cs20/a October 8, 2002 1 Determistic Finite Automata A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F) Q is a finite

More information

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012 Finite Automata and Formal Languages TMV26/DIT32 LP4 22 Lecture 7 Ana Bove March 27th 22 Overview of today s lecture: Regular Expressions From FA to RE Regular Expressions Regular expressions (RE) are

More information

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata. Finite Automata Automata (singular: automation) are a particularly simple, but useful, model of computation. They were initially proposed as a simple model for the behavior of neurons. The concept of a

More information

Finite Automata. Finite Automata

Finite Automata. Finite Automata Finite Automata Finite Automata Formal Specification of Languages Generators Grammars Context-free Regular Regular Expressions Recognizers Parsers, Push-down Automata Context Free Grammar Finite State

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Theory of Regular Expressions DFAs and NFAs Reminders Project 1 due Sep. 24 Homework 1 posted Exam 1 on Sep. 25 Exam topics list posted Practice homework

More information

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1) CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1) Definition 1 (Alphabet) A alphabet is a finite set of objects called symbols. Definition 2 (String)

More information

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA) Languages Non deterministic finite automata with ε transitions Recall What is a language? What is a class of languages? Finite Automata Consists of A set of states (Q) A start state (q o ) A set of accepting

More information

How do regular expressions work? CMSC 330: Organization of Programming Languages

How do regular expressions work? CMSC 330: Organization of Programming Languages How do regular expressions work? CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata What we ve learned What regular expressions are What they can express, and cannot

More information

Java II Finite Automata I

Java II Finite Automata I Java II Finite Automata I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz November, 23 Processing Regular Expressions We already learned about Java s regular expression

More information

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen Pushdown Automata Notes on Automata and Theory of Computation Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan ROC Pushdown Automata p. 1

More information

Automata: a short introduction

Automata: a short introduction ILIAS, University of Luxembourg Discrete Mathematics II May 2012 What is a computer? Real computers are complicated; We abstract up to an essential model of computation; We begin with the simplest possible

More information

Closure under the Regular Operations

Closure under the Regular Operations September 7, 2013 Application of NFA Now we use the NFA to show that collection of regular languages is closed under regular operations union, concatenation, and star Earlier we have shown this closure

More information

Decidability (What, stuff is unsolvable?)

Decidability (What, stuff is unsolvable?) University of Georgia Fall 2014 Outline Decidability Decidable Problems for Regular Languages Decidable Problems for Context Free Languages The Halting Problem Countable and Uncountable Sets Diagonalization

More information

Applied Computer Science II Chapter 1 : Regular Languages

Applied Computer Science II Chapter 1 : Regular Languages Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany Overview Deterministic finite automata Regular languages

More information

CS 121, Section 2. Week of September 16, 2013

CS 121, Section 2. Week of September 16, 2013 CS 121, Section 2 Week of September 16, 2013 1 Concept Review 1.1 Overview In the past weeks, we have examined the finite automaton, a simple computational model with limited memory. We proved that DFAs,

More information

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism, CS 54, Lecture 2: Finite Automata, Closure Properties Nondeterminism, Why so Many Models? Streaming Algorithms 0 42 Deterministic Finite Automata Anatomy of Deterministic Finite Automata transition: for

More information

Computer Sciences Department

Computer Sciences Department 1 Reference Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER 3 objectives Finite automaton Infinite automaton Formal definition State diagram Regular and Non-regular

More information

Constructions on Finite Automata

Constructions on Finite Automata Constructions on Finite Automata Informatics 2A: Lecture 4 Mary Cryan School of Informatics University of Edinburgh mcryan@inf.ed.ac.uk 24 September 2018 1 / 33 Determinization The subset construction

More information

COMP4141 Theory of Computation

COMP4141 Theory of Computation COMP4141 Theory of Computation Lecture 4 Regular Languages cont. Ron van der Meyden CSE, UNSW Revision: 2013/03/14 (Credits: David Dill, Thomas Wilke, Kai Engelhardt, Peter Höfner, Rob van Glabbeek) Regular

More information

Automata and Computability. Solutions to Exercises

Automata and Computability. Solutions to Exercises Automata and Computability Solutions to Exercises Spring 27 Alexis Maciel Department of Computer Science Clarkson University Copyright c 27 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

More information

CS243, Logic and Computation Nondeterministic finite automata

CS243, Logic and Computation Nondeterministic finite automata CS243, Prof. Alvarez NONDETERMINISTIC FINITE AUTOMATA (NFA) Prof. Sergio A. Alvarez http://www.cs.bc.edu/ alvarez/ Maloney Hall, room 569 alvarez@cs.bc.edu Computer Science Department voice: (67) 552-4333

More information

1 More finite deterministic automata

1 More finite deterministic automata CS 125 Section #6 Finite automata October 18, 2016 1 More finite deterministic automata Exercise. Consider the following game with two players: Repeatedly flip a coin. On heads, player 1 gets a point.

More information