What is this course about?

Similar documents
Formal Languages and Automata

Sri vidya college of engineering and technology

Theory of Computation

CS Automata, Computability and Formal Languages

Introduction to Kleene Algebras

Computational Theory

CS 121, Section 2. Week of September 16, 2013

Automata Theory and Formal Grammars: Lecture 1

Finite Automata and Languages

Theory of Computation 1 Sets and Regular Expressions

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

Theoretical Computer Science

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Nondeterministic Finite Automata and Regular Expressions

Computational Models - Lecture 4

FLAC Context-Free Grammars

Chapter 3. Regular grammars

Introduction to Automata

Structural Induction

6.8 The Post Correspondence Problem

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Mathematical Preliminaries. Sipser pages 1-28

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Models of Computation,

(Refer Slide Time: 0:21)

Uses of finite automata

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

What we have done so far

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin

CS375 Midterm Exam Solution Set (Fall 2017)

MA/CSSE 474 Theory of Computation

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

Theory of Computer Science

CVO103: Programming Languages. Lecture 2 Inductive Definitions (2)

Automata: a short introduction

COSE312: Compilers. Lecture 2 Lexical Analysis (1)

Computational Models - Lecture 3

Finite Automata and Regular Languages

Author: Vivek Kulkarni ( )

CSCI 2670 Introduction to Theory of Computing

COM364 Automata Theory Lecture Note 2 - Nondeterminism

UNIT-III REGULAR LANGUAGES

How do regular expressions work? CMSC 330: Organization of Programming Languages

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

Languages. A language is a set of strings. String: A sequence of letters. Examples: cat, dog, house, Defined over an alphabet:

Formal Languages, Automata and Models of Computation

Automata and Languages

CMSC 330: Organization of Programming Languages

3515ICT: Theory of Computation. Regular languages

Theory of computation: initial remarks (Chapter 11)

COMP4141 Theory of Computation

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Properties of Context-free Languages. Reading: Chapter 7

Context-Free Grammars and Languages. Reading: Chapter 5

UNIT II REGULAR LANGUAGES

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata

CMSC 330: Organization of Programming Languages

FABER Formal Languages, Automata. Lecture 2. Mälardalen University

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Chapter Five: Nondeterministic Finite Automata

CS 455/555: Finite automata

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007.

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Deterministic Finite Automaton (DFA)

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

Theory of Computation

Context-free Grammars and Languages

C1.1 Introduction. Theory of Computer Science. Theory of Computer Science. C1.1 Introduction. C1.2 Alphabets and Formal Languages. C1.

CS500 Homework #2 Solutions

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Lecture 17: Language Recognition

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Topics COSC Administrivia. Topics Today. Administrivia (II) Acknowledgements. Slides presented May 9th and 16th,

Pushdown Automata. Reading: Chapter 6

ICS141: Discrete Mathematics for Computer Science I

MA/CSSE 474 Theory of Computation

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

Theory of Computation Lecture 1. Dr. Nahla Belal

Lexical Analysis. Reinhard Wilhelm, Sebastian Hack, Mooly Sagiv Saarland University, Tel Aviv University.

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Deterministic Finite Automata (DFAs)

Section 1 (closed-book) Total points 30

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

On Parsing Expression Grammars A recognition-based system for deterministic languages

Axioms of Kleene Algebra

Advanced Automata Theory 2 Finite Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS256/Spring 2008 Lecture #11 Zohar Manna. Beyond Temporal Logics

Today s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets

SFWR ENG 2FA3. Solution to the Assignment #4

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Transcription:

What is this course about? Examining the power of an abstract machine What can this box of tricks do?

What is this course about? Examining the power of an abstract machine Domains of discourse: automata and formal languages Automaton is the box of tricks, language recognition is what it can do.

What is this course about? Examining the power of an abstract machine Domains of discourse: automata and formal languages Formalisms to describe languages and automata Very useful for future courses.

What is this course about? Examining the power of an abstract machine Domains of discourse: automata and formal languages Formalisms to describe languages and automata Proving a particular case: relationship between regular languages and finite automata Perhaps the simplest result about power of a machine. Finite Automata are simply a formalisation of finite state machines you looked at in Digital Electronics.

A word about formalisms to describe languages Classically (i.e. when I was young) this would be done using formal grammars. e.g. S NV e.g. I ID, I D, I D

A word about formalisms to describe languages Classically (i.e. when I was young) this would be done using formal grammars. Here will we use rule induction Excuse to introduce now, useful in other things

Syllabus for this part of the course Inductive definitions using rules and proofs by rule induction. Abstract syntax trees. Regular expressions and pattern matching. Finite automata and regular languages: Kleene s theorem. The Pumping Lemma. mathematics needed for computer science

Common theme: mathematical techniques for defining formal languages and reasoning about their properties. Key concepts: inductive definitions, automata Relevant to: Part IB Compiler Construction, Computation Theory, Complexity Theory, Semantics of Programming Languages Part II Natural Language Processing, Optimising Compilers, Denotational Semantics, Temporal Logic and Model Checking N.B. we do not cover the important topic of context-free grammars, which prior to 2013/14 was part of the CST IA course Regular Languages and Finite Automata that has been subsumed into this course. see course web page for relevant Tripos questions

Formal Languages

Alphabets An alphabet is specified by giving a finite set, Σ, whose elements are called symbols. For us, any set qualifies as a possible alphabet, so long as it is finite. Examples: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, 10-element set of decimal digits. {a, b, c,..., x, y, z}, 26-element set of lower-case characters of the English language. {S S {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}}, 2 10 -element set of all subsets of the alphabet of decimal digits. Non-example: N = {0, 1, 2, 3,...}, set of all non-negative whole numbers is not an alphabet, because it is infinite.

Strings over an alphabet A string of length n (for n = 0, 1, 2,...) over an alphabet Σ is just an ordered n-tuple of elements of Σ, written without punctuation. Σ denotes set of all strings over Σ of any finite length. notation for the Examples: string of length 0 If Σ = {a, b, c}, then ε, a, ab, aac, and bbac are strings over Σ of lengths zero, one, two, three and four respectively. If Σ = {a}, then Σ contains ε, a, aa, aaa, aaaa, etc. In general, a n denotes the string of length n just containing a symbols

Strings over an alphabet A string of length n (for n = 0, 1, 2,...) over an alphabet Σ is just an ordered n-tuple of elements of Σ, written without punctuation. Σ denotes set of all strings over Σ of any finite length. Examples: If Σ = {a, b, c}, then ε, a, ab, aac, and bbac are strings over Σ of lengths zero, one, two, three and four respectively. If Σ = {a}, then Σ contains ε, a, aa, aaa, aaaa, etc. If Σ = (the empty set), then what is Σ?

Strings over an alphabet A string of length n (for n = 0, 1, 2,...) over an alphabet Σ is just an ordered n-tuple of elements of Σ, written without punctuation. Σ denotes set of all strings over Σ of any finite length. Examples: If Σ = {a, b, c}, then ε, a, ab, aac, and bbac are strings over Σ of lengths zero, one, two, three and four respectively. If Σ = {a}, then Σ contains ε, a, aa, aaa, aaaa, etc. If Σ = (the empty set), then Σ = {ε}.

Concatenation of strings The concatenation of two strings u and v is the string uv obtained by joining the strings end-to-end. This generalises to the concatenation of three or more strings. Examples: If Σ = {a, b, c,..., z} and u, v, w Σ are u = ab, v = ra and w = cad, then vu = raab uu = abab wv = cadra uvwuv = abracadabra

Concatenation of strings The concatenation of two strings u and v is the string uv obtained by joining the strings end-to-end. This generalises to the concatenation of three or more strings. Examples: If Σ = {a, b, c,..., z} and u, v, w Σ are u = ab, v = ra and w = cad, then vu = raab uu = abab wv = cadra uvwuv = abracadabra N.B. (uv)w = uvw = u(vw) ( any u,v,w ) uǫ = u =ǫu The length of a string u Σ is denoted u.

Formal languages An extensional view of what constitutes a formal language is that it is completely determined by the set of words in the dictionary : Given an alphabet Σ, we call any subset of Σ a (formal) language over the alphabet Σ. We will use inductive definitions to describe languages in terms of grammatical rules for generating subsets of Σ.

Inductive Definitions

Axioms and rules for inductively defining a subset of a given set U axioms a are specified by giving an element a of U rules h 1 h 2 h n c are specified by giving a finite subset {h 1, h 2,..., h n } of U (the hypotheses of the rule) and an element c of U (the conclusion of the rule)

Axioms and rules for inductively defining a subset of a given set U axioms are specified by giving an element a of U a which means that a is in the subset we are defining rules h 1 h 2 h n c are specified by giving a finite subset {h 1, h 2,..., h n } of U (the hypotheses of the rule) and an element c of U (the conclusion of the rule) which means that c is in the subset we are defining if all of h 1, h 2,..., h n are

usually draw with leaves at top, root at bottom Derivations Given a set of axioms and rules for inductively defining a subset of a given set U, a derivation (or proof) that a particular element u U is in the subset is by definition a finite rooted tree with vertexes labelled by elements of U and such that: the root of the tree is u (the conclusion of the whole derivation), each vertex of the tree is the conclusion of a rule whose hypotheses are the children of the node, each leaf of the tree is an axiom.

Example U = {a, b} axiom: rules: ε u aub u bua u uv v (for all u, v U) Example derivations: ε ε ab ε ba ε ab ab aabb baab abaabb abaabb

Example U = {a, b} The universal set from which we are specifying a subset. axiom: rules: ε u aub u bua u uv v (for all u, v U) Example derivations: ε ε ab ε ba ε ab ab aabb baab abaabb abaabb

Example U = {a, b} It is the set of all finite strings containing a s & b s. axiom: rules: ε u u u v aub bua uv Example derivations: ε ε ab ε ba ε ab ab aabb baab abaabb abaabb (for all u, v U)

Example U = {a, b} Now the axioms and rules to define the subset : axiom: rules: ε u aub u bua u uv v (for all u, v U) ε ab Example derivations: ε ε ε ab ba ab aabb baab

Inductively defined subsets Given a set of axioms and rules over a set U, the subset of U inductively defined by the axioms and rules consists of all and only the elements u U for which there is a derivation with conclusion u. For example, for the axioms and rules on Slide 13 abaabb is in the subset they inductively define (as witnessed by either derivation on that slide) abaab is not in that subset (there is no derivation with that conclusion why?) (In fact u {a, b} is in the subset iff it contains the same number of a and b symbols.)

rules or templates? u uv v (for all u, v U) is really a template for a (potentially) infinite set of rules

Example: transitive closure Given a binary relation R X X on a set X, its transitive closure R + is the smallest (for subset inclusion) binary relation on X which contains R and which is transitive ( x, y, z X. (x, y) R + &(y, z) R + (x, z) R + ). R + is equal to the subset of X X inductively defined by axioms (x, y) (for all (x, y) R) rules (x, y) (y, z) (x, z) (for all x, y, z X)

Example: reflexive-transitive closure Given a binary relation R X X on a set X, its reflexive-transitive closure R is defined to be the smallest binary relation on X which contains R, is both transitive and reflexive ( x X. (x, x) R ). R is equal to the subset of X X inductively defined by axioms (x, y) (for all (x, y) R) (x, x) (for all x X) rules (x, y) (y, z) (x, z) (for all x, y, z X)

Example: reflexive-transitive closure Given a binary relation R X X on a set X, its reflexive-transitive closure R is defined to be the smallest binary relation on X which contains R, is both transitive and reflexive ( x X. (x, x) R ). R is equal to the subset of X X inductively defined by axioms (x, y) (for all (x, y) R) (x, x) (for all x X) rules (x, y) (y, z) (x, z) (for all x, y, z X) we can use Rule Induction to prove this

Example: reflexive-transitive closure Given a binary relation R X X on a set X, its reflexive-transitive closure R is defined to be the smallest binary relation on X which contains R, is both transitive and reflexive ( x X. (x, x) R ). R is equal to the subset of X X inductively defined by axioms (x, y) (for all (x, y) R) (x, x) (for all x X) rules (x, y) (y, z) (x, z) (for all x, y, z X) we can use Rule Induction to prove this, since S X X being closed under the axioms & rules is the same as it containing R, being reflexive and being transitive.

Inductively defined subsets Given a set of axioms and rules over a set U, the subset of U inductively defined by the axioms and rules consists of all and only the elements u U for which there is a derivation with conclusion u. Derivation is a finite (labelled) tree with u at root, axiom at leaves and each vertex the conclusion of a rule whose hypotheses are the children of the vertex. (We usually draw the trees with the root at the bottom.)

Rule Induction Theorem. The subset I U inductively defined by a collection of axioms and rules is closed under them and is the least such subset: if S U is also closed under the axioms and rules, then I S. Given axioms and rules for inductively defining a subset of a set U, we say that a subset S U is closed under the axioms and rules if for every axiom a, it is the case that a S for every rule h 1 h 2 h n, if h 1, h 2,..., h n S, then c S. c

E.g. for the axiom & rules ǫ u aub u bua u v uv for all u, v {a, b} the subset {u {a, b} # a (u) = # b (u)} (where # a (u) is the number of a s in the string u)

E.g. for the axiom & rules ǫ u aub u bua u v uv for all u, v {a, b} the subset {u {a, b} # a (u) = # b (u)} is closed under the axiom & rules.

N.B. for a given set R of axioms & rules {u U S U.(S closed under R) = u S} is closed under R (Why?) and so is the smallest such (with respect to subset inclusion, )

N.B. for a given set R of axioms & rules {u U S U.(S closed under R) = u S} is closed under R (Why?) and so is the smallest such (with respect to subset inclusion, ) This set contains all items that are in every set that is closed under R

Theorem. The subset I U inductively defined by a collection of axioms and rules is closed under them and is the least such subset: if S U is also closed under the axioms and rules, then I S. the least subset closed under the axioms & rules is sometimes take as the definition of inductively defined subset

Proof of the Theorem [Page 23 of notes] Closure part I is closed under each axiom because we can construct a derivation witnessing a I...... which is simply a tree with one node containing a a

Closure part (2) I is closed under each rule r = h 1 h 2... h n a because if h 1 h 2... h n I... we have n derivations from axioms to each h i and so... we can just make these the n children to our rule r to form a big tree... which is a derivation witnessing c I

Proof of the Theorem so we have closure under rules & axioms Now the least such subset part We need to show, for every S U (S closed under axioms and rules ) I S That is, I is the least subset, in that any other subset that is closed under the axioms & rules contains I.

Least Subset So we need to show that every element of I is contained in any set S U which is closed under the rules & axioms Q: How can we characterise an element of I? A: For each element of I there is a derivation that witnesses its membership So let s do induction on the height of the derivation (i.e. the height of the tree)

Least Subset - Proof by Induction P(n) all derivations of height n have their conclusion in S Need to show: P(0) (consider these to be single (axiom) node derivations) (k n) P(k) P(n+1) since if P(n) is true for all n, then all derivations have their conclusion in S, and thus every element of I is in S.

Least Subset - Proof by Induction P(n) all derivations of height n have their conclusion in S P(0): trivially true since conclusion is an axiom and S is closed under axioms

Least Subset - Proof by Induction P(n) all derivations of height n have their conclusion in S P(0): trivially true since conclusion is an axiom and S is closed under axioms (k n) P(k) P(n+1):

Least Subset - Proof by Induction P(n) all derivations of height n have their conclusion in S P(0): trivially true since conclusion is an axiom and S is closed under axioms (k n) P(k) P(n+1): Suppose (k n) P(k) and that D is a derivation of height n+1 with, say, conclusion c

D D 1 c 1 D 2 c 2 D k c k c

D D 1 c 1 D 2 c 2 D k c k some rule c c is the result of applying some rule to a set of conclusions c 1 c 2... c k

D D 1 D 2 D k c 1 c 2 heights c n k some rule c But the derivations for the c i all have height n. So the c i are all in S by assumption and since S is closed under all axioms & rules, c S so (k n) P(k) P(n+1)

Thus every element in I is in any S that is closed under the axioms & rules that inductively defined I. Thus I is the least subset that is closed under those axioms & rules.

Rule Induction Theorem. The subset I U inductively defined by a collection of axioms and rules is closed under them and is the least such subset: if S U is also closed under the axioms and rules, then I S. We use the theorem as method of proof: given a property P(u) of elements of U, to prove u I. P(u) it suffices to show base cases: P(a) holds for each axiom a induction steps: P(h 1 ) & P(h 2 ) & &P(h n ) P(c) holds for each rule h 1 h 2 h n c

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. u u u v ǫ aub bua uv

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. u u u v ǫ aub bua uv Associated Rule Induction:

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. u u u v ǫ aub bua uv Associated Rule Induction: P(ǫ)

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. ǫ u aub u bua u v uv Associated Rule Induction: P(ǫ) u I. P(u) P(aub)

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. ǫ u aub u bua u v uv Associated Rule Induction: P(ǫ) u I. P(u) P(aub) u I. P(u) P(bua)

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. ǫ u aub u bua u v uv Associated Rule Induction: P(ǫ) u I. P(u) P(aub) u I. P(u) P(bua) u, v I. P(u) P(v) P(uv)

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. For u {a, b}, let P(u) be the property u contains the same number of a and b symbols We can prove u I. P(u) by rule induction: base case: P(ε) is true (the number of as and bs is zero!)

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. For u {a, b}, let P(u) be the property u contains the same number of a and b symbols We can prove u I. P(u) by rule induction: base case: P(ε) is true (the number of as and bs is zero!) induction steps: if P(u) and P(v) hold, then clearly so do P(aub), P(bua) and P(uv).

Example using rule induction Let I be the subset of {a, b} inductively defined by the axioms and rules on Slide 17 of the notes. For u {a, b}, let P(u) be the property u contains the same number of a and b symbols We can prove u I. P(u) by rule induction: base case: P(ε) is true (the number of as and bs is zero!) induction steps: if P(u) and P(v) hold, then clearly so do P(aub), P(bua) and P(uv). (It s not so easy to show u {a, b}. P(u) u I rule induction for I is not much help for that.)