Testing Emptiness of a CFL. Testing Finiteness of a CFL. Testing Membership in a CFL. CYK Algorithm

Similar documents
Most General computer?

V Honors Theory of Computation

Introduction to Turing Machines. Reading: Chapters 8 & 9

Turing Machines Part II

SCHEME FOR INTERNAL ASSESSMENT TEST 3

Undecidable Problems and Reducibility

CSCC63 Worksheet Turing Machines

Turing Machines Part III

MA/CSSE 474 Theory of Computation

More About Turing Machines. Programming Tricks Restrictions Extensions Closure Properties

CS20a: Turing Machines (Oct 29, 2002)

CS21 Decidability and Tractability

Theory of Computation Turing Machine and Pushdown Automata

CS5371 Theory of Computation. Lecture 10: Computability Theory I (Turing Machine)

Section 1 (closed-book) Total points 30

1 Unrestricted Computation

CS154 Final Examination

Computational Models Lecture 8 1

CS5371 Theory of Computation. Lecture 10: Computability Theory I (Turing Machine)

CSCE 551: Chin-Tser Huang. University of South Carolina

Theory of Computation (IX) Yijia Chen Fudan University

Computational Models Lecture 8 1

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

CPSC 421: Tutorial #1

CS154 Final Examination

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

CSE 2001: Introduction to Theory of Computation Fall Suprakash Datta

CS20a: Turing Machines (Oct 29, 2002)

Turing Machines. Our most powerful model of a computer is the Turing Machine. This is an FA with an infinite tape for storage.

Part I: Definitions and Properties

Automata Theory - Quiz II (Solutions)

Chapter 8. Turing Machine (TMs)

Homework Assignment 6 Answers

CS4026 Formal Models of Computation

Reducability. Sipser, pages

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Computational Models Lecture 8 1

Decidability (What, stuff is unsolvable?)

Chapter Five: Nondeterministic Finite Automata

Theory of Computation - Module 4

CSE 105 THEORY OF COMPUTATION

TURING MAHINES

Final exam study sheet for CS3719 Turing machines and decidability.

CP405 Theory of Computation

Computability and Complexity

NPDA, CFG equivalence

Properties of Context-Free Languages. Closure Properties Decision Properties

Turing Machines. Fall The Chinese University of Hong Kong. CSCI 3130: Formal languages and automata theory

Automata Theory (2A) Young Won Lim 5/31/18

Introduction to Formal Languages, Automata and Computability p.1/42

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

The Turing Machine. CSE 211 (Theory of Computation) The Turing Machine continued. Turing Machines

The Unsolvability of the Halting Problem. Chapter 19

Introduction to Languages and Computation

(pp ) PDAs and CFGs (Sec. 2.2)

Turing Machines. 22c:135 Theory of Computation. Tape of a Turing Machine (TM) TM versus FA, PDA

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Turing Machines. Lecture 8

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Chapter 3: The Church-Turing Thesis

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

Properties of Context-Free Languages

Harvard CS 121 and CSCI E-121 Lecture 14: Turing Machines and the Church Turing Thesis

Theory Bridge Exam Example Questions

1 Showing Recognizability

More Turing Machines. CS154 Chris Pollett Mar 15, 2006.

The Church-Turing Thesis

ECS 120 Lesson 15 Turing Machines, Pt. 1

Introduction to Turing Machines

CSCE 551: Chin-Tser Huang. University of South Carolina

Equivalence of TMs and Multitape TMs. Theorem 3.13 and Corollary 3.15 By: Joseph Lauman

CSE355 SUMMER 2018 LECTURES TURING MACHINES AND (UN)DECIDABILITY

CS154, Lecture 10: Rice s Theorem, Oracle Machines

CDM Parsing and Decidability

Please give details of your answer. A direct answer without explanation is not counted.

Decidability and Undecidability

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

Foundations of Informatics: a Bridging Course

CSE 105 THEORY OF COMPUTATION

Ogden s Lemma for CFLs

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Decidable and undecidable languages

Turing Machines (TM) The Turing machine is the ultimate model of computation.

A Note on Turing Machine Design

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

Outline 1 PCP. 2 Decision problems about CFGs. M.Mitra (ISI) Post s Correspondence Problem 1 / 10

Lecture 14: Recursive Languages

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Pushdown Automata. Chapter 12

CS151 Complexity Theory. Lecture 1 April 3, 2017

Theory of Computation

Functions on languages:

Homework 8. a b b a b a b. two-way, read/write

Automata and Computability. Solutions to Exercises

(pp ) PDAs and CFGs (Sec. 2.2)

Automata and Computability. Solutions to Exercises

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012

UNIT-VIII COMPUTABILITY THEORY

What languages are Turing-decidable? What languages are not Turing-decidable? Is there a language that isn t even Turingrecognizable?

Transcription:

Testing Emptiness of a CFL As for regular languages, we really take a representation of some language and ask whether it represents φ Can use either CFG or PDA Our choice, since there are algorithms to convert one to the other. The test: Use a CFG; check if the start symbol is useless Testing Finiteness of a CFL Let L be a CFL. Then there is some pumping-lemma constant n for L. Test all strings of length between n and n - for membership (as in next slide). If there is any such string, it can be pumped, and the language is infinite. If there is no such string, then n - is an upper limit on the length of strings, so the language is finite COSC 00 COSC 00 Testing Membership in a CFL Trick: If there were a string z = uvwxy of length n or longer, you can find a shorter string uwy in L, but it's at most n shorter (why?). Thus, if there are any strings of length n or more, you can repeatedly cut out vx to get, eventually, a string whose length is in the range n to n -. Simulating a PDA for L on string w doesn't quite work, because the PDA can grow its stack indefinitely on ε input, and we never finish, even if the PDA is deterministic. There is an O(n ) algorithm (n = length of w) that uses a "dynamic programming" technique. Called Cocke-Younger-Kasami (CYK) algorithm COSC 00 COSC 00 CYK Algorithm Start with a CNF grammar for L. Build a two-dimensional table Row = length of substring of w Column = start of substring Entry in row i and column j = set of variables that generate the substring of w beginning at position j and including i positions w is in L if S is in X n, w = n COSC 00 5 X X X X X X X X a a a X w X a Basis: (row ) X ii = the set of variables A such that A a is a production, and a is the symbol at position i of w. Induction: Assume the rows for substrings of length up to m - have been computed, and compute the row for substrings of length m. We can derive a i a i+ a j from A if there is a production A BC, B derives any prefix of a i a i+ a j, and C derives the rest. Thus, we must ask if there is any value of k such that i k < j B is in X ik C is in X k+,j COSC 00 6

Example Determine if w = aabbb is in language generated by G: S AB A BB a B AB b w = a, so X is set of all variables that immediately derive a, that is X = {A} 5 X 5 X X X X 5 X X X 5 X X 5 {A} {A}{B}{B}{B} a a b b b w Compute X : sincex = {A} and X = {A}, look to produce AA X consists of all variables on the left side of any production with right side AA; none, so X is φ ComputeX : look for AB, productions B AB and S AB fit, so X = {S,B} Rest is easy Since S is in X 5, w L(G) 5 {S,B} {A} {S,B} {S,B} {A} {S,B} φ {S,B} {A} {A} {A} {A} {B} {B} {B} a a b b b w COSC 00 7 COSC 00 8 Outline of Turing Machines and Complexity. Turing machine (TM) = formal model of a computer running a particular program. We must argue that the TM can do exactly what a computer can do, albeit slower.. We use the simplicity of the TM model to prove formally that there are specific problems (languages) that the TM cannot solve. Outline Undecidability unsolvable problems Turing Machines formalism for computers generally COSC 00 9 COSC 00 0 Undecidability: Intuitive Argument Are there problems a programme cannot solve? Simple hello, world problem: decide if a programme prints hello, world It's hard/impossible to determine, even for humans: cf. the International Obfuscated C Code Contest (IOCCC): goal is to write most confusing programme you can even humans don't understand these programmes why should programmes do much better? spoiler: they can't! Simulation won't work Simple approach: Run programme or simulate it, wait for the output, print /no depending on output Obvious problem: what if programme is: maybe_wait_for_00_years(); print("hello, world"); Is there no other way? Some oracle? No use proof by contradiction COSC 00 COSC 00

Undecidability: Informal Proof (I) Suppose H is a programme which solves the hello, world problem: H takes two inputs: P (a programme) and I (input for P) H prints if P prints hello, world after reading I H prints no otherwise NB: even if P never stops, H must print or no P I H no Undecidability: Informal Proof (II) From H, create H : acts just like H, but prints hello, world instead of no if P prints hello, world, H prints if P prints anything else, H prints hello, world P H I hello, world COSC 00 COSC 00 Undecidability: Informal Proof (III) Now from H, create H : acts just like H, but P doubles as input to itself, so H (P) = H (P,P) if P prints hello, world, H prints if P prints anything else, H prints hello, world P H hello, world Undecidability: Informal Proof (IV) Now feed H to itself, i.e., H (H )! What happens? If H prints hello, world, H should print but by printing, H is forced to print hello, world, and so on, ad infinitum. Hence, H, H, and H cannot exist. H hello, world COSC 00 5 COSC 00 6 Turing Machines and Complexity Turing Machine (TM): formal model of computer + program TM can do exactly what a computer can do, just slower there are specific problems that TM cannot solve:. Recursively Enumerable: accept but not reject. Non-RE: cannot even recognize problems with TM's that accept and always halt, i.e., accept + reject ² vs. NP-complete problems specific NP-complete problem(s), e.g., satisfiability The TM Finite-state control, like PDA One read-write tape serves as both input and unbounded storage device Tape divided into cells Each cell holds one symbol from tape alphabet Tape is "semi-infinite"; it ends only at left Tape head marks current cell, only cell that can influence move of TM Initially, tape holds a a a n BB where a a a n is input, chosen from input alphabet (subset of tape alphabet) and B is blank Finite-state control p a/b, R q tape head, about to read current cell a a a n B B COSC 00 7 COSC 00 8

Formal TM M = (Q,Σ,Γ,δ,q 0,B,F) where: Q = finite set of states Σ = input alphabet Γ = tapealphabet Σ Γ B = blank B in Γ - Σ q 0 = start state q 0 in Q F = accepting states F Q δ = transition function δ(q,) = (q',',d) maps state (q) and tape symbol () to new state (q'), replacement symbol (') (either might not change) and direction (d=l/r) for head motion COSC 00 9 Example XX0 M accepts if third input symbol is 0, and otherwise runs forever. M = ({p, q, r, s, t}, {0, }, {0,, B}, p, B, {s}). δ(p,x) = (q, X, R) for X = 0, i.e., in state p on reading 0 or, rewrite the input symbol, move tape head to Right and go to state q. δ(q,x) = (r, X, R) for X = 0,. δ(r,0) = (s, 0, L). δ(r,) = (t,, R) 5. δ(t,x) = (t, X, R) for X = 0,, B COSC 00 0 Example XX0 (II) ID's of a Turing Machine can draw M: p 0/0,R /,R q 0/0,R /,R r 0/0,R /,R s t ID (instantaneous description) captures what is going on at any moment: the current state, the contents of the tape, and the position of the tape head. Keep things finite by dropping all symbols to the right of the head and to the right of the rightmost nonblank. Subtle point: although there is no limit on how far right the head may move and write nonblanks, at any finite time, the TM has visited only a finite prefix of the infinite tape. COSC 00 COSC 00 Notation αqβ says: α is the tape contents to the left of the head The state is q β is the nonblank tape contents at or to the right of the tape head One move indicated byl Zero or more moves represented by L Check 8.. for detailed definition of L IDs for Example XX0 With input 00, sequence of ID's of TM is: p00l 0q0L 0r0L 0s0 At that point it halts, since state s has no move when the head is scanning With input 0 the sequence is: p0l 0qL rl 0tL 0tL 0BtL «The TM never halts, but continues to move right COSC 00 COSC 00

Acceptance by Final State / by Halting Two ways to define language of a TM:. by the set of input strings that cause it to reach an accepting state: L(M) = {w q 0 wl αpβ for some p in F and any α and β in Γ*}.. by the set of strings that cause the TM to halt, i.e., have no next move: H(M) = {w q 0 wl αpxβ and δ(p,x) is not defined} Language of Example XX0 can describe L(M) with RE: (0+)(0+)0(0+)* can describe H(M) with RE: ε+(0+)+(0+)(0+) + (0+)(0+)0(0+)* why the difference? no move on B defined from p, q or r could fix with δ(p,b) = δ(q,b) = δ(r,b) = (t, B, R) for this new machine M': L(M') = H(M') = L(M) COSC 00 5 COSC 00 6 Final State = Halting Need to show L is L(M ) (final state) for TM M iff L is H(M ) (halting) for TM M If:build M from M add final state r to M and transitions to r from any state where M might otherwise halt Only-if: can also do reverse wherever M has final state, M has no final move; wherever M has no move on some input, add transition to new state r which loops forever on any input Falling Off the Left End of Tape funny situation where the TM would halt but falls off the left end of the tape This situation is not halting. Neither does a TM accept if it tries to enter an accepting state as it falls off the left end. We can prevent falling off the left end, by marking the leftmost cell, as in the book. But it appears we do not need to do so in order to prove the equivalence of halting/accepting, since neither occurs when the TM falls off the left end. COSC 00 7 COSC 00 8 Stupid Turing Machine Tricks Can create structured state names & tape symbols: state named [q,x], where X is in Γ tape symbol [P,X], P = * or blank, X is real symbol Structured State Names use for swapping cells on tape, for example: r a/a,r b/b,r [q,a] [q,b] b/a,l a/a,l b/b,l a/b,l [p,b] [p,a] b/b,r a/b,r b/a,r a/a,r s Swapping cells on a tape with Γ = {a,b} COSC 00 9 COSC 00 0 5

Structured Tape Symbols simulate multi-tape TMs * marks cell to be read next a b c [,a] [*,b] [,c] Single-tape TM using structured tape symbols a b c d e f COSC 00 [,a,,d] [*,b,,e] [,c,*,f] Two-tape TM using structured tape symbols Example: Multiple Tracks A common use for multiple tracks is to use one track for data the other for a single "mark." Symbols of Γ are pairs [A,X], where X is the "real" symbol, and A is either B (blank) or *. Input symbol a is identified with [B, a]. The blank is [B,B]. Here's a program to find the *, assuming it is somewhere to the left of the present position.. δ(q,[b,x]) = (q,[b,x],l). δ(q,[*,x]) = (p, [B,X],R) COSC 00 Other TM Models While regular or CF languages are classes of languages that we defined by convenient notations (RE's, CFG's, etc.), no one supposed that they represented "everything we can compute." The purpose of the TM was to define "everything we can compute." For convenience, we use recognition of languages as the space of possibly computable things; other spaces, e.g., computing arithmetic functions, yield the same conclusions. Everything we can compute? are TMs powerful enough to represent everything we can compute? can we make a more powerful machine than TM? add more tapes? stacks? memory? nondeterminism? real computers? seen another way: does adding these facilities make a more powerful machine? COSC 00 COSC 00 Multitape TM's Allow the TM to have some finite number of tapes k, with a head for each tape. Move is a function of the state and the symbol scanned by each tape head. Action = new state, new symbol for each tape, and a head motion (L, R, or S, for "stationary"). First tape holds the input, other tapes are initially blank. Many Tapes to One Tape Simulation To simulate k tapes, use one tape with k tracks. One track holds the contents of each tape. Another track holds a mark representing the head position of that tape, as * W X Y Z To simulate one move of the multitape TM, the one-tape TM must remember how many *'s are to its left. COSC 00 5 COSC 00 6 6

Moves. Move left, then right, visiting all the *'s to see what each tape head is scanning.. Decide on the multitape TM's move, based on the scanned symbols and its state (remembered in the state of the one-tape TM).. Visit each * again, making the necessary adjustments: change symbols and move *'s one cell left or right, as needed. Important observation for when we study polynomial time TM's: If the multitape TM makes T(n) moves when the input is of length n, then the one-tape TM makes O(n ) moves. Thus, if the multitape TM takes polynomial time, so does the one-tape TM. Key point in proof: The *'s can't get more than n cells apart, so one move is simulated in n + k moves of the one-tape TM (k = constant to account for reverse of direction to write a symbol). This happens at most n times, so get O(n ). COSC 00 7 COSC 00 8 7