Preface These notes were prepared on the occasion of giving a guest lecture in David Harel's class on Advanced Topics in Computability. David's reques

Similar documents
CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

Turing Machines Part III

Extracted from a working draft of Goldreich s FOUNDATIONS OF CRYPTOGRAPHY. See copyright notice.

Computational Tasks and Models

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Lecture 3. 1 Terminology. 2 Non-Deterministic Space Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005.

Contents 1 Introduction 2 2 Formal Setting and General Observations Specication : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

Lecture 13: Foundations of Math and Kolmogorov Complexity

Computability and Complexity

Contents 1 Introduction Objects, specications, and implementations : : : : : : : : : : : : : : : : : : : : : : : : : : : : Indistinguishab

Introduction to Languages and Computation

Two Comments on Targeted Canonical Derandomizers

CSE 4111/5111/6111 Computability Jeff Edmonds Assignment 3: Diagonalization & Halting Problem Due: One week after shown in slides

In this chapter we discuss zero-knowledge proof systems. Loosely speaking, such proof

for average case complexity 1 randomized reductions, an attempt to derive these notions from (more or less) rst

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

of trapdoor permutations has a \reversed sampler" (which given ; y generates a random r such that S 0 (; r) = y), then this collection is doubly-enhan

Then RAND RAND(pspace), so (1.1) and (1.2) together immediately give the random oracle characterization BPP = fa j (8B 2 RAND) A 2 P(B)g: (1:3) Since

1 Introduction The relation between randomized computations with one-sided error and randomized computations with two-sided error is one of the most i

Harvard CS 121 and CSCI E-121 Lecture 14: Turing Machines and the Church Turing Thesis

Lecture 14 - P v.s. NP 1

Written Qualifying Exam. Spring, Friday, May 22, This is nominally a three hour examination, however you will be

Computability Crib Sheet

Finish K-Complexity, Start Time Complexity

Theory of Computation

case in mathematics and Science, disposing of an auxiliary condition that is not well-understood (i.e., uniformity) may turn out fruitful. In particul

1 Introduction An old folklore rooted in Brassard's paper [7] states that \cryptography" cannot be based on NPhard problems. However, what Brassard ha

A version of for which ZFC can not predict a single bit Robert M. Solovay May 16, Introduction In [2], Chaitin introd

Computability and Complexity

Contents I A high-level overview 2 Codes, proofs, their length, and local testability : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 Loca

CS154, Lecture 10: Rice s Theorem, Oracle Machines

CMPT 710/407 - Complexity Theory Lecture 4: Complexity Classes, Completeness, Linear Speedup, and Hierarchy Theorems

Notes on Complexity Theory Last updated: November, Lecture 10

Decidability. Linz 6 th, Chapter 12: Limits of Algorithmic Computation, page 309ff

TURING MAHINES

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Lecture 6: Oracle TMs, Diagonalization Limits, Space Complexity

Turing Machines, diagonalization, the halting problem, reducibility

to fast sorting algorithms of which the provable average run-time is of equal order of magnitude as the worst-case run-time, even though this average

Theory of Computation

Decidability and Undecidability

Theory of Computation

Reductions in Computability Theory

Computational Models Lecture 8 1

CSCI3390-Lecture 6: An Undecidable Problem

Lecture 24: Randomized Complexity, Course Summary

Decidability of Existence and Construction of a Complement of a given function

CISC 876: Kolmogorov Complexity

Notes on Complexity Theory Last updated: December, Lecture 2

On Languages with Very High Information Content

Computational Models Lecture 8 1

(Refer Slide Time: 0:21)

Theory of Computation

Computational Models Lecture 8 1

Introduction to Turing Machines. Reading: Chapters 8 & 9

Languages, regular languages, finite automata

Advanced Undecidability Proofs

1 Computational problems

Computability Theory

,

Notes on Complexity Theory Last updated: October, Lecture 6

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

Fall 1999 Formal Language Theory Dr. R. Boyer. 1. There are other methods of nding a regular expression equivalent to a nite automaton in

2 Plain Kolmogorov Complexity

Advanced topic: Space complexity

Time-Space Trade-Os. For Undirected ST -Connectivity. on a JAG

Introduction to Turing Machines

How to Pop a Deep PDA Matters

On Pseudorandomness w.r.t Deterministic Observers

Extracted from a working draft of Goldreich s FOUNDATIONS OF CRYPTOGRAPHY. See copyright notice.

Lecture 20: conp and Friends, Oracles in Complexity Theory

1 More concise proof of part (a) of the monotone convergence theorem.

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

1 Acceptance, Rejection, and I/O for Turing Machines

Halting and Equivalence of Program Schemes in Models of Arbitrary Theories

Kolmogorov Complexity in Randomness Extraction

The Polynomial Hierarchy

Concurrent Non-malleable Commitments from any One-way Function

Time-bounded computations

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Lecture 2. 1 More N P-Compete Languages. Notes on Complexity Theory: Fall 2005 Last updated: September, Jonathan Katz

DD2446 Complexity Theory: Problem Set 4

CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits

Decision, Computation and Language

ECS 120 Lesson 15 Turing Machines, Pt. 1

Part I: Definitions and Properties

3 Self-Delimiting Kolmogorov complexity

Lecture 2: From Classical to Quantum Model of Computation

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

Is there an Elegant Universal Theory of Prediction?

A little context This paper is concerned with finite automata from the experimental point of view. The behavior of these machines is strictly determin

1 Showing Recognizability

Theory of Computation

Lecture Notes: The Halting Problem; Reductions

Lecture 4 : Quest for Structure in Counting Problems

Recovery Based on Kolmogorov Complexity in Underdetermined Systems of Linear Equations

Turing s thesis: (1930) Any computation carried out by mechanical means can be performed by a Turing Machine

SD & Turing Enumerable. Lecture 33: Reductions and Undecidability. Lexicographically Enumerable. SD & Turing Enumerable

CS151 Complexity Theory. Lecture 1 April 3, 2017

Transcription:

Two Lectures on Advanced Topics in Computability Oded Goldreich Department of Computer Science Weizmann Institute of Science Rehovot, Israel. oded@wisdom.weizmann.ac.il Spring 2002 Abstract This text consists of notes for two short lectures on advanced topics in computability. The topic of the rst lecture is Kolmogorov Complexity, and we merely present the most basic denitions and results regarding this notion. The topic of the second lecture is a comparison of two-sided error versus one-sided error probabilistic machines, when conned to the domain of nite automata.

Preface These notes were prepared on the occasion of giving a guest lecture in David Harel's class on Advanced Topics in Computability. David's request was the lecture should not rely on resource bounds (equiv., complexity measures). This was a very challenging request for me, because all my professional thinking evolves around resource bounds. Still, I was able to nd within my \knowledge base" two topics that meet David's request, although one may claim that I \cheated": these topics too are \related to quantities" and not merely to \qualities". 1 Kolmogorov Complexity We start by presenting a paradox. Consider the following description of a natural number: the largest natural number that can be described by an English sentence of up-to 1000 letters. (Something is wrong, because if the above is well-dened then so is the integer-successor of the largest natural number that can be described by an English sentence of up-to 1000 letters.) Jumping ahead, we point out that the paradox presupposes that any sentence is a legal description in some adequate sense. One adequate sense of legal descriptions is that there exists a procedure that given a (possibly succinct) \implicit" description of an object outputs the \explicit" description of the object. Passing from implicit descriptions to explicit descriptions is what Kolmogorov Complexity isabout. Let us x a Turing machine M. The Kolmogorov Complexity (w.r.t M) of a binary string x, denoted K M (x), is the length of the shortest input y such that M(y) x that is, K M (x) def min y fjyj : M(y) xg. (In case there is no such y, we let K M (x) def 1.) Clearly, K M (x) depends on M (and not only on x), and so a question that arises is what machine M should we x? The answer is that any universal machine will do. This is justied by the following fact: Proposition 1.1 Let U be a universal Turing machine. Then for every Turing machine M, there exists a constant c such that K U (x) K M (x)+c, for every x 2f0 1g. Thus, universal machines provides the \most expressive" syntax for describing strings. Furthermore, the Kolmogorov Complexity is the same (up-to an additive constant) for any two universal machines. Proof Sketch: Suppose that y satises both jyj K M (x) and M(y) x, and consider what happens when the input (hmi y) is fed to U. Clearly, U(hMi y) M(y) x. Using a suitable encoding of pairs (i.e., using a prex-free code for the rst input), we have K U (x) jhmij + jyj, and the proposition follows since jhmij depends only on M. Conceptual Discussion: Kolmogorov Complexity measures the length of the most succinct descriptions of phenomena (or strings), where descriptions are with respect to an \eective" universal language that comes together with a procedure for generating the full description of a phenomenon out of its succinct description. Whereas some phenomena have very succinct descriptions, most phenomena (i.e., random phenomena) do not have succinct descriptions. These facts are stated in Theorem 1.2. Properties of Kolmogorov Complexity. In light of Proposition 1.1, we may x an arbitrary universal machine U, and let K(x) def K U (x). The quantitative properties of Kolmogorov Complexity are captured by the following 1

Theorem 1.2 (KC {quantitative properties): 1. There exists a constant c such that K(x) jxj + c for all x's. 2. There exist innitely many x's such that K(x) jxj. Furthermore, for every monotonically increasing recursive function f : N! N and innitely many x's K(x) <f ;1 (jxj). 3. There exists innitely many x's such that K(x) jxj. Furthermore, for most x's of length n, it holds that K(x) jxj;1. Proof Sketch: Part 1 follows by applying Proposition 1.1 to the machine M id that computes the identity function (and thus satises M id (x) jxj for all x's). Part 2 can be demonstrated by binary strings of the form yy (for which K(yy) jyj + O(1)). Another example is provided by strings of the form 1 n (for which K(1 n ) log 2 n + O(1)). The furthermore part can be shown by dening x n 1 f(n), and observing that K(x n ) O(1) + log 2 n<f ;1 (f(n)) f ;1 (jx n j). To prove Part 3, observe thatif K(x) i then it means that there exists an input y of length i that makes the universal machine U output x (i.e., U(y) x). Since each input may yield only one output, there exists a one-to-one mapping of x's with K(x) i to i-bit long strings (satisfying U(y) x). Thus, jfx 2 f0 1g : K(x) igj jfy 2 f0 1g i : U(y) 2 f0 1g gj 2 i and jfx 2f0 1g n : K(x) kgj 2 k+1 ; 1 follows (for all k's and in particular for k n ; 1 and k n ; 2). The main computational property of Kolmogorov Complexity is that it is not computable. That is: Theorem 1.3 (KC { a computational property): The Kolmogorov Complexity function K dened above is not computable. The proof will be outlined at the end of the lecture. paradox. It is very related to resolving the initial Resolving the paradox. The paradox may be recast as follows. For every natural number n, we dene x n 2 f0 1g to be the largest string (according to the standard lexicographic order) that has Kolmogorov Complexity at most n that is, x n def max x fx : K(x) ng. The paradox implicitly and wrongly presupposes that there exists an input y n of length O(log n) n that makes the universal machine output x n (i.e., x n U(y n )). (Above, we assumed that the string described by the paradox can be written in an adequate language (i.e., allowing explicit reconstruction) by using less than 1000 symbols.) The paradox is actually a proof that x n cannot be produced by the universal machine on input that is much shorter than n. This is proved by considering the string x 0 n dened as the successor of x n. Observe thatify n ( n ) is an input that makes U produces x n then y 0 n ( 0 n )makes U produces x 0 n, where 0 n rst invokes n and next invokes the constant program for computing successors (on the result of n ). Thus, K(x 0 n) jy 0 nj jy n j + O(1) K(x n )+O(1) n (using the contradiction hypothesis K(x n ) n), which contradicts the denition of x n. We comment that the wrong intuition regarding the existence of short programs for generating x n is implicitly based on the wrong assumption that given n we can eectively enumerate all strings having Kolmogorov Complexity less than n. (Note that if that was possible then we could have computed K(x) by enumerating all strings having Kolmogorov Complexity less than i, for i 1 ::: jxj + O(1). This does NOT prove that K is not computable, because the reduction is in the wrong direction.) 2

def Outline for the proof of Theorem 1.3: For every n, consider the string z n min z fz : K(z) ng. (Note that z n is well-dened because there exists strings with Kolmogorov Complexity greater than n.) It is easy to show that if K is computable then there exists an input of length log 2 n+o(1) that makes the universal machine produce z n. Thus, K(z n ) log 2 n + O(1), which (for suciently large n) is impossible (because K(z n ) n by denition of z n ). A related exercise: For any unbounded function f : f0 1g! N, dene zn f def min z fz : f(z) ng. (Note that zn f is well-dened because f is unbounded.) Prove that if f is computable then there exists an input of length log 2 n + O(1) that makes the universal machine produce zn. f (Hint: given n, we generate zn f by computing f on nitely many inputs (i.e., all z's that precede zn f in lexicographic order as well as zn f itself).) 2 Probabilistic Finite Automata: Two-Sided versus One-Sided Error Probabilistic computation defers from non-deterministic computation in that the former is concerned with the quantity of accepting computations, whereas the latter is only concerned with their existence. That is, a non-deterministic machine M is said to accept (or non-deterministically accept) the set L if For every x 2 L there exists a computation of M(x) that halts in accepting state. For every x 62 L there does not exist a computation of M(x) that halts in accepting state. In contrast, a two-sided error probabilistic machine M is said to accept (or probabilistically accept) the set L if For every x 2 L a strict majority of the computations of M(x) halt in an accepting state. For every x 62 L a strict majority of the computations of M(x) halt in a rejecting state. We stress that (unlike in standard complexity-theoretic treatments), we only required a separation of the two cases, rather a signicant \separation-gap" that is, we only required a strict majority in each direction, rather than asking for a special majority (e.g., a 2/3-majority). We also consider one-sided error probabilistic machines. Such a machine is said to accept the set L (with one-sided error on yes-instances) if For every x 2 L a strict majority of the computations of M(x) halt in an accepting state. For every x 62 L there does not exist a computation of M(x) that halts in accepting state. Throughout the lecture we focus on nite automata (i.e., M above is a nite automaton). We mention that similar phenomena (regarding two-sided error that is not bounded-away from 12) seem to occur also in other models. The power of one-sided error. Observe that acceptance by probabilistic machines with onesided error on yes-instances is never stronger (and, in fact, is weaker in standard complexity classes) than acceptance by non-deterministic machines. In our setting, of nite automata, both models coincide with deterministic machines. 3

The power of two-sided error. In contrast to the above, we will show that a probabilistic machine with two-sided error probability can accept non-regular sets, and thus this model is more powerful than non-deterministic machines. Specically, we will show a probabilistic machine with two-sided error probability that accepts the set fw 2 f0 1g : # 0 (w) < # 1 (w)g, where # (w) denotes the number of 's in w. The basic idea is to let the machine scan the input while tossing a coin per each 0-symbol it sees, and maintain a record of whether all these coin tosses turned out to be head. (This record can be encoded in the machine's state.) Similarly (and in parallel), while scanning the input, the machine tosses a coin per each 1-symbol it sees, and maintain a record of whether all these coin tosses turned out to be head. (The latter fact is recorded in a separate part of the state.) We say a 0-win occurred if all coins tossed for 0-symbols turned out to be head. Similarly, we dene the notion of a 1-win. Note that it may be that we have both a 0-win and a 1-win or neither a 0-win nor a 1-win (the latter is most likely for most suciently long inputs). The probability that there is 0-win (resp., 1-win) on an input x is exactly 2 ;# 0(x) (resp., 2 ;# 1(x) ). Thus,if# 0 (x) < # 1 (x) then the probability of a 0-win smaller by a factor of at least two than the probability of a 1-win, whereas if # 0 (x) # 1 (x) then these the probability of a 0-win is greater or equal to the probability of a 1-win. This motivates the following procedure. 1. We run the procedure described above, while recording whether a 0-win and/or a 1-win has occurred. 2. If either no win has occurred or both wins have occurred then we accept with probability exactly 12. 3. Otherwise (i.e., exactly one win has occurred) then we decide as follows: If a 0-win has occured then we accept else (i.e., a 1-win has occurred) we reject. Note that on input x, wereach Step 3 with probability exactly (x) def 2 ;# 0(x) (1 ; 2 ;# 1(x) )+2 ;# 1(x) (1 ; 2 ;# 0(x) ) which may be exponentially vanishing with jxj. Still conditioned on reaching Step 3, we accept with probability Pr [ 0-win has occured j a single win has occured ] 2 ;# 0(x) 2 ;# 0(x) +2 ;# 1(x) 1 1+2 # 0(x);# 1 (x) (Verication of the rst equality is left as an exercise.) Let us denote (x) def # 0 (x) ; # 1 (x). Thus, if # 0 (x) < # 1 (x) (i.e., (x) ;1) then, conditioned on reaching Step 3, we accept with probability 1(1+2 (x) ) 1(1+0:5) 23. On the other hand, if # 0 (x) # 1 (x) (i.e., (x) 0) then, conditioned on reaching Step 3, we accept with probability 1(1+2 (x) ) 1(1+1) 12. It follows that if # 0 (x) < # 1 (x) (resp., # 0 (x) # 1 (x)) then the above nite automaton accepts with probability at least (1;(x)) 1 +(x) 2 1 +exp(;jxj) (resp., at most (1;(x)) 1 +(x) 1 1). 2 3 2 2 2 2 We conclude that x's in the langauge are accepted with probability strictly greater than 12, whereas x's not in the language are rejected with probability at least 12. This almost meets the denition of two-sided probabilistic acceptance. All that is needed is to \shift the acceptance probabilities" a little. Towards doing so, observe thatx's in the langauge are actually accepted with 4

probability at least (1;(x)) 1 2 +(x) 2 3 1 2 + (x) 6, where (x) > 2;# 0(x) (1;2 ;# 1(x) ) > 2 ;jxj2. Thus, if unconditionally decrease the acceptance probability ofeach x by 2 ;jxj (or so) then we'll be ne. This can be achieved by performing the above process in parallel to tossing a coin per each input bit (regardless of its value), and rejecting if all the latter coins turned out HEAD. A more complex example The above example of a non-regular set that is accepted by a nite automaton was suggested in class by Amos Gilboa, after I have presented the following more complex example. Specically, I showed a probabilistic machine with two-sided error probability that accepts the (non-regular) set fw 2f0 1g :# 0 (w) # 1 (w)g. Following the above discussion, observe thatif # 0 (x) # 1 (x) then the probability of a 0-win equals the probability of a 1-win, whereas if # 0 (x) 6 # 1 (x) then these probabilities are at least a factor of 2 away from one another. Thus, to decide membership in the language we should approximate both probabilities, or rather the ratio between them. This cannot be done by running the above procedure, which provides us (in case we reach Step 3) with the result of one lottery with odds 2 ;# 0(x) : 2 ;# 1(x) for 0-win versus 1-win. In order to approximate the odds (or rather distinguish the 50:50 case from the other cases), we need to obtain several results of the same lottery. This motivates the following procedure. 1. We run 1000 (parallel) copies of the basic procedure (described above), where each copy records whether a 0-win and/or a 1-win has occurred. 2. If in at least one of these 1000 copies either no win has occurred or both wins have occurred then we accept with probability exactly 12. 3. Otherwise (i.e., in each of the 1000 copies exactly one win has occurred) then we decide as follows: If the number of 0-wins is between 400 and 600 then we accept else we reject. Note that on input x, wereach Step 3 with probability exactly (x) def 1000 2 ;# 0(x) (1 ; 2 ;# 1(x) )+2 ;# 1(x) (1 ; 2 ;# 0(x) ) which may be exponentially vanishing with jxj. Still conditioned on reaching Step 3, if # 0 (x) # 1 (x) thenwe accept with high probability (e.g., higher than 23). On the other hand, if # 0 (x) 6 # 1 (x) then in Step 3 we accept with low probability (e.g., lower than 13). Proving the last two statements is left as an exercise. It follows that if # 0 (x) # 1 (x) (resp., # 0 (x) 6 # 1 (x)) then the above nite automaton accepts with probability at least (1 ; (x)) 1 2 + (x) 2 3 > 1 2 + exp(;jxj) (resp., at most (1 ; (x)) 1 2 + (x) 1 3 < 1 2 ; exp(;jxj)). Exercise. Present a probabilistic machine with two-sided error probability that accepts the set f0 n 1 n : n 2 Ng. Same for f0 an 1 bn+c : n 2 Ng, where a b and c are xed (positive) integers. Finally, show that the restriction on (a b and) c being positive can be removed. 5