,

Similar documents
Theory of Computation

Introduction to Languages and Computation

Computability Theory

CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Lecture 13: Foundations of Math and Kolmogorov Complexity

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CSCE 551 Final Exam, Spring 2004 Answer Key

CSE 105 THEORY OF COMPUTATION

Turing Machines, diagonalization, the halting problem, reducibility

Time-bounded computations

TURING MAHINES

COS597D: Information Theory in Computer Science October 19, Lecture 10

Kolmogorov complexity

Theory of Computation

1 Showing Recognizability

Theory of Computation

CISC 876: Kolmogorov Complexity

Harvard CS 121 and CSCI E-121 Lecture 14: Turing Machines and the Church Turing Thesis

A non-turing-recognizable language

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2. March 30, Please write your name in the upper corner of each page.

Computational Models Lecture 8 1

Computability Theory. CS215, Lecture 6,

Finish K-Complexity, Start Time Complexity

Non-emptiness Testing for TMs

Decidability: Church-Turing Thesis

Turing Machines Part III

NP, polynomial-time mapping reductions, and NP-completeness

Computational Models Lecture 8 1

Computational Models Lecture 8 1

2 Plain Kolmogorov Complexity

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

CSE 555 Homework Three Sample Solutions

Introduction to Turing Machines. Reading: Chapters 8 & 9

CPSC 421: Tutorial #1

Reducability. Sipser, pages

Time to learn about NP-completeness!

V Honors Theory of Computation

Further discussion of Turing machines

Foundations of

CS 361 Meeting 26 11/10/17

Computation Histories

Lecture 23: Rice Theorem and Turing machine behavior properties 21 April 2009

A Universal Turing Machine

Decidability: Reduction Proofs

The Turing Machine. Computability. The Church-Turing Thesis (1936) Theory Hall of Fame. Theory Hall of Fame. Undecidability

Turing Machines A Turing Machine is a 7-tuple, (Q, Σ, Γ, δ, q0, qaccept, qreject), where Q, Σ, Γ are all finite

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Decidability. Linz 6 th, Chapter 12: Limits of Algorithmic Computation, page 309ff

CpSc 421 Homework 9 Solution

The Church-Turing Thesis

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012

1 Definition of a Turing machine

Q = Set of states, IE661: Scheduling Theory (Fall 2003) Primer to Complexity Theory Satyaki Ghosh Dastidar

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

1 Reals are Uncountable

CSCC63 Worksheet Turing Machines

Advanced topic: Space complexity

34.1 Polynomial time. Abstract problems

Lecture Notes: The Halting Problem; Reductions

Notes for Lecture Notes 2

q FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY. FLAC (15-453) Spring L. Blum. REVIEW for Midterm 2 TURING MACHINE

Is there an Elegant Universal Theory of Prediction?

Computational Tasks and Models

CMPT 710/407 - Complexity Theory Lecture 4: Complexity Classes, Completeness, Linear Speedup, and Hierarchy Theorems

Decidability (What, stuff is unsolvable?)

Theory of Computation

The Power of One-State Turing Machines

Confusion of Memory. Lawrence S. Moss. Department of Mathematics Indiana University Bloomington, IN USA February 14, 2008

Lecture 12: Mapping Reductions

Turing Machine properties. Turing Machines. Alternate TM definitions. Alternate TM definitions. Alternate TM definitions. Alternate TM definitions

Computability and Complexity

Sample Project: Simulation of Turing Machines by Machines with only Two Tape Symbols

Languages, regular languages, finite automata

CSCI3390-Lecture 6: An Undecidable Problem

CSE 200 Lecture Notes Turing machine vs. RAM machine vs. circuits

On the Computational Hardness of Graph Coloring

Decidability and Undecidability

Theory of Computation

The purpose here is to classify computational problems according to their complexity. For that purpose we need first to agree on a computational

Theory of Computation (IX) Yijia Chen Fudan University

Part I: Definitions and Properties

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

1 Computational Problems

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

Halting and Equivalence of Program Schemes in Models of Arbitrary Theories

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

The halting problem is decidable on a set of asymptotic probability one

TAKE-HOME: OUT 02/21 NOON, DUE 02/25 NOON

Computability and Complexity

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

Reductions in Computability Theory

Kolmogorov Complexity in Randomness Extraction

Turing s thesis: (1930) Any computation carried out by mechanical means can be performed by a Turing Machine

MTAT Complexity Theory October 13th-14th, Lecture 6

A little context This paper is concerned with finite automata from the experimental point of view. The behavior of these machines is strictly determin

4.2 The Halting Problem

How to Pop a Deep PDA Matters

Transcription:

Kolmogorov Complexity Carleton College, CS 254, Fall 2013, Prof. Joshua R. Davis based on Sipser, Introduction to the Theory of Computation 1. Introduction Kolmogorov complexity is a theory of lossless data compression. It ponders the existence of compression/decompression schemes, in which long strings of data are compressed into short strings, that can be decompressed back into their longer versions. Kolmogorov complexity is also a theory of information. Intuitively, the length of a compressed string is a measure of how much information the uncompressed string contains. To get an idea of what we mean, consider the uncompressed strings 0101010101010101010101010101010101010101010101010101010101010101010101, 0111101011000010100000101111111111101001101010100000000110111100000101. Both strings have length 70. The first string can be described succinctly: repeat 01 35 times. In contrast, the second string does not have any apparent pattern. There is no obvious way to describe it, other than just laying out the whole string. It seems to contain more information than the first string does. We begin with a minimal assumption: The process of decompressing a compressed string into its uncompressed version should be algorithmic. Therefore, we define a decompression scheme to be a deterministic Turing machine F that halts on all inputs. This F takes in an input string w, which we regard as the compressed string, and outputs the corresponding uncompressed string F (w) (as the contents of its tape upon halting). For any x, any string w such that F (w) = x is called a description of x. Let d F (x) be the minimal description of x, according to lexicographic order. The Kolmogorov complexity K F (x) is the length of the minimal description of x. Throughout this tutorial, all strings are taken over the alphabet {0, 1}. Whenever we need to represent a non-negative integer m as a string m, we do so in binary. In particular, m log 2 m + 1 for all m 1. These choices have no serious effect on the theory. Theorem 1.1. Under any scheme F, for any string length m, there are incompressible strings meaning, strings x such that K F (x) x. Proof. Let m 1 be any positive string length. There are 2 m 1 strings of length less than m, which, when fed to F, can produce at most 2 m 1 decompressed strings. But there are 2 m strings of length m. Therefore, at least one string of length m is incompressible. 2. The default scheme Let s define a particular decompression scheme F. This F takes as input M, w, where M is a Turing machine and w is an input for M, all encoded into bits according to some pre-arranged 1

system. This F runs M on w until it halts (if ever). Then F halts, with its tape containing the same contents as M s final tape. It is conventional to assume that the encoding is such that w is given explicitly at the end of M, w. That is, M, w consists of some encoding of M, followed by some separator mark, followed by w. In particular, M, w = M, + w. This scheme will be our default. When we write d and K without any subscript, we mean this scheme. Example 2.1. Let M be the Turing machine that, on input w, produces 35 concatenated copies of w on its tape, and then halts. Then M, 01 is a description of the first 70-bit string given above. The length of M, w is M, + 2. Our next example expresses the idea that d(x) should never be much longer than x itself, because here is the string x should always be a description of x. Example 2.2. There exists a constant c such that for all x, K(x) c + x. Proof. Let M be a Turing machine that immediately halts. Let c = M,. Then, for any string x, M, x is a description of x, of length M, x = M, + x = c + x. Thus the minimal description of x can be no longer than c + x, and K(x) c + x. This next example says that xx should not require much more description than x. Example 2.3. There exists a constant c such that for all x, K(xx) c + x. Proof. Let M be a Turing machine that repeats its input twice on its tape and then halts. Let c = M,. Then, for any string x, M, x is a description of xx, of length c + x. Now that we have a taste for how compression and decompression work, let s prove a result that says that our default scheme is about as good as any other. Theorem 2.4. For any scheme F there exists a constant c such that K(x) c + K F (x). Proof. Let c = F,. Then F, d F (x) is a description of x in the default scheme. Its length is F, + d F (x) = c + K F (x). 3. Intermediate Results Henceforth we shall work only with our default scheme. For any c 0, we say that a string x is incompressible by c if K(x) > x c. The notion of incompressibility introduced earlier is incompressibility by 1. description of x, should itself be incompressible. This next theorem gets at the idea that d(x), being the minimal Theorem 3.1. There exists a constant c such that for all x, d(x) is incompressible by c. Proof. Let N be a Turing machine that, on input M, w, does the following steps. 2

(1) Run M on w. (2) If the output of M is not of the form P, y, then reject. (3) If the output is of the form P, y, then run P on y and halt with that output. Let c = N, + 1. Now suppose, for the sake of contradiction, that x is a string such that d(x) is compressible by c. Then d(d(x)) d(x) c. But N, d(d(x)) is a description of x, and its length is N, d(d(x)) = N, + d(d(x)) (c 1) + d(x) c = d(x) 1. Therefore K(x) d(x) 1, which contradicts the definition of K(x) = d(x). Recall that this whole theory is founded on a mild assumption: that decompression should be algorithmic. Under that assumption, this next theorem shows that optimal compression cannot be algorithmic. (Perhaps we should place more constraints on decompression, to arrive at a theory in which decompression and optimal compression are both algorithmic?) Theorem 3.2. The Kolmogorov complexity is not computable. In other words, there does not exist a Turing machine M that, given any input x, halts with K(x) on its tape. Proof. Suppose, for the sake of contradiction, that such an M exists. Build a decider N that, on input m, outputs some string x satisfying K(x) m. (N tries all strings x of length m, using M to compute K(x), until it finds an x such that K(x) m. Our first theorem guarantees that such an x will be found.) Now let m be a number large enough that m log 2 m 1 > N,, and let x be the output of N on input m. Then N, m is a description of x, of length N, m = N, + m < (m log 2 m 1) + (log 2 m + 1) = m. So K(x) < m. On the other hand, K(x) m, by the definition of N. This contradiction implies that K is not computable. 4. Random Strings In this section, we explain the notion introduced earlier, that a random string has no pattern and hence should not be compressible. A property of strings over {0, 1} is a function f : {0, 1} {True, False}. A property f is said to hold for almost all strings if #{x : x = n, f(x) = False} lim = 0. n #{x : x = n} Intuitively, f is True for typical strings x and False for special cases of x. If you select a string x randomly, then f(x) = True with high probability. As n, the probability that a randomly chosen string x of length n will have f(x) = True goes to 1. Examples of such f include x contains at least 40% 0s and at least 40% 1s. 3

the longest run of 0s in x has length between 0.5 log 2 x and 1.5 log 2 x. This notion allows us to investigate properties of random strings without really doing any probability theory. The following purely mathematical lemma shows that we can replace = with in certain parts of the above definition. Sipser uses this fact without proof. You may want to skip the proof on a first reading. Lemma 4.1. Let f be a property that holds for almost all strings. Then lim = 0. n #{x : x n} Proof. Let ɛ > 0. We wish to show that there exists N such that for all n N #{x : x n} For the sake of brevity, let L n = #{x : x = n, f(x) = False}. Because f holds for almost all strings, there exists an M such that for all n > M, #{x : x = n, f(x) = False} #{x : x = n} That is, L n < ɛ 2 2n for all n > M. Pick N large enough so that Then for all n N L i < ɛ ( 2 N+1 1 ). 2 = < L i + L i + < ɛ. < ɛ 2. n i=m+1 n i=m+1 L i ɛ 2 2i < ɛ ( 2 N+1 1 ) + ɛ 2 2 (2n+1 1) ɛ ( 2 n+1 1 ) = ɛ #{ x n}. This proves the lemma. The following theorem says, roughly, that long incompressible strings have every property that holds for almost all strings. In this sense, they are random. Theorem 4.2. Let f be a computable property that holds for almost all strings. Let c 1. Then there exists an N such that f(x) = True for all x such that x N and x is incompressible by c. 4

Proof. If f is False on only finitely many strings, then f is true for all longer strings, and the theorem is obviously true. Henceforth assume that f is False on infinitely many strings. Denote these strings s 0, s 1, s 2,... in lexicographic order. For any string x in the sequence s 0, s 1, s 2,..., let i x be its index in the list. That is, i x is the unique number such that s ix = x. Let M be a Turing machine that on input i outputs s i. (How would you design M, using the fact that f is computable?) Then M, i x is a description of x. Fix c 1. By the lemma, there exists a large N so that for all n N #{x : x n} Using the fact that #{x : x n} = 2 n+1 1, we have < 1 2 c+ M, +2. 2 n+1 < 2 c+ M, +2 = 2n c M, 1. If x is any string of length n N such that f(x) = False, then and Therefore i x < 2 n c M, 1 i x n c M,. K(x) M, i x M, + n c M, = n c. So x is compressible by c. In other words, any x of length at least N that is incompressible by c satisfies f(x) = True. 5