Lecture 13: Foundations of Math and Kolmogorov Complexity

Similar documents
CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Finish K-Complexity, Start Time Complexity

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Computability Theory

q FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY. FLAC (15-453) Spring L. Blum. REVIEW for Midterm 2 TURING MACHINE

Introduction to Languages and Computation

TURING MAHINES

CSE 105 THEORY OF COMPUTATION

1 Showing Recognizability

,

CS154, Lecture 10: Rice s Theorem, Oracle Machines

Computability and Complexity

Lecture 11: Gödel s Second Incompleteness Theorem, and Tarski s Theorem

Decidability. Linz 6 th, Chapter 12: Limits of Algorithmic Computation, page 309ff

Turing Machine Recap

Most General computer?

Decidability and Undecidability

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 10 Nancy Lynch

CSE 555 Homework Three Sample Solutions

Section 14.1 Computability then else

CpSc 421 Homework 9 Solution

Theory of Computation Lecture Notes. Problems and Algorithms. Class Information

Theory of Computation

V Honors Theory of Computation

CISC 876: Kolmogorov Complexity

Chap. 4,5 Review. Algorithms created in proofs from prior chapters

Introduction to Turing Machines. Reading: Chapters 8 & 9

Theory of Computation (IX) Yijia Chen Fudan University

Lecture Notes: The Halting Problem; Reductions

CSCI3390-Lecture 6: An Undecidable Problem

Theory of Computation

Turing Machines. Lecture 8

Computable Functions

CSE 105 Theory of Computation

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012

Large Numbers, Busy Beavers, Noncomputability and Incompleteness

Lecture 16: Time Complexity and P vs NP

SD & Turing Enumerable. Lecture 33: Reductions and Undecidability. Lexicographically Enumerable. SD & Turing Enumerable

Computation. Some history...

17.1 The Halting Problem

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Turing Machines. Reading Assignment: Sipser Chapter 3.1, 4.2

Reading Assignment: Sipser Chapter 3.1, 4.2

1 Reducability. CSCC63 Worksheet Reducability. For your reference, A T M is defined to be the language { M, w M accepts w}. Theorem 5.

Gödel s Incompleteness Theorem. Overview. Computability and Logic

Undecidability. We are not so much concerned if you are slow as when you come to a halt. (Chinese Proverb)

CS20a: Turing Machines (Oct 29, 2002)

Computational Models Lecture 8 1

Computational Models Lecture 8 1

Turing Machines, diagonalization, the halting problem, reducibility

Harvard CS 121 and CSCI E-121 Lecture 14: Turing Machines and the Church Turing Thesis

Kolmogorov complexity and its applications

Turing Machines Part III

Computational Models Lecture 9, Spring 2009

CSCE 551 Final Exam, Spring 2004 Answer Key

Peano Arithmetic. CSC 438F/2404F Notes (S. Cook) Fall, Goals Now

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

Lecture 14 Rosser s Theorem, the length of proofs, Robinson s Arithmetic, and Church s theorem. Michael Beeson

Decidability: Church-Turing Thesis

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Kolmogorov complexity and its applications

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Decidability (What, stuff is unsolvable?)

Computational Models Lecture 8 1

CSE 105 THEORY OF COMPUTATION

CSE 311: Foundations of Computing. Lecture 27: Undecidability

2 Plain Kolmogorov Complexity

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Introduction to Turing Machines

What languages are Turing-decidable? What languages are not Turing-decidable? Is there a language that isn t even Turingrecognizable?

Computation Histories

CSE 105 Theory of Computation

Reducability. Sipser, pages

The Turing machine model of computation

Gödel s First Incompleteness Theorem (excerpted from Gödel s Great Theorems) Selmer Bringsjord Intro to Logic May RPI Troy NY USA

7.1 The Origin of Computer Science

16.1 Countability. CS125 Lecture 16 Fall 2014

Lecture 23: Rice Theorem and Turing machine behavior properties 21 April 2009

problem X reduces to Problem Y solving X would be easy, if we knew how to solve Y

An example of a decidable language that is not a CFL Implementation-level description of a TM State diagram of TM

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Intro to Theory of Computation

Turing machines and linear bounded automata

1 The decision problem for First order logic

Computability Theory. CS215, Lecture 6,

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

CSCE 551: Chin-Tser Huang. University of South Carolina

Computer Sciences Department

CPSC 421: Tutorial #1

15-251: Great Theoretical Ideas in Computer Science Fall 2016 Lecture 6 September 15, Turing & the Uncomputable

Decidable and undecidable languages

Decidability (intro.)

Undecidable Problems and Reducibility

Undecidability and Rice s Theorem. Lecture 26, December 3 CS 374, Fall 2015

CSE355 SUMMER 2018 LECTURES TURING MACHINES AND (UN)DECIDABILITY

ON COMPUTAMBLE NUMBERS, WITH AN APPLICATION TO THE ENTSCHENIDUGSPROBLEM. Turing 1936

Part I: Definitions and Properties

Transcription:

6.045 Lecture 13: Foundations of Math and Kolmogorov Complexity 1

Self-Reference and the Recursion Theorem 2

Lemma: There is a computable function q : Σ* Σ* such that for every string w, q(w) is the description of a TM P w that on every input, prints out w and then accepts Proof Define a TM Q: s w Q P w w 3

Theorem: There is a Self-Printing TM Proof: First define a TM B which does this: M B w P M M M Now consider the TM that looks like this: w B w P B B P B B B No explicit self-reference here! QED 4

The Recursion Theorem Theorem: For every TM T computing a function t : Σ* Σ* Σ* there is a Turing machine R computing a function r : Σ* Σ*, such that for every string w, r(w) = t(r, w) (a,b) T t(a,b) w R t(r,w) 5

Proof: (a,b) T t(a,b) Define M = N B N w B Y w T w P N N w N Define R: What is S? w P M M B S w T t(s,w) 6

Proof: (a,b) Define M = T M t(a,b) B What is M(M,w)? N w B Y w T w P M M w M Define R: S M S w P B T t(s,w) M w 7

Proof: (a,b) T t(a,b) M B w M PM w B Y w T Define R: S S = Y = R. QED M S w P B T t(s,w) M w 8

FOO x (y) := Output x and halt. BAR(M) := Output N(w) = Run FOO M outputting M. Run M on (M, w) Q(N, w) := Run BAR(N) outputting S. Run T on (S, w) R(w) := Run FOO Q outputting Q. Run BAR(Q) outputting S. Run T on (S, x) Claim: S is a description of R itself! S(w) = Run FOO Q outputting Q. Run Q on (Q, w)

FOO x (y) := Output x and halt. BAR(M) := Output N(w) = Run FOO M outputting M. Run M on (M, w) Q(N, w) := Run BAR(N) outputting S. Run T on (S, w) R(w) := Run FOO Q outputting Q. Run BAR(Q) outputting S. Run T on (S, w) Claim: S is a description of R itself! S(w) = Run FOO Q outputting Q. Run BAR(Q) outputting S. Run T on (S, w) Therefore R(w) = T(R, w)

For every computable t, there is a computable r such that r(w) = t(r,w) where R is a description of r Suppose we can design a TM T of the form: On input (x,w), do bla bla with x, do bla bla bla with w, etc. etc. We can then find a TM R with the behavior: On input w, do bla bla with R, do bla bla bla with w, etc. etc. We can use the operation: Obtain your own description in Turing machine pseudocode! 11

Theorem: A TM is undecidable Proof (using the recursion theorem) Assume H decides A TM Construct machine B such that on input w: 1. Obtains its own description B 2. Runs H on (B, w) and flips the output Running B on input w always does the opposite of what H says it should! A formalization of free will paradoxes! No single machine can predict behavior of all others 12

Turing Machine Minimization MIN = {M M is a minimal-state Turing machine} Theorem: MIN is unrecognizable! Proof: Suppose we could recognize MIN with TM M M(x) := Obtain description of M. For k = 1,2,, Run M on the TMs M 1, M k for k steps, until M accepts some M i with Q(M i ) > Q(M). Output M i (x). We have: 1. L M = L M i [by construction] 2. M has fewer states than M i 3. M i is minimal [by definition of MIN] CONTRADICTION! 13

Computability and the Foundations of Mathematics 17

Formal Systems of Mathematics A formal system describes a formal language for - writing (finite) mathematical statements, - has a definition of a proof of a statement Example: Every TM M defines some formal system F - {Mathematical statements in F } = * String w represents the statement M halts on w - A proof that M halts on w can be defined to be the computation history of M on w: the sequence of configurations C 0 C 1 C t that M goes through while computing on w Could sometimes prove M doesn t halt on w 18

Interesting Systems of Mathematics Define a formal system F to be interesting if: 1. Any mathematical statement about computation can be (computably) described as a statement of F. Given (M, w), there is a (computable) S M,w of F such that S M,w is true in F if and only if M accepts w. 2. Proofs are convincing a TM can check that a candidate proof of a theorem is correct This set is decidable: {(S, P) P is a proof of S in F } 3. If S is in F and there is a proof of S describable as a computation, then there s a proof of S in F. If M halts on w, then there s either a proof P of S M,w or a proof P of S M,w 19

Consistency and Completeness A formal system F is inconsistent if there is a statement S in F such that both S and S are provable in F F is consistent if it is NOT inconsistent A formal system F is incomplete if there is a statement S in F such that neither S nor S are provable in F F is complete if it is NOT incomplete 20

Limitations on Mathematics! For every consistent and interesting F, Theorem 1. (Gödel 1931) F must be incomplete! There are mathematical statements that are true but cannot be proved. Theorem 2. (Gödel 1931) The consistency of F cannot be proved in F. Theorem 3. (Church-Turing 1936) The problem of checking whether a given statement in F has a proof is undecidable. 21

Unprovable Truths in Mathematics (Gödel) Every consistent interesting F is incomplete: there are statements that cannot be proved or disproved. Let S M, w in F be true if and only if M accepts w Proof: Define TM G(w): 1. Obtain own description G [Recursion Theorem!] 2. For all strings P in lexicographical order, If (P is a proof of S G, w in F ) then reject If (P is a proof of S G, w in F ) then accept Note: If F is complete then G cannot run forever! 1. If (G accepts w) then have proof P of G doesn t accept w 2. If (G rejects w) then found proof P of G accepts w In either case, F is inconsistent! Proof of S G, w and S G, w 22

(Gödel 1931) The consistency of F cannot be proved within any interesting consistent F Proof: Assume we can prove F is consistent in F We constructed :S G, w = G does not accept w which is true, but has no proof in F G does not accept w There is no proof of :S G, w in F But if there s a proof of F is consistent in F, then there is a proof of :S G, w in F (here s the proof): If S G, w is true, then there is a proof in F of S G, w and a proof in in F of :S G, w But since F is consistent, this cannot be true. Therefore, :S G, w is true This contradicts the previous theorem! 23

Undecidability in Mathematics PROVABLE F = {S there s a proof in F of S, or there s a proof in F of :S} (Church-Turing 1936) For every interesting consistent F, PROVABLE F is undecidable Proof: Suppose PROVABLE F is decidable with TM P. Then we can decide A TM with the following procedure: On input (M, w), run the TM P on input S M,w If P accepts, examine all proofs in lex order If a proof of S M,w is found then accept If a proof of :S M,w is found then reject If P rejects, then reject. Why does this work? 24

Kolmogorov Complexity: A Universal Theory of Data Compression 25

The Church-Turing Thesis: Everyone s Intuitive Notion of Algorithms = Turing Machines This is not a theorem it is a falsifiable scientific hypothesis. A Universal Theory of Computation 26

A Universal Theory of Information? Can we quantify how much information is contained in a string? A = 01010101010101010101010101010101 B = 110010011101110101101001011001011 Idea: The more we can compress a string, the less information it contains. 27

Information as Description Thesis: The amount of information in a string x is the length of the shortest description of x Let x 2 {0,1}* How should we describe strings? Use Turing machines with inputs! Def: A description of x is a string <M,w> such that M on input w halts with only x on its tape. Def: The shortest description of x, denoted as d(x), is the lexicographically shortest string <M,w> such that M(w) halts with only x on its tape. 28

A Specific Pairing Function Theorem. There is a 1-1 computable function <,> : Σ* x Σ* Σ* and computable functions 1 and 2 : Σ* Σ* such that: z = <M,w> iff 1 (z) = M and 2 (z) = w Define: <M,w> := 0 M 1 M w (Example: <10110,101> = 00000110110101) Note that <M,w> = 2 M + w + 1 29

Kolmogorov Complexity (1960 s) Definition: The shortest description of x, denoted as d(x), is the lexicographically shortest string <M,w> such that M(w) halts with only x on its tape. Definition: The Kolmogorov complexity of x, denoted as K(x), is d(x). EXAMPLES?? Let s first determine some properties of K. Examples will fall out of this. 30

A Simple Upper Bound Theorem: There is a fixed c so that for all x in {0,1}* K(x) x + c The amount of information in x isn t much more than x Proof: Define a TM M = On input w, halt. On any string x, M(x) halts with x on its tape. Observe that <M,x> is a description of x. Let c = 2 M +1 Then K(x) <M,x> 2 M + x + 1 x + c 31

Repetitive Strings have Low K-Complexity Theorem: There is a fixed c so that for all n 2, and all x ϵ {0,1}*, K(x n ) K(x) + c log n The information in x n isn t much more than that in x Proof: Define the TM N = On input <n,<m,w>>, Let x = M(w). Print x for n times. Let <M,w> be the shortest description of x. Then K(x n ) K(<N,<n,<M,w>>>) 2 N + d log n + K(x) c log n + K(x) for some constants c and d 32

Repetitive Strings have Low K-Complexity Theorem: There is a fixed c so that for all n 2, and all x ϵ {0,1}*, K(x n ) K(x) + c log n The information in x n isn t much more than that in x Recall: A = 01010101010101010101010101010101 For w = (01) n, we have K(w) K(01) + c log n So for all n, K((01) n ) d + c log n for a fixed c, d 33

Does The Computational Model Matter? Turing machines are one programming language. If we use other programming languages, could we get significantly shorter descriptions? An interpreter is a semi-computable function p : Σ* Σ* Takes programs as input, and (may) print their outputs Definition: Let x ϵ {0,1}*. The shortest description of x under p, called d p (x), is the lexicographically shortest string w for which p(w) = x. Definition: The K p complexity of x is K p (x) := d p (x). 34

Does The Computational Model Matter? Theorem: For every interpreter p, there is a fixed c so that for all x ϵ {0,1}*, K(x) K p (x) + c Moral: Using another programming language would only change K(x) by some additive constant Proof: Define M = On w, simulate p(w) and write its output to tape Then <M,d p (x)> is a description of x, so K(x) <M,d p (x)> 2 M + K p (x) + 1 c + K p (x) 35

There Exist Incompressible Strings Theorem: For all n, there is an x ϵ {0,1} n such that K(x) n There are incompressible strings of every length Proof: (Number of binary strings of length n) = 2 n but (Number of descriptions of length < n) (Number of binary strings of length < n) = 1 + 2 + 4 + + 2 n-1 = 2 n 1 Therefore, there is at least one n-bit string x that does not have a description of length < n 36

Random Strings Are Incompressible! Theorem: For all n and c 1, Pr x ϵ {0,1} n[ K(x) n-c ] 1 1/2 c Most strings are highly incompressible Proof: (Number of binary strings of length n) = 2 n but (Number of descriptions of length < n-c) (Number of binary strings of length < n-c) = 2 n-c 1 Hence the probability that a random x satisfies K(x) < n-c is at most (2 n-c 1)/2 n < 1/2 c. 37

KOLMOGOROV DIRECTIONS 38

Kolmogorov Complexity: Try it! Give short algorithms for generating the strings: 1. 01000110110000010100111001011101110000 2. 123581321345589144233377610987 3. 126241207205040403203628803628800 39

Kolmogorov Complexity: Try it! Give short algorithms for generating the strings: 1. 01000110110000010100111001011101110000 2. 123581321345589144233377610987 3. 126241207205040403203628803628800 40

Kolmogorov Complexity: Try it! Give short algorithms for generating the strings: 1. 01000110110000010100111001011101110000 2. 123581321345589144233377610987 3. 126241207205040403203628803628800 41

Kolmogorov Complexity: Try it! Give short algorithms for generating the strings: 1. 01000110110000010100111001011101110000 2. 123581321345589144233377610987 3. 126241207205040403203628803628800 This seems hard to determine in general. Why? 42

Computing Compressibility? Can an algorithm perform optimal compression? Can algorithms tell us if a given string is compressible? COMPRESS = { (x,c) K(x) c} Theorem: COMPRESS is undecidable! Idea: If decidable, we could design an algorithm that prints the shortest incompressible string of length n But such a string could then be succinctly described, by providing the algorithm code and n in binary! Berry Paradox: The smallest integer that cannot be defined in less than thirteen words. 43

Computing Compressibility? COMPRESS = {(x,c) K(x) c} Theorem: COMPRESS is undecidable! Proof: Suppose it s decidable. Consider the TM: M = On input x ϵ {0,1}*, let N = 2 x. For all y ϵ {0,1}* in lexicographical order, If (y,n) COMPRESS then print y and halt. M(x) prints the shortest string y with K(y ) > 2 x. <M,x> is a description of y, and <M,x> d + x So 2 x < K(y ) d + x. CONTRADICTION for large x! 44

Yet Another Proof that A TM is Undecidable! COMPRESS = {(x,c) K(x) c} Theorem: A TM is undecidable. Proof: Reduction from COMPRESS to A TM. Given a pair (x,c), our reduction constructs a TM: M x,c = On input w, For all pairs <M,w > with <M,w > c, simulate each M on w in parallel. If some M halts and prints x, then accept. K(x) c M x,c accepts ε 45

More on Interesting Formal Systems A formal system F is interesting if it is finite and: 1. Any mathematical statement about computation can also be effectively described within F. For all strings x and integers c, there is a S x,c in F that is equivalent to K(x) c 2. Proofs are convincing: it should be possible to check that a proof of a theorem is correct This set is decidable: { (S,P) P is a proof of S in F } 46

The Unprovable Truth About K-Complexity Theorem: For every interesting consistent F, There is a t s.t. for all x, K(x) > t is unprovable in F Proof: Define an M that treats its input as an integer: M(k) := Search over all strings x and proofs P for a proof P in F that K(x) > k. Output x if found Suppose M(k) halts. It must print some output x Then K(x ) = K(<M,k>) c + k c + log k for some c Because F is consistent, K(x ) > k is true But k < c + log k only holds for small enough k If we choose t to be greater than these k then M(t) cannot halt, so K(x) > t has no proof! 47

Random Unprovable Truths Theorem: For every interesting consistent F, There is a t s.t. for all x, K(x) > t is unprovable in F For a randomly chosen x of length t+100, K(x) > t is true with probability at least 1-1/2 100 We can randomly generate true statements in F which have no proof in F, with high probability! For every interesting formal system F there is always some finite integer (say, t=10000) so that you ll never be able to prove in F that a random 20000-bit string requires a 10000-bit program! 48

Proving Theorems With K-Complexity Theorem: L = {x x x 0, 1 } is not regular. Proof: Suppose L is recognized by a DFA D. Let n 0 and choose an x 0, 1 such that K x n Let q x be the state of D reached after reading in x Define a TM M(D, q, n): Find a path P in D of length n that starts from state q and ends in an accept state. Print the n-bit string along path P, and halt. Claim: The pair <M,(D, q x, n)> is a description of x! So n K x <M,(D, q x, n)> O log n CONTRADICTION! 50

Next Episode: Complexity Theory! 51

Repetitive Strings have Low Information Theorem: There is a fixed c so that for all x ϵ {0,1}* K(xx) K(x) + c The information in xx isn t much more than that in x Proof: Let N = On <M,w>, let s = M(w). Print ss. Suppose <M,w> is the shortest description of x. Then <N,<M,w>> is a description of xx Therefore K(xx) <N,<M,w>> 2 N + <M,w> + 1 2 N + K(x) + 1 c + K(x) 57

Information as Description Thesis: The amount of information in a string x is the length of the smallest description of x Let x 2 {0,1}* How should we describe strings? Use Turing Machines! Def. The shortest description of x, denoted d(x), is the lexicographically shortest string M such that the TM M on input ε halts with only x on its tape. 58

Kolmogorov Complexity (1960 s) Def. The shortest description of x, denoted d(x), is the lexicographically shortest string M such that the TM M on input ε halts with only x on its tape. Definition: The Kolmogorov complexity of x, denoted as K(x), is d(x). EXAMPLES?? Let s first determine some properties of K. Examples will fall out of this. 59

A Simple Upper Bound Theorem: There is a fixed c so that for all x in {0,1}* K(x) x + c The amount of information in x isn t much more than x Proof: For any string x, define a TM M x = On input w, overwrite w with x, and halt. Then M x on ε halts with just x on its tape. We have K(x) M x x + c 60