Kolmogorov complexity and its applications

Similar documents
Kolmogorov complexity and its applications

CISC 876: Kolmogorov Complexity

2 Plain Kolmogorov Complexity

Kolmogorov complexity

Sophistication Revisited

Lecture 13: Foundations of Math and Kolmogorov Complexity

Time, Space, and Energy in Reversible Computing. Paul Vitanyi, CWI, University of Amsterdam, National ICT Australia

Computability Theory

Notes for Lecture Notes 2

CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

Is there an Elegant Universal Theory of Prediction?

1 Computational Problems

Decidability: Church-Turing Thesis

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Computation. Some history...

17.1 The Halting Problem

CSE 4111/5111/6111 Computability Jeff Edmonds Assignment 3: Diagonalization & Halting Problem Due: One week after shown in slides

Introduction to Turing Machines. Reading: Chapters 8 & 9

Limitations of Efficient Reducibility to the Kolmogorov Random Strings

Gödel s Incompleteness Theorem. Overview. Computability and Logic

Counting dependent and independent strings

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Theory of Computation

Computability Theory. CS215, Lecture 6,

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Algorithmic Probability

Lecture Notes: The Halting Problem; Reductions

Finish K-Complexity, Start Time Complexity

1 From previous lectures

Introduction to Kolmogorov Complexity

Notes for Lecture 3... x 4

Randomness. What next?

U.C. Berkeley CS278: Computational Complexity Professor Luca Trevisan August 30, Notes for Lecture 1

Lecture 20: conp and Friends, Oracles in Complexity Theory

,

Lecture 11: Quantum Information III - Source Coding

König s Lemma and Kleene Tree

Large Numbers, Busy Beavers, Noncomputability and Incompleteness

Great Theoretical Ideas in Computer Science. Lecture 7: Introduction to Computational Complexity

CS154, Lecture 10: Rice s Theorem, Oracle Machines

1 Showing Recognizability

The axiomatic power of Kolmogorov complexity

Kolmogorov-Loveland Randomness and Stochasticity

The Turing Machine. Computability. The Church-Turing Thesis (1936) Theory Hall of Fame. Theory Hall of Fame. Undecidability

Complexity 6: AIT. Outline. Dusko Pavlovic. Kolmogorov. Solomonoff. Chaitin: The number of wisdom RHUL Spring Complexity 6: AIT.

CSCI3390-Lecture 6: An Undecidable Problem

Introduction to Languages and Computation

2 P vs. NP and Diagonalization

Pseudorandom Generators

Lecture 6: Oracle TMs, Diagonalization Limits, Space Complexity

Theory of Computation

MTAT Complexity Theory October 13th-14th, Lecture 6

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

Turing Machines Part III

Algorithmic probability, Part 1 of n. A presentation to the Maths Study Group at London South Bank University 09/09/2015

Complex Systems Methods 2. Conditional mutual information, entropy rate and algorithmic complexity

A Universal Turing Machine

Theory of Computing Systems 1999 Springer-Verlag New York Inc.

Lecture 3 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

COS597D: Information Theory in Computer Science October 19, Lecture 10

Computable Functions

Algorithmic Information Theory

Lecture 11: Gödel s Second Incompleteness Theorem, and Tarski s Theorem

Compression Complexity

Lecture 3: Randomness in Computation

6.895 Randomness and Computation March 19, Lecture Last Lecture: Boosting Weak Learners Into Strong Learners

Decidability (What, stuff is unsolvable?)

CS 125 Section #10 (Un)decidability and Probability November 1, 2016

Kolmogorov Complexity and Diophantine Approximation

Section 14.1 Computability then else

Chapter 6 The Structural Risk Minimization Principle

A non-turing-recognizable language

Average Case Analysis of QuickSort and Insertion Tree Height using Incompressibility

A An Overview of Complexity Theory for the Algorithm Designer

Most General computer?

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

From Fundamentele Informatica 1: inleverdatum 1 april 2014, 13:45 uur. A Set A of the Same Size as B or Larger Than B. A itself is not.

Algorithmic Information Theory

Algorithmic Information Theory and Kolmogorov Complexity

Warm-Up Problem. Please fill out your Teaching Evaluation Survey! Please comment on the warm-up problems if you haven t filled in your survey yet.

Lecture 16: Time Complexity and P vs NP

Logic: The Big Picture

A statistical mechanical interpretation of algorithmic information theory

Turing Machines (TM) Deterministic Turing Machine (DTM) Nondeterministic Turing Machine (NDTM)

The complexity of stochastic sequences

Computational Models Lecture 8 1

TURING MAHINES

Limits of Computability

Computational Models Lecture 8 1

Lecture 4 : Quest for Structure in Counting Problems

Decidable Languages - relationship with other classes.

1 Distributional problems

Final Exam Comments. UVa - cs302: Theory of Computation Spring < Total

CS294: Pseudorandomness and Combinatorial Constructions September 13, Notes for Lecture 5

KOLMOGOROV COMPLEXITY AND ALGORITHMIC RANDOMNESS

Production-rule complexity of recursive structures

CS154, Lecture 17: conp, Oracles again, Space Complexity

Turing Machines. Lecture 8

Transcription:

CS860, Winter, 2010 Kolmogorov complexity and its applications Ming Li School of Computer Science University of Waterloo http://www.cs.uwaterloo.ca/~mli/cs860.html We live in an information society. Information science is our profession. But do you know what is information, mathematically, and how to use it to prove theorems? Examples n Average case analysis of Shellsort. n Proofs that certain sets are not regular n Complexity Lower Bounds for 1 Tape TMs n Communication Lower Bounds: What is the distance between two pieces of information carrying entities? For example, distance from an internet query to an answer. Lecture 1. History and Definitions n History n Intuition and ideas in the past n Inventors n Basic mathematical theory n Li-Vitanyi: An introduction to Kolmogorov complexity and its applications. 1

1. Intuition & history 1903: An interesting year n What is the information content of an individual string? n 111. 1 (n 1 s) n π = 3.1415926 n n = 2 1024 n Champernowne s number: 0.1234567891011121314 is normal in scale 10 (every block has same frequency) n All these numbers share one commonality: there are small programs to generate them. n Shannon s information theory does not help here. n Popular youtube explanation: http://www.youtube.com/watch?v=kyb13pd-ume Kolmogorov Church von Neumann Andrey Nikolaevich Kolmogorov (1903-1987, Tambov, Russia) Reserachers in Kolmogorov complexity (1987). n Measure Theory n Probability n Analysis n Intuitionistic Logic n Cohomology n Dynamical Systems n Hydrodynamics n Kolmogorov complexity 2

Ray Solomonoff: 1926 -- 2009 Motivation: A case of Dr. Samuel Johnson (1709-1784) Dr. Beattie observed, as something remarkable which had happened to him, that he chanced to see both No.1 and No. 1000 hackney-coaches. Why sir, said Johnson there is an equal chance for one s seeing those two numbers as any other two. Boswell s Life of Johnson 3

The case of cheating casino Alice goes to the court Bob proposes to flip a coin with Alice: n Alice wins a dollar if Heads; n Bob wins a dollar if Tails Result: TTTTTT. 100 Tails in a roll. n Alice lost $100. She feels being cheated. n Alice complains: T 100 is not random. n Bob asks Alice to produce a random coin flip sequence. n Alice flipped her coin 100 times and got THTTHHTHTHHHTTTTH n But Bob claims Alice s sequence has probability 2-100, and so does his. n How do we define randomness? Roots of Kolmogorov complexity and preliminaries (1) Foundations of Probability Laplace, 1749-1827 n P. Laplace: a sequence is extraordinary (nonrandom) because it contains rare regularity. n 1919. von Mises notion of a random sequence S: n lim n { #(1) in n-prefix of S}/n =p, 0<p<1 n The above holds for any subsequence of S selected by an admissible function. n But if you take any partial function, then there is no random sequence a la von Mises. n A. Wald: countably many. Then there are random sequences. n A. Church: recursive selection functions n J. Ville: von Mises-Wald-Church random sequence does not satisfy all laws of randomness. Roots of Kolmogorov complexity and preliminaries (2) Information Theory: Shannon-Weaver theory is on an ensemble. But what is information in an individual object? (3) Inductive inference: Bayesian approach using universal prior distribution 4

Preliminaries and Notations n Strings: x, y, z. Usually binary. n x=x 1 x 2... an infinite binary sequence n x i:j =x i x i+1 x j n x is number of bits in x. n Sets, A, B, C n A, number of elements in set A. n Fix an effective enumeration of all Turing machines (TMs): T 1, T 2, n Universal Turing machine U: n U(0 n 1p) = T n (p) = gives output of TM T n with input p 3. Kolmogorov Theory Solomonoff (1960)-Kolmogorov (1963)-Chaitin (1965): The amount of information in a string is the size of the smallest program generating that string. { } c K( x) = min p : U( p) = x u p Invariance Theorem: It does not matter which universal Turing machine U we choose. I.e. all encoding methods are ok. Proof of the Invariance theorem n For a fixed effective enumeration of all Turing machines (TM s): T 1, T 2, n Let U be a universal TM such that input p to nth TM produces x U(0 n 1p) = T n (p)= x n Then for all x: C U (x) < C Tn (x) + O(1) n Note: The constant O(1) depends on n, but not x. n Fixing U, we write C(x) instead of C U (x). QED Formal statement of the Invariance Theorem: There exists a computable function S 0 such that for all computable functions S, there is a constant c S such that for all strings x ε {0,1} * C S0 (x) C S (x) + c S Kolmogorov Theory Applications n Mathematics --- probability theory, logic. n Physics --- chaos, thermodynamics. n Computer Science average case analysis, inductive inference and learning, shared information between documents, data mining and clustering, incompressibility method -- examples: n Shellsort average case n Heapsort average case n Circuit complexity n Lower bounds on Turing machines, formal languages n Combinatorics: Lovazs local lemma and related proofs. n Philosophy, biology etc randomness, inference, complex systems, sequence similarity n Information theory information in individual objects, information distance n Classifying objects: documents, genomes n Query Answering systems 5

Kolmogorov Theory continued Ø Intuitively: C(x)= length of shortest description of x Ø Define conditional Kolmogorov complexity similarly, Ø C(x y)=length of shortest description of x given y. n Examples n C(xx) = C(x) + O(1) n C(xy) C(x) + C(y) + O(log(min{C(x),C(y)}) n C(1 n ) O(logn) n C(π 1:n ) O(logn) n For all x, C(x) x +O(1) n C(x x) = O(1) n C(x ε) = C(x) 3.1 Basics n Incompressibility: For constant c>0, a string x ε {0,1} * is c-incompressible if C(x) x -c. For constant c, we often simply say that x is incompressible. (We will call incompressible strings random strings.) Lemma. There are at least 2 n 2 n-c +1 c-incompressible strings of length n. Proof. There are only k=0,,n-c-1 2 k = 2 n-c -1 programs with length less than n-c. Hence only that many strings (out of total 2 n strings of length n) can have shorter programs (descriptions) than n-c. QED. Facts n If x=uvw is incompressible, then C(v) v - O(log x ). n If p is the shortest program for x, then C(p) p - O(1) n C(x p) = O(1) n A is recursively enumerable (r.e.) if the elements of A can be listed by a Turing machine n A is sparse if the set of all length n strings of A is p(n) for some polynomial p n If a subset A of {0,1}* is recursively enumerable (r.e.), and A is sparse, then for all x in A, x =n, C(x) O(log p(n) ) + O(C(n)) = O(log n). 3.3 Properties Theorem (Kolmogorov) C(x) is not partially recursive. That is, there is no Turing machine M s.t. M accepts (x,k) if C(x) k and undefined otherwise. Proof. If such M exists, then design M as follows. Choose n >> M. M simulates M on input (x,n), for all x =n in parallel (one step each), and outputs the first x such that M says yes. Thus we have a contradiction: C(x) n by M, but M outputs x Hence M C(x) n, but by choice x =n >> M, a contradiction. QED 6

3.4 Godel s Theorem Theorem. The statement x is random is not provable. Proof (G. Chaitin). Let F be an axiomatic theory. C(F)= C (is the size of the compressed encoding of F). If the theorem is false and statement x is random is provable in F, then we can enumerate all proofs in F to find a proof of x is random and x >> C, output (first) such x. Then C(x) < C +O(1). But the proof for x is random implies that C(x) x >> C, a contradiction. QED 3.5 Barzdin s Lemma n A characteristic sequence of set A is an infinite binary sequence χ=χ 1 χ 2, where χ i =1 iff iεa. Theorem. (i) The characteristic sequence χ of an r.e. set A satisfies C(χ 1:n n) logn+c A for all n. (ii) There is an r.e. set, C(χ 1:n n) logn for all n. Proof. Proof of (i): Use the number 1 s in the prefix χ 1:n as a termination condition, implies C(χ 1:n n) logn+c A Proof of (ii): By diagonalization: Let U be the universal TM. Define χ=χ 1 χ 2, by χ i =1 if U(i-th program, i)=0, otherwise χ i =0. χ defines an r.e. set. And, for each n, we have C(χ 1:n n) logn since the first n programs of length < log n are all different from χ 1:n by definition. QED 7