An Algebraic Characterization of the Halting Probability

Similar documents
The Riemann Hypothesis. Cristian Dumitrescu

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

Confusion of Memory. Lawrence S. Moss. Department of Mathematics Indiana University Bloomington, IN USA February 14, 2008

What are the recursion theoretic properties of a set of axioms? Understanding a paper by William Craig Armando B. Matos

CSCC63 Worksheet Turing Machines

Part I: Definitions and Properties

Most General computer?

Introduction to Turing Machines

Chaitin Ω Numbers and Halting Problems

arxiv:math/ v1 [math.nt] 17 Feb 2003

Introduction to Turing Machines. Reading: Chapters 8 & 9

A version of for which ZFC can not predict a single bit Robert M. Solovay May 16, Introduction In [2], Chaitin introd

CS5371 Theory of Computation. Lecture 10: Computability Theory I (Turing Machine)

Turing Machines, diagonalization, the halting problem, reducibility

The Turing machine model of computation

Computability Theory. CS215, Lecture 6,

CSCI3390-Assignment 2 Solutions

1 Showing Recognizability

Randomness and Recursive Enumerability

Theory of Computation Lecture Notes. Problems and Algorithms. Class Information

CS20a: Turing Machines (Oct 29, 2002)

Word-hyperbolic groups have real-time word problem

Turing Machines. The Language Hierarchy. Context-Free Languages. Regular Languages. Courtesy Costas Busch - RPI 1

Sample Project: Simulation of Turing Machines by Machines with only Two Tape Symbols

Section 14.1 Computability then else

Large Numbers, Busy Beavers, Noncomputability and Incompleteness

Designing 1-tape Deterministic Turing Machines

Algorithmic probability, Part 1 of n. A presentation to the Maths Study Group at London South Bank University 09/09/2015

COMPUTING THE BUSY BEAVER FUNCTION

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Turing Machines (TM) Deterministic Turing Machine (DTM) Nondeterministic Turing Machine (NDTM)

CS5371 Theory of Computation. Lecture 10: Computability Theory I (Turing Machine)

How to Pop a Deep PDA Matters

Turing s thesis: (1930) Any computation carried out by mechanical means can be performed by a Turing Machine

FREE PRODUCTS AND BRITTON S LEMMA

IV. Turing Machine. Yuxi Fu. BASICS, Shanghai Jiao Tong University

The Turing Machine. CSE 211 (Theory of Computation) The Turing Machine continued. Turing Machines

Busch Complexity Lectures: Turing s Thesis. Costas Busch - LSU 1

Turing Machines Part II

Theory of Computation

Decidability (What, stuff is unsolvable?)

Testing Emptiness of a CFL. Testing Finiteness of a CFL. Testing Membership in a CFL. CYK Algorithm

CSE355 SUMMER 2018 LECTURES TURING MACHINES AND (UN)DECIDABILITY

arxiv: v1 [cs.lo] 16 Jul 2017

Turing Machines. Lecture 8

Halting and Equivalence of Program Schemes in Models of Arbitrary Theories

Limits of Computability

Homework Assignment 6 Answers

Introduction to Languages and Computation

Lecture 23: Rice Theorem and Turing machine behavior properties 21 April 2009

Turing Machines and the Church-Turing Thesis

Introducing Proof 1. hsn.uk.net. Contents

(a) Definition of TMs. First Problem of URMs

Turing Machines. Nicholas Geis. February 5, 2015

Lecture 14: Recursive Languages

Finite Automata. Mahesh Viswanathan

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text

Lecture notes on Turing machines

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012

P systems based on tag operations

Equivalent Forms of the Axiom of Infinity

A Universal Turing Machine

König s Lemma and Kleene Tree

6.8 The Post Correspondence Problem

Halting and Equivalence of Schemes over Recursive Theories

CSE 2001: Introduction to Theory of Computation Fall Suprakash Datta

CSE 2001: Introduction to Theory of Computation Fall Suprakash Datta

Informal Statement Calculus

Predicate Logic - Undecidability

2.6 Variations on Turing Machines

A statistical mechanical interpretation of algorithmic information theory

CS4026 Formal Models of Computation

More About Turing Machines. Programming Tricks Restrictions Extensions Closure Properties

Theory of Computer Science

UNIT-VIII COMPUTABILITY THEORY

Universal Turing Machine. Lecture 20

Asymptotic behavior and halting probability of Turing Machines

ECS 120 Lesson 15 Turing Machines, Pt. 1

Theory of Computation

CS20a: Turing Machines (Oct 29, 2002)

CITS2211 Discrete Structures (2017) Cardinality and Countability

Theory of Computer Science. Theory of Computer Science. D7.1 Introduction. D7.2 Turing Machines as Words. D7.3 Special Halting Problem

Busch Complexity Lectures: Turing Machines. Prof. Busch - LSU 1

CS3719 Theory of Computation and Algorithms

Chapter 1A -- Real Numbers. iff. Math Symbols: Sets of Numbers

1 The decision problem for First order logic

Foundations of

Theory of Computation - Module 4

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

About the relationship between formal logic and complexity classes

Decidability: Church-Turing Thesis

Turing machines COMS Ashley Montanaro 21 March Department of Computer Science, University of Bristol Bristol, UK

Theory of Computation

FACTORIZATION AND THE PRIMES

On P Systems with Active Membranes

Languages, regular languages, finite automata

The halting problem is decidable on a set of asymptotic probability one

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 10 Nancy Lynch

CS6901: review of Theory of Computation and Algorithms

Transcription:

CDMTCS Research Report Series An Algebraic Characterization of the Halting Probability Gregory Chaitin IBM T. J. Watson Research Center, USA CDMTCS-305 April 2007 Centre for Discrete Mathematics and Theoretical Computer Science

An Algebraic Characterization of the Halting Probability Gregory Chaitin Abstract Using 1947 work of Post showing that the word problem for semigroups is unsolvable, we explicitly exhibit an algebraic characterization of the bits of the halting probability Ω. Our proof closely follows a 1978 formulation of Post s work by M. Davis. The proof is selfcontained and not very complicated. 1. Introduction Algorithmic information theory [4] shows that pure mathematics is infinitely complex and contains irreducible complexity. The canonical example of such irreducible complexity is the infinite sequence of bits in the base-two expansion of the halting probability Ω. The halting probability is defined by taking the following summation 0 < Ω = 2 p < 1 U(p) halts over all the self-delimiting programs p that halt when run on a suitably defined universal Turing machine U. Here p denotes the size in bits of the program p. The value of Ω depends on the choice of U, but its surprising properties do not. The numerical value of Ω is maximally unknowable in the following precise sense. You need an N-bit theory in order to be able to determine N bits of IBM T. J. Watson Research Center, P. O. Box 218, Yorktown Heights, NY 10598, U.S.A., chaitin@us.ibm.com. 1

Ω [7]. Nevertheless, Ω has a kind of diophantine reality, because there is a diophantine equation with a parameter k that has finitely or infinitely many solutions depending on whether the kth bit of Ω is respectively 0 or 1 [4]. More recently, Ord and Kieu [5] have shown that there is also a diophantine equation with a parameter k that has an even or odd number of solutions depending on whether the kth bit of Ω is respectively 0 or 1. The purpose of this note is to discuss the fact that as well as diophantine reality, Ω also possesses a kind of algebraic reality, because there is an algebraic problem with a parameter i which yields the infinite sequence of bits b i in the binary expansion of Ω: Ω = b i 2 i. i=1,2,3,... First of all, note that one can calculate better and better lower bounds on Ω, for example, by using the simple LISP function given in [7, pp. 65 69]. This works because Ω is the limit of Ω n defined as follows: Ω n = 2 p. p n and U(p) halts in n steps As n tends to infinity, Ω n tends to Ω, and from some point on each bit of Ω n will remain correct, since Ω is irrational. 1 In other words, as n tends to infinity, the values of individual bits of Ω n will fluctuate but eventually settle down to the correct values. Our construction closely follows the presentation in Davis [1] of the idea in Post [2]. Davis [1] explains Post [2] so well, that it seems foolish to use a different formulation here. Sections 2 through 4 are taken word from word from Davis [1], except for changes to adapt Davis [1] to our present needs. 2 2. The Turing Post programming language We work with a finite alphabet of α possible tape symbols 0 1 a b c... 1 I.e., this limiting process cannot give us.3659999... instead of.3660000... because then Ω would be a rational number and would therefore not be irreducibly complex. 2 Davis only allows his Turing Post programs to use the two-symbol alphabet 0, 1. However, here we use a bigger alphabet, as was originally done by Post. 2

Here stands for a blank square on the tape. Any computation can be thought of as being carried out by an automatic scanning device, working with strings of these α symbols written on a linear tape, which executes instructions of the form: Write the symbol 0, Write the symbol 1, etc. Move scanner one square to the right, Move scanner one square to the left, Observe the symbol currently scanned and choose the next step accordingly, Stop. The procedure which our calculator is carrying out then takes the form of a numbered list of instructions of these kinds. As in modern computing practice, it is convenient to think of these kinds of instructions as constituting a special programming language. A list of such instructions written in this language is then called a program. We are now ready to introduce the Turing Post programming language. In this language there are 2α + 3 kinds of instructions: WRITE 0 WRITE 1 etc. MOVE SCANNER RIGHT MOVE SCANNER LEFT GO TO INSTRUCTION # i IF 0 IS SCANNED GO TO INSTRUCTION # i IF 1 IS SCANNED etc. STOP A Turing Post program is then a list of instructions, each of which is of one of these 2α + 3 kinds. Of course in an actual program the letter i in each GO TO instruction must be replaced by a definite (positive whole) number. In order that a particular Turing Post program begin to calculate, it must have some input data. That is, the program must begin scanning at a specific square of a tape already containing a sequence of symbols. The symbol functions as a blank ; although the entire tape is infinite, there 3

are never more than a finite number of non- symbols that appear on it in the course of a computation. (A reader who is disturbed by the notion of an infinite tape can replace it for our purposes by a finite tape to which blank squares that is, squares filled with s are attached to the left or the right whenever necessary.) 3. What is a word problem? We now explain what a word problem is. In formulating a word problem one begins with a (finite) collection, called an alphabet, of symbols, called letters. Any string of letters is called a word on the alphabet. A word problem is specified by simply writing down a (finite) list of equations between words. Figure 1 exhibits a word problem specified by a list of 3 equations on the alphabet a, b, c. From the given equations many other equations may be derived by making substitutions in any word of equivalent expressions found in the list of equations. In the example of Figure 1, we derive the equation bac = abcc by replacing the part ba by abc as permitted by the first given equation. We have explained how to specify the data for a word problem, but we have not yet stated what the problem is. It is simply the problem of determining for two arbitrary given words on the given alphabet, whether one can be transformed into the other by a sequence of substitutions that are legitimate using the given equations. 4. Converting the question of whether a Turing Post program halts into a word problem Consider a Turing Post program P which we assume consists of n instructions. We now use an alphabet consisting of the α + n + 2 symbols: 0 1 a b c... h q 1 q 2... q n q n+1. The fact that the ith step of P is about to be carried out and that there is some given tape configuration is coded by a certain word (sometimes called a Post word) in this alphabet. This Post word is constructed by writing down the string of symbols constituting the current nonblank part of the tape, placing an h to its left and right (as punctuation marks) and inserting 4

Given an alphabet of three symbols a, b, c, and three equations ba = abc bc = cba ac = ca we can obtain other equations by substitution: Or [ba]c = abcc b[ac] = [bc]a = c[ba]a = cabca = [ca]bca = acbca =... = cab[ca] = cabac =... = ca[bc]a = cacbaa =... (The expressions in brackets are the symbols about to be replaced.) In this context can be raised questions such as: Can we deduce from the three equations listed above that bacabca = acbca? The word problem defined by the three equations is the general question: to determine of an arbitrary given equation between two words, whether or not it can be deduced from the three given equations. Figure 1. A Word Problem 5

the symbol q i (remember that it is the ith instruction which is about to be executed) immediately to the left of the symbol being scanned. For example, with a tape configuration 11011 and instruction number 4 about to be executed, the corresponding Post word would be h110q 4 11h. This correspondence between tape configurations and words makes it possible to translate the steps of a program into equations between words. For example, suppose that the fourth instruction of a certain program is WRITE 0. We translate this instruction into the α equations q 4 0 = q 5 0, q 4 1 = q 5 0, etc. which, for example, yield the equation between Post words h110q 4 11h = h110q 5 01h corresponding to the next step of the computation. Suppose next that the fifth instruction is MOVE SCANNER RIGHT. It requires α(α + 1) equations to fully translate this instruction, of which two typical ones are q 5 01 = 0q 6 1, q 5 1h = 1q 6 h. In a similar manner each of the instructions of a program can be translated into a list of equations. In particular when the ith instruction is STOP, the corresponding equation will be: q i = q n+1. So the presence of the symbol q n+1 in a Post word serves as a signal that the computation has halted. Finally, the 2α equations q n+1 0 = q n+1, q n+1 1 = q n+1, 0q n+1 = q n+1, 1q n+1 = q n+1, 6

etc. serve to transform any Post word containing q n+1 into the word hq n+1 h. Putting all of the pieces together we see how to obtain a word problem which translates any given Turing Post program. Now let a Turing Post program P begin scanning the leftmost symbol of the string v; the corresponding Post word is hq 1 vh. Then if P will eventually halt, the equation hq 1 vh = hq n+1 h will be derivable from the corresponding equations as we could show by following the computation step by step. If on the other hand P will never halt, we would like to prove that this same equation will not be derivable. The problem is that every time we use one of the equations which translates an instruction, we are either carrying the computation forward, or in case we substitute from right to left undoing a step already taken. 5. Running computations backwards and forwards simultaneously So the computation, when it is expressed in terms of Post words, can run both forwards and backwards! Does this ruin things? Post realized that it cannot. 3 What we actually get is a tree with all the computations that eventually halt. The final word hq n+1 h is the root of this tree. As time goes forward, computational trajectories can merge or join, but they can never split in two. So even if the computation runs backward for a while, and even splits off from the correct trajectory, when it starts forward again it will have to retrace its steps, so this detour does not affect the final result. Post s argument is different; it s a proof by induction. 4 He considers the last backwards step in the derivation. The step right after that goes forward (since we were looking at the last backwards step), and must undo that backwards step. Hence we can delete these two steps which mutually annihilate each other and then apply Post s argument again to the resulting 2-step-shorter derivation. 3 Unfortunately, Davis [1] does not explain why. 4 See Post s Lemma II in Davis [3, pp. 297 298.] 7

6. Converting individual bits of the halting probability Ω into word problems Recall the definition of Ω j given in Section 1. Ω j approximates Ω by looking at the finite set of all programs p up to j bits in size, and running each of them for j steps. Each program p that halts that is discovered in this way contributes 1/2 p to Ω j, i.e., contributes 1 over 2 raised to size in bits of p. These approximations will get better and better. More precisely, as j tends to infinity, the values of individual bits of Ω j will fluctuate but eventually settle down to the correct values. Now let a j be a string of j letters a, and let b k be a string of k letters b. Consider a Turing Post program which when started scanning the leftmost symbol of the word a j b k on its tape, eventually halts if the kth bit of Ω j is a 1, and never halts if the kth bit of Ω j is a 0. This Turing Post program computes the kth bit of the jth approximation to Ω, and then halts at once or loops forever depending on whether this bit is a 1 or a 0. Convert this Turing Post program to a set of equations using Post s method as explained above. Then hq 1 a j b k h = hq n+1 h will be derivable from this set of equations iff the kth bit of Ω j is a 1. Hence, fixing k and letting j vary, the set of all words of the form hq 1 a j b k h which are equal to the word hq n+1 h will be infinite if the kth bit of Ω is a 1, and it will be finite if the kth bit of Ω is a 0. This is our algebraic characterization of the bits of the halting probability. 5 By the way, it is well worth it to read Davis [1], reprinted in Calude [8], in its entirety, not just the parts we have excerpted here. For more on the word problem, see Chapter 12 of Rotman [9]. For the philosophical significance of Ω, see [10]. 5 Using a different construction due to Ord and Kieu [5] (explained in Chaitin [6, pp. 135 139]) we can instead construct a set of equations with the following property. Fix k and let j vary. The set of all words of the form hq 1 a j b k h which are equal to the word hq n+1 h will always be finite, and furthermore the cardinality of this set will be odd if the kth bit of Ω is a 1, and its cardinality will be even if the kth bit of Ω is a 0. 8

References [1] M. Davis, What is a computation?, in L. A. Steen, Mathematics Today: Twelve Informal Essays, Springer-Verlag, New York, 1978, pp. 241 267. Reprinted in Calude [8]. [2] E. Post, Recursive unsolvability of a problem of Thue, Journal of Symbolic Logic, Vol. 12 (1947), pp. 1 11. Reprinted in Davis [3, pp. 293 303]. [3] M. Davis, The Undecidable, Dover, 2004. [4] G. Chaitin, Algorithmic Information Theory, Cambridge University Press, 1987. [5] T. Ord, T. D. Kieu, On the existence of a new family of diophantine equations for Ω, Fundamenta Informaticae, Vol. 56 (2003), pp. 273 284. [6] G. Chaitin, Meta Math!, Pantheon, 2005. [7] G. Chaitin, The Limits of Mathematics, Springer-Verlag, 1998. [8] C. Calude, Randomness & Complexity, from Leibniz to Chaitin, World Scientific, to appear. [9] J. J. Rotman, An Introduction to the Theory of Groups, Fourth edition, Springer-Verlag, 1995, Corrected second printing, 1999. [10] G. Chaitin, Thinking about Gödel & Turing: Essays on Complexity, 1970 2007, in preparation. 9