Lecture 9: The Recursion Theorem. Michael Beeson

Similar documents
7. The Recursion Theorem

7. The Recursion Theorem

7. The Recursion Theorem

Computability Theory. CS215, Lecture 6,

Theory of Computation

(a) Definition of TMs. First Problem of URMs

where Q is a finite set of states

This is logically equivalent to the conjunction of the positive assertion Minimal Arithmetic and Representability

Lecture 14 Rosser s Theorem, the length of proofs, Robinson s Arithmetic, and Church s theorem. Michael Beeson

Ordinary Differential Equations Prof. A. K. Nandakumaran Department of Mathematics Indian Institute of Science Bangalore

König s Lemma and Kleene Tree

Theory of computation: initial remarks (Chapter 11)

Axioms of Kleene Algebra

Lecture 15 The Second Incompleteness Theorem. Michael Beeson

More About Turing Machines. Programming Tricks Restrictions Extensions Closure Properties

Languages, regular languages, finite automata

λ Slide 1 Content Exercises from last time λ-calculus COMP 4161 NICTA Advanced Course Advanced Topics in Software Verification

MATH 105: PRACTICE PROBLEMS FOR CHAPTER 3: SPRING 2010

Lecture notes on Turing machines

Static Program Analysis

IV. Turing Machine. Yuxi Fu. BASICS, Shanghai Jiao Tong University

Type Systems. Today. 1. What is the Lambda Calculus. 1. What is the Lambda Calculus. Lecture 2 Oct. 27th, 2004 Sebastian Maneth

3 The Semantics of the Propositional Calculus

CS 361 Meeting 26 11/10/17

MATH The Chain Rule Fall 2016 A vector function of a vector variable is a function F: R n R m. In practice, if x 1, x n is the input,

Theory of Computation

1.1. The Kleene Recursion Theorem

Lecture Notes on From Rules to Propositions

Theory of Computation

Assignment 7 Properties of operations

form, but that fails as soon as one has an object greater than every natural number. Induction in the < form frequently goes under the fancy name \tra

Nondeterministic finite automata

Decision Problems with TM s. Lecture 31: Halting Problem. Universe of discourse. Semi-decidable. Look at following sets: CSCI 81 Spring, 2012

SYDE 112, LECTURE 7: Integration by Parts

TAYLOR POLYNOMIALS DARYL DEFORD

FINAL EXAM FOR PHIL 152 MICHAEL BEESON

LECTURE 11 - PARTIAL DIFFERENTIATION

15. Polynomial rings Definition-Lemma Let R be a ring and let x be an indeterminate.

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

Lecture 3. 1 Terminology. 2 Non-Deterministic Space Complexity. Notes on Complexity Theory: Fall 2005 Last updated: September, 2005.

Introduction to Turing Machines

Computation Theory, L 9 116/171

What are the recursion theoretic properties of a set of axioms? Understanding a paper by William Craig Armando B. Matos

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Lecture 2: Self-interpretation in the Lambda-calculus

Introduction to Turing Machines. Reading: Chapters 8 & 9

Mathematical Reasoning & Proofs

Numbers, proof and all that jazz.

Linear Equations & their Solutions

Informal Statement Calculus

Partial Derivatives for Math 229. Our puropose here is to explain how one computes partial derivatives. We will not attempt

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

2 Plain Kolmogorov Complexity

The Turing machine model of computation

P1 Calculus II. Partial Differentiation & Multiple Integration. Prof David Murray. dwm/courses/1pd

Large Numbers, Busy Beavers, Noncomputability and Incompleteness

Computability Crib Sheet

5.2 Proving Trigonometric Identities

To factor an expression means to write it as a product of factors instead of a sum of terms. The expression 3x

Closure under the Regular Operations

258 Handbook of Discrete and Combinatorial Mathematics

CSCE 222 Discrete Structures for Computing. Review for Exam 2. Dr. Hyunyoung Lee !!!

Proving languages to be nonregular

0.1 The Iteration Theorem of Kleene

Introduction to Metalogic

Supplement for MAA 3200, Prof S Hudson, Fall 2018 Constructing Number Systems

CS 2110: INDUCTION DISCUSSION TOPICS

Computability Theory

Announcements. Problem Set 6 due next Monday, February 25, at 12:50PM. Midterm graded, will be returned at end of lecture.

MATH 415, WEEKS 7 & 8: Conservative and Hamiltonian Systems, Non-linear Pendulum

Math 212-Lecture 8. The chain rule with one independent variable

1 Differentiability at a point

MATH 415, WEEKS 14 & 15: 1 Recurrence Relations / Difference Equations

Lecture 6: Finite Fields

Math Boot Camp Functions and Algebra

Type Systems. Lecture 2 Oct. 27th, 2004 Sebastian Maneth.

MAT137 Calculus! Lecture 10

Lecture 22: Integration by parts and u-substitution

Turing Machines. Nicholas Geis. February 5, 2015

CDM. Recurrences and Fibonacci

CS 4110 Programming Languages & Logics. Lecture 16 Programming in the λ-calculus

DR.RUPNATHJI( DR.RUPAK NATH )

Theory of Computer Science. Theory of Computer Science. D7.1 Introduction. D7.2 Turing Machines as Words. D7.3 Special Halting Problem

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES

L Hopital s Rule. We will use our knowledge of derivatives in order to evaluate limits that produce indeterminate forms.

7.5 Partial Fractions and Integration

MATH 1231 MATHEMATICS 1B CALCULUS. Section 5: - Power Series and Taylor Series.

L Hopital s Rule. We will use our knowledge of derivatives in order to evaluate limits that produce indeterminate forms.

COMP6463: λ-calculus

University of Regina. Lecture Notes. Michael Kozdron

CSE 331 Winter 2018 Reasoning About Code I

Why write proofs? Why not just test and repeat enough examples to confirm a theory?

Preliminaries to the Theory of Computation

CDM. Recurrences and Fibonacci. 20-fibonacci 2017/12/15 23:16. Terminology 4. Recurrence Equations 3. Solution and Asymptotics 6.

Limits and Infinite Series Lecture Notes for Math 226. Department of Mathematics Western Washington University

Notes on ordinals and cardinals

A NOTE ON ARITHMETIC IN FINITE TYPES. 1. Introduction

Lecture 12. Statement Logic as a word algebra on the set of atomic statements. Lindenbaum algebra.

3 Algebraic Methods. we can differentiate both sides implicitly to obtain a differential equation involving x and y:

Transcription:

Lecture 9: The Recursion Theorem Michael Beeson

The problem of computing by recursion We still have not provided completely convincing evidence that every intuitively computable function is Turing computable, even though we have proved that the µ-recursive functions coincide with the Turing computable functions. For example, what about the Ackermann function? Another example: we previously showed how to assign indices to the µ-recursive functions, so that we can write ϕ n for the µ-recursive function with index n. The question arises whether ϕ n (x) is a partial recursive function of both n and x. If so then the partial recursive functions form a model of computation. In an earlier lecture, we had to postpone proving that, because we do not know a direct proof. Now, we see that ϕ n (x) is partial recursive if and only if it is Turing computable. But it is not trivial to write a Turing machine to compute ϕ n (x) (as a function of n as well as x).

λ notation We need a notation for as a function of. Thus f(x,y) can be considered as a function of x for each fixed y. We write this function as λxf(x,y). Then we have the basic law that (λxf(x,y))(u) = f(u,y) Note that x is bound in the expression λxf(x,y). We also use the λ-notation for several variables x = x 1,...,x n. λxf(x,y)(u) = f(u,y). The process described by λx is called, in English, lambda abstraction over the variables x.

Example of λ-notation This notation can bring more precision to the differential calculus, where the derivative operation really applies to a function. If we define, for example, f(x) = x 2, then the derivative Df is the function λx 2x. We have (Df)(u) = 2u, something which is not written in calculus books. the expression (df/dx) is usually interpreted as what technically should be (Df)(x) (which is 2x). Functions are not the same as numbers. sin is a function, sin(x) is a number (if x is a number). Thus sin = λx sin(x).

The S 1 1 function As a preliminary to the recursion theorem, we need to discuss the relationship between functions of n variables, and functions of m < n variables obtained by fixing some of the n variables. For example, g(x) = λxf(x,y). The point is that code for computing g can be found by an algorithm, given code for computing f. This algorithm is a simple example of a program that treats programs as data. The traditional notation for this algorithm is S1 1, which takes two arguments. The first argument is code for f. The second argument is x. The output is code for g.

The S m n functions We treat this abstractly, using only the notion of a model of computation, given by a set X and some partial application functions App(e,x) on X, where x = x 1,...,x n. We need computable functions Sn m (x) such that for y = y 1,...,y m we have App(Sn m (e,x),y) = App(e,x,y) (1) We say e is an index of f if f(x) = App(e,x) for all x. The role of S m n can be described this way: Sm,n(e,y) is an index of λxf(y,x) if e is an index of f. See Kleene, p. 342, where the S n m functions are introduced for a model of computation that we have not studied, but you can see that the defining property is the same as here. The notation is due to Kleene.

Λ notation Kleene also introduced the Λ-notation, according to which Λxϕ(y,x) is an index of λxϕ(y,x). This works if we define Λxϕ e (y,x) to be Sn m (e,y), where e is an index of ϕ. (See Kleene, p. 344.) Although the index e of f = ϕ e is not mentioned in the notation Λxf(y,x), the notation without the index is mere human shorthand. You must have an index in mind. To clarify the notation: Λxϕ e (x,y) is an index of the function λxϕ e (x,y). ϕ e refers either to a particular model of computation (e.g. Turing machines) or to an abstract model of computation, depending on the context.

The main property of Λ Λxϕ(y,x) is an index of λxϕ(y,x). Therefore φ Λxϕ(y,x) (u) = λxϕ(y,x)(u) = ϕ(y,u)

S m n and λ in an abstract model The existence of computable Sn m functions is the detail that we promised to supply when discussing the definition of model of computation. A model of computation, by definition, has a set X and partial App functions for any number of variables, and the App functions for different numbers of arguments are connected by (1). The S m n functions must be computable, i.e. have indices. We also need to form indices of functions like λx,yf(y,x), or λyf(x,y,z), which cannot be directly defined using the S m n. It suffices to have the S m n plus swap functions s i,j,n that swap the i-th and j-th arguments of n arguments. Thus for example s 1,2,3 (e) is an index of λy,xϕ e (x,y,z).

The Turing model of computation has S m n functions Proof. We describe a Turing machine that computes S m n. The input is a Turing machine e and a sequence y. The output should be a Turing machine that takes input x and simulates the computation by e at input x y. We describe the desired output machine as a two-tape machine. Starting with e y on the main tape, and input x on the auxiliary tape, we insert x between e and y, leaving a blank before and after x. Then we return to the first square of x and run machine e. (We do not need to appeal to the universal machine.) What we need to output, however, is a one-tape machine that simulates this two-tape machine. But in the exercises, you have already shown how to transform a two-tape machine algorithmically into an equivalent one-tape machine. That completes the proof of the existence of S m n functions.

The Turing model of computation has swap functions For notational simplicity, we give the proof of the existence of swap functions only for s 1,2,2, which swaps the order of arguments of a binary function. To compute s 1,2,2 (e), we first rename the start state of e. Our output will consist of a list of Turing machine instructions including those of e (with the renamed start state), plus additional instructions designed to interchange the order of the two inputs on the tape that e works on, for example by copying the first input to the right of the second, and then copying both inputs back to the original starting square. That completes the proof (or at least, that s as much as we are going to say about it).

The First Recursion Theorem The first recursion theorem says that any recursion equation can be solved, in which the values of the function to be recursively defined are used in any way whatsoever. Theorem (First recursion theorem) Let H be Turing computable. Then there exists a Turing-computable g such that g(x) = H(x,g(x)) Of course, the resulting function will generally be only partial; if it is total that will require an additional proof. For example, we can define g(x) = g(x+1), which has one solution that is nowhere defined. This example also points up the fact that recursion equations generally do not have unique solutions, since any constant function also satisfies the equation.

The Second Recursion Theorem We will prove the first recursion theorem as a corollary of an even more general theorem, which says that recursion equations are solvable even when F is allowed access to the code for the recursively defined function, not just to its values: Theorem (Second recursion theorem) Let F be Turing computable. Then there exists a Turing machine e computing a partial function g such that g(x) = F(x,e) For comparison, we repeat the equation from the First Recursion Theorem: g(x) = H(x,g(x))

The fixed-point theorem The second recursion theorem in turn is a consequence of a theorem first stated and proved by Rogers in his 1967 book, Recursion Theory: Theorem (Fixed-point theorem) Let G be Turing computable and total. Then there exists an e such that for all x, ϕ e = ϕ G(e).

The general setting for recursion theorems Theorem The three recursion theorems hold in any model of computation, i.e. set X with partial App functions and computable S m n and swap functions. Since the Turing-computable functions have S m n and swap functions, the result applies to that case. In the next slides, we will prove all four theorems in the abstract setting.

Proof of the first recursion theorem from the second We can write the equation g(x) = H(x,g(x)) as g(x) = H(x,App(e,x)) and apply the second recursion theorem with F(x,e) = H(x,App(e,x)). Then g(x) = F(x,e) = H(x,App(e,x)).

Proof of the second recursion theorem from the fixed-point theorem In the abstract setting, the fixed point theorem says that if G is total computable, then there exists an e such that for all x, App(e,x) = App(G(e),x). Suppose given the partial computable function F. Define G(e) = ΛxF(x,e). Note that G is a total function, since ΛxF(x,e) is always some Turing machine obtained from F, whether F is total or not. By the fixed-point theorem, there exists an e such that for all x, App(e,x) = App(G(e),x). Define g(x) := App(e,x). Then we have g(x) := App(e,x) = App(G(e),x) = App(ΛxF(x,e),x) = F(x,e) as required for the second recursion theorem.

Notation The calculations involved in the next proof are easier to follow if we stop writing App explicitly, and just write xy instead of App(x, y). Thus we write xy for App(x,y). App(App(x, x), y) becomes (xx)y. That is not the same as x(xy). We will try to avoid omitting parentheses but if we do omit them, xyz means (xy)z.

The fixed-point theorem Given a total computable G, define Then define ω := ΛxG(Λy((xx)y)). e := Λy((ωω)y) A calculation then reveals that e is the desired fixed point: ex = (Λy((ωω)y))x by definition of e = (ωω)x reducing the Λ term = ((Λx G(Λy((xx)y)))ω)x by definition of ω = G(Λy((ωω)y)))x = (G(e))x In the unabbreviated notation, this is App(e,x) = App(G(e),x), which is the abstract version of the fixed-point theorem. That completes the proof.

This wonderful, mysterious proof! The reader may be forgiven if he or she finds the proof mysterious, and has the feeling that although it is easy to check the proof line by line, it is still hard to understand. The origins of this proof will become clearer when we discuss the λ-calculus, but that will not entirely remove the mystery. Though the proof is mysterious, the theorem is basic and important. So memorize the proof. Write it out several times on a blank sheet of paper, copying at first, and eventually writing it out from memory. This is one of the secret, esoteric gems of knowledge passed on in the inner circles of mathematical logic.

Applications of the recursion theorem Corollary The µ-recursive functions form a model of computation. That is, if ϕ n is the partial µ-function with index n and App(n,x) := ϕ n (x), then App is µ-recursive. Proof. App is defined by recursion on n; that is, the value ϕ n (x) is defined in terms of some other values of ϕ m for m < n. These recursion equations are quite complicated, but their exact form is no longer relevant. Supposing that e is a Turing machine that computes g(n,x) = ϕ n (x), for some F the recursion equations can be written as g(n,x) = F(n,x,e). Therefore by the second recursion theorem, such an e actually exists. That completes the proof. Remark. The proof has a certain magical quality: we get to assume that the desired Turing machine e exists; then we only have to check that the recursion equations we want to solve can be written in terms of e, and magically the Turing machine e appears.

No smaller model of computation on N The following theorem shows that there is no weaker notion of computation than Turing machines, that permits a universal machine. Theorem (Minimality theorem) Any model of computation on the natural numbers N must include all the Turing-computable functions; furthermore the computable functions in any model of computation on N are closed under the least-number operator.

Proof of the minimality theorem Suppose given a model of computation on N; let App be the application function(s) of the model, and temporarily let computable mean, computable in the model. It is easy to show, using the recursion theorem, that all primitive recursive functions must computable. If we show that the computable partial functions are closed under the µ operator, it will follow that all µ-recursive functions are computable (in the model), and hence all Turing computable functions also. Thus the only remaining thing to do is to show closure under the µ operator. This is essentially a piece of beginning computer science: search can be done recursively. Details on the next slide.

Search from recursion Suppose χ(b, y) is (partial) computable. Then define { φ(b,z) 0 if χ(b,z) = 0 = φ(b,z ) if χ(b,z) 0) Then we claim µy(χ(b,y) = 0) = φ(b,0) To prove this we prove the more general fact that φ(b,z) is the least k such that y = z +k satisfies χ(b,y) = 0, if there is a y z such that χ(b,y) = 0. That is proved by induction on y z, where y is the least y > z such that χ(b,y) = 0. (We expect to use induction someplace, because this is a theorem about N.) The base case of the induction is the first line in the definition of φ, and the induction step follows from the second line, since when z increase to z, y z (which is not zero, or the first line would have applied) decreases by 1. That completes the proof.

No total App over N The question naturally arises: Could we have a model of computation in which App is a total function? In that case all functions would be total. It is clear that no such model of computation exists over N, since we know that µxx = x is undefined, and by the minimality theorem, the µ operator can be defined in any model of computation over N.