Recursion vs. Computability, and Kleene’s First Recursion Theorem

The Kleene recursion theorems are two basic (and often confused) results in computability theory. The first theorem guarantees that recursive definitions make sense, while the second one shows (among other things) the existence of quines. This post will explain the first recursion theorem.

Recursion vs. Computability

There are many equivalent definitions of the class of “computable functions” (cf. the Church-Turing thesis). The definition used in the Kleene recursion theorems, and in recursion theory in general, uses the mu minimization operator. In contrast, a “programmer-friendly” definition of computable functions says that computable functions are exactly the functions computed in one’s favorite programming language. Though they define the same class, “recursive” functions and “computable” functions have different definitions, and coupled with the fact that “recursion” is a common tool in computer programming, this has led to some confusion (see this excellent Stack Exchange answer for some discussion). The Kleene recursion theorem is a statement from the “recursion theory” perspective; on the way to it we’ll state it in the computability perspective, and we’ll start by talking about it from the mathematical perspective. I think the discussion of this theorem emphasizes the clarity that can be gained from keeping in mind real-world programming languages even when thinking about “mathematically friendly” objects like the μ-recursive functions.

Say we want a function f that satisfies some recursive equalities,

f(0) = 0\\ f(1) = 1\\ f(n) = f(n-1) + f(n-2)

It’s not a priori clear that such an f exists. There is a slightly more primitive mathematical object that always exists, though, which is the recursive operator \Phi : g \mapsto f obtained by replacing recursive calls with calls to g,

f(0) = 0\\ f(1) = 1\\ f(n) = g(n - 1) + g(n - 2)

Given g, we can always define a new function f in this way. Now, notice that the f satisfying our recurrence are exactly fixed points of our recursive operator \Phi. Kleene’s first recursion theorem says that recursive operators always have fixed points. In this way, the theorem says that our recurrence equations do, in fact, always have solutions.

Theorem (Kleene’s First Recursion Theorem, mathematical version)
Every (mathematical) recursive operator has a fixed point.

Proof:
In the mathematical case, we require that every recursive operator only evaluate g on smaller natural numbers. What this condition says is that our recurrences for a desired fixed point f: \mathbb{N}\to \mathbb{N} only use previous values of f. In this case we can define f by induction like a normal recursive definition. \hfill\square

Kleene’s Recursion Theorem in Any Programming Language

Let’s now talk about recursive operators from the computability/programming language perspective. Say we’re writing a recursive function in our favorite language,

f(0) = 0
f(1) = 1
f(n) = f(n - 1) + f(n - 2)

As before we have a natural notion of “recursive operator” \Phi : g \mapsto f, which is just a higher-order function,

f(0) = 0
f(1) = 1
f(n) = g(n - 1) + g(n - 2)

Theorem (Kleene’s First Recursion Theorem, programming language version)
Every (programming language) recursive operator has a fixed point.

Proof:
This proof is rather different from the mathematical case, and it will be different from the full Kleene recursion theorem, but it is philosophically interesting. Compile the block of code

f(0) = 0
f(1) = 1
f(n) = f(n - 1) + f(n - 2)

The compiler automatically makes this a valid definition of the function f. Note that we didn’t need the restriction we had in the mathematical case, that recursive calls are only made on smaller natural numbers. Consequently the output of compilation might be a partial function (undefined, \bot, on some inputs), but it’s a well-defined mathematical object. \square

Kleene’s First Recursion Theorem

In recursion theory we have a different definition of recursive functions via the mu minimization operator. Let R be the set of μ-recursive functions. This calls for a new definition of a recursive operator. This seems like a hard task: we don’t have any “code” to speak of, a recursive operator should be a special case of a function \Phi :R \to R. We can extract one key mathematical property of programming language recursive operators to be our guide: we require that a recursive operator \Phi :R \to R allows \Phi(g)(n) to be computed from finitely many values of g (notice that every programming language recursive operator i.e. higher order function has this property). More discussion of recursive operators is at the end of this post; the intuition here suffices to understand the proof of Kleene’s first recursion theorem,

Theorem (Kleene’s First Recursion Theorem)
Every (recursion theory) recursive operator \Phi : R \to R has a fixed point. Furthermore, there is a least fixed point i.e. a function f \in R such that \Phi(f) = f, and for any function g \in R satisfying \Phi(g) = g, f \sqsubset g.

The notation f \sqsubset g (pronounced “g is more defined than f“) says that f(n) \neq \bot implies f(n) = g(n). Notice that the function defined nowhere (denoted \Lambda) satisfies \Lambda \sqsubset f for every f.

Proof
The proof can be thought of as “making recursive calls until termination”. Say we want to know what value we should assign to f(n). If we want \Phi(f) = f, this means we want to know the value of \Phi(f)(n). \Phi(f) tells us a finite list of values of f that, if we define those, will define \Phi(f)(n); this is our recurrence for computing f(n). By repeatedly asking for and evaluating those values, either all recursive calls for f(n) will terminate at base cases, or they won’t; if it ever terminates, we have f(n)‘s value, otherwise f(n) = \bot is undefined. To use our running example of the Fibonacci recurrence, \Phi(f) tells us that in order to define f(n), it suffices to define f(n-1) and f(n-2), which in turn require f(n-3) and f(n-4), and so on until we meet the base cases 0 and 1, which don’t require any other values of f.

A bit more formally, start with the function that loops on every input, \Lambda. Define functions f_i that terminate after i recursive calls by f_0 = \Lambda and f_i = \Phi (f_{i-1}). Finally, let f = \bigcup_{i: \mathbb{N}} f_i i.e. if an input terminates in any finite number of recursive calls, it is defined.

It’s not too hard to see that this f is least: any g satisfying the recursion must have carried out any finite number of recursive calls. \square

Ultimately, the proof is nothing more than “compiling our recursive operator,” in the same way the compiler in the programming language perspective “automatically” defines a function that terminates if all recursive calls terminate, or is undefined if there’s an infinite loop.

Finally, we should mention the idea behind the definition of recursive operators. As seen in the proof, and as holds true when thinking of programming language recursion, the key idea is that computing some value \Phi(f)(n) should only require finitely many values of f. This can be formalized by saying that \Phi is a continuous function on R.

We define the positive information topology on R. Let \mathcal{F} be those functions that are defined on only finitely many inputs, and are \bot elsewhere. We think of functions in \mathcal{F} as specifying finitely many values of functions in R. A basis of open sets of the positive information topology is U_\theta, for \theta \in \mathcal{F}, defined by

U_\theta = \{ g \in R \mid \theta \sqsubset g\}

That is, U_\theta is exactly those functions that meet the specified (finitely many) output values of \theta. A continuous function \Phi has that the preimage of every basis set U_\xi contains a basis set U_\nu. Taking \xi to be a singleton function, and so U_{\xi} is exactly those functions with a specified single output, continuity says that defining the finitely many values in \nu ensures that the single output is fixed.

A recursive operator is necessarily a continuous function on this space. A “coherence condition” is also necessary to pin down the proper definition, but continuity suffices for intuition. For full definitions, see Cutman’s book on computability theory, Chapter 10.