The Ackermann Function Defined By Hyperoperation

One of my associates recently remarked, “I can write down the Ackermann recursion, but I have zero intuition where it comes from”. This expository post explains the answer I gave him, that the Ackermann function arises via hyperoperation.

The Ackermann function A(m, n) is variously defined, but the most popular these days is the Ackermann-Péter function:

A(0, n) = n+1

A(m, 0) = A(m-1, 1)

A(m, n) = A(m - 1, A(m, n-1))

One way to build up to the Ackermann function is through hyperoperation. We’ll show that A(m, n) is pretty much 2 \wedge_m n for the m-th hyperoperator \wedge_m: in fact A(m, n) = 2 \wedge_m (n+3) -3.

Fixing the first argument of A(m, n) yields a function based on the m-th hyperoperator, which is why we often think of the Ackermann function as a family of functions parameterized by the first argument. In retrospect, maybe the better definition for the Ackermann function is the simpler A(m,n) = 2 \wedge_m n, so that the ubiquitous exercise to compute the first few functions x \mapsto A(m, x) becomes just a little bit less of an exercise.


Interpret multiplication x * y as repeated addition: add x to itself, y times. Interpret exponentiation x^y as repeated multiplication: multiply x with itself, y times. Going one step further, interpret tetration x \uparrow y as repeated exponentiation: exponentiate x with itself, y times. For example,

x \uparrow 6 = x^{x^{x^{x^{x^{x}}}}}

Going infinitely many steps further, we can define the k-th hyperoperation x \wedge_k y to “recursively apply the previous hyperoperation on x with itself, y times” (this of course stops being commutative at and beyond exponentiation, though the first hyperoperators constructed were actually commutative). To actually implement this we should use the recursion “compute x hyperoperated with itself y-1 times, then apply x one more time”:

x \wedge_k y \overset{\text{def}}{=} x \wedge_{k-1}(x \wedge_k (y-1))

Notice we fold right here i.e. compute x \wedge_{k-1} (x \wedge_{k-1} (x \wedge_{k-1}\cdots (x \wedge_{k-1} x)\cdots)) rather than ((\cdots(x \wedge_{k-1} x) \cdots\wedge_{k-1} x) \wedge_{k-1} x) \wedge_{k-1} x.

We also need a trio of base cases for what happens when you “apply x to itself 0 times” in order to align with the normal arithmetic operations:

x \wedge_1 0 \overset{\text{def}}{=} x

x \wedge_ 2 0 \overset{\text{def}}{=} 0

x \wedge_k 0 \overset{\text{def}}{=} 1 for k \geq 3

Finally, we set x \wedge_0 y\overset{\text{def}}{=} y+1, to capture that addition, \wedge_1, arises as the iterated successor function, our \wedge_0. After making all these definitions, we have our hyperoperators. These extend the normal arithmetic operators in the following way:

x \wedge_1 y = x + y

x \wedge_2 y = x * y

x \wedge_3 y = x^y

x \wedge_4 y = x \uparrow y

Proof of Ackermann/Hyperoperator Identity

The point of defining these hyperoperations is to note that the Ackermann function is itself a specific case: A(m, n) = 2 \wedge_m (n+3) -3. The proof of this isn’t too exciting, but let’s prove it by induction. Check the base case m = 0,

A(0,n) = n+1

2 \wedge_0 (n+3) - 3 = n+1

For the base case n = 0, we need key properties of using 2 instead of a different constant:

2 \wedge_k 1 = 2 for k \geq 3

2 \wedge_k 2 = 4 for k \geq 3

Now we have

A(m, 0) = A(m-1, 1)

2 \wedge_m (0+3) - 3 = 2 \wedge_{m-1} (2 \wedge_m 2) - 3 = 2 \wedge_{m-1} 4 -3

And the primary recurrence for 2 \wedge_m n and A(m, n):

A(m, n) = A(m-1, A(m, n-1))

2 \wedge_m (n+3) - 3 =2 \wedge_{m-1}(2 \wedge_{m} (n+2)) - 3 =2 \wedge_{m-1}((2 \wedge_{m} (n+2) - 3) + 3) - 3

Substituting 2 \wedge_m (n+2) - 3 for A(m, n-1) by induction, we see they are in fact the same recursion. \square