Ackermann function

The Ackermann function is an extremely fast growing mathematical function found by Wilhelm Ackermann in 1926 , with the help of which the limits of computer and calculation models can be shown in theoretical informatics . Today there is a whole series of functions that are known as the Ackermann function. These all show a similar formation law as the original Ackermann function and also have a similar growth behavior.

History of origin

In 1926 David Hilbert hypothesized that every computable function was primitive-recursive . Put simply, this means that every function that can be calculated by a computer can be composed of a few, very simple rules and that the duration of the calculation can be estimated in advance. This applies to almost all functions that occur in practice.

Also in 1926, Ackermann constructed a function that refuted this assumption and published it in 1928. In his honor, this function is now called the Ackermann function. It can be evaluated by a computer in a finite time, but it is not primitive-recursive.

In 1935, Rózsa Péter constructed a simplified version that had the same properties. This function, sometimes referred to as the Ackermann-Péter function, is mainly used today.

idea

One looks at the result . In this case, the operation of the previous term is applied- times to for each sequence element, i.e. is even , where the variable- times occurs and so on. The idea behind the Ackermann function is to understand this sequence as a function. ${\ displaystyle a + b, a \ cdot b, a ^ {b}, \ ldots}$ ${\ displaystyle (b-1)}$ ${\ displaystyle a}$ ${\ displaystyle a \ cdot b}$ ${\ displaystyle a + a + \ dotsb + a}$ ${\ displaystyle a}$ ${\ displaystyle b}$

Example : Substituting in the above sequence and , we obtain the sequence: 6, 8, 16, 65536, (65536 twos), ..., which is also called the Ackermann function and calls. The last number listed is already much larger than the estimated number of all atoms in the universe.

{\ displaystyle a = 2}

{\ displaystyle b = 4}

{\ displaystyle 2 ^ {2 ^ {. ^ {. ^ {. ^ {2}}}}}}

{\ displaystyle a = 2}

{\ displaystyle b = 4}

The Ackermann function, notated as , is a function that satisfies the following equations: ${\ displaystyle \ varphi}$

{\ displaystyle \ varphi (a, b, 0) = a + b \,}

{\ displaystyle \ varphi (a, b, 1) = a \ cdot b}

{\ displaystyle \ varphi (a, b, 2) = a ^ {b} \,}

{\ displaystyle \ ldots}

From the fourth line, the function values can no longer be formulated with conventional operators ; you need extended notations, such as the hyper operator .

Definition and variants

The Ackermann function is usually defined recursively , i.e. H. one makes explicit statements for some initial values and gives instructions (the recursion scheme) how to get further function values from those already calculated.

Definition of Ackermann

Ackermann himself defined the function in a rather cumbersome way, but shortly afterwards gave the following equivalent definition:

{\ displaystyle \ varphi (a, b, 0) = a + b \,}

{\ displaystyle \ varphi (a, 0, n + 1) = \ alpha (a, n) \,}

{\ displaystyle \ varphi (a, b + 1, n + 1) = \ varphi (a, \ varphi (a, b, n + 1), n) \,}

There is another function that Ackermann did not describe further. It provides the starting values : ${\ displaystyle \ alpha \ left (a, n \ right)}$ ${\ displaystyle a \ cdot 0, a ^ {0}, \ ldots}$

{\ displaystyle \ alpha (a, n) = {\ begin {cases} 0, & {\ text {if}} n = 0 \\ 1, & {\ text {if}} n = 1 \\ a, & {\ text {if}} n> 1 \ end {cases}}}

Examples :

If one wants to calculate, it is possible to apply the first line of the definition and obtained directly: . ${\ displaystyle \ varphi (4,3,0)}$ ${\ displaystyle \ varphi (4,3,0) = 4 + 3 = 7}$
If one wants to calculate, it is possible to apply the second line and obtains: . ${\ displaystyle \ varphi (4,0,2)}$ ${\ displaystyle \ varphi (4,0,2) = \ alpha (4,1) = 1}$
If neither the second nor the third argument is 0, the third line of the definition is used. For example . If you insert from the previous example, you get . Now you can apply the third line again, and then the second twice. It all adds up to: ${\ displaystyle \ varphi (4,1,2) = \ varphi (4, \ varphi (4,0,2), 1)}$ ${\ displaystyle \ varphi (4,0,2) = 1}$ ${\ displaystyle \ varphi (4,1,1)}$

{\ displaystyle {\ begin {aligned} \ varphi (4,1,2) & = \ varphi (4, \ varphi (4,0,2), 1) \\ & = \ varphi (4,1,1) \\ & = \ varphi (4, \ varphi (4,0,1), 0) \\ & = \ varphi (4,0,0) \\ & = 4 \ end {aligned}}}

When one speaks of the growth of the Ackermann function, one often means the function . ${\ displaystyle f \ left (n \ right): = \ varphi (n, n, n)}$

Definition of peter

In 1935, Rózsa Péter defined a simpler version of the Ackermann function, which has only two parameters and also works without the auxiliary function: ${\ displaystyle \ alpha}$

{\ displaystyle {\ begin {array} {lcl} \ operatorname {a} (0, m) & = & m + 1 \\\ operatorname {a} (n + 1,0) & = & \ operatorname {a} ( n, 1) \\\ operatorname {a} (n + 1, m + 1) & = & \ operatorname {a} (n, \ operatorname {a} (n + 1, m)) \ end {array}} }

Here, too, in connection with growth studies, one often means the function when one speaks of the Ackermann function . ${\ displaystyle f (n): = \ operatorname {a} (n, n)}$

From this definition it is not immediately apparent that the function is defined for all non-negative, integer and defined, since it is also increased in some cases. But one can see that on the one hand for is well-defined. On the other hand, under the condition that it is well-defined, it is also well-defined by iteratively using the last two recursion rules. ${\ displaystyle \ operatorname {a} (n, m)}$ ${\ displaystyle n}$ ${\ displaystyle m}$ ${\ displaystyle m}$ ${\ displaystyle \ operatorname {a} (n, \ cdot)}$ ${\ displaystyle n = 0}$ ${\ displaystyle \ operatorname {a} (n_ {0}, \ cdot)}$ ${\ displaystyle \ operatorname {a} (n_ {0} +1, \ cdot)}$

It should be noted, however, that there is no upper limit for the growth of in the following function calls when decreasing . ${\ displaystyle n}$ ${\ displaystyle m}$

Modified Ackermann function

Often, slightly modified versions of the Ackermann function are used in the complexity analysis, but they have the same asymptotic runtime behavior. One of these variants, which allows an interpretation as a "generalized exponential function", can be specified as follows:

{\ displaystyle {\ begin {array} {lcl} A_ {1} (n) & = & 2n & {\ text {if}} n \ geq 1, \\ A_ {k} (1) & = & 2 & {\ text { if}} k \ geq 2, \\ A_ {k} (n) & = & A_ {k-1} (A_ {k} (n-1)) & {\ text {if}} n \ geq 2, k \ geq 2. \ end {array}}}

The sequence of functions thus defined can now be used to define the modified Ackermann function by adding ${\ displaystyle A_ {k}}$ ${\ displaystyle A}$

{\ displaystyle A (n) = A_ {n} (n) \,}

puts.

The functions can now be interpreted as a natural continuation of addition, multiplication and exponentiation. For example, represents the fold addition of the number , the fold multiplication of the number , etc. It is ${\ displaystyle A_ {k}}$ ${\ displaystyle A_ {1} (n)}$ ${\ displaystyle n}$ ${\ displaystyle 2}$ ${\ displaystyle A_ {2} (n)}$ ${\ displaystyle n}$ ${\ displaystyle 2}$

{\ displaystyle A_ {1} (k) = 2 + 2 + 2 + \ dotsb + 2 = 2 \ cdot k}

{\ displaystyle A_ {2} (k) = 2 \ cdot 2 \ cdot 2 \ dotsm 2 = 2 ^ {k}}

{\ displaystyle A_ {3} (k) = 2 ^ {2 ^ {2 ^ {{} ^ {. \, ^ {. \, ^ {. \, ^ {2}}}}}}}}

Significance in theoretical computer science

As already mentioned at the beginning, Ackermann invented this function as an example of a function that is not primitive recursive, but is calculable .

When looking for the limits of computers, one very quickly comes across the concept of calculable functions . These are all the functions for whose evaluation an algorithm can be specified, i.e. all functions that a computer (especially a Turing machine ) can calculate. This definition quickly poses a problem when you want to decide whether a specific function is predictable. If one finds an algorithm that calculates the function, it is obviously calculable. Otherwise it is uncertain whether the function is really unpredictable or whether there is an algorithm but it has not been found.

For this reason one is looking for alternative definitions with which one can make such a proof more easily. A first approach for this was the primitive recursive functions. These are functions that can be composed of very simple functions using a few rules.

For some time it was assumed that all computable functions are primitive-recursive, so that the primitive-recursive functions provide a tool to solve the problem described above. However, this hope destroyed the Ackermann function, of which one can prove that it is predictable but not primitive-recursive. (See evidence below.)

If one introduces another construction rule, the so-called µ-recursion , to the class of primitive-recursive functions , one obtains a larger class of also computable functions, which contains the Ackermann function. It is assumed that this class of µ-recursive functions corresponds to the class of intuitively computable functions ( Church's thesis ).

proof

The proof that the Ackermann function is computable, but not primitive-recursive, essentially uses the fact that the Ackermann function grows faster than any primitive-recursive function.

Proof sketch for the claim that the Ackermann function is not primitive-recursive:

First you define a function for every primitive recursive function ${\ displaystyle g}$

${\ displaystyle f_ {g} (n): = \ max \ left \ {g (n_ {1}, \ ldots, n_ {k}): \ sum _ {i = 1} ^ {k} n_ {i} \ leq n \ right \}}$

This function specifies the maximum that can be achieved with if the sum of the arguments does not exceed. ${\ displaystyle g}$ ${\ displaystyle n}$
Then you show by structural induction over the inductive construction of the primitive recursive functions that every primitive recursive function is a natural number is such that for all the following applies: . This clearly shows that the Ackermann function grows faster than any primitive recursive function. ${\ displaystyle g}$ ${\ displaystyle k}$ ${\ displaystyle n \ geq k}$ ${\ displaystyle f_ {g} (n) <a (k, n)}$
This then proves, as follows, that the Ackermann function is not primitive-recursive:
Assuming it is primitive-recursive, then too . After the preliminary remarks but there is a such that for all the following applies: . If you put here , you get the contradiction: ${\ displaystyle a (k, n)}$ ${\ displaystyle g (n): = a (n, n)}$ ${\ displaystyle k}$ ${\ displaystyle n \ geq k}$ ${\ displaystyle g (n) <a (k, n)}$ ${\ displaystyle n = k}$
${\ displaystyle g (k) \ leq f_ {g} (k) <a (k, k) = g (k)}$

Applications

There are very few applications for the Ackermann function. The two most important are benchmark tests for recursive calls in programming languages and run-time estimates of the weighted union and path compression in the union find structure .

Benchmark for recursive calls

When introducing new programming languages , compilers and computers, one wants to examine their performance. To do this, u. a. Using benchmarking , special programs are used to check properties that have been set.

In this context, the Ackermann function is often used as a benchmark for checking recursive procedure calls, since a program for calculating the Ackermann function essentially only consists of such procedure calls. Péter's definition only calculates directly. The difficulty in calculating the function values is not just their size, but the deep nesting of the function calls, which easily leads to a stack overflow, i.e. the system running out of memory. The Ackermann function is therefore a simple and safe method of provoking a stack overflow , for example to test whether this error case is being processed and, if so, how this is done. The Ackermann function has the advantage that it is immune to compiler optimizations and that even static source code analyzes can practically not detect the (possible) stack overflow. ${\ displaystyle a (0, m)}$

This idea goes back to Yngve Sundblad, who used the function in 1971 to compare various programming languages. In order to calculate, calls are made. ${\ displaystyle f (n): = a (3, n)}$ ${\ displaystyle a (3, n)}$ ${\ displaystyle a (3, n) + 12 ^ {n} -2}$

Sundblad tested, among other things, how large can be chosen so that the computer is still able to calculate this number. At that time he reached . For comparison: With Java 1.4.2 and the standard memory settings you can nowadays . ${\ displaystyle n}$ ${\ displaystyle n = 1}$ ${\ displaystyle n = 13}$

In the course of the calculation, many identical calls are calculated multiple times. An intelligent compiler can take advantage of this and cache the results so that such calls only have to be made once. With this, calls to were already feasible in 1971 . You also get a significant time advantage if you calculate directly instead of expanding it recursively . The direct calculation of requires linear time in . The calculation of requires quadratic time because it leads to (that is, for a constant ; see Landau symbols ) nested calls of for different ones . Finally, the calculation of requires a time that is too proportional ( ). ${\ displaystyle n = 20}$ ${\ displaystyle a (1, n)}$ ${\ displaystyle a (1, a (1, a (1, \ dots, a (1,0) \ dots)))}$ ${\ displaystyle a (1, n)}$ ${\ displaystyle n}$ ${\ displaystyle a (2, n)}$ ${\ displaystyle {\ mathcal {O}} (n)}$ ${\ displaystyle c \ cdot n}$ ${\ displaystyle c}$ ${\ displaystyle a (1, i)}$ ${\ displaystyle i}$ ${\ displaystyle a (3, n)}$ ${\ displaystyle 4 ^ {n + 1}}$ ${\ displaystyle {\ mathcal {O}} (4 ^ {n + 1})}$

Run time estimates with the inverse function

Since the function grows very quickly, its inverse function grows very slowly. It is less than 5 for every practically imaginable input quantity, because the function value is greater than the number of atoms in the universe, as the calculation below shows. In the practical analysis of algorithms, it can therefore be regarded as constant. ${\ displaystyle f (n): = a (n, n)}$ ${\ displaystyle f ^ {- 1}}$ ${\ displaystyle f (4) = a (4,4)}$ ${\ displaystyle a (4,3)}$

This inverse function appears in the run-time analysis of certain algorithms , for example in the union find problem and in Chazelle's algorithm for minimal spanning trees . In this and other contexts, the original Ackermann function is often easily redefined to a function with similar asymptotic behavior by omitting additive constants or other modifications. These modified functions are not the same as the Ackermann function, but by the standards of the run-time analysis they can be regarded as equivalent.

implementation

The recursive implementation of the Ackermann function (here in pseudocode ) corresponds directly to the definition:

 function ack(n, m)
     if n = 0
         return m + 1
     else if m = 0
         return ack(n - 1, 1)
     else
         return ack(n - 1, ack(n, m - 1))

The following, partially iterative implementation is somewhat more efficient :

 function ack(n, m)
     while n ≠ 0
         if m = 0
             m:= 1
         else
             m:= ack(n, m - 1)
         n:= n - 1
     return m + 1

Even more efficient implementations use arrays to temporarily store values that have already been calculated, see also Dynamic Programming .

In Haskell , a functional programming language , the implementation directly reflects the definition:

ack 0 m = m+1
ack n 0 = ack (n-1) 1
ack n m = ack (n-1) (ack n (m-1))

In Prolog the implementation looks like this:

 ackermann(0,X,Y) :- X >= 0,!, Y is X + 1.
 ackermann(X,0,Z) :- X > 0,!, X1 is X - 1, ackermann(X1,1,Z).
 ackermann(X,Y,Z) :- X > 0, Y > 0, X1 is X-1, Y1 is Y - 1, ackermann(X,Y1,W), ackermann(X1,W,Z).

In lambda calculus , a purely iterative implementation is even possible. 1 and succ use the Church numerals to represent the natural numbers. The equations of Péter's definition can be shown directly by β-conversion .

  ack ≡ λn. n (λf.λm. m f (f 1)) succ

Table of values

The following table shows some function values for small values of and . The not fully calculated values are too large to be displayed in decimal format. ${\ displaystyle n}$ ${\ displaystyle m}$

Values of ${\ displaystyle a (n, m)}$
n \ m	0	1	2	3	4th	...	${\ displaystyle m}$
0	1	2	3	4th	5	...	${\ displaystyle m + 1}$
1	2	3	4th	5	6th	...	${\ displaystyle m + 2}$
2	3	5	7th	9	11	...	${\ displaystyle 2m + 3}$
3	5	13	29	61	125	...	${\ displaystyle 8 \ cdot 2 ^ {m} -3}$
4th	13	65533	${\ displaystyle 2 ^ {65536} -3 \ approx 2 \ cdot 10 ^ {19728}}$ (Number with 19729 decimal places)	${\ displaystyle a (3.2 ^ {65536} -3)}$	${\ displaystyle a (3, a (4,3))}$	...	${\ displaystyle {\ begin {matrix} \ underbrace {{2 ^ {2}} ^ {{\ cdot} ^ {{\ cdot} ^ {{\ cdot} ^ {2}}}}} -3 \ end { matrix}}}$ ( Power tower with m + 3 numbers)
5	65533	${\ displaystyle a (4,65533)}$	${\ displaystyle a (4, a (5,1))}$	${\ displaystyle a (4, a (5,2))}$	${\ displaystyle a (4, a (5,3))}$
6th	${\ displaystyle a (5,1)}$	${\ displaystyle a (5, a (5,1))}$	${\ displaystyle a (5, a (6,1))}$	${\ displaystyle a (5, a (6,2))}$	${\ displaystyle a (5, a (6,3))}$

Despite the unimaginably large numbers that already appear in this table, recursive procedures have been defined that deliver values that grow even faster, such as Graham's number .

Closer examination

Using the table of values, a scheme for calculating the function values can be derived that is easier to understand than the formal recursive definition. It's easy to see that the values on the first line are simply a list of all natural numbers. The respective entries can be calculated with the formula . All of the following lines simply contain instructions to find a value on that line. In the case of the line , this statement simplifies looking for the value in the line , but this simplification is a little more difficult to derive - for example: ${\ displaystyle a (0, m) = m + 1}$ ${\ displaystyle n = 1}$ ${\ displaystyle a (0, m + 1)}$ ${\ displaystyle n = 0}$

a(1, 2) = a(0, a(1,1))
        = a(0, a(0, a(1,0)))
        = a(0, a(0, 2))
        = a(0, 3)
        = 4

We now consider a more complex case, namely , the first function value, which is so large that it can practically not be written down in decimal. ${\ displaystyle a (4,3)}$

a(4, 3) = a(3, a(4, 2))
        = a(3, a(3, a(4, 1)))
        = a(3, a(3, a(3, a(4, 0))))
        = a(3, a(3, a(3, a(3, 1))))
        = a(3, a(3, a(3, a(2, a(3, 0)))))
        = a(3, a(3, a(3, a(2, a(2, 1)))))
        = a(3, a(3, a(3, a(2, a(1, a(2, 0))))))
        = a(3, a(3, a(3, a(2, a(1, a(1, 1))))))
        = a(3, a(3, a(3, a(2, a(1, a(0, a(1, 0)))))))
        = a(3, a(3, a(3, a(2, a(1, a(0, a(0, 1)))))))
        = a(3, a(3, a(3, a(2, a(1, a(0, 2))))))
        = a(3, a(3, a(3, a(2, a(1, 3)))))
        = a(3, a(3, a(3, a(2, a(0, a(1, 2))))))
        = a(3, a(3, a(3, a(2, a(0, a(0, a(1, 1)))))))
        = a(3, a(3, a(3, a(2, a(0, a(0, a(0, a(1, 0))))))))
        = a(3, a(3, a(3, a(2, a(0, a(0, a(0, a(0, 1))))))))
        = a(3, a(3, a(3, a(2, a(0, a(0, a(0, 2))))))
        = a(3, a(3, a(3, a(2, a(0, a(0, 3)))))
        = a(3, a(3, a(3, a(2, a(0, 4)))))
        = a(3, a(3, a(3, a(2, 5))))
        = ...

This has been the simplest case of such an expansion for some time, and it is obvious from the table why function values like this are rarely calculated directly. It is also interesting to see how many steps it takes to simplify Ackermann expressions that are already very simple. Each line in the previous example is a single application of one of the three parts of the definition of the Ackermann function.

    … = a(3, a(3, a(3, 13)))
        = ...
        = a(3, a(3, 65533))

If we skip several logical steps at this point, we could evaluate to 13 and then try to evaluate - that is 65533. But the next function call already gives me a number that goes far beyond the estimated number of atoms in the universe. This number is then used in the calculation , which at some point would be written out for a printout of the form , but which can no longer be written down with our means. ${\ displaystyle a (2.5)}$ ${\ displaystyle a (3.13)}$ ${\ displaystyle a (3,65533)}$ ${\ displaystyle m}$ ${\ displaystyle a (3, m)}$ ${\ displaystyle a (2, a (2, a (2, \ dots, a (2,0) \ dots)))}$

Another interesting aspect of the Ackermann function is that the only computation that actually comes up alongside the recursive calls is the computation of , which simply increases by 1. ${\ displaystyle a (0, m)}$ ${\ displaystyle m}$

literature

Wilhelm Ackermann: On Hilbert's structure of real numbers . In: Mathematische Annalen , 99, 1928, pp. 118-133, ISSN 0025-5831 .
Yngve Sundblad: The Ackermann Function. A Theoretical, Computational, and Formula Manipulative Study. In: BIT - numerical mathematics. Springer, Dordrecht 11.1971, pp. 107-119, ISSN 0006-3835 .
Uwe Schöning: Theoretical Computer Science - in a nutshell. Spektrum Akademischer Verlag, Heidelberg 2001, ISBN 3-8274-1099-1 .
Dexter C. Kozen: The Design and Analysis of Algorithms . Springer, Berlin 1992, ISBN 3-540-97687-6 .

Web links

Explanatory video for the Ackermann function (English)

Individual evidence

↑ Péter Rózsa: Construction of non-recursive functions . In: Mathematische Annalen , 111, 1935, pp. 42-60
↑ PK Agarwal, M. Sharir: Davenport-Schinzel sequences and their geometric applications . In: Handbook of computational geometry , 2000, pp. 1-47.
↑ For details on the proof see e.g. B. in the book by Uwe Schöning (see literature ).

This version was added to the list of excellent articles on August 22nd, 2004 .

[peter-1] Péter Rózsa: Construction of non-recursive functions . In: Mathematische Annalen , 111, 1935, pp. 42-60

[2] PK Agarwal, M. Sharir: Davenport-Schinzel sequences and their geometric applications . In: Handbook of computational geometry , 2000, pp. 1-47.

[3] For details on the proof see e.g. B. in the book by Uwe Schöning (see literature ).