Kolmogorov Complexity
The Kolmogorow complexity (after Andrei Nikolajewitsch Kolmogorow ) is a measure of the structure of a character string and is given by the length of the shortest program that generates this character string. This shortest program thus gives the best possible compression of the character string without any information being lost.
If the Kolmogorow complexity of a character string is at least as great as the character string itself, then the character string is called uncompressible, random or even structureless. The closer the Kolmogorow complexity is to the length of the character string, the more ' random ' the character string is (and the more information it contains).
The Kolmogorov Complexity Principle was independently developed in 1964 by Ray Solomonoff , in 1965 by Andrei Kolmogorov, and in 1969 by Gregory Chaitin , and is related to Shannon's information theory .
The Kolmogorow Complexity is sometimes also called Algorithmic Complexity or Descriptive Complexity , but must not be confused with the time or space complexity of algorithms . The designation Algorithmic Information Content is somewhat more precise , which also establishes the connection to the concept of information content according to Shannon. A related, but clearly demarcated approach is the algorithmic depth , which refers to the effort that has to be made to generate or decrypt a particular message. The algorithmic information theory of Gregory Chaitin clarified the approach Kolmogorov in relation to the machine model . With the Minimum Description Length, Jorma Rissanen describes a similar concept that is based on the compression of the data.
example
An example for generating a sequence of 1000 zeros may illustrate the compression: The number 1000 can be represented (in binary form ) by 10 bits . Given a (constant) program for printing a sequence of zeros, the Kolmogorow complexity of a sequence of n zeros can be given as "constant + log ( n )":
Program Nullfolge (n) begin for i:= 1 to n // im Beispiel n = 1000 print "0" end
The above program can be encoded with a constant number of bits, e.g. B. as machine code or as a simple text file . The coding of the number n requires log (n) bits. The entire coding therefore requires "constant + log (n)" bits added together and thus significantly less than n bits for large n . Therefore the null sequence is compressible.
The following illustration illustrates the compressibility:
Program Nullfolge (n)00000000000000000000000000000 0begin00000000000000000000000000000000000000000000 00for i:= 1 to n0000000000000000000000000000000000 000print "0"00000000000000000000000000000000000000 0end0000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000
The program that generates the sequence with 1000 zeros takes up hardly more than 5% of the sequence itself.
Coincidence or order?
However, there are (in this sense) only seemingly random sequences of numbers . For example, there is a short program that generates the decimal expansion of the circle number Pi with any precision. This also results in a complexity of the form "constant + log ( n )", where n indicates the accuracy of the representation.
calculation
The Kolmogorow complexity cannot be calculated, but it can be recursively enumerated from above .
Applications
Today Kolmogorov complexity finds application in computer science , biology and other sciences that consider structures or information.
- Data compression
- Definition of randomness in strings
literature
- Ming Li , Paul Vitányi : An Introduction to Kolmogorov Complexity and Its Applications. 3. Edition. Springer-Verlag, New York 2008, ISBN 978-0-387-33998-6 , doi : 10.1007 / 978-0-387-49820-1
- Juraj Hromkovic : Theoretical Computer Science. Formal languages, predictability, complexity theory, algorithms, communication and cryptography. 3rd revised and expanded edition. Teubner Verlag, Wiesbaden 2007, ISBN 978-3-8351-0043-5 ( guidelines for computer science ).