Time complexity

from Wikipedia, the free encyclopedia

In computer science, the time complexity of a problem is understood as the number of computational steps that an optimal algorithm needs to solve this problem, depending on the length of the input. One speaks here of the asymptotic running time and means, based on an asymptote , the time behavior of the algorithm for a potentially infinitely large input set. So it is not the time required of a specific program on a specific computer that is of interest , but rather how the time requirement increases when more data has to be processed, e.g. B. whether the effort for twice the amount of data doubles or squares ( scalability ). The running time is therefore given as a function of the length  n of the input and is estimated asymptotically for ever increasing n using the Landau notation (uppercase O notation). The master theorem or the substitution method also offer a more precise runtime estimation of recursion equations .

If the term time complexity is related to a specific algorithm, it means the number of steps that the algorithm needs to process an input with a certain length  n (in the best, worst or average case, see below). In complexity theory, however , the actual subject of consideration is the complexity of problems . The complexity of algorithms is only interesting insofar as statements about the problem under consideration can be drawn from it. In practice, however, this is often of interest, especially in order to select a suitable algorithm for a specific context: Bubble sorting is a very slow process for large amounts of data, but is well suited for small amounts of data due to the low overhead (especially for n  ≤ 8 ).

The time complexity is always given in relation to a machine model . As a rule, the reference model is the Turing machine , but alternatively the time complexity can also be specified in relation to a register machine , which behaves more like the actually existing computers . A parallel machine model such as the PRAM can be used for parallel algorithms . One model related to the PRAM model is the external storage model . Any problem that can be efficiently solved in PRAM can also be efficiently calculated in external memory. In addition, a distinction is made between the complexity for deterministic and nondeterministic machines.

In complexity theory , time complexity is the most frequently examined aspect of algorithms and problems, alongside space complexity . The time complexity of all algorithms that solve a problem, for example, is based on a number of significant complexity classes .

variants

A distinction is made between the following variants for runtime estimation:

  • The worst-case -Laufzeit ( Engl. Worst case tester ) indicates how long the algorithm needs maximum. For many algorithms there are only a few inputs that achieve this worst-case runtime, which is why it is not necessarily a realistic estimate. If, however, it is a real-time system , the worst-case runtime must of course be taken into account.
  • The average case -Laufzeit (Engl. Average case ) is the expected life at a given distribution of the inputs. However, since the distribution of the inputs in programs is not always known, the calculation of the average-case runtime in these cases is only possible with limited assumptions. See also: Amortized Runtime Analysis
  • The best-case -Laufzeit (Engl. Best Case ) indicates how long the algorithm needs at least, so even for the inputs, on which he is working perfectly. This lower bound for solving the problem is only rarely given, since it only applies to a few cases and the best-case runtime is included in that for the worse cases.

Dependent time complexity

In most cases, the time complexity can only be specified in relation to other time complexities. In sorting algorithms, the time is actually not measured directly, but in “comparison operations”, on the assumption that each such comparison requires a constant time. While this normally applies to elementary data types , this is already not the case for strings, for example. However, if two algorithms (in this example two sorting algorithms) have to carry out similar comparisons, the result is still meaningful.

In some cases this is the only way to make a definitive statement. For example, the DBSCAN algorithm carries out exactly one neighborhood query for each point. Since a neighborhood query is considered to be the “slowest” operation in the algorithm, it is also said to be “linear” (in the number of neighborhood queries). However, if you want to compare it with an algorithm that does not perform neighborhood inquiries, you need a different measure. Depending on an existing index structure , however, a neighborhood query can be answered at different speeds (without this changing the DBSCAN algorithm). Without index support, DBSCAN then has a "quadratic" time complexity (in the number of distance calculations).

example

Twenty names are saved in a list. An existing name should now be entered and compared. The list is searched starting with the first entry until the name is found.

  • best case: The name you are looking for is the first in the list, the search time is 1.
  • worst case: The name you are looking for is the last one in the list. The search time is 20. This search time would also apply if the name were not in the list.
  • average case: If the name is definitely in the list, the average case is 10.5.

If one simply speaks of time complexity , the worst case estimate is usually meant.

With complex algorithms, the amortized analysis , which considers the average costs of the algorithm for all possible inputs, instead of only for the best / worst / equally distributed case, is suitable for a realistic estimate of the running time . The decisive factor is how likely it is that a particular case will occur. This type of investigation is particularly suitable if you consider the effort involved in a partial operation of a larger algorithm: when sorting using a Fibonacci heap, for example, the operation of sorting in a new entry is quite time-consuming in the worst case, but these only come once when the Overall algorithm, in all the following steps the heap is already "almost sorted", and the individual step is cheap. The problem with the amortized analysis is that it is not easy to carry out, because you first have to develop a function that models the behavior of the data structure as precisely as possible and thus indicates how likely the individual cases are.

In information theory , time complexity is used to determine the algorithmic depth of a data set.

See also

literature

Individual evidence

  1. ^ From: Meyer, Sanders and Sibeyn: Algorithms for Memory Hierarchies: Advanced Lectures. Chapter 15.3, p. 335, Springer, 2003.