Randomized algorithm

A randomized algorithm (also stochastic or probabilistic algorithm ) is an algorithm that tries to arrive at a (on average) good or approximately correct result by choosing random intermediate results. It thus forms the counterpart to the deterministic algorithm . It is not required that a randomized algorithm always efficiently finds a correct solution. Randomized algorithms are in many cases easier to understand, easier to implement, and more efficient than deterministic algorithms for the same problem. An example that shows this is the AKS primality test , which is deterministic, but much more inefficient and much more difficult to implement than, for example, the primality test by Solovay and Strassen .

Types of algorithms

Different classes of algorithms

Las Vegas

Randomized algorithms that never give a wrong result are also known as Las Vegas algorithms . There are two flavors of Las Vegas algorithms:

Algorithms that always deliver the correct result, but whose computing time can be very long (if the random bits are chosen poorly). In many cases the algorithm is only fast in the expected case, but not in the worst case. One example is the variant of Quicksort, in which the pivot element is chosen at random. The expected computing time is , on the other hand, if the random bits are chosen unfavorably, steps are required. ${\ displaystyle O (n \ log n)}$ ${\ displaystyle O (n ^ {2})}$
Algorithms that are allowed to fail (= allowed to give up), but must never deliver a wrong result. The quality of this type of algorithm can be described by an upper bound for the probability of failure.

Monte Carlo

Randomized algorithms, which are allowed to deliver a wrong result, are also known as Monte Carlo algorithms . The quality of Monte Carlo algorithms can be described by an upper bound for the error probability .

One speaks of randomized algorithms only if one can estimate the expected value of the computing time and / or the error or failure probability. Algorithms for which it is only intuitively plausible that they deliver good results, or algorithms for which this has been proven through experiments on typical inputs, are called heuristic algorithms .

When analyzing the expected computing time and / or the probability of errors, it is generally assumed that the random bits are generated independently. In implementations, instead of correct random bits, pseudo- random numbers are usually used , which of course are no longer independent. As a result, it is possible that implementations behave worse than the analysis suggests.

Types of errors of Monte Carlo algorithms for decision problems

In the case of decision-making problems (problems that can be described by yes - no questions) a distinction is made between one and two-sided errors:

In the event of a two-sided error , entries where the correct answer is yes may also be discarded, and entries to which the correct answer is no may also be accepted. In this case, the error probability must sensibly be less than 1/2, since one can achieve the error probability with a coin toss regardless of the input (this approach is obviously not a sensible solution for the problem under consideration). ${\ displaystyle 1/2}$
In the case of a one-sided error , entries for which the correct answer is yes can also be discarded, but entries for which the correct answer is no must be discarded. Error probabilities smaller than 1 are useful here.

Relationship between Las Vegas and Monte Carlo algorithms

Las Vegas algorithms can always be transformed into Monte Carlo algorithms: The output ? can simply be interpreted as incorrect output. If an upper bound is only known for the expected value of the computing time of the Las Vegas algorithm, can the algorithm be broken off after steps and ? output. The probability of this happening is bounded by 1 / c according to the Markov inequality . ${\ displaystyle p (n)}$ ${\ displaystyle cp (n)}$

Since Monte Carlo algorithms can output incorrect solutions and these incorrect solutions do not have to be specially marked, it is more difficult to transform Monte Carlo algorithms into Las Vegas algorithms. This is possible if you also have a verifier for the solution, i.e. an algorithm that can test for a solution proposal whether the proposal is correct. A Las Vegas algorithm is obtained by executing the given Monte Carlo algorithm, then checking with the verifier whether the calculated solution is correct, and iterating this process until a correct solution has been calculated. The worst-case computing time of this approach is not limited upwards, but the expected value of the number of iterations can be estimated upwards. If one does not have a verifier available, it is generally not clear how to construct a Las Vegas algorithm from a Monte Carlo algorithm.

Complexity classes

The class PP

${\ displaystyle Prob (M (w) = L (w))> 1/2}$

Typ: & Monte Carlo Algorithmus

The class PP (probabilistic polynomial) are all languages L, so that there is a probabilistic Turing machine M that is polynomially time-limited and this formula applies to all w.

There is a two-sided error in this class. The probability that the result obtained is correct is over 1/2.

The class BPP

${\ displaystyle Prob (M (w) = L (w))> 1/2 + e}$

Typ: & Monte Carlo Algorithmus

The class BPP (bounded error probabilistic polynomial) are all languages L, so that there is a probabilistic Turing machine M that is polynomially time-limited and this formula applies to all w in . ${\ displaystyle e> 0}$

There is a two-sided error in this class. The probability that the result obtained is correct is in a known range.

The RP class

w ∈ L: ${\ displaystyle Prob (M (w) = 1)> 1/2}$

w ∉ L: ${\ displaystyle Prob (M (w) = 0) = 1}$

Typ: & Monte Carlo Algorithmus

The class RP (random polynomial) are all languages L, so that there is a probabilistic Turing machine M which is polynomially time-limited and this formula applies.

There is a one-sided error in this class.

The ZPP class

w ∈ L: ${\ displaystyle Prob (M (w) = 0) = 0}$ ${\ displaystyle Prob (M (w) = 1)> 1/2}$

w ∉ L: ${\ displaystyle Prob (M (w) = 1) = 0}$ ${\ displaystyle Prob (M (w) = 0)> 1/2}$

Typ: & Las Vegas Algorithmus

The class ZPP (zero error probabilistic polynomial) are all languages L, so that there is a probabilistic Turing machine M, which is polynomially time-limited and this formula applies.

Reduction of the probability of failure / failure (probability amplification)

The probability of failure or failure of randomized algorithms can be reduced by independent repetition. For example, if you run an error-free algorithm with a failure probability of 1/2 times in total , the probability that it will fail in all iterations is only one . If it delivers a result in at least one iteration, this is also correct, so that it can also be output. In this way it can be shown that every constant failure probability from the interval (except for polynomial factors in the computing time) is equally good. (It's easy to imagine that the failure probability is a really tiny failure probability for example ; you would have to run the algorithm an average of times until it fails the first time.) The same approach works for algorithms with one-sided error. In the case of algorithms with two-sided errors, on the other hand, a majority decision is required, so that the analysis becomes somewhat more difficult. However, it results again that every constant error probability from the interval (except for polynomial factors in the computing time) is equally good. ${\ displaystyle k}$ ${\ displaystyle k}$ ${\ displaystyle (1/2) ^ {k}}$ ${\ displaystyle (0,1)}$ ${\ displaystyle 2 ^ {- k}}$ ${\ displaystyle k = 100}$ ${\ displaystyle 2 ^ {100}}$ ${\ displaystyle (0.1 / 2)}$

Derandomization

Derandomization is the reduction in the number of random bits that a randomized algorithm uses. This is useful for the following reason: You can make any randomized algorithm deterministic by simulating it for all assignments of the random bits. However, this means that when using random bits, the computing time increases by a factor , which in most cases leads to exponential computing time. However, if the algorithm only needs random bits for the input length , this only leads to a polynomial increase in the computing time. ${\ displaystyle c}$ ${\ displaystyle 2 ^ {c}}$ ${\ displaystyle n}$ ${\ displaystyle O (\ log n)}$

One approach to derandomization is to analyze in more detail how the algorithm uses the random bits. With some randomized algorithms it is sufficient that the random bits used are independent in pairs. For example, you can now generate pairwise independent random variables from completely independent random variables. In a second step, the algorithm can be made deterministic using the approach described above. ${\ displaystyle n}$ ${\ displaystyle O (\ log n)}$

literature

J. Hromkovič : Randomized Algorithms: Methods for Designing Random Controlled Systems for Beginners . Vieweg + Teubner, 2004, ISBN 3-519-00470-4 .
R. Motwani, P. Raghavan: Randomized Algorithms . Cambridge University Press, 1995, ISBN 0-521-47465-5 .
I. Wegener : Complexity Theory - Limits of the Efficiency of Algorithms . Springer, 2003, ISBN 3-540-00161-1 .

Web links

Overview of randomized algorithms