Pollard-Rho method
The Pollard-Rho methods are algorithms for determining the period length of a number sequence that is calculated using a mathematical function. Various difficult math problems such as the discrete logarithm and factorization can be calculated using these methods. An optimized variant of the Pollard-Rho method was developed by John M. Pollard in 1975 for prime factorization . Such methods can also be used to calculate collisions in hash functions.
With the Pollard-Rho methods, sequences of partial results are calculated. At a certain point, some of these partial results are only repeated. The partial results can be arranged graphically so that the shape of the letter ρ (Rho) can be recognized. The name of the methods is derived from this.
functionality
Find a prime factor of the number . In general, however, this divisor does not necessarily have to be a prime number. The method is based on the generation of a sequence of pseudo-random numbers . Any function can be used to generate the random sequence . It is only necessary that it also follows from, and this already applies, for example, when is given by a polynomial with integer coefficients.
The sequence starts with a largely freely selectable start value . The other values are calculated iteratively according to
The modulo function values can at most assume the various values . If one of these values occurs again, these values are then repeated modulo . This happens at the latest after iterations and on average after about iterations. For the same reasons, one can expect after about iterations that the values will repeat modulo . If it is already known that has a small prime factor, is considerably smaller than , so that it can be hoped that the repetition modulo starts considerably earlier than the repetition modulo .
In the case of a sequence of numbers calculated in this way with a finite number of possible function values, a few values are initially used in a previous period
accepted. As soon as a value occurs repeatedly, the values are then repeated cyclically
This behavior of the sequence gave the method its name, since the period can be imagined like a circle and the elements of the sequence at the beginning like a stem that leads into the circle. Graphically, it looks like the Greek letter ρ .
If two values and modulo from the sequence have the same value, for which consequently applies, the greatest common factor results in a multiple of and often a real factor of .
However, it is very time-consuming to compare all numerical values in this way. An optimized variant of the Pollard-Rho method therefore calculates two sequences to determine the period length. One episode
and the second episode
This trick can avoid comparing a large number of function values. The greatest common factor does not now have to be calculated for all pairs . It is sufficient to calculate or respectively .
Since , as a sought divisor of , is unknown, the remainder of the division by cannot initially be calculated. The equality of two values and is therefore not queried, but is calculated. If the values and only differ by a multiple of , the value of is a multiple of the divisor of . Integer multiples of are also integral multiples of and therefore do not need to be taken into account in the calculation. As a result, it is sufficient to calculate the function values modulo .
A function of the form can be used to calculate the sequence of numbers . With this choice, only a part, about half, of the values can occur up to the remainder, whereby the earlier occurrence of the sought repetitions is somewhat favored.
Formal definition
Let be the number from which a prime factor is to be calculated. Denote a sequence of pseudorandom numbers such as
If there is a real prime factor , then the following applies
- There is an index so that and with .
algorithm
Input : is the number to be factored and be the pseudo-random function modulo Output : a non-trivial factor of or an error message
- x ← 2, y ← x; d ← 1
- As long as d = 1:
- x ← f ( x )
- y ← f ( f ( y ))
- d ← ggT (| x - y |, n )
- If 1 < d < n then return d .
- If d = n , then output "error".
Note: This algorithm returns an error message for everyone who is only divisible by 1 and themselves. However, an error message can also be returned for the others . In this case, choose another function and try again.
If the result is a number, then this is really also a divisor and therefore a correct result, although this generally does not necessarily have to be a prime number.
For one chooses a polynomial with an integer coefficient. A common function for this algorithm is as follows:
Estimation of the running time
The number sequences and can be viewed as pseudo-random sequences. If a numerical value occurs again, the following values are inevitably repeated. Up to values can be assumed (with a quadratic as above: up to values). The expected value for the length of a cycle is . The fact that far fewer than calculations are required is sometimes called the birthday paradox .
The worst case occurs when a product of two prime numbers is the same length. The algorithm then terminates after O (n 1/4 polylog (n)) steps with a probability of . The method works well for factoring numbers with several smaller factors. The algorithm can factor a number with twice as many digits as the trial division in the same time (with high probability). The algorithm works exponentially in the length of the input and is therefore asymptotically slower than the square sieve and the number field sieve .
Numerical example
1st example
We are looking for the factors of number . We use the function and the starting value :
n = 703, f ( x ) = x 2 + c with c = 23, x 0 = 431 | |||
i | x i = f ( x i -1 ) | y i = x 2 i = f ( f ( y i -1 )) | d = gcd (| x - y |, n ) |
1 | 192 | 331 | 1 |
2 | 331 | 49 | 1 |
3 | 619 | 125 | 19th |
4th | 49 | 106 | 19th |
5 | 315 | 144 | 19th |
6th | 125 | 619 | 19th |
7th | 182 | 315 | 19th |
8th | 106 | 182 | 19th |
9 | 11 | 11 | 703 |
10 | 144 | 372 | 19th |
11 | 372 | 49 | 19th |
12 | 619 | 125 | 19th |
With that the prime factorization of is found.
2nd example
n = 2717, f ( x ) = x 2 + c mod n with c = 4, x 0 = 2 | |||
i | x i = f ( x i -1 ) | y i = x 2 i = f ( f ( y i -1 )) | d = gcd (| x - y |, n ) |
1 | 8th | 68 | 1 |
2 | 68 | 277 | 209 |
3 | 1911 | 2367 | 19th |
4th | 277 | 68 | 209 |
5 | 657 | 277 | 19th |
6th | 2367 | 2367 | 2717 |
7th | 239 | 68 | 19th |
8th | 68 | 277 | 209 |
This example shows that the factor found does not necessarily have to be a prime number. The factor found here is .
Factoring
Using the method described, in 1980 the Fermat number
be factored. denotes a (prime) number with 62 digits, which was only later proven to be a prime number.
Implementations
The Rho method is rho_factorize()
part of the function library of the ARIBAS program by Otto Forster .
literature
- A Monte Carlo Method for Factorization , JMPollard, BIT 15 (1975) 331-334
- An Improved Monte Carlo Factorization Algorithm , RPBrent, BIT 20 (1980) 176-184
- Otto Forster: Algorithmic Number Theory. Vieweg, 1996, ISBN 3-528-06580-X