Rainbow table

The Rainbow Table ( Engl. For rainbow table ) is one of Philippe Oechslin developed data structure , a fast, memory-efficient search for the original string (usually a password) for a given hash value allows. Searching over a rainbow table is considerably faster than with the brute force method , but the memory requirement is higher. Such a compromise is called a time-memory tradeoff .

A hash function without salt is assumed , as is e.g. This is the case, for example, with the passwords for earlier Windows versions and with many routers . Comparatively extensive tables were calculated for LM hashes and MD5 and are available from various sources.

Rainbow tables are used for password recovery, in IT forensics , for penetration tests and for password cracking.

overview

Simplified Rainbow Table with 3 iterations . In realistic rainbow tables, one challenge is to find reduction functions R ₁ , ..., R _k that generate passwords that are as "meaningful" as possible for given hashes.

A rainbow table is a compact representation of connected password sequences, so-called chains. Each of these chains starts with an initial password, which is passed through a hash function . The resulting hash is in turn passed through a reduction function, with the result of having another potential plaintext password. This process is repeated for a specified number of times, and finally the first password in the chain is saved along with the password from the last hash value.

When creating the table, make sure that, on the one hand, no password that appears in a chain is used as the start password (collision), but that, on the other hand, all possible passwords appear in the table. The tables are only created once and then serve as a look-up table.

Finding out a password takes a two-step process. First, the hash of the password to be found is passed through the hash reduction sequence described above until the result of the reduction appears in the table of the last chain links (the "right side" of the table). This gives you the start password of the chain and in the second step you can use the hash reduction sequence based on this to get the password you are looking for.

The length of the chain, i.e. the number of iterations for creating the tables, has an effect on the size of the table: the longer the iterations are selected, the smaller the resulting table. In the simplest case, the number of iterations is equal to 1, so that the table contains all password hash pairs.

functionality

A hash function assigns a binary sequence of fixed length to a binary sequence of any length. With the hash function MD5 , this output length is 128 bits or 32 4-bit characters. A hash value is calculated for a random string of length . This result of the hash function (with length 32) is converted by a reduction function R into a new character string which again has length . Because the successive execution of the hash function and the reduction function does not change the length of the character string, these two steps can be repeated alternately as often as required. The sequence of intermediate results ultimately forms a chain. The start and end values of this chain are saved. This sequence of steps is also repeated times and forms a universal rainbow table . ${\ displaystyle n}$ ${\ displaystyle n}$ ${\ displaystyle x}$

Reduction function

The reduction function shortens a hash value to characters. Each of these reductions yields e.g. B. MD5 creates a new "unique" 128 bit string, or a collision. A hash value that can be generated by various output character strings is called a collision. In order to avoid collisions, various reduction functions are used which, when applied periodically, allow the input character string and the output hash to be clearly assigned. This method is a more efficient method for n-digit strings than, for example, a brute force attack with a key search of [a - //////], since with the latter, many strings are converted into hashes which, with a high probability, will never be found or be chosen. ${\ displaystyle n}$

With poorly programmed or trivial reduction functions, collisions occur after a few runs, which lead to repetitions of the reduced texts and thus also of the hashes. Such internal loops then lead to the algorithm failing: You laboriously compute thousands of elements, and only a few of them are clearly distinguishable.

application

In this example, we are looking for the character string whose MD5 hash value in hexadecimal notation corresponds to 97fae39bfd56c35b6c860aa468c258e0 ( Domino ). The conventional way of calculating all MD5 hashes for all possible variations and comparing them is very computationally intensive and has to be repeated for new searches.

It would make sense to save all hashes that have already been calculated in a database and only have to compare when searching again whether the hash you are looking for is already known. When searching over 64 possible characters [A-Za-z0-9./], which each position of the input text could have, there are variations with 6 characters . If text and hash are now stored in a database, 16 bytes per pair are required for the hash and 6 bytes for the plain text and thus around 1.4 terabytes for the complete data . ${\ displaystyle 64 ^ {6}}$

These amounts of data can usually not be processed and must be reduced.

A sensible middle ground: Rainbow Table

Instead of saving all values including their keys, only the initial character string and the last character string of a -element string are saved. In this way, hashes can be represented by a start and end value and recalculated in a comparatively short time, so that the input text can be found again. If an end value is formed from a reduced hash (= plain text), this chain is recalculated from the start value until the given hash is obtained; the text preceding this is the original text sought. With a chain length of only 9,999 hash calculations are needed to find the original text for a hash. ${\ displaystyle n}$ ${\ displaystyle n-1}$ ${\ displaystyle n = 10,000}$

The probability of getting exactly the input text you are looking for from the reduced hashes depends on the quality of the reduction function (s) and the parameters when creating the rainbow table, since only the reduced hashes (= plain text) can be found later. For example, if the reduction function (s) only reduce to numbers, the plain text "Domino" cannot be found. If the reduction function (s) reduce to seven digits (from 32), then 6-digit plaintexts are not calculated and "Domino" cannot be found here either.

Perfect and imperfect rainbow tables

In a perfect rainbow table there are no passwords that appear twice in the chains. Thus, a perfect rainbow table has the fewest number of chains to be able to display all passwords. However, the calculation effort increases to create the perfect rainbow table. In imperfect rainbow tables, redundant passwords appear in the chains. They are faster to produce, but take up more space than a perfect rainbow table.

Countermeasures

Passwords are hashed with a cryptographic hash function before they are saved, so that the password cannot be inferred from the list of saved hash values if the file with the passwords is stolen. Rainbow tables make exactly that possible. To protect the passwords, there are various countermeasures that are supposed to make the use of rainbow tables less efficient.

Password length

The size of a rainbow table increases with the length of the passwords for which it is to be generated. Depending on the hash type, the calculation of a rainbow table is no longer economical from a certain password length, both in terms of the duration of the generation (electricity costs) and the space requirements of the tables. Long passwords arise e.g. B. when using sentences instead of a word (see passphrase ).

Salts

Another method to make the generation of rainbow tables uneconomical is the use of salt . A value - ideally - a randomly generated value, the salt, is appended to the password before hashing. The salt is saved together with the hash value so that the password can be checked later, so it is not a secret.

example

Password, exactly 6 digits, only numbers allowed: 123456
Salt, any combination of three capital letters: ABC

So now is hashed not 123456 , but 123456ABC . Creating a rainbow table for exactly this salt is not much more time-consuming than without salt, since the last part of the password (the salt) is already known. The table would have to be created for all possible combinations with ABC at the end, which of course are just as many entries as without a salt at the end:

password	Hash
111111ABC	hash (111111ABC)
111112ABC	hash (111112ABC)
...	...
999999ABC	hash (999999ABC)

The real highlight is that the salt does not always stay the same, but is generated randomly with each password. A rainbow table for the Salt ABC is of no use to us with a Salt from XYZ . For this a rainbow table would have to be created again. This would multiply the effort by the number of possible salts, because a rainbow table would have to be created for each possible salt.

Since a rainbow table can no longer be used to find all passwords, this method becomes uneconomical and technically not feasible. Salts cannot protect short passwords, however - they only increase the effort involved in brute-forcing if there are several, but they do not prevent it in any way.

Iterations

The effort can be further increased if a password is hashed not once but several times - several thousand iterations are common . Only salts and iterations combined result in a hashing method that has a certain resistance to typical attack methods. The salt makes creating tables uneconomical or even impossible; together with the iterations, brute force attacks are slowed down considerably. A successful implementation is e.g. B. MD5 (crypt). The method is based on MD5 , but uses salts with a length of 12 to 48 bits and 1000 iterations. The creation of rainbow tables for this method is uneconomical due to the length of the salt under real conditions, as is pure bruteforcing .

"Pepper" method

Pepper means combining the password with a secret string before calculating the hash value. The pepper is secret and is not stored in the database. Instead, it is stored in a location that is as secure as possible; the same pepper applies to all passwords. If the attacker knows this code (for example, if he gains control of the server), the pepper does not bring any advantages. However, if the attacker only has access to the database (for example through SQL injection), he will recognize the hash values, but these no longer come from weak passwords. They are hash values of long combinations of password and a strong pepper. No dictionary ever contains such passwords, so a dictionary attack is pointless. It is often recommended to use an HMAC to combine password and pepper. If, on the other hand, they are simply attached to one another, the pepper should be added after and not in front of the password, as some hash functions ignore characters from a certain position.

Web links

Philippe Oechslin: Making a Faster Cryptanalytic Time-Memory Trade-Off . In: CRYPTO 2003 . vol. 2729, 2003, p. 617–630 ( epfl.ch [PDF; 243 kB ]).
Vrizlynn LL Thing, Hwei-Ming Ying: A novel time-memory trade-off method for password recovery . In: Digital Investigation . No. 6 , 2009, p. 114–120 ( dfrws.org [PDF; 361 kB ]).
Program / Live CD for using Rainbow Tables
Distributed computing project for the creation of rainbow tables
Rainbow tables databases , overview of the Chaos Computer Club
An explanation of Rainbow Tables for Beginners
A specialist work, u. a. About the structure and functionality of Rainbow Tables (PDF; 1.6 MB)
Karsten Nohl, Colorful key guessing - from dictionaries and rainbows , c't 15/2008