Dictionary compression

from Wikipedia, the free encyclopedia

Dictionary compression , also known as string replacement or substitution compression , describes all data compression processes that search the raw data for recurring character strings in order to assign them to a replacement symbol in a so-called dictionary and replace them with the symbol (e.g. the position in the dictionary).

Dictionary methods are often combined with other methods that take advantage of other forms of redundancy . The combination with subsequent entropy coding is very common .

Methods

Some dictionary procedures use a static dictionary, the entries of which are fixed before coding and are not changed in the process.

More common are methods that start with an empty dictionary or a predefined dictionary and build it up according to the content during coding.

Examples

The methods LZ77 and LZ78 published by Abraham Lempel and Jacob Ziv in 1977 and 1978 and their numerous derivatives and variants ( LZW , LZSS , LZMA , LZO ...) are based on this principle.

Another example is Sequitur .