Homophonic encryption

from Wikipedia, the free encyclopedia

The homophone encryption (of ancient Greek ὅμος homos "same" and φωνή phone "voice" = "equal sounding") is a already in the 17th century widespread monoalphabetic encryption method in which, in contrast to the simple mono substitution cipher the plaintext character (usually: letters) can also be substituted by several (different) ciphertext characters .

The main weakness of the simple monoalphabetic substitution is that each plaintext letter is only ever encoded by a single ciphertext character. The resulting ciphertext is therefore susceptible to statistical attack methods. For example, a simple counting of the frequency of the ciphertext characters is sufficient to quickly identify the most common letter E in most languages (frequency in German about 17.7%).

Homophonic encryption counteracts this attack by allowing multiple substitutes for more frequently used letters, such as E or N. Conversely, formulated from the point of view of the ciphertext, different ciphertext characters can mean the encryption of the same plaintext letter (hence the name homophonic ), which makes unauthorized decipherment of the ciphertext much more difficult. The homophonic encryption thus represents a cryptographic improvement of the simple monoalphabetic substitution method, and is still easier to handle than a polyalphabetic substitution , in which several different secret alphabets are used.

An opposite method to homophonic encryption is polyphonic encryption .

example

As with all monoalphabetic substitution methods, only one fixed substitution alphabet is used for encryption and decryption with homophonic encryption . In order to achieve the goal, namely the leveling of the different frequencies of plaintext letters, one can, for example, assign as many ciphertext characters to each letter of the alphabet as corresponds to its relative frequency in percent, which results in a ciphertext alphabet of 100 characters. The typical frequencies of letters in the German language are shown in the following diagram:

Letters frequency.jpg

If you now map the 26 letters of the alphabet to 100 secret characters, in the simplest case to the numbers 00 to 99, in such a way that the A is assigned six secret characters, the B two, the C two, the D five, and so on , every (secret) number in the ciphertext occurs with an average frequency of 1%. A frequency analysis of the individual characters no longer gives any starting points for deciphering.

Homophonic encryption.jpg

In order to crack the text anyway , the attacker now has to use more sophisticated methods. For this purpose, instead of individual characters (monograms), he can expand the analysis to bigrams (pairs of characters), trigrams or tetragrams. Possible points of attack are characteristic bigrams such as CH, CK or QU as well as the reverse EN and NE or ER and RE. For this, however, he needs much longer texts. Sufficiently short, homophonically encrypted texts (less than eighty letters) are very well protected against unauthorized decipherment.

See also

literature