Monoalphabetic substitution

from Wikipedia, the free encyclopedia

As monoalphabetic substitution (of Greek μόνο mono , only 'and αλφάβητο alphabeto , Alphabet' and of Latin substituere , replace ') is called in cryptography an encryption method in which only a single (fixed) key alphabet for encryption, ie to convert the plaintext in the ciphertext .


The letters or characters or groups of letters or groups of characters in the plain text are replaced by other letters, characters or groups according to the specification of this one alphabet, which is also called the key alphabet or secret alphabet .

Classic examples of monoalphabetic substitutions are the Caesar encryption and the Playfair method. In contrast to monoalphabetic substitutions, there are polyalphabetic substitutions , in which several (many) different alphabets are used for encryption. Examples of this are Vigenère encryption and the Enigma key machine .


Simple monoalphabetic substitution

An example of monoalphabetic encryption is the following procedure: Here, individual letters of the plain text are substituted into individual characters of the ciphertext using the key alphabet. This method is therefore precisely referred to as “monographic monoalphabetic monopartite substitution” or simply also as “simple monoalphabetic substitution”.

Plain text alphabet: a b c d e f G H i j k l m n O p q r s t u v w x y z
Secret alphabet: U F. L. P W. D. R. A. S. J M. C. O N Q Y B. V T E. X H Z K G I.

After encryption, the plain text "wikipedia is informative" becomes the ciphertext "ZSMSYWPSU STE SNDQVOUESH". The plain text can be reconstructed from the ciphertext by decryption by replacing the letters in the second line with those in the first line. The ciphertext, also known as the cipher , is mostly written in capital letters to make it easier to distinguish it from the plain text.

Caesar encryption

This is a special case of simple monoalphabetic substitution, whereby the alphabet used for encryption is obtained by cyclically shifting each individual letter of the standard alphabet . The number of places to move is key. Caesar already used this method, mostly with the key “C”, which corresponds to a shift of three letters.

Example of Caesar encryption:

 Klartextalphabet:   a b c d e f g h i j k l m n o p q r s t u v w x y z
 Geheimtextalphabet: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

In this example the word "wikipedia" is encoded as "ZLNLSHGLD".

Secret alphabet creation

There are different methods of generating the secret alphabet required for encryption and decryption . Particularly simple (and particularly unsafe) variants are

  • Caesar shift: Only 25 different keys are possible here. Example with key E, i.e. shift by five characters:
  Klar: abcdefghijklmnopqrstuvwxyz
  • Atbasch : Reverse alphabet, only one fixed key available:
  Klar: abcdefghijklmnopqrstuvwxyz

It is also common to generate a scrambled secret alphabet using a password (key). The advantage of this method is that a large number of different secret alphabets can be formed without having to transmit the key in written form. It is sufficient to provide the authorized recipient with the appropriate password (key) verbally or in some other (secret) way. The password is easy to remember and thus well protected against spying. Both the encryptor (sender) and the decryptor (receiver) create the identical secret alphabet from the password in the same way.

For example, you agree to the password “umbrella” as your secret key. First they remove all repeated letters from the password. The “umbrella” becomes REGNSCHIM. These letters form the beginning of the secret alphabet. The rest of the alphabet, i.e. the letters that do not appear in the password, are padded on the right (highlighted in bold below). This is how you get a secret alphabet

  Klar: abcdefghijklmnopqrstuvwxyz

It is better not to fill in the remaining letters alphabetically, but in reverse alphabetical order (reversed). This avoids the disadvantage that otherwise the secret alphabet often (as here) ends with ... XYZ. Reverse padding of the remaining letters of the alphabet after the password results in a secret alphabet:

  Klar: abcdefghijklmnopqrstuvwxyz

As an alternative, you can append the missing alphabet letters in alphabetical order to the last letter of the password (progressive filling) and thus create a scrambled secret alphabet:

  Klar: abcdefghijklmnopqrstuvwxyz

It is also conceivable to use a completely randomly scrambled secret alphabet. The disadvantage here, however, is that the two partners usually cannot remember this in their heads. So it has to be noted and can then possibly be spied out.

  Klar: abcdefghijklmnopqrstuvwxyz

Using the above secret alphabet, the plain text "Water boils in the tea kettle" is converted into the ciphertext "INRRZQ VPJMU LF UZZVZRRZY". Of course, before transmitting the ciphertext to make unauthorized deciphering more difficult, the spaces would be removed and the text transmitted as a “worm” “INRRZQVPJMULFUZZVZRRZY” or in groups “INRRZ QVPJM ULFUZ ZVZRR ZY”.


In contrast to the Caesar encryption with only 25 options, there are many options for scrambling the standard alphabet: The first letter "A" can be placed in one of 26 possible alphabet positions. For the second letter “B” there are 25 possible places to choose from, for the third 24, and so on. In total, 26 25 24 23 4 3 2 1 = 26 are calculated! ( Faculty ) Opportunities to scramble the alphabet. That is approximately 4 · 10 26 cases and corresponds to approximately 88 bits . As a result, it is practically impossible to decipher by trying out all cases ( brute force method ). Nonetheless, monoalphabetic substitution is unsafe and easy to “ crack ”. Even relatively short ciphertexts that are encoded in mono-alphabet (thirty to fifty characters are completely sufficient) can be deciphered with the help of statistical studies (frequency counts) and pattern searches .


Frequency analysis

To decipher monoalphabetic encodings without a known key, a frequency analysis of the letters in the cipher text is carried out and in this way certain letters can be inferred, from which words and thus more and more associations can be drawn to plain text letters. (Some frequency tables can be found under German alphabet .)


Mjjp nop cni Hzgfzqosmqgr zqo scd Gjdkqpcmucmcngf. Cm rjddp tjd ciabnogfci qis fcnoop vjcmpbngf qcucmocpyp: Vqmycb.

Letters frequencies : 12.6%: c, 6.7%: mp, 5.9%: oq, 5%: dgj From the distribution it can be assumed that the e as the most frequent letter is encoded by c. This results in the following:

Mjjp nop cni Hzgfzqosmqgr zqo scd Gjdkqpcmucmcngf. Cm rjddp tjd Ciabnogfci qis fcnoop vjcmpbngf qcucmocpyp: Vqmycb.
.... ... e.. ............ ... .e. ......e..e.e.... E. ..... ... E.......e. ... .e.... ..e...... .e....e...: ....e..

Word connections are now searched for. Words with 3 letters and e in the middle are usually articles ( der , den , dem , ...), especially if they occur more than once; so we can infer the d. A word with 3 letters and e at the beginning is often a . It is important to try out and document the steps so that you can continue with backtracking in the event of errors .

Mjjp nop cni Hzgfzqosmqgr zqo scd Gjdkqpcmucmcngf. Cm rjddp tjd ciabnogfci qis fcnoop vjcmpbngf qcucmocpyp: Vqmycb.
.... i.. ein .......d.... ... de. ......e..e.e.... E. ..... ... En..i...en .nd .ei... ..e...i.. .e....e...: ....e..

This can easily be the words and and is found in:

...t ist ein .....u.d.u.. .us de. ....ute..e.e.... E. ....t ... und .eisst ..e.t.i.. ue.e.set.t: .u..e..

From which, with a little imagination and practice, further words and letter sequences (such as aus , sch / ch , en etc.) and, last but not least, the plain text can easily be inferred:

...t ist ein Fachausd.u.. aus de. E. ....t ... En..ischen und heisst ..e.t.ich ue.e.set.t: .u..e..
Root ist ein Fachausdruck aus dem Computerbereich. Er kommt vom Englischen und heisst woertlich uebersetzt: Wurzel.

The decipherment of the ciphertext by evaluating the frequency of letters can be made difficult or impossible by a leipogrammatic text. Because one or more letters are not used in a leipogrammatic text (e.g. not using words with e ), the entire frequency of letters is shifted, and without knowledge of the letter ( s ) that have been avoided, no evaluation, or only a very difficult evaluation respectively.

Plain text attack (pattern search)

If parts of the plain text are known (individual terms), you can search for their pattern in the ciphertext, for example by looking for double letters. In plain and ciphertext, monoalphabetic substitution should contain double characters in the same places. In the same way, you can search for patterns in the ciphertext that match the pattern of the suspected word.


Vermutet:    INTERNET

MAKE PROFIT encryption

This very simple monoalphabetic encryption of digits is based on the fact that digits are replaced by the letters assigned to them from the easily remembered phrase "MAKE PROFIT.":

   Ziffern: 1 2 3 4 5 6 7 8 9 0
 Schlüssel: M A K E P R O F I T
 Beispiele: 3719346 87550 46025504 12892

Such an encryption is less suitable as a secret code, but rather the key is used to convert letters into digits where letters cannot or should not be used. One example is type codes in catalogs and price information in lists for salespeople. At Siemens AG , the former Siemens manager Michael Kutschenreuter told the public prosecutor in Munich, the code was also used as a secret key in connection with instructions on bribe payments.

Related encryption methods

Homophonic encryption
The plain text characters can be substituted by different ciphertext characters.
A bigraphic monoalphabetic substitution.
Polyalphabetic substitution
"Many" ciphertext alphabets are used for the characters of the plaintext.
Polygram substitution (also: polygraphic substitution)
Instead of individual plain text characters, character N-grams (e.g. groups of letters) are substituted.

See also


Web links

Individual evidence

  1. Example: Coding of identification keys for pistols and revolvers in the Gun Stock Book Record of Army & Navy Store Ltd. (London). Even this simple code is often misrepresented. In Using the Army & Navy Co-Operative Society firearms records . (PDF; 555 kB) University of Glasgow, October 2008. T = 10 and S = 11, so the zero could not be coded. An S was also added to the code in the German reporting on the use of the code at Siemens.
  2. ^ David Crawford, Mike Esterl: At Siemens, witnesses cite pattern of bribery . In: The Wall Street Journal , January 31, 2007. “Back at Munich headquarters, he [Michael Kutschenreuter] told prosecutors, he learned of an encryption code he alleged was widely used at Siemens to itemize bribe payments. He said it was derived from the phrase 'Make Profit,' with the phrase's 10 letters corresponding to the numbers 1-2-3-4-5-6-7-8-9-0. Thus, with the letter A standing for 2 and P standing for 5, a reference to 'file this in the APP file' meant a bribe was authorized at 2.55 percent of sales. - A spokesman for Siemens said it has no knowledge of a 'Make Profit' encryption system. "
  3. ^ Christian Buchholz: The code for bribes . In: Manager-Magazin , February 8, 2007, "... The system, which is said to have been known to most of the sales representatives, according to information from corporate circles, lasted until 1997 ..."