Base32
Base32 describes a method for encoding of binary data in a string of only 32 different ASCII characters is (plus an additional 33 characters as padding at the end of data). Compared to the related method Base64 , it is suitable for data formats in which no distinction is made between uppercase and lowercase letters.
Basic principle
RFC 3548 describes the coding of any binary data as follows: Five bytes of 8 bits each (so a total of 40 bits) are divided into eight 5-bit groups. Each of these groups corresponds to a number between 0 and 31. These numbers are converted into "printable ASCII characters" and output using the conversion table below. If a complete 40-bit block can no longer be formed at the end, this block is padded with zero bytes and the 5-bit groups, which only consist of filler bits, are coded with = to tell the decoder how many filler bits have been added.
Coding table
While Base64 is used in machine-to-machine communication, Base32-like encodings are often used in areas where they are read and entered by humans. Various encodings are in use, the aim of which is to minimize the risk of confusion between characters that look similar and to specifically exclude from use individual characters that are believed to be ambiguous. This is why the Base32 numbers are usually translated into coded characters in tabular form.
Base32 according to RFC 3548 / RFC 4648
value | character | value | character | value | character | value | character | |||
---|---|---|---|---|---|---|---|---|---|---|
0 | A. | 8th | I. | 16 | Q | 24 | Y | |||
1 | B. | 9 | J | 17th | R. | 25th | Z | |||
2 | C. | 10 | K | 18th | S. | 26th | 2 | |||
3 | D. | 11 | L. | 19th | T | 27 | 3 | |||
4th | E. | 12 | M. | 20th | U | 28 | 4th | |||
5 | F. | 13 | N | 21st | V | 29 | 5 | |||
6th | G | 14th | O | 22nd | W. | 30th | 6th | |||
7th | H | 15th | P | 23 | X | 31 | 7th |
The digits 0 and 1 are not used because there is a risk of confusion with the letters O and I when reproduced in writing .
Base32hex according to RFC 4648
value | character | value | character | value | character | value | character | |||
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 8th | 8th | 16 | G | 24 | O | |||
1 | 1 | 9 | 9 | 17th | H | 25th | P | |||
2 | 2 | 10 | A. | 18th | I. | 26th | Q | |||
3 | 3 | 11 | B. | 19th | J | 27 | R. | |||
4th | 4th | 12 | C. | 20th | K | 28 | S. | |||
5 | 5 | 13 | D. | 21st | L. | 29 | T | |||
6th | 6th | 14th | E. | 22nd | M. | 30th | U | |||
7th | 7th | 15th | F. | 23 | N | 31 | V |
RFC 3548 has been superseded by RFC 4648 , which introduces another coding. Similar to the hexadecimal system , this uses the decimal digits for the values 0 to 9. The values 10 to 31 are represented by the letters A to V. As with hexadecimal numbers, the sequence of the coded values is retained with lexicographical sorting.
This coding is used in DNSSEC , among others .
Bech32 encoding of Bitcoin addresses
value | character | value | character | value | character | value | character | |||
---|---|---|---|---|---|---|---|---|---|---|
0 | q | 8th | G | 16 | s | 24 | c | |||
1 | p | 9 | f | 17th | 3 | 25th | e | |||
2 | z | 10 | 2 | 18th | j | 26th | 6th | |||
3 | r | 11 | t | 19th | n | 27 | m | |||
4th | y | 12 | v | 20th | 5 | 28 | u | |||
5 | 9 | 13 | d | 21st | 4th | 29 | a | |||
6th | x | 14th | w | 22nd | k | 30th | 7th | |||
7th | 8th | 15th | 0 | 23 | H | 31 | l |
Bitcoin addresses are usually given in a coding called " base58check ", which allows a relatively compact textual representation, but has some disadvantages in practice:
- The text display is compact, but quite inefficient as a QR code .
- Since upper and lower case letters are used, it is difficult to specify the addresses e.g. B. to pass it on orally.
- The Base58 encoding is quite computationally intensive and requires 256-bit arithmetic.
- The selected checksum was not selected after carefully considered error detection or correction options.
The format proposed in the Bitcoin Improvement Proposal 0173 (BIP0173) called "Bech32" tries to circumvent these disadvantages:
- Base32 is about 15% longer than Base58. If addresses are passed on via copy and paste , however, the slightly longer length does not matter.
- Lower case letters should be used in the text display. As a QR code, however, capital letters, as this allows the more compact "Alphanumeric Mode", which encodes 2 characters in 11 bits.
- Base32 is efficient to implement using 32-bit arithmetic
- The checksum algorithm was specifically selected for the desired error detection and correction properties.
Bech32 uses a special coding table that was designed in such a way that the coded 5-bit sequence of visually similar (and therefore most easily confused) characters always differs by more than just 1 bit, so that the checksum algorithm benefits from it.
More coding alphabets
In video games, passwords and level codes are often represented in a modified Base32 coding. The coding alphabets used are not standardized. Often digits and consonants are used to avoid generating “speaking” passwords.
ZRTP uses its own coding table called z-base-32, which has also been optimized to avoid misunderstandings when played back orally (e.g. via telephone).
Examples
Example coding for a byte with the value 0
step | Block 1 | Block 2 | Block 3 | Block 4 | Block 5 | Block 6 | Block 7 | Block 8 |
---|---|---|---|---|---|---|---|---|
Integer value | 0 | - | - | - | - | - | - | - |
Represented as 8 bits | 00000000 | - | - | - | - | - | - | - |
Divided into 8 × 5 blocks | 00000 | 000 ... | - | - | - | - | - | - |
Padded missing zeros | 00000 | 00000 | - | - | - | - | - | - |
Integer value | 0 | 0 | - | - | - | - | - | - |
Base32 encoding | A. | A. | = | = | = | = | = | = |
Example coding for the string "AB" (corresponds to the values 65 and 66 in ASCII coding)
step | Block 1 | Block 2 | Block 3 | Block 4 | Block 5 | Block 6 | Block 7 | Block 8 |
---|---|---|---|---|---|---|---|---|
Integer values | 65 | 66 | - | - | - | - | - | - |
Represented as 8 bits | 01000001 | 01000010 | - | - | - | - | - | - |
Divided into 8 × 5 blocks | 01000 | 00101 | 00001 | 0 .... | - | - | - | - |
Padded missing zeros | 01000 | 00101 | 00001 | 00000 | - | - | - | - |
Integer values | 8th | 5 | 1 | 0 | - | - | - | - |
Base32 encoding | I. | F. | B. | A. | = | = | = | = |