UUencode

from Wikipedia, the free encyclopedia

UUencode was the first popular program that made it possible binary files (images or programs ie for. B.) to convert so that it only "printable ASCII characters" consist of and hassle free via e-mail could be sent where only ASCII characters are allowed.

history

The UU stands for the roots in UNIX . The UU in UUencode and -decode, like the UU in UUcp, stands for U NIX to U NIX copy protocol . That is, the transfer from one UNIX computer to another UNIX computer.

The principle is similar to the Base64 method that is common today for e-mail attachments : three bytes of the binary file (= 24 bits) are divided into four times 6 bits and printable ASCII characters are assigned to the 6-bit values. The first versions of UUencode simply used the ASCII characters with the values ​​32 to 95.

Since the ASCII character with the value 32 is the space, and this often does not survive the mail dispatch, the ASCII character with the value 96 (" `") was used instead.

File format

UUencode uses a special format for the encoded file:

begin mode filename
length data
length data
...
length data
`
end

As mode , the file rights, as they are common under Unix, are written as 3 or 4-digit octal numbers . The file name is the name of the original file , without a directory.

Each data line begins with a 1-byte length specification, which indicates how many original bytes have been encoded in this line. This length specification is a number between 1 and 63 and is also uu-coded, i.e. as a character from " !" to " _". Usually 45 bytes (ie the value " M") are encoded in 60 characters.

In order to display the end of the file, a "blank line" must always be coded, which only contains the length byte 0 (coded " `"). Finally there is a line with the keyword end.

Coding method

Three bytes of source data are uuencodeencoded into four bytes by. The data are in the uuencoded file in the lower six bits of the bytes, the upper bits are set by the coding:

    unkodierter Bitstrom    ↔    kodierter Bitstrom
aaaaaaaa bbbbbbbb cccccccc  ↔  0kaaaaaa 0kaabbbb 0kbbbbcc 0kcccccc


For coding, the new groups of six "00eeeeee" are first XOR- linked with 32 . Bit k is set if the result is ≤ 32.

uncodiert                      (XOR 32)                   (k setzen?)                        codiert
[0]     = [00000000]              →            [00100000]    -ja→               [01100000] =    [96]
[1,31]  = [00000001,00011111]     →   [00100001,00111111]   -nein→     [00100001,00111111] = [33,63]
[32,63] = [00100000,00111111]     →   [00000000,00011111]    -ja→      [01000000,01011111] = [64,95]

In other words: for 0 the result is 96, for all others 32 must be added.

The decoding of the data works in reverse, four bytes of source data are uudecodedecoded into three bytes. If bit k is set in the source data, this must be removed. The resulting value will then 32 XOR -linked.

codiert                      (k entfernen)                 (XOR 32)                        uncodiert
[96]    = [01100000]             -ja→            [00100000]   →                [00000000] =      [0]
[33,63] = [00100001,00111111]   -nein→  [00100001,00111111]   →       [00000001,00011111] =   [1,31]
[64,95] = [01000000,01011111]    -ja→   [00000000,00011111]   →       [00100000,00111111] =  [32,63]

In other words: for 96 the result is 0, for all others 32 must be subtracted.

    kodierter Bitstrom               ↔     unkodierter Bitstrom
0kaaaaaa 0kaabbbb 0kbbbbcc 0kcccccc  ↔  aaaaaaaa bbbbbbbb cccccccc

example

A paragraph of text from above serve as input:

Geschichte
Das UU steht für die Wurzeln in UNIX. Das UU in UUencode und -decode steht
ebenso wie das UU bei UUcp für UNIX to UNIX copy protocol. Also die Übertragung
von einem UNIX-Computer zu einem anderen UNIX-Computer.

UUencoding makes it:

begin 644 uuencode-Test.txt
M1V5S8VAI8VAT90T*#0I$87,@554@<W1E:'0@9OQR(&1I92!7=7)Z96QN(&EN
M(%5.25@N($1A<R!552!I;B!5565N8V]D92!U;F0@+61E8V]D92!S=&5H="`-
M"F5B96YS;R!W:64@9&%S(%55(&)E:2!556-P(&;\<B!53DE8('1O(%5.25@@
M8V]P>2!P<F]T;V-O;"X@06QS;R!D:64@W&)E<G1R86=U;F<@#0IV;VX@96EN
M96T@54Y)6"U#;VUP=71E<B!Z=2!E:6YE;2!A;F1E<F5N(%5.25@M0V]M<'5T
%97(N#0H`
`
end

XXencode

XXencode works in the same way as UUencode, but only uses letters and numbers and the two special characters plus (+) and minus (-). This is to minimize the risk that some characters in the text file are irreparably damaged by automatic character set conversions (e.g. from ASCII to EBCDIC ) during transmission.

In addition, with some xxencode versions there is the option of sending a list of all the characters used. If this list is also modified due to incorrect character set conversions, the recipient can recognize this and still decode the file correctly, as long as the modifications are reversible .

Coding table of XXencode
value character value character value character value character
0 + 16 E 32 U 48 k
1 - 17th F 33 V 49 l
2 0 18th G 34 W 50 m
3 1 19th H 35 X 51 n
4th 2 20th I 36 Y 52 o
5 3 21st J 37 Z 53 p
6th 4 22nd K 38 a 54 q
7th 5 23 L 39 b 55 r
8th 6 24 M 40 c 56 s
9 7 25th N 41 d 57 t
10 8 26th O 42 e 58 u
11 9 27 P 43 f 59 v
12 A 28 Q 44 g 60 w
13 B 29 R 45 h 61 x
14th C 30th S 46 i 62 y
15th D 31 T 47 j 63 z

Related topics

  • 7plus - more efficient and fail-safe coding method used in amateur radio
  • Kermit - protocol that also maps binary characters to ASCII characters.
  • Base64 - MIME encoding , used in emails to transfer binary files.

Web links