KOI8-U

from Wikipedia, the free encyclopedia

KOI8-U from the KOI8 family is a character set that is used for character encoding of the Cyrillic alphabet for the Ukrainian language in computer systems and uses only a single byte for coding.

KOI8-U is a superset of ASCII and thus also contains the 26 letters of the Latin alphabet . KOI8-U shares many features with KOI8-R for Russian . The international character set standard Unicode completely replaces KOI8-U.

KOI-U is described in RFC 2319 and is IANA registered and approved for MIME .

table

… 0 …1 … 2 … 3 … 4 … 5 … 6 … 7 …8th … 9 … A … B ... C … D … E ... F
0 ... unused
1…
2… SP ! " # $ % & ' ( ) * + , - . /
3… 0 1 2 3 4th 5 6th 7th 8th 9 : ; < = > ?
4… @ A. B. C. D. E. F. G H I. J K L. M. N O
5… P Q R. S. T U V W. X Y Z [ \ ] ^ _
6… ` a b c d e f G H i j k l m n O
7… p q r s t u v w x y z { | } ~
8th…
9 ... Ø NBSP ° ² · ÷
A ... ё є і ї ґ
B ... Ё Є І Ї Ґ ©
C ... ю а б ц д е ф г х и й к л м н о
D ... п я р с т у ж в ь ы з ш э щ ч ъ
E ... Ю А Б Ц Д Е Ф Г Х И Й К Л М Н О
F ... П Я Р С Т У Ж В Ь Ы З Ш Э Щ Ч Ъ

The differences to KOI8-R are at the positions A4 hex , A6 hex , A7 hex , AD hex and B4 hex , B6 hex , B7 hex , BD hex (highlighted in color in the table above), where the four additional letters required are coded.

While RFC 2319 says that 95 hex Unicode should be U + 2219 (∙), it is often converted to U + 2022 (•) because of compatibility with code page 1251 . Some references have a typo and incorrectly assign B4 hex U + 0403 instead of the correct U + 0404. This typo is also in Appendix A of RFC 2319 , but the table in the main text is correct.

See also

Web links