Thai Industrial Standard 620-2533

from Wikipedia, the free encyclopedia

Thai Industrial Standard 620-2533 is commonly known under the abbreviation TIS-620 as the most common character set and character encoding for the Thai script . The standard was approved by the Thai Industrial Standards Institute (TISI), an organ of the Royal Thai Government, and is the only valid standard in the Kingdom of Thailand .

The descriptive name of the standard is: “Standard for codes of Thai letters for use in computers” (Thai: รหัส สำหรับ อักขระ ไทย ที่ ใช้ กับ คอม พิ ว เต อร ).

The addition 2533 refers to the year according to the Buddhist calendar (1990) in which the standard was published. The previous version, TIS-620-2529 (1986), is no longer valid.

structure

TIS-620 is a conventional ASCII extension that is fully compatible with 7-bit ASCII and encodes the Thai letters in the 8-bit hexadecimal area between A1 hex and FB hex . Due to the complex placement of the Thai vowels and tone characters, TIS-620 is only used for information exchange. A rendering engine for Thai text is also required for correct display.

variants

An almost identical version of TIS-620 was adapted as ISO 8859-11 in 1999 . The only difference is that the sign A0 in ISO 8859-11 hex as a non-breaking space is defined as though it reserved in TIS-620, but is not defined. (In practice, this small difference is usually ignored.)

The ISO 8859-11 character set has also been registered as ISO-IR-166 at Ecma International , but this variant also contains explicit escape sequences to mark the beginning and end of a Thai word. (In Thai there are no spaces between words.)

The Windows code page 874 is also based on TIS-620, but adds a few more characters.

The order of the characters in TIS-620 was also adopted in Unicode ( ISO 10646 ). The Thai Unicode block ranges from U + 0E01 to U + 0E7F. TIS-620 characters can be converted to UTF-16 by adding 0E00 hex to each byte and subtracting A0 hex from the value.

… 0 …1 … 2 … 3 … 4 … 5 … 6 … 7 …8th … 9 … A … B ... C … D … E ... F
0 ... unused
1…
2… SP ! " # $ % & ' ( ) * + , - . /
3… 0 1 2 3 4th 5 6th 7th 8th 9 : ; < = > ?
4… @ A. B. C. D. E. F. G H I. J K L. M. N O
5… P Q R. S. T U V W. X Y Z [ \ ] ^ _
6… ` a b c d e f G H i j k l m n O
7… p q r s t u v w x y z { | } ~
8th… unused
9 ...
A ...
B ...
C ...
D ...  ั  ิ  ี  ึ  ื  ุ  ู  ฺ ฿
E ...  ็  ่  ้  ๊  ๋  ์  ํ  ๎
F ...

In the table above, 20 hex is the regular space. The values ​​00-1F hex , 7F hex . 80-9F hex , A0 hex , DB-DE hex and FC-FF hex are not assigned any characters in TIS-620. The characters marked in red are diacritics that are combined with other characters.

Web links