Thai Industrial Standard 620-2533
Thai Industrial Standard 620-2533 is commonly known under the abbreviation TIS-620 as the most common character set and character encoding for the Thai script . The standard was approved by the Thai Industrial Standards Institute (TISI), an organ of the Royal Thai Government, and is the only valid standard in the Kingdom of Thailand .
The descriptive name of the standard is: “Standard for codes of Thai letters for use in computers” (Thai: รหัส สำหรับ อักขระ ไทย ที่ ใช้ กับ คอม พิ ว เต อร ).
The addition 2533 refers to the year according to the Buddhist calendar (1990) in which the standard was published. The previous version, TIS-620-2529 (1986), is no longer valid.
structure
TIS-620 is a conventional ASCII extension that is fully compatible with 7-bit ASCII and encodes the Thai letters in the 8-bit hexadecimal area between A1 hex and FB hex . Due to the complex placement of the Thai vowels and tone characters, TIS-620 is only used for information exchange. A rendering engine for Thai text is also required for correct display.
variants
An almost identical version of TIS-620 was adapted as ISO 8859-11 in 1999 . The only difference is that the sign A0 in ISO 8859-11 hex as a non-breaking space is defined as though it reserved in TIS-620, but is not defined. (In practice, this small difference is usually ignored.)
The ISO 8859-11 character set has also been registered as ISO-IR-166 at Ecma International , but this variant also contains explicit escape sequences to mark the beginning and end of a Thai word. (In Thai there are no spaces between words.)
The Windows code page 874 is also based on TIS-620, but adds a few more characters.
The order of the characters in TIS-620 was also adopted in Unicode ( ISO 10646 ). The Thai Unicode block ranges from U + 0E01 to U + 0E7F. TIS-620 characters can be converted to UTF-16 by adding 0E00 hex to each byte and subtracting A0 hex from the value.
… 0 | …1 | … 2 | … 3 | … 4 | … 5 | … 6 | … 7 | …8th | … 9 | … A | … B | ... C | … D | … E | ... F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 ... | unused | |||||||||||||||
1… | ||||||||||||||||
2… | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3… | 0 | 1 | 2 | 3 | 4th | 5 | 6th | 7th | 8th | 9 | : | ; | < | = | > | ? |
4… | @ | A. | B. | C. | D. | E. | F. | G | H | I. | J | K | L. | M. | N | O |
5… | P | Q | R. | S. | T | U | V | W. | X | Y | Z | [ | \ | ] | ^ | _ |
6… | ` | a | b | c | d | e | f | G | H | i | j | k | l | m | n | O |
7… | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8th… | unused | |||||||||||||||
9 ... | ||||||||||||||||
A ... | ก | ข | ฃ | ค | ฅ | ฆ | ง | จ | ฉ | ช | ซ | ฌ | ญ | ฎ | ฏ | |
B ... | ฐ | ฑ | ฒ | ณ | ด | ต | ถ | ท | ธ | น | บ | ป | ผ | ฝ | พ | ฟ |
C ... | ภ | ม | ย | ร | ฤ | ล | ฦ | ว | ศ | ษ | ส | ห | ฬ | อ | ฮ | ฯ |
D ... | ะ | ั | า | ำ | ิ | ี | ึ | ื | ุ | ู | ฺ | ฿ | ||||
E ... | เ | แ | โ | ใ | ไ | ๅ | ๆ | ็ | ่ | ้ | ๊ | ๋ | ์ | ํ | ๎ | ๏ |
F ... | ๐ | ๑ | ๒ | ๓ | ๔ | ๕ | ๖ | ๗ | ๘ | ๙ | ๚ | ๛ |
In the table above, 20 hex is the regular space. The values 00-1F hex , 7F hex . 80-9F hex , A0 hex , DB-DE hex and FC-FF hex are not assigned any characters in TIS-620. The characters marked in red are diacritics that are combined with other characters.
Web links
- Official reference (in Thai)
- Mapping from TIS-620 to ISO 10646 (not relevant)