Input systems for the Chinese script

from Wikipedia, the free encyclopedia

The keyboard can be used as an input medium in order to generate Chinese writing digitally . There are various input methods for the Chinese characters ( Chinese 漢字 輸入 法  /  汉字 输入 法 , Pinyin Hànzì shūrùfǎ ).  

A Chinese character contains two rough sources of information: On the one hand, it refers to one or more meanings of a word (or a syllable) without having to know the pronunciation. This means that speakers of different dialects (e.g. standard Chinese , Cantonese , Shanghai ) can often communicate with one another in writing, over and above verbal language barriers. It is therefore possible to write Chinese without having to commit to any pronunciation (and thus also to a transcription system). The input systems that are based on the character structure make use of this advantage.

On the other hand, each word contains one (in some cases more than one) conventional pronunciation, which, however, can vary depending on the region. Most are based on standard Chinese, which is the officially designated standard norm and has the most widespread use in the Chinese-speaking world. The problem with other dialects is that a transcription system tailored to their needs (and also standardized) must exist, which forwards them to the corresponding characters. This is a problem especially in the centralized orientation of the People's Republic, since regional dialects generally do not receive much support in the educational system and are therefore not standardized. The exception is Cantonese, which provides a basis with the Jyutping transcription developed in Hong Kong .

Input through character structure

With the input methods, which are based on the appearance of the characters, the characters are entered individually with regard to their geometric structure. The pronunciation is not taken into account, only the appearance itself.

The most common are:

Others are:

  • Qūwèi shūrùfǎ 區 位 輸入 法  /  区 位 输入 法 (for example: zone input method)
  • Xíngmǎ shūrùfǎ 形碼 輸入 法  /  形码 输入 法 (Xingma input method)
  • Wángmǎ shūrùfǎ 王 碼 輸入 法  /  王 码 输入 法 (Wangma input method)
  • Èrbǐ shūrùfǎ 二 筆輸入 法  /  二 笔输入 法 (for example: two-stroke input method)
  • Zhèngmǎ shūrùfǎ 鄭 碼 輸入 法  /  郑 码 输入 法 (Zhengma input method)
  • Sìjiǎo hàomǎ 四角號碼  / 四角号码 ( four-corner method )

Chinese writing can also be scanned in and converted into digital using an OCR system.

A significant advantage of the character-based input methods is their independence from pronunciation. In addition, due to the strong tendency towards homophony of many Chinese words, the accuracy is significantly increased. However, the disadvantage is that all of them can only be learned after extensive training.

Input through pronunciation

Entering Chinese characters with Latin letters
Correction and selection of already entered characters
Chinese keyboard layout for Zhuyin and Cangjie

Here, the Chinese sentences are entered using one of the common transcription systems , for example the worldwide common pinyin transcription in Latin letters or the zhuyin transcription, which is mainly used in Taiwan . However, Chinese characters often have many homophones that do not differ from each other in pronunciation. As a result, the user is sometimes given dozens of individual characters for an entered syllable to choose from. Only with increasing length of the input can the input program narrow down more precisely which sequence of characters the user is referring to.

Pinyin and Zhuyin input do not differ fundamentally from one another: In both cases, the sentence is typed as it is pronounced. However, with Pinyin the tones are usually ignored, whereas with Zhuyin they have to be typed. The sentence 我 用 電腦 打字 , Wǒ yòng diànnǎo dǎzì  - "I type with the computer" would be typed as follows in both systems:

  • Pinyin: woyongdiannaodazi
  • Zhuyin: ㄨ ㄛ ˇ ㄩ ㄥ ˋ ㄉ ㄧ ㄢ ˋ ㄋ ㄠ ˇ ㄉ ㄚ ˇ ㄗ ˋ

One advantage of pronunciation-based input methods is that they are easy to use: If you are already familiar with the standard pronunciation and the transcription system, you do not have to learn a new system. The problem is that the probability of finding the desired character right away is lower than with a character structure entry and blind writing is often made more difficult by the fact that the searched word has to be found first.

Most Korean script input methods also allow Chinese characters to be input. This is done character by character and first the Korean pronunciation of the desired character is entered in Hangeul ; then a selection list opens at the push of a button, from which the appropriate Chinese character can then be selected. The selection list contains all characters with the specified Korean pronunciation. Since the Korean language, unlike the Chinese, is not a tonal language, the selection lists are often quite long. This method of entering Chinese characters is not suitable for people without any knowledge of Korean.

See also

Web links