Bidirectional text

from Wikipedia, the free encyclopedia

Bidirectional texts are mostly multilingual texts in which fonts with two writing directions are used. This has posed a challenge, especially in information and computer technology, since data was exchanged around the world via the Internet.

Fonts with different writing directions

Different writing systems are used in different languages. In Europe and in European-influenced cultures, typefaces with a writing direction from left to right are predominantly used, for example in Latin , German or English . Other, particularly Semitic, scripts such as Hebrew , Arabic or Persian and scripts influenced by them such as Thaana and Kharoshthi are written from right to left. But there are also fonts that are written both from right to left and vice versa, as is possible, for example, with Egyptian hieroglyphics or Chinese characters .

Bidirectional texts in computer systems

You can see here whether your browser is
correctly displaying the Hebrew text on this page
.

Hebrew for "Garden of Eden"

גן עדן

The text (right) should
look something like the picture (left)

Bidirectional script support (BiDi or bidi) is the ability to use computer systems to write complex texts in different writing directions. In older systems, mostly only one writing direction, often from left to right, was supported. Due to the spread of computer technology all over the world and thus across cultures, text editors and other word processing systems must be able to process both writing directions.

Some computer programs cannot display bidirectional text correctly. So the Hebrew name for the Garden of Eden (גן עדן) should be spelled from right to left (gimel (ג), nun (ן), ajin (ע), daleth (ד), nun (ן)).

Computer systems still have display problems today, especially with mixed texts in which different fonts are used within a paragraph.

Unicode

Several writing systems are represented in Unicode , whereby each letter is assigned its writing direction; Punctuation marks , on the other hand, have no fixed writing direction. Characters with a fixed writing direction are called "strong characters"; Characters that can be used in different writing directions are called "weak characters". The Unicode standard does not stipulate how "weak characters" have to be dealt with, but there is the Unicode Bidi algorithm which tries to find a suitable writing direction for the punctuation marks.

An example of such an algorithm: If there is a “weak character” between two “strong characters” with the same writing direction, it inherits the writing direction. If, on the other hand, it is between two "strong characters" with different writing directions, the main writing direction of the text is adopted. If there is a "weak character" between other "weak characters", an attempt is made to determine the writing direction of the closest "strong character". To influence this behavior, there are the “pseudo-strong characters” (U + 200E LTR and U + 200F RTL), also called “marks”, among the bidirectional control characters . These characters are not printed, but behave like a corresponding "strong character" to determine the writing direction for a punctuation mark.

See also

Web links