Internationalization (software development)

from Wikipedia, the free encyclopedia

In computer science or software development , internationalization means designing a program in such a way that it can be easily adapted to other languages and cultures (without having to change the source code ) .

Internationalization ( English internationalization or internationalization ) is often abbreviated with the numeronym i18n (in the English word Internationalization there are 18 letters between the first letter I and the last letter  n ).

This includes those tasks that the developer / programmer of a program must master. To do this, he may not, for example, hard-code description texts in the source code, but must use variables that are read in from a source at runtime . But also date formatting and the language-dependent surface design (e.g. text can be of different lengths or the right-to-left orientation can differ) belong to this.

The next step is localization ( localization or localization ), which is abbreviated as l10n (10 letters between l and  n ). In the previous step, a program should be designed so that these changes no longer have to be made by the programmer. This process refers to the pure translation of, for example, texts into a national language.

Scope of internationalization and localization (selection)

language

  • Translations: Text data are stored in external files and are loaded dynamically. In addition to texts, translations can also concern voice output and texts within graphic elements such as images and videos ( subtitles in films).
  • Graphic representation: The program logic is independent of the user interface and output media such as printer.

Texts

  • Unicode : In contrast to the previously used character sets such as ASCII or EBCDIC , modern systems use the Unicode character set. Character encoding problems are solved by a much larger character set. This means that characters from different fonts can be offered in the same system.
  • Bidirectional texts : Depending on the font, different writing directions must be used.
  • Script: Some languages ​​can exist in different writing variants, such as B. Serbian with Cyrillic script and Serbian with Latin script.
  • Word processing: Concepts such as upper / lower case are not known in every font. Different rules apply to text separation, for example to break a line.
  • Input methods: Allows you to enter keyboard shortcuts with any keyboard layout .
  • Sorting: National special characters, such as umlauts in the German language, must be sorted according to national rules. There can also be thematic distinctions that mean that a special sorting is expected for telephone books, for example.
  • Search: When searching textually, some characters have to be mapped to correspondingly equivalent forms. The texts "¼" and "1/4" are coded differently, but the meaning is identical. The transformation of the texts is also called normalization .
  • Transformation: Texts can be converted into a different font in order to either support a different character set or to increase legibility. A distinction is made between transliteration (the literal transformation) and transcription (the transformation according to the phonetic spelling ).

Text formatting

Culture

  • Locale : To define the culture to which the software should adapt, the software usually uses a locale. This contains information about the language, the country and, if applicable, other regional properties, such as the font to be used.
  • Images and colors: Problems of intelligibility and cultural appropriateness
  • Name and title
  • assigned by the state numbers such as passport numbers or social security number ( social security number ) in the United States, the national insurance number in the UK or the Isikukood in Estonia
  • Telephone numbers, addresses and international postcodes
  • Paper dimensions

Difference between internationalization and localization

The difference between internationalization and localization may seem subtle, but it is important. Internationalization is the adaptation of a product so that it can theoretically be used anywhere. Localization is the addition of special properties for use in a certain geographically or ethnically defined sales or use area (country, region or ethnic group). Internationalization is carried out once per product. Localization is then performed once for each combination of product and usage area. The processes complement each other and have to be combined to create a system that works globally.

Business process of internationalization of software

In order to internationalize a product, the various markets into which the product is likely to be introduced must be considered. Details such as address field lengths, optional fields for postcodes and the introduction of new registration processes to cater to local legal situations are examples of how complex an internationalization project can be.

A comprehensive approach extends to cultural factors such as the adaptation of business process logic or the consideration of individual cultural behavioral aspects.

Programming practice

The traditional prevailing practice for applications is to store the texts (or other elements, such as the names of graphic files) in auxiliary character strings ( resource strings ), which are loaded as required during program execution. The character strings are stored in auxiliary files and are relatively easy to translate. Programs are often built in such a way that they access auxiliary libraries depending on the sales area set. A program library that supports this is, for example, GNU gettext .

trouble

While translating existing text into another language appears easy, it is far more difficult to simultaneously manage language versions of texts throughout the product lifecycle. If a message displayed to the user is modified, all translated versions must also be modified. That extends the development cycle.

The different lengths of the texts can lead to undesirable results in the display (truncated text, undesired line breaks).

Many localization tasks (writing direction, text sorting, etc.) require a more profound change to the software than just the translation. OpenOffice.org solves this for example with compilation switches .

Problems often arise from the syntactic characteristics of the different languages. The display of " ZAHL days ago " cannot be implemented with a one-part translation for days ago , as this information may be broken down into two parts in other languages, such as German, which must be displayed separated by the number: "vor ZAHLTage" . Therefore, the internationalized version of the program must allow text in front of and behind the number, whereby the text in front of the number would be empty in the English localization, while in the German it would have to contain “before”.

In practice, this problem is usually solved using placeholders: the entire sentence is still treated as a unit, but special character strings indicate the position of the parameters: “ Showing results {0}to {1}out of{2} ” becomes “Show hits {0}to {1}from {2}”. This technique also enables the order of the arguments to be rearranged if this is required for different languages. For formulations that differ depending on the parameter values ​​(e.g. singular and plural), several alternatives are built into the text resources, which are dynamically evaluated by the system at runtime and selected according to the current parameter value: “The search delivered {PLURAL|{0}|keine Ergebnisse|1 Ergebnis|{0} Ergebnisse}. "

From a certain level of complexity (e.g. for quality assurance ) the development team needs someone who understands other languages ​​and cultures and has a technical background.

Cost-benefit analysis

In a commercial setting, the benefit of localization is access to more markets. There is an opinion that localizing a product in different languages ​​or cultures is a matter of course. All that is needed is confirmation of the cost. It costs more to produce products for international markets, but in an increasingly global economy, supporting only one language or only one market is hardly an option. However, the localization of self-developed software is influenced by economic imponderables and mostly lacks the possibility for end users and volunteers to do the localization, as is common in open source environments. Since open source software can generally be freely modified and redistributed, it is more accessible to internationalization. The KDE project, for example, has been translated into 100 languages.

See also

Web links

Individual evidence

  1. ^ JM Pawlowski: Culture Profiles: Facilitating Global Learning and Knowledge Sharing. Proc. of ICCE 2008, Taiwan, Nov. 2008. ( English , PDF; 350 kB) Retrieved on October 21, 2009.
  2. The current list of KDE localizations ( English ) Retrieved on 24 October of 2009.