Common Locale Data Repository
The Common Locale Data Repository ( CLDR for short ) is a project of the Unicode Consortium to make Locale information available for application programs . It thus supports internationalization and localization . The data is available in the XML- based language LDML ( Locale Data Markup Language ).
history
The CLDR was originally developed by a working group of the Free Standards Group , which was founded by IBM , Sun Microsystems and OpenOffice.org . The first version was released in early 2004. The project was then continued under the leadership of the Unicode consortium. New versions with extended and improved data are usually published twice a year, the current version since April 2019 is 35.1.
Data
The data is available as XML files in the language LDML ( Locale Data Markup Language ).
The following small excerpt from one of the files for German can serve as an example of the format:
<ldml>
<localeDisplayNames>
<languages>
<language type="en">Englisch</language>
<language type="fr">Französisch</language>
</languages>
<territories>
<territory type="AT">Österreich</territory>
<territory type="CH">Schweiz</territory>
<territory type="CI">Côte d’Ivoire</territory>
<territory type="CI" alt="variant">Elfenbeinküste</territory>
<territory type="DE">Deutschland</territory>
</territories>
</localeDisplayNames>
<delimiters>
<quotationStart>„</quotationStart>
<quotationEnd>“</quotationEnd>
</delimiters>
<dates>
<calenders>
<calender type="generic">
<dateFormats>
<dateFormatLenght type="long">
<dateFormat>
<pattern>d. MMMM y G</pattern>
</dateFormat>
</dateFormatLenght>
</dateFormats>
</calender>
</calenders>
<timeZoneNames>
<zone type="Europe/Vienna">
<exemplarCity>Wien</exemplarCity>
</zone>
</timeZoneNames>
</dates>
</ldml>
The example shows localizations for the names of languages and countries, quotation marks and various information about dates and times, here the pattern for long dates and the specification of a time zone .
The values for a locale can be passed on to sub-locales so that the data does not have to be duplicated unnecessarily. Accordingly , only a small amount of data is given for de-CH
, i.e. Swiss German , the majority being taken from de
, i.e. Standard German . The starting point for inheritance is for all locales root
, after the English word for "root".
The project includes the following data:
- Translations for
- languages
- Writing systems
- countries
- Date and Time
- Names of calendars
- Formats for points in time and time intervals
- Time zone names
- numbers
- Characters for thousands separator , decimal separator , sign and others
- different number formats
- Rules for representing numbers in words
- Names and symbols for currencies
- Plural rules
- Names of units
- Formats for postal codes
- Adjustments to the Unicode segmentation algorithms (e.g. information on abbreviations whose period does not represent the end of a sentence)
- Sorting rules that are used in the Unicode Collation Algorithm and its extensions
- Language-specific rules for the Unicode casing algorithms
- Rules for transliterations
The data is available for more than 740 locales spanning over 200 different languages, but not all of them.
use
Libraries are available for all common programming languages , including those of the ICU project, to use the CLDR data .
CLDR is used in numerous software products, Apple is used for example in its operating systems Mac OS X and iOS , Google Inc. in its Web applications and the browser Google Chrome . MediaWiki , the software that operates Wikipedia, among other things, also uses CLDR for its various language versions.
CLDR also contains conversion tools to get POSIX locales from the data .
Web links
- Common Locale Data Repository
- Locale Explorer of the ICU project
Individual evidence
- ↑ Acknowledgments . Unicode CLDR Project; accessed November 13, 2013
- ↑ CLDR 35.1 Release Note . Unicode CLDR Project; accessed August 21, 2019
- ↑ Unicode Technical Standard # 35 - Unicode Locale Data Markup Language (LDML) . Retrieved November 13, 2013
- ↑ CLDR 24 release note . Unicode CLDR Project; accessed November 13, 2013
- ↑ Who uses CLDR? Unicode CLDR Project; accessed November 13, 2013
- ↑ mw: Extension: CLDR
- ↑ POSIX data . Unicode CLDR Project; accessed November 13, 2013