Private use area

from Wikipedia, the free encyclopedia

In Unicode , special areas are designated as Private Use Areas ( PUA , English for "Unicode areas for personal use"). The code points in these areas are never assigned characters standardized in Unicode itself. This means that these can be used for privately defined characters that have to be agreed individually between the creators and users of the texts they contain. Such agreements can e.g. For example, to share a font file in which such characters are provided for use.

Areas of code

The Unicode standard identifies the three areas described below as intended for the users' own use.

Private use zone

The private use zone is on level 0 ( BMP, Basic Multilingual Plane ) and covers the area from U + E000 to U + F8FF. That's 6400 code points.

Which characters a font has defined in the private use zone can be determined with the template: Private-Use-Area-Test .

Private Use Planes

Unicode levels 15 and 16 only contain the two blocks Supplementary Private Use Area-A and -B . Instead of PUA-A and PUA-B , private use plans (PUP) are sometimes also used in summary .

Supplementary Private Use Area-A

The Supplementary Private Use Area-A covers the entire level 15, i.e. the area from U + F0000 to U + FFFFD. That is 65534 code points.

Which characters a font has defined in the Supplementary Private Use Area-A can be determined with the template: Supplementary-Private-Use-Area-A-Test .

Supplementary Private Use Area-B

The Supplementary Private Use Area-B covers the entire level 16, i.e. the area from U + 100000 to U + 10FFFD. That is 65534 code points.

Which characters a font has defined in the Supplementary Private Use Area-B can be determined with the template: Supplementary-Private-Use-Area-B-Test .

use

The assignment of characters to code points is not regulated by the Unicode consortium in these areas. However, there are various organizations and initiatives that coordinate the allocation of character codes in these areas.

Medieval Unicode Font Initiative

The Medieval Unicode Font Initiative (MUFI) coordinates the coding of historical characters, character variants and ligatures and assigns code points from the private use zone, mainly from the U + E000 to U + EFFF area.

Use on Linux

Under Linux, the private use zone was divided into two areas:

  • U + E000… U + EFFF: "End User Zone"
  • U + F000… U + F8FF: "Linux Zone"
    • U + F000… U + F7FF: 1: 1 mapping to the characters of the current console font
    • U + F800… U + F8FF: characters defined throughout Linux that are required / desired under Linux, but are not yet included in Unicode.

The end user zone is freely available to the end user. The Linux zone is reserved for internal operating system purposes. The range from U + F000 to U + F7FF is used to cover a 1: 1 mapping of the screen font used for the console. This enables programs such as consolecharsto display all characters of the currently used screen font without knowing their character encodings. Since the Linux text console supports a maximum of 512 characters in a screen font, this range is more than sufficient. The range from U + F800 to U + F8FF is used for characters that are required or desired under Linux, but which are not (yet) included in the Unicode character set:

Codepoint character comment
U + F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1 With the inclusion of these characters in Unicode 3.2, these 4 code positions are out of date ("deprecated").
U + F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3
U + F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7
U + F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9
U + F810 KEYBOARD SYMBOL FLYING FLAG Button symbol "waving flag" Windows 3 logo simplified.svg= Windows button
U + F811 KEYBOARD SYMBOL PULLDOWN MENU Menu button icon
U + F812 KEYBOARD SYMBOL OPEN APPLE Button symbol " empty apple "Apple logo hollow.svg
U + F813 KEYBOARD SYMBOL SOLID APPLE "Filled apple" button symbol Apple logo black.svg
U + F8D0 Letters and numerals of the fictional Klingon language
U + F8FF

The allocation of code points in the "Linux Zone" is coordinated by the Linux Assigned Names and Numbers Authority (LANANA).

ConScript Unicode Registry

This volunteer project coordinates the ingestion of fictional writings used in novels or films, such as the Middle-earth Fantasy Languages by JRR Tolkien . It assigns code points in all 3 private Unicode blocks and coordinates with LANANA, but not with MUFI.

Other uses

See also

Andreas Stötzner: LINCUA - A Unicode PUA harmonization plan. June 20, 2012. Retrieved August 26, 2012 .

Individual evidence

  1. Michael Everson et al .: Roadmap to the BMP - revision 6.1.0. The Unicode Consortium, February 1, 2012, accessed on August 26, 2012 (The term “Private Use Zone” is not used in the text of the Unicode standard, but can be found on this website officially provided by the Unicode Committee.).
  2. Unicode 6.3 Chapter 2.8, page 34, first paragraph (since the core specification for version 6.3 has not been changed and has not been published again, the files from version 6.2 for 6.3 continue to apply unchanged.)
  3. ^ Medieval Unicode Font Initiative. Retrieved August 21, 2012 .
  4. ^ H. Peter Anvin (ed.): Linux Zone Unicode Assignments. (TXT) The "Linux Assigned Names And Numbers Authority" (LANANA) project, January 17, 2005, accessed September 12, 2012 .
  5. ConScript Unicode Registry. Retrieved August 21, 2012 .
  6. ^ Peter Constable and Lorna A. Priest: SIL Corporate PUA Assignments. April 17, 2012. Retrieved August 21, 2012 .
  7. Chris Harvey: Languagegeek Fonts. June 29, 2012. Retrieved August 21, 2012 .