Mathematical characters in Unicode

Unicode provides a wide range of letters and symbols specifically for use in mathematical formulas . The symbols allow a linear set of formulas , in which superscripts and subscripts for powers and indices or also as integral limits are only possible to a limited extent, and the construction of multi-line structures such as matrices is not possible at all. For such exact positioning, higher-level protocols must be used which, in addition to the pure formula text, also contain instructions on its exact formatting.

General

Characters that are used exclusively or mainly in formulas are identified as such by the Math- property . Mathematical symbols can also be recognized by their general category, this is Sm.

In contrast to most of the other characters, the coding was based on optical principles. With U + 2264 (≤), U + 2266 (≦) and U + 2A7D (⩽) there are three different characters for “less than or equal to” that differ only minimally. In addition to the usual characters, Latin letters are also encoded in 13 different fonts as separate characters. Conversely, characters that have the same appearance but are used for different purposes are often only encoded once: For example, U + 2206 (∆) is used as the Laplace operator for the symmetrical difference , in difference calculus and in physical or chemical formulas as a size change.

Coded characters

The mathematical characters in Unicode come partly from existing standards, such as ISO 9573-13 , on the other hand characters were also included that were used in mathematical or physical publications.

Letters

Latin letters

In addition to the usual Latin letters, letters with special fonts are also available.

Award	Area	block
normal	U + 0041-U + 005A, U + 0061-U + 007A	Basic Latin
fat	U + 1D400-U + 1D433	Math alphanumeric symbols
italic	U + 1D434 – U + 1D467 ^*
bold, italic	U + 1D468-U + 1D49B
calligraphic	U + 1D49C – U + 1D4CF ^*
calligraphic, bold	U + 1D4D0-U + 1D503
fracture	U + 1D504 – U + 1D537 ^*
with double line	U + 1D538 – U + 1D56B ^*
Fracture, bold	U + 1D56C-U + 1D59F
Sans serif	U + 1D5A0-U + 1D5D3
Sans serif, bold	U + 1D5D4-U + 1D607
Sans serif, italic	U + 1D608-U + 1D63B
Sans serif, bold, italic	U + 1D63C-U + 1D66F
Monospace	U + 1D670-U + 1D6A3

^*Individual characters are located in the Unicode block. Letter-like symbols , in particular ℕ, ℝ ff: These were not assigned again for the 1Dxxx numbers.

In connection with accents, the letters i and j are also used in a variant without a point; these are encoded in the Unicode block mathematical alphanumeric symbols at the code points U + 1D6A4 and U + 1D6A5 as italic letters.

Small subscript letters for indices are found in most of the Unicode block superscripts and subscripts , superscripts for powers only n in Unicode blocks superscripts and subscripts. The use of these characters is not recommended in favor of appropriate formatting in a mathematical context.

The Unicode block of letters-like symbols contains some other characters derived from Latin letters that are used in formulas, including the Weierstrass-p .

Greek letters

In addition to the usual Greek letters, letters with a few selected existing script markings are also available. Some characters, such as the little phi , are coded in two different representations. There are also some symbols derived from Greek letters, such as the Nabla .

Award	Area	block
normal	U + 0391 – U + 03E1 ^**	Greek and Coptic
fat	U + 1D6A8-U + 1D6E1, U + 1D7CA, U + 1D7CB	Math alphanumeric symbols
italic	U + 1D6E2-U + 1D71B
bold, italic	U + 1D71C-U + 1D755
Sans serif, bold	U + 1D756-U + 1D78F
Sans serif, bold, italic	U + 1D790-U + 1D7C9
with double line	only a few characters	Letter-like symbols

^**Symbols derived from Greek letters can be found in the Mathematical Operators Unicode block .

Other letters

In the unicode block Arabic mathematical alphanumeric symbols , Arabic letters are coded for use in formulas, some Hebrew letters used in mathematics are coded in the unicode block letter-like symbols as characters that, unlike in Hebrew, do not influence the direction of writing from left to right. In individual cases, letters from other alphabets can also appear in mathematical formulas, such as the Cyrillic Ш (U + 0428) for the Tate-Shafarevich group .

Accents

Often letters in mathematical formulas are accented and other diacritical marks , such as circumflex or macron . Above all, in physics, superimposed points are used for time derivatives. There are also other accents that are only used in formulas, such as the arrow above to identify vectors. In addition to the unicode block combining diacritical marks , the unicode block combining diacritical marks is used for symbols .

numbers

The usual digits are also available in multiple coding for different fonts.

Award	Area	block
normal	U + 0030 - U + 0039	Basic Latin
fat	U + 1D7CE-U + 1D7D7	Math alphanumeric symbols
with double line	U + 1D7D8-U + 1D7E1
Sans serif	U + 1D7E2-U + 1D7EB
Sans serif, bold	U + 1D7EC-U + 1D7F5
Monospace	U + 1D7F6-U + 1D7FF

Small superscript and subscript numbers for powers and indices can be found in the Unicode block superscript and subscript characters (1, 2, 3 in Unicode block Latin-1, supplement ). The use of these characters is not recommended in favor of appropriate formatting in a mathematical context (but common: km²).

Some fractions are encoded in the Unicode block numerals and Unicode block Latin-1, supplement , other fractions can be generated with the fraction bar from the Unicode block General Punctuation . It is provided that the preceding and following numbers are determined in the display and are each formatted as a whole as a numerator and denominator. There is no direct support for the representation of fractions whose numerators or denominators are not ordinary numbers, but letters for variables.

Symbols

Arrows

There are four blocks for arrows : Arrows , Additional arrows-A , Additional arrows-B , Additional arrows-C and Various symbols and arrows , the latter also containing some geometric symbols.

Operators, relation signs

Four blocks are provided for operators , relational symbols, and other math symbols: Math operators , Various math symbols-A , Various math symbols-B, and Additional math operators . Some elementary symbols are in the Basic Latin Unicode block . If a negated relation is not specially coded, it can be created with combining symbols .

Other symbols

Some other blocks also contain symbols that can appear in formulas. These include the Unicode block Various technical characters , in which, among other things, characters are defined from which large brackets can be put together from several pieces. The Geometric Shapes Unicode block contains various triangles, squares, circles, and other shapes for general use. In addition to spaces of different widths, the Unicode block General Punctuation also contains some invisible characters that can semantically structure formulas: An implicit multiplication can be expressed using the character U + 2062.

Variant selectors

For some symbols, display variants are possible using variant selectors. The negation line in U + 2268 (≨) is normally inclined, the combination <U + 2268, U + FE00>, on the other hand, should be displayed with a vertical line.

use

TeX and LaTeX are older than Unicode, so traditionally use adapted fonts to represent formulas. With unicode-math there is a LaTeX package that on the one hand allows most of the Unicode characters for mathematical symbols in the input instead of the usual commands, and on the other it also uses these in the output. Other systems for formatting formulas, such as MathML , use all Unicode characters directly and thus benefit from the large number of Unicode characters for formulas.

In some programming languages , with the help of preprocessors and similar methods, it is possible to a certain extent to use these Unicode characters in the program code and thus make the formulas more readable.

swell

Julie D. Allen et al .: The Unicode Standard. Version 6.2 - Core Specification. The Unicode Consortium, Mountain View, CA, 2012. ISBN 978-1-936213-07-8 . Chapter 15: Symbols. ( online , PDF)
Barbara Beeton, Asmus Freytag and Murray Sargent III: Unicode Technical Report # 25: Unicode Support for Mathematics. ( online , PDF)

Individual evidence

^ Will Robertson, Philipp Stephani and Khaled Hosny: Experimental Unicode mathematical typesetting: The unicode-math package. Version of July 28, 2012. ( online , PDF)
^ Murray Sargent III: Unicode Nearly Plain-Text Encoding of Mathematics. Unicode Technical Note # 28, version of March 10, 2010. ( online , PDF)

[1] Will Robertson, Philipp Stephani and Khaled Hosny: Experimental Unicode mathematical typesetting: The unicode-math package. Version of July 28, 2012. ( online , PDF)

[2] Murray Sargent III: Unicode Nearly Plain-Text Encoding of Mathematics. Unicode Technical Note # 28, version of March 10, 2010. ( online , PDF)