HTML entity

from Wikipedia, the free encyclopedia

An HTML entity is an entity (i.e. a clearly delimitable character string with a special meaning) that is used in HTML (i.e. the text-based markup language in which, for example, websites can be formulated). Are often used where numeric entities and named entities to characters to describe (if they specifically in the selected for the website coding or used to compile input method are not available.) Also, certain control characters can be so visibly displayed in the text.

Numeric entities

A numeric entity denotes a character by its Unicode code point. Two formats are defined for this:

  • &#nnn;- nnn represents the code point as a decimal number (without leading zeros).
  • &#xhhhh;- hhhh represents the code point as a hexadecimal number , that is, as the Unicode code point is usually specified (without the introductory "U +"). Leading zeros can be specified and are common for less than four-digit values ​​so that the value is the same as the four-digit Unicode code point specification.

Regardless of the code (" charset ") in which the HTML document is available, only the code point in Unicode applies. This means that numeric entities in the range €to Ÿor hexadecimal €to are Ÿincorrect if they are used to represent characters that have code points in this range in the Windows-1252 code. These include a. the characters € and ‰, the letters Œ, œ, Š, š, Ÿ, Ž and ž as well as various quotation marks and dashes . Characters with Unicode code points from U+0080to U+009Fdo not normally appear in texts.

Named Entities

A named entity has the format - aaa represents a name consisting of uppercase and lowercase letters from the basic Latin alphabet and digits, which uniquely identifies the character to be designated. Upper and lower case letters are to be used exactly and can have different meanings. The names are determined by the W3C ( World Wide Web Consortium ). &aaa;

Examples

character Unicode designation Decimal
code
Numeric entity Named
Entity
position Surname decimal hexadec.
· U + 00B7 middle dot Half-high point 0183 & # 183; & # x00B7; · middot;
ſ U + 017F latin small letter long s long s 0383 & # 383; & # x017F; (no)
U + 2030 per mille sign Alcohol symbol 8240 & # 8240; & # x2030; & permil;
? U + 1F5B7 fax icon Fax icon 128439 & # 128439; & # x1F5B7; (no)

The fact that the alcohol symbol 0137can also be written with the decimal code under Windows cannot be used for HTML entities.

Individual evidence

  1. W3C ( World Wide Web Consortium ): Character entity references in HTML 4 - List of named entities that are available in HTML 4 (and thus, for example, for the creation of Wikipedia articles)
  2. W3C ( World Wide Web Consortium ): Character entity reference chart - List of named character entities available in HTML 4 and HTML5