Windows-1252
874 | Thai |
932 | Japanese |
936 | Simplified Chinese |
949 | Korean |
950 | Traditional Chinese |
1250 | Central European |
1251 | Cyrillic |
1252 | Western European |
1253 | Greek |
1254 | Turkish |
1255 | Hebrew |
1256 | Arabic |
1257 | Baltic |
1258 | Vietnamese |
Windows-1252 also CP 1252 as well as Western European or ANSI . is an 8-bit character encoding that was developed for the Microsoft Windows operating system . The character set is based on ISO 8859-1 (Latin-1), but deviates from this in the range 80 16 - 9F 16 , instead of the (very rarely used) C1 control characters , these 32 positions contain 27 displayable characters, among others. a. the characters added in ISO 8859-15 and some necessary for better typography .
Some applications mix the definition of ISO 8859-1 and Windows-1252. Since the additional control characters from ISO 8859-1 have no meaning in HTML either, the HTML5 standard stipulates that texts marked as ISO 8859-1 are to be interpreted as Windows-1252. Nonetheless, Windows-1252 is also registered with the IANA. In January 2019, 3.5% of all websites use the character encoding implicitly as ISO 8859-1, with 0.6% of the websites Windows-1252 is used explicitly, with a falling trend. Latin-1 is the second most common coding of websites after UTF-8 (93.0%), Windows-1252 is the fourth most common after Windows-1251 . The differences between all of these encodings and a general lack of consistency in supporting different character sets are common interoperability problems.
code | … 0 | …1 | … 2 | … 3 | … 4 | … 5 | … 6 | … 7 | …8th | … 9 | … A | … B | ... C | … D | … E | ... F |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 ... | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | HT | LF | VT | FF | CR | SO | SI |
1… | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
2… | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3… | 0 | 1 | 2 | 3 | 4th | 5 | 6th | 7th | 8th | 9 | : | ; | < | = | > | ? |
4… | @ | A. | B. | C. | D. | E. | F. | G | H | I. | J | K | L. | M. | N | O |
5… | P | Q | R. | S. | T | U | V | W. | X | Y | Z | [ | \ | ] | ^ | _ |
6… | ` | a | b | c | d | e | f | G | H | i | j | k | l | m | n | O |
7… | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | DEL |
8th… | € | ‚ | ƒ | " | ... | † | ‡ | ˆ | ‰ | Š | ‹ | Œ | Ž | |||
9 ... | ' | ' | " | ” | • | - | - | ˜ | ™ | š | › | œ | ž | Ÿ | ||
A ... | NBSP | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | SHY | ® | ¯ |
B ... | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
C ... | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
D ... | Ð | Ñ | O | O | O | O | Ö | × | O | Ù | Ú | Û | Ü | Ý | Þ | ß |
E ... | à | á | â | ã | Ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
F ... | ð | ñ | O | O | O | O | ö | ÷ | O | ù | ú | û | ü | ý | þ | ÿ |
The colored code points represent changes compared to ISO 8859-1: Yellow fields are occupied, green fields are not used.
Since Unicode is based on ISO 8859-1 and not on Windows-1252, the Unicode code points of the characters not highlighted in color are identical to the code values in Windows-1252, but not those with a colored background:
… 0 | …1 | … 2 | … 3 | … 4 | … 5 | … 6 | … 7 | …8th | … 9 | … A | … B | ... C | … D | … E | ... F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8th… |
€ U + 20ac |
' U + 201a |
ƒ U + 0192 |
" U + 201e |
... U + 2026 |
† U + 2020 |
‡ U + 2021 |
U + 02c6 |
‰ U + 2030 |
Š U + 0160 |
‹ U + 2039 |
Œ U + 0152 |
Ž U + 017d |
|||
9 ... |
' U + 2018 |
' U + 2019 |
“ U + 201c |
" U + 201d |
• U + 2022 |
- U + 2013 |
- U + 2014 |
˜ U + 02dc |
™ U + 2122 |
š U + 0161 |
› U + 203a |
œ U + 0153 |
ž U + 017e |
Ÿ U + 0178 |
Differences between ISO 8859-1, ISO 8859-15, Windows-1252 and Unicode
In addition to the characters from ISO 8859-1 , Windows-1252 also contains those characters that were added in ISO 8859-15 and replace some less often used characters from ISO 8859-1. However, the position of these characters differs between Windows-1252 and ISO 8859-15 as well as the encoding in Unicode. All characters that do not appear in one of the two ISO encodings occupy the following positions.
character | € | Š | š | Ž | ž | Œ | œ | Ÿ | ¤ | ¦ | ¨ | ´ | ¸ | ¼ | ½ | ¾ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ISO 8859-1 | - | - | - | - | - | - | - | - | A4 | A6 | A8 | B4 | B8 | BC | BD | BE |
ISO 8859-15 | A4 | A6 | A8 | B4 | B8 | BC | BD | BE | - | - | - | - | - | - | - | - |
Windows-1252 | 80 | 8A | 9A | 8E | 9E | 8C | 9C | 9F | A4 | A6 | A8 | B4 | B8 | BC | BD | BE |
Unicode | 20AC | 160 | 161 | 17D | 17E | 152 | 153 | 178 | A4 | A6 | A8 | B4 | B8 | BC | BD | BE |
character | ‚ | ƒ | " | ... | † | ‡ | ˆ | ‰ | ‹ | ' | ' | " | ” | • | - | - | ˜ | ™ | › |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ISO 8859-1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
ISO 8859-15 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
Windows-1252 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 8B | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 9B |
Unicode | 201A | 192 | 201E | 2026 | 2020 | 2021 | 2C6 | 2030 | 2039 | 2018 | 2019 | 201C | 201D | 2022 | 2013 | 2014 | 2DC | 2122 | 203A |
Individual evidence
- ↑ Microsoft Windows code page: 1252 (Latin I). Microsoft , archived from the original on May 8, 1999 ; accessed on September 27, 2019 .
- ↑ HTML 5.1 Nightly Editor's Draft February 19, 2013, 8.2.2.2 Character encodings , accessed February 19, 2013.
- ↑ iana.org
- ↑ Character encoding w3techs.com.
- ↑ Faq w3techs.com.