Internationalized domain name
As internationalized domain names ( internationalized domain name , IDN ), colloquially umlaut domain or special characters domain are domain names called, the umlauts , diacritics or characters from other alphabets other than the Latin alphabet contained. Such characters were not originally intended in the Domain Name System and were subsequently made possible by the Internet standard Internationalizing Domain Names in Applications ( IDNA ).
The share of IDNs in all registered domains below .de is around four percent.
Unicode domain names lead to be ASCII -compatible encodings ( English ASCII Compatible Encoding ; ACE ) converted. The conversion takes place at the client (e.g. the browser or mail program) so that the server infrastructure does not have to be adapted. Instead of the Unicode strings, the user can also enter the ACE strings directly in the client. This means that clients without IDN capability can also work with internationalized domains, provided the user knows the ACE string. However, this is more cumbersome because the user cannot easily read the Unicode domain name from an ACE string.
In the original IDNA2003 ( RFC 3490 ) process, the domain names were first normalized using the Nameprep process. The normalization consisted of replacing all uppercase letters with lowercase letters and swapping out equivalent characters. For example, “ß” was specified as equivalent to “ss”, so that the domain names “STRaße” and “strasse” were identical. With the new version IDNA2008 , which is partly also known as IDNAbis and was developed from 2008 to 2010 ( RFC 5890 , RFC 5891 , RFC 5892 , RFC 5893 , RFC 5894 ), normalization is no longer part of IDNA, but is in the User interface responsibility . IDNA2008 no longer prescribes normalization, but recommends a general algorithm in which the conversion from uppercase to lowercase letters and a few other rules are still provided. With .de it has been possible since November 16, 2010 (for owners of a domain with “ss” even earlier) to register separate domains with “ß”.
Following the normalization, the non-ASCII characters are removed from the name using Punycode and an ASCII string derived from it is added at the end of the name, in which the position and type of the Unicode character is coded. To distinguish an IDN from an ASCII domain name, the punycode string begins with the prefix xn-- . The unusual character string xn-- was chosen because it practically does not occur in real words or proper names and therefore conflicts with ASCII domains are extremely unlikely.
Incompatibilities of IDNA2003 and IDNA2008
The Unicode Technical Standard 46 describes measures with which the incompatibilities between IDNA2003 and IDNA2008 are to be minimized in practice in order to facilitate the switch from IDNA2003 to IDNA2008. But even three years after its introduction, browser support for IDNA2008 is still poor (see also section Support in the browser ): Since IDNA2003 converts “ß” to “ss”, the new “ß” domains are often not accessible or referenced the previous "ss" domains. As long as the "ß" domain and the "ss" domain belong to the same offer, the user usually does not notice anything; However, if the “ß” domain and the “ss” domain belong to different offers, this sometimes leads to confusion.
In addition, IDNA2008 no longer allows about 8000 Unicode characters that were still valid components of domain names after IDNA2003, so that previously valid domain names that contain these characters become invalid when switching from IDNA2003 to IDNA2008.
dömäin.example → xn--dmin-moa0i.example äaaa.example → xn--aaa-pla.example aäaa.example → xn--aaa-qla.example aaäa.example → xn--aaa-rla.example aaaä.example → xn--aaa-sla.example déjà.vu.example → xn--dj-kia8a.vu.example efraín.example → xn--efran-2sa.example ñandú.example → xn--and-6ma2c.example foo.âbcdéf.example → foo.xn--bcdf-9na9b.example موقع.وزارة-الاتصالات.مصر → xn--4gbrim.xn----ymcbaaajlc6dj7bxne2c.xn--wgbh1c ☃.example → xn--n3h.example (erlaubt nach IDNA2003, aber unzulässig nach IDNA2008) fußball.example → xn--fuball-cta.example (wird nach IDNA2003 zwingend zu fussball.example, nicht jedoch nach IDNA2008)
A Whois query of the form
whois -h whois.denic.de -- -C ISO-8859-1 example.comor
whois -h whois.denic.de -- -C UTF-8 example.comon Unicode-based systems supplies u for registered domains. a. the spelling in Punycode .
IDN top-level domains have existed since May 2010, and thus complete domains made up of non-Latin letters. For example, there is the top-level domain .مصر , which is the Arabic word for Egypt ( Misr ); the website of the Egyptian Ministry of Communication and Information Technology can be reached via the domain consisting entirely of Arabic characters
موقع.وزارة-الاتصالات.مصر. The domain name should be read from right to left according to Arabic.
Below is a list of some top-level domains which non-ASCII characters are allowed in the respective IDN domains:
- .com and .net
- à á â ã ä å æ ā ă ą ç ć ĉ ċ č ď đ è é ê ë ē ĕ ė ę ě ĝ ğ ġ ģ ĥ ħ ì í î ï ĩ ī ĭ į ı ð ĵ ñ ĸ ĺ ļ ľ ł ł ĸ ĺ ļ ń ņ ň ŋ ò ó ô õ ö ø ō ŏ ő œ ŕ ŗ ř ś ŝ ş š ţ ť ŧ ù ú û ü ũ ū ŭ ů ű ų ŵ ý ŷ ÿ ź ż ž þ
- á ä å æ ā ą ć č é ē ė ę ģ í ī į ð ķ ļ ł ñ ń ņ ó ö ø ō ő ŗ ś š ú ü ū ű ų ý ź ż ž þ
- ä ö ü
- à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø œ š ù ú û ü ý я ž þ
- .ch and .li
- à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø œ ù ú û ü ý ÿ þ
- à á â ã ä å æ ā ă ą ç ć ĉ ċ č ď đ è é ê ë ē ĕ ė ę ě ĝ ğ ġ ģ ĥ ħ ì í î ï ĩ ī ĭ į ı ð ĵ ñ ĸ ĺ ļ ľ ł ł ĸ ĺ ļ ń ņ ň ŋ ò ó ô õ ö ø ō ŏ ő œ ŕ ŗ ř ś ŝ ş š ţ ť ŧ ù ú û ü ũ ū ŭ ů ű ų ŵ ý ŷ ÿ ź ż ž þ ß
- à á â ã ä å æ ā ă ą ç ć ĉ ċ č ď đ è é ê ë ē ĕ ė ę ě ĝ ğ ġ ģ ĥ ħ ì í î ï ĩ ī ĭ į ı ð ĵ ñ ĺ ļ ľ ŀ ł ł ĺ ļ ľ ń ņ ň ŉ ŋ ò ó ô õ ö ø ō ŏ ő œ ŕ ŗ ř ś ŝ š ș ť ŧ ț ù ú û ü ũ ū ŭ ů ű ų ŵ ý ŷ ÿ ź ż ž þ ΐ ά έ ή ί ΰ α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω ϊ ϋ ό ύ ώ а б в г д е ж з и й к л м у о п р с т т о п р с т т х ц ч ш щ ъ ы ь э ю я ἀ ἁ ἂ ἃ ἄ ἅ ἆ ἇ ἐ ἑ ἒ ἓ ἔ ἕ ἠ ἡ ἢ ἣ ἤ ἥ ἦ ἧ ἰ ἱ ἲ ἳ ἴ ἵ ἶ ἷ ὀ ὁ ὂ ὃ ὑ ὒ ὐ ὓ ὔ ὕ ὖ ὗ ὠ ὡ ὢ ὣ ὤ ὥ ὦ ὧ ὰ ά ὲ έ ὴ ή ὶ ί ί ὸ ό ὺ ύ ὼ ώ ᾀ ᾀ ᾂ ᾂ ᾃ ᾄ ᾅ ᾆ ᾇ ᾐ ᾑ ᾒ ᾓ ᾔ ᾕ ᾖ ᾗ ᾠ ᾡ ᾢ ᾣ ᾧ ᾰ ᾱ ᾲ ᾳ ᾴ ᾶ ᾷ ῂ ῃ ῄ ῆ ῇ ῐ ῑ ῒ ΐ ῖ ῗ ῠ ῡ ῢ ΰ ῤ ῥ ῥ ῦ ῧ ῲ ῳ ῴ ῶ ῷ
Support in the browser
Support for internationalized domain names is common in current browsers, at least according to IDNA2003. In contrast, IDNA2008 was hardly supported by any browser in 2013 either.
Some IDNA2003 capable browsers:
- Firefox from version 0.8
- Konqueror from KDE 3.2 with GNU IDN library
- Internet Explorer version 7.0 or higher
- Mozilla Application Suite from version 1.4
- Netscape Navigator from version 7.1
- Opera from version 7.11
- Safari from version 1.2 (v125)
- SeaMonkey version 1.0 or higher
Some IDNA2008-capable browsers (as of December 2016):
- Firefox (since Firefox Nightly 46.0a1)
- Safari from version 10.1 (from  (Safari Technology Preview 19))
ASCII spoofing problem (→ homographic attack )
The use of Unicode in domain names makes it easier to spoof web pages as the visual representation of the IDN string in a browser sometimes makes it impossible to distinguish a legitimate page from a spoofed one, depending on the character set used. For example, the Unicode character U + 0430, the Cyrillic lower case а, looks like the Unicode character U + 0061, which corresponds to the lower case letter a of the Latin writing system. Said Cyrillic character is z. B. Part of the above list of possible characters within .eu.
- denic IDN web converter , converts IDNs to ACE strings and vice versa.
- denic: FAQ on IDNs
- Frequently asked questions about IDNs and IDN suitability tests for browsers
- Austria: List of 34 new characters (.at). nic.at, archived from the original on May 4, 2016 .
- Germany: List of 93 new characters (.de) , denic
- Switzerland / Liechtenstein: List of the 32 new characters (.ch and .li) , SWITCH
- Table of IDNA characters , unicode.org
- Statistics of the domain development on denic.de
- "ß" in future in a permitted character set for .de domains , DENIC press release , October 26, 2010
- Unicode Technical Standard # 46 - Unicode IDNA Compatibility Processing , The Unicode Consortium, accessed January 24, 2019
- Internationalized Domain Names (IDN) FAQ - How does IDNA2008 differ from IDNA2003? , The Unicode Consortium, accessed January 24, 2019
- IDNA Hell , Anne van Kesteren, November 27, 2012, accessed January 24, 2019
- The first completely non-Latin domains go online at Heise-online
- IDNs at nic.at ( Memento of the original from February 10, 2007 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice.
- General terms and conditions for the registration and administration of domain names under ".ch" and ".li", Appendix 2
- DENIC IDN list
- Supported characters ( Memento of the original from July 29, 2013 in the Internet Archive ) Info: The archive link was automatically inserted and not yet checked. Please check the original and archive link according to the instructions and then remove this notice. . The European Registry of Internet Domain Names.
- Mozilla Bug 479520