REGISTRATION RULES

REGISTRATION RULES

The Verisign Shared Registration System (SRS) supports IDN (Internationalised Domain Names) containing various Unicode scripts.


Verisign has developed a policy for IDN registrations specifying permissible and prohibited code points. The policy is implemented using the following five validation rules. IDNs that adhere to these five rules are considered valid registrations.

1. IETF STANDARDS

The IDNA2008 specification defines rules and algorithms that permit/prohibit Unicode points in IDN registrations. Verisign is fully in compliance with all of the RFC documents that comprise the IDNA2008 standard. Please review the IETF Standards.

2. RESTRICTIONS ON SPECIFIC LANGUAGES

All IDN registrations require a 3-letter Language Tag. CHI, for instance, is for the Chinese language. If the Language Tag associated with the registration is in the following table, then Verisign has a List of Included Characters for that language. The requested IDN must be entirely contained within this List of Included Characters. If even one code point from the IDN is not a valid character for this language, then the registration is rejected.

The following table lists the languages that have an associated List of Included Characters.

LANGUAGE TAG LANGUAGE
AZE Azerbaijani
BEL Belarusian
BUL Bulgarian
CHI Chinese
GRE Greek
JPN Japanese
KOR Korean
KUR Kurdish
MAC Macedonian
MOL Moldavian
POL Polish
RUS Russian
SCC Serbian
SCR Croatian
SRP Serbian
UKR Ukrainian

3. RESTRICTIONS ON COMMINGLING OF SCRIPTS

If the Language Tag specified in the IDN registration is not in the above table and so does not have a List of Included Characters, then Verisign applies an alternative restriction to prevent commingling of different scripts in a single domain.

The Unicode Standard defines a set of Unicode Scripts by assigning each code point exactly one Unicode Script value. As a rule, Verisign’s registries reject the commingling of code points from different Unicode scripts. That is, if an IDN contains code points from two or more Unicode scripts, then that IDN registration is rejected. For example, a character from the Latin script cannot be used in the same IDN with any Cyrillic character. All code points within an IDN must come from the same Unicode script. This is done to prevent confusing code points from appearing in the same IDN.

Again, this rule only applies to Languages for which there is not a strictly defined List of Included Characters. For example, the FRE Language Tag, indicating the French language, does not have a strict List of Included Characters. Therefore the commingling rule applies. All code points in a French domain must come from a single Script. However that script may be any of the valid Unicode-defined Scripts.

The following table lists Unicode Scripts and the associated table of allowed code points.

Unicode Scripts and Associated Code Points
Arabic Georgian Latin Rejang
Armenian Glagolitic Lepcha Runic
Avestan Greek Limbu Samaritan
Balinese Gujarati Lisu Saurashtra
Bamum Gurmukhi Lycian Sinhala
Batak Han Lydian Sundanese
Bengali Hangul Malayalam Syloti Nagri
Bopomofo Hanunoo Mandaic Syriac
Brahmi Hebrew Meetei Mayek Tagalog
Buginese Hiragana Mongolian Tagbanwa
Buhid Imperial Aramaic Myanmar Tai Le
Canadian Aboriginal Inscriptional Pahlavi New Tai Lue Tai Tham
Carian Inscriptional Parthian Nko Tai Viet
Cham Javanese Ogham Tamil
Cherokee Kaithi Ol Chiki Telugu
Coptic Kannada Old Persian Thaana
Cuneiform Katakana Old South Arabian Thai
Cyrillic Kayah Li Old Turkic Tibetan
Devanagari Kharoshthi Oriya Tifinagh
Egyptian Hieroglyphs Khmer Phags Pa Vai
Ethiopic Lao Phoenician Yi

For a comprehensive list of all Unicode Points allowed for IDN registration, click here.

4. ICANN’S RESTRICTED UNICODE POINTS

The Verisign SRS also adheres to ICANN’s Guidelines for the Implementation of Internationalised Domain Names Section 5 of the document outlines characters that are allowed by the IETF standard, but should be prohibited for IDN registration. For this reason, the Verisign SRS prohibits those Unicode code points in all registrations. A complete list of ICANN’s restricted Unicode points is here.

5. SPECIAL CHARACTERS

There are exactly two (2) Unicode characters whose latest definitions are not backwards compatible with previous versions of the IDNA Standard. The Latin Sharp S and Greek Final Sigma were previously mapped to other characters. Clients and Registries compliant with the older standard would, for instance, map a Latin Sharp S into two lowercase Latin letter S characters. This mapping is irreversible. The latest version of the IDNA standard does not apply to this mapping. So, whereas the Latin Sharp S was previously prohibited (mapped into other characters), the latest standard allows Registries to accept this character at their own discretion.

Since these changes are not backwards compatible, Verisign has elected to continue to disallow these two (2) characters, until a clear and fair approach to their registration has been reached and communicated.

CHARACTER UNICODE POINT GLYPH
Latin Small Letter Sharp S U+00DF ß
Greek Small Letter Final Sigma U+03C2 ς


NEED MORE INFO?