Top Level Domain Name Specification
Netnod Internet Exchange
Box 30194SE-104 25 StockholmSwedenliman@netnod.sehttp://www.netnod.se/ICANN4676 Admiralty WaySuite 330Marina del Rey90292USAjoe.abley@icann.orghttp://www.icann.org/Individual SubmissionThe syntax for allowed Top-Level Domain (TLD) labels in
the Domain Name System (DNS) is not clearly applicable to
the encoding of Internationalised Domain Names (IDNs) as
TLDs.This document provides a concise specification of TLD label
syntax based on existing syntax documentation, extended
minimally to accommodate IDNs.This document updates RFC1123.The syntax of TLD labels ("TLD DNS-Labels", as defined in
) is specified
in , where such labels are asserted
to be "alphabetic" within a section of that document entitled
"DISCUSSION". This can be interpreted as requiring
that the hyphen character ("-") and numeric digits be
excluded from TLD DNS-Labels. Such a restriction would not
accommodate the US-ASCII encoding of Internationalised
Domain Names (IDNs), as specified in . A more detailed discussion
of the existing specifications can be found in .This document extends the syntax of allowable TLD DNS-Labels
to support IDNs, but places some restrictions on the choice
of IDN labels. These restrictions are intended to be
consistent with the existing specification for US-ASCII TLD
DNS-Labels. See for the
updated specification.This document focuses narrowly on the issue of allowable
DNS-Labels in TLDs and does not (and is not intended to)
make any other changes or clarifications to existing domain
name syntax rules.It is carefully noted that the specification in this
document is not the only factor in choosing suitable TLD
DNS-Labels, and that many considerations external to the
IETF are included in that wider policy. See for more discussion of policy considerations.The term DNS-Label is used in this document to have precisely
the same meaning as the term "label", as introduced in , section 3.1. A DNS-Label denotes one
node in a DNS tree. A DNS-Label is zero to 63 octets in
length. The term "DNS-Label" refers exclusively to the
"wire format" of the label, and not to any presentation
format of the label.A Top-Level Domain (TLD) DNS-Label is the right-most
("highest-level") DNS-Label in a fully-qualified domain
name.The terms A-Label and U-Label are used in this document
as defined in . defines a host name as follows:
'A "name" ... is a text string up to 24 characters
drawn from the alphabet (A-Z), digits (0-9), minus sign
(-), and period (.). Note that periods are only allowed
when they serve to delimit components of "domain style
names". (See RFC-921, "Domain Name System Implementation
Schedule", for background). No blank or space characters
are permitted as part of a name. No distinction is
made between upper and lower case. The first character
must be an alpha character. The last character must
not be a minus sign or period.' [Unnumbered section
titled "ASSUMPTIONS", first paragraph] reaffirms this definition, but
makes one change to the syntax:
'The syntax of a legal Internet host name was specified
in RFC-952 [DNS:4]. One aspect of host name syntax is
hereby changed: the restriction on the first character
is relaxed to allow either a letter or a digit. Host
software MUST support this more liberal syntax.' [Section
2.1]
In addition, the DISCUSSION section of Section 2.1 says:
'However, a valid host name can never have the
dotted-decimal form #.#.#.#, since at least the
highest-level component label will be alphabetic.'
[Section 2.1]
Some implementers may have understood the above phrase 'will
be alphabetic' to be a protocol restriction.Neither nor
explicitly states the reasons for these restrictions. It
might be supposed that human factors were a consideration;
appears to suggest that one of the
reasons was to prevent confusion between dotted-decimal
IPv4 addresses and host domain names. In any case, it is
reasonable to believe that the restrictions have been assumed
in some deployed software, and that changes to the rules should
be undertaken with caution.The Internationalised Domain Names in Applications 2008
specification (IDNA2008)
provides a protocol for encoding Unicode strings in DNS-Labels.
The Unicode string used by applications is known as a
U-Label; its corresponding encoding in the DNS is known as
an A-Label. The terms A-Label and U-Label are used in this
document as defined in .
Valid A-Labels always contain non-alphabetic characters.In order to accommodate the wish to express TLD names in
scripts other than Latin (or rather, the US-ASCII subset
of Latin), it is necessary to allow non-alphabetic characters
in the corresponding TLD DNS-Labels. To minimize changes, the
U-label form of a TLD name is restricted in ways functionally
compatible with the restrictions (from
and ) on US-ASCII TLD names, by
applying rules analogous to those already imposed on US-ASCII
TLD DNS-Labels to TLD U-labels.However, deployed software that checks DNS top-level labels
for conformance with an alphabetic restriction will not
recognize such corresponding A-Labels (i.e., U-labels
represented in their US-ASCII form).This document relaxes the existing specification to allow
TLD DNS-Labels to be well-formed A-Labels, but places
restrictions on their corresponding U-Labels. That is, not
every well-formed A-Label is a valid TLD DNS-Label.A Restricted-A-Label is a DNS-Label which satisfies all
the following conditions:
the DNS-Label is a valid A-Label according to ;the derived property value of all code points, as defined
by , is PVALID;the general category of all code points, is one of {
Ll, Lo, Lm, Mn, Mc }.
This new specification reflects current practice in
registration of TLD names by the IANA, extended to accommodate
IDNs.This document provides a technical specification that
limits the set of TLD DNS-Labels that are available for
assignment; it does not aim to encapsulate the full policy
framework within which TLD names are chosen.At the time of writing, the policy under which TLD names
are chosen is developed and maintained by ICANN in consultation
with a wide base of stakeholders. As the Internet continues
to grow to serve new user communities, applications and
services, it is to be expected that the corresponding policy
will be changed accordingly.While this document makes no requests of the IANA, management
of the root zone is an IANA function. This document expands
the set of strings permitted for delegation from the root
zone, and hence establishes new limits for the corresponding
IANA policy.This document is believed to have limited security
implications.General discussion about the security effects of
internationalized labels can be found in , section 4. Those
considerations apply equally to TLD labels.The creation of new TLDs has the potential to conflict
with software which (for example) predates and correspondingly
does not accommodate new TLD names. Such software problems
might in turn lead to security vulnerabilities, e.g. in the
case where a DNS name specified by a user is truncated or
otherwise misinterpreted, causing an application to interact
with a different remote host from that which the user
intended. It should be noted that this is not a new
phenomenon, and has been observed following the creation
of new (US-ASCII) TLD names prior to the publication of
this document.The issue that some Unicode characters can be confused
with each other is discussed at length in the Security
Considerations section of .Tina Dam, Patrik Faltstrom, John Klensin, Thomas Narten
and Andrew Sullivan contributed text to this document, and
their contributions are hereby acknowledged.Requirements for Internet Hosts - Application and
Support University of Southern California (USC),
Information Sciences InstituteInternationalized Domain Names for Applications (IDNA): Definitions and Document FrameworkInternationalized Domain Names in Applications (IDNA): ProtocolThe Unicode Code Points and Internationalized Domain Names for Applications (IDNA)DoD Internet host table specificationSRI InternationalSRI InternationalSRI InternationalDomain
names - concepts and facilitiesInformation Sciences Institute
(ISI)This section (and sub-sections) should be removed before
publication.Add Mc as an allowable code-point, required for names
in Devanagari script.New affiliation and address for Liman, due to company
merger.Removed subjective and unverified statements regarding
deployed software. Replaced with more generic text. Polishing
a few expressions to make them less obtrusive. Removed
confusing paragraph after ABNF table. Updated some references
that are now published as RFCs.More wordsmithing, and explanatory text. Work on the IANA
and the security considerations sections.Wordsmithing and rearrangement of text following discussions
with Joe Abley, Tina Dam, Thomas Narten and Andrew Sullivan.
Incorporated revised ABNF and associated specification
from Patrik Faltstrom. Tightened definitions and introduced
the term "DNS-Label" to avoid ambiguity with various other
uses of the word "label".Substantial comments and improvements supplied by Thomas
Narten and John Klensin. Decided to go for a minimal
change approach. Also noted that U-labels have to be
letters due to jumping digit problem. Rewritten major
parts.First cut. Prompted by Olafur Gudmundsson and Tina Dam.
$Id: draft-liman-tld-names.xml,v 1.40 2011/04/12 08:20:42 liman Exp $