LTRU D. Ewell, Ed.
Internet-Draft Consultant
Intended status: Informational September 28, 2006
Expires: April 1, 2007
Update to the Language Subtag Registrydraft-ietf-ltru-4645bis-00
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 1, 2007.
Copyright Notice
Copyright (C) The Internet Society (2006).
Ewell Expires April 1, 2007 [Page 1]

Internet-Draft Update to the Language Subtag Registry September 20061. Introduction
[RFC4646] provided for a Language Subtag Registry and described its
format. The initial contents of the Registry and rules for
determining them were specified in [RFC4645].
[draft-4646bis] expands on [RFC4646] by adding support for almost
7,200 primary and extended language subtags based on [ISO639-3]
alpha-3 code elements. This memo describes the process of updating
the Registry to include these additional subtags, and to make
secondary changes to the Registry that result from adding the new
subtags.
In its initial phase as an Internet-Draft, this memo also contained
the complete updated contents of the Language Subtag Registry. The
purpose was to deliver a complete, revised Registry to the Internet
Assigned Numbers Authority (IANA), replacing the previous version.
This content was deleted from this memo prior to publication as an
RFC.
The format of the Language Subtag Registry, and the definition and
intended purpose of each of the fields, are described in
[draft-4646bis].
The Registry is expected to change over time, as new subtags are
registered and existing subtags are modified or deprecated. The
process of updating the Registry is described in Section 3 of
[draft-4646bis]. In its Internet-Draft phase, this memo did not
define the permanent contents of the Registry and should not be
represented as doing so.
Many of the subtags defined in the Language Subtag Registry are based
on code elements defined in [ISO639-1], [ISO639-2], [ISO639-3],
[ISO3166-1], [ISO15924], and [UN_M.49]. The Registry is not a mirror
of the code lists defined by the standards and should not be used as
one.
Ewell Expires April 1, 2007 [Page 3]

Internet-Draft Update to the Language Subtag Registry September 20062. Updating the Registry
This section describes the process for determining the updated
contents of the Language Subtag Registry.
2.1. Starting Point
The version of the Language Subtag Registry that was current at the
time of IESG approval of this memo served as the starting point for
this update. The process of creating that version was described in
[RFC4645].
The source data for [ISO639-3] used for this update consisted of two
files, [iso-dis-639-3_20060421] and
[iso-dis-639-3-macrolanguages_20060922], available from the official
site of the ISO 639-3 Registration Authority. [RFC EDITOR NOTE: this
information is expected to be updated before approval of this memo.]
o [iso-dis-639-3_20060421] is a list of all language code elements
in [ISO639-3], including the alpha-3 code element and inverted
name for each code element. For example, the entry for
Northwestern Kolami contained the code element "kfb" and the name
"Kolami, Northwestern" (among other things).
o [iso-dis-639-3-macrolanguages_20060922] is a list of all alpha-3
code elements for languages that are contained within a
macrolanguage in [ISO639-3], together with the alpha-3 code
element for the macrolanguage. For example, a line containing the
code elements "mon" and "khk" indicated that the macrolanguage
"Mongolian" includes the individual language "Mongolian, Halh".
(Note that these alpha-3 code elements may not have corresponded
directly to subtags in the Registry, which uses 2-letter subtags
derived from [ISO639-1] when possible.)
The value of the File-Date field and of the Added date for each new
subtag record are set to a date near the date of IESG approval of
this memo.
2.2. New Subtags
For each language in [ISO639-3] that was not already represented by a
primary language subtag in the Language Subtag Registry, a new subtag
was added to the Registry, using the [ISO639-3] code element as the
value for the Subtag field and the inverted name as the value for the
Description field. The following rules were used to determine
whether to add a primary or extended language subtag:
Ewell Expires April 1, 2007 [Page 4]

Internet-Draft Update to the Language Subtag Registry September 2006
o If the language was included within a macrolanguage, an extended
language subtag was added, with the primary language subtag of the
macrolanguage as the value for the Prefix field.
o If the name of the language included the words "Sign Language", an
extended language subtag was added, with the string "sgn" as the
value for the Prefix field. This is a special case that treats
the existing primary language subtag for "Sign Languages" as if it
were a macrolanguage encompassing all sign languages. (Note that
"sgn" is defined as a "collection code" by [ISO639-3] and hence
not included in that standard.)
o Otherwise, a primary language subtag was added.
As a special case, the code elements "diq" (Dimli) and "kiu"
(Kirmanjki) were added as extended language subtags, with "zza" as
their Preferred-Value, even though [iso-dis-639-3_20060421] did not
include an entry for code element "zza" (because it was released
before "zza" was added to [ISO639-2]).
All subtags were added to the Registry maintaining alphabetical order
within each type of tag: all "language" subtags first, followed by
all "extlang" subtags. No existing records were moved.
For consistency with naming changes that were reported to have been
implemented in [ISO639-3], but were not yet present in
[iso-dis-639-3_20060421], the following additional transformations
were applied to language names before applying them as Description
fields:
o If the name listed in [ISO639-3] included the substring
"(generic)", the substring and preceding space were deleted.
o If the name included the substring "(specific)", the substring was
replaced by the string "(macrolanguage)". The sole exception to
this rule was "Zande (specific)", which was left unchanged because
a primary language subtag for "Zande" already existed and neither
was a macrolanguage.
2.3. Modified Subtags
For each language in [ISO639-3] that was already represented by a
primary language subtag in the Language Subtag Registry, but whose
name in [iso-dis-639-3_20060421] did not exactly match the value for
one of the Description fields of that subtag, the name from
[iso-dis-639-3_20060421] was added as a new Description field after
the existing Description fields. This principle was followed even
when the only difference between the names was that one was inverted
Ewell Expires April 1, 2007 [Page 5]

Internet-Draft Update to the Language Subtag Registry September 2006
and the other was not, or that one contained a country name or other
string in parentheses (to distinguish it from another language of the
same name) and the other did not.
Names from [iso-dis-639-3_20060421] were transformed as listed in
Section 2.2. If the resulting name after the transformation exactly
matched an existing Description field, it was not added.
No existing Description fields were changed or deleted. As a special
case, the capitalization of the Subtag field for redundant tag "yi-
latn" was changed to "yi-Latn", for consistency with the
capitalization conventions described in Section 2.1 of
[draft-4646bis].
2.4. Grandfathered and Redundant Tags
As stated in [draft-4646bis], "grandfathered" and "redundant" tags
are complete tags in the Language Subtag Registry that were
registered under [RFC1766] or [RFC3066] and remain valid.
Grandfathered tags cannot be generated from a combination of valid
subtags, while "redundant" tags can.
Under certain conditions, registration of a subtag under
[draft-4646bis] may cause a grandfathered tag to be reclassified as
redundant. It may also enable the creation of a generative tag with
the same meaning as a grandfathered or redundant tag; in that case,
the grandfathered or redundant tag is marked as Deprecated, and the
generative tag (including the new subtag) becomes its Preferred-
Value.
As a result of adding the new subtags in this update, the following
grandfathered tags became composable and were reclassified as
redundant:
zh-cmn
zh-cmn-Hans
zh-cmn-Hant
zh-gan
zh-wuu
zh-yue
The following grandfathered tags were deprecated, with the indicated
generative tag serving as the Preferred-Value:
Ewell Expires April 1, 2007 [Page 6]

Internet-Draft Update to the Language Subtag Registry September 2006
i-ami (Preferred-Value: ami)
i-bnn (Preferred-Value: bnn)
i-pwn (Preferred-Value: pwn)
i-tao (Preferred-Value: tao)
i-tay (Preferred-Value: tay)
i-tsu (Preferred-Value: tsu)
sgn-CH-de (Preferred-Value: sgn-sgg)
zh-hakka (Preferred-Value: zh-hak)
zh-min (no Preferred-Value; see below)
zh-min-nan (Preferred-Value: zh-nan)
zh-xiang (Preferred-Value: zh-hns)
The tag "zh-min" is a special case: it represents a small class of
languages, but is not a true macrolanguage. It could not ever become
a generative tag since the [ISO639-3] code element "min" is assigned
to an individual language (Minangkabau) that is not related to
Chinese ("zh"). Because it does not represent a useful linguistic
distinction for tagging purposes, it was deprecated without a
Preferred-Value.
The following redundant sign-language tags were deprecated, with the
indicated generative tag serving as the Preferred-Value:
sgn-BR (Preferred-Value: sgn-bzs)
sgn-CO (Preferred-Value: sgn-csn)
sgn-DE (Preferred-Value: sgn-gsg)
sgn-DK (Preferred-Value: sgn-dsl)
sgn-ES (Preferred-Value: sgn-ssp)
sgn-FR (Preferred-Value: sgn-fsl)
sgn-GB (Preferred-Value: sgn-bfi)
Ewell Expires April 1, 2007 [Page 7]

Internet-Draft Update to the Language Subtag Registry September 2006
sgn-GR (Preferred-Value: sgn-gss)
sgn-IE (Preferred-Value: sgn-isg)
sgn-IT (Preferred-Value: sgn-ise)
sgn-JP (Preferred-Value: sgn-jsl)
sgn-MX (Preferred-Value: sgn-mfs)
sgn-NI (Preferred-Value: sgn-ncs)
sgn-NL (Preferred-Value: sgn-dse)
sgn-NO (Preferred-Value: sgn-nsl)
sgn-PT (Preferred-Value: sgn-psr)
sgn-SE (Preferred-Value: sgn-swl)
sgn-US (Preferred-Value: sgn-ase)
sgn-ZA (Preferred-Value: sgn-sfs)
A Comments field was added to each of the deprecated grandfathered
and redundant tags, with a value in the form "replaced by ISO code
xxx", where "xxx" represents the new primary or extended language
subtag (derived from [ISO639-3]) that caused the tag to become
deprecated. A similar comment was exceptionally added to the entry
for the grandfathered tag "zh-guoyu", which was already deprecated.
These comments were added for consistency with other grandfathered
tags in the Registry.
No change was made to the Description field(s) for any of the
grandfathered or redundant tags. For example, the redundant tag
"sgn-US" continues to carry the Description "American Sign Language".
The sign language tags registered prior to [RFC4646] remain an
exception to the general principle that the meaning of a non-
grandfathered tag can be derived from its component subtags.
Ewell Expires April 1, 2007 [Page 8]

Internet-Draft Update to the Language Subtag Registry September 20063. Updated Registry Contents
The remainder of this section specified the updated set of records
for the Language Subtag Registry. This material was deleted before
publication of this memo, to avoid any potential confusion with the
Registry itself. The IANA Language Subtag Registry can be found at
<http://www.iana.org/numbers.html> under "Language Tags".
[RFC EDITOR NOTE: the remainder of this section is to be deleted upon
publication.]
The updated contents of the Language Subtag Registry follow. This
data is intended as a complete replacement for the current contents
of the Registry. The Registry begins with the line that starts with
the string "File-Date" and continues to the end of this section.
Headers, footers, line breaks, and other vertical whitespace
introduced by the RFC process are not significant. Leading
horizontal whitespace relative to the "File-Date" line indicates a
continued line in the record-jar format, and must not be deleted.
File-Date: 2007-01-01
%%
Type: language
Subtag: aa
Description: Afar
Added: 2005-10-16
%%
Type: language
Subtag: ab
Description: Abkhazian
Added: 2005-10-16
Suppress-Script: Cyrl
%%
Type: language
Subtag: ae
Description: Avestan
Added: 2005-10-16
%%
Type: language
Subtag: af
Description: Afrikaans
Added: 2005-10-16
Suppress-Script: Latn
%%
Type: language
Subtag: ak
Description: Akan
Added: 2005-10-16
Ewell Expires April 1, 2007 [Page 9]

Internet-Draft Update to the Language Subtag Registry September 20064. Security Considerations
This document specifies the complete updated contents to be used by
IANA in populating the Language Subtag Registry. For security
considerations relevant to the Registry and the use of language tags,
see [draft-4646bis].
Ewell Expires April 1, 2007 [Page 879]

Internet-Draft Update to the Language Subtag Registry September 20065. IANA Considerations
In its initial phase as an Internet-Draft, this memo contained the
complete updated contents of the Language Subtag Registry. As an
RFC, it contains a pointer to the updated content for the Registry,
which is maintained by IANA. The Language Subtag Registry can be
found at <http://www.iana.org/numbers.html> under "Language Tags".
For details on the procedures for the format and ongoing maintenance
of this Registry, see [draft-4646bis].
Ewell Expires April 1, 2007 [Page 880]

Internet-Draft Update to the Language Subtag Registry September 20066. References6.1. Normative References
[ISO639-3]
International Organization for Standardization, "ISO/FDIS
639-3:2006. Codes for the representation of names of
languages -- Part 3: Alpha-3 code for comprehensive
coverage of languages, first edition", 2006.
[draft-4646bis]
Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying
Languages", September 2006.
[iso-dis-639-3-macrolanguages_20060922]
International Organization for Standardization, "ISO/DIS
639-3 macrolanguage mappings", September 2006, <http://www.sil.org/iso639-3/iso-dis-639-3-macrolanguages_20060922.tab>.
[iso-dis-639-3_20060421]
International Organization for Standardization, "ISO/DIS
639-3 code set", April 2006,
<http://www.sil.org/iso639-3/iso-dis-639-3_20060421.tab>.
6.2. Informative References
[ISO15924]
International Organization for Standardization, "ISO
15924:2004. Information and documentation -- Codes for
the representation of names of scripts", January 2004.
[ISO3166-1]
International Organization for Standardization, "ISO 3166:
1988. Codes for the representation of names of countries,
3rd edition", August 1988.
[ISO639-1]
International Organization for Standardization, "ISO 639-
1:2002. Codes for the representation of names of
languages -- Part 1: Alpha-2 code", 2002.
[ISO639-2]
International Organization for Standardization, "ISO 639-
2:1998. Codes for the representation of names of
languages -- Part 2: Alpha-3 code, first edition", 1998.
[RFC1766] Alvestrand, H., "Tags for the Identification of
Ewell Expires April 1, 2007 [Page 881]

Internet-Draft Update to the Language Subtag Registry September 2006
Full Copyright Statement
Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Ewell Expires April 1, 2007 [Page 884]