L2/01-161
From: Kenneth Whistler [kenw@sybase.com]
Sent: Monday, April 09, 2001 8:52 PM
Subject: WG2 #40 Meeting Summary (Mountain View)
Unicordates,
WG2 met last week in Mountain View. You can go dig out the
resolutions (WG2 N2354) and eventually the minutes yourself,
but, as usual, I'd like to send around a short--well, kind of
long, actually--summary report, emphasizing the issues of relevance
to the UTC.
1. FDIS 10646-2 (see Resolution M40.3)
This is approved now, and on its way to publication as an IS.
The editor (Michel) accommodated a few minor editorial comments.
The main issue that was discussed was the status of the Extension B
font, needed both for publication of 10646-2 and for the final
chart publication of Extension B for Unicode 3.1.
China reported that the FDIS font was created by 3 vendors.
The current font is now the work of a single vendor, so it
is stylistically consistent and the quality has been improved.
It is still being checked for correctness -- and that will have
to be reviewed by the IRG meeting June 18-20 in Hong Kong.
China promised delivery of the "final, final font" by July 15.
I'll spare you the rights-to-online-publication hassling about
the font. I am assuming that that will be worked out, but if not,
then we have a production problem for both standards.
2. Amendment 1 for 10646-1 (see Resolutions M40.4, M40.5, M40.6)
The resolution of comments for PDAM 1 was the main order of
business for the meeting. The PDAM passed, with negative votes
by Japan and Ireland, and with lots of comments from other national
bodies. WG2 accommodated most of the comments, including enough
to turn the Japanese and Irish votes to YES. And the amendment
now progresses to its FPDAM balloting.
In the process of resolving national ballot comments or as
a result of considering independent proposal documents brought
into the meeting, a number of characters were added to the
amendment. I'll list these here, ordered by their status for
the UTC. And then list other significant changes.
Characters added to the BMP, already approved by the UTC,
with the same code point and names. These are non-problematical
additions, synching up with what UTC has approved, and require
no further action by the UTC:
0220 LATIN CAPITAL LETTER N WITH LONG RIGHT LEG
034F COMBINING GRAPHEME JOINER
066E ARABIC LETTER DOTLESS BEH
066F ARABIC LETTER DOTLESS QAF
267A RECYCLING SYMBOL FOR GENERIC MATERIALS
267B BLACK UNIVERSAL RECYCLING SYMBOL
2768..2775 (14 Dingbat ornamental brackets)
FE73 ARABIC TAIL FRAGMENT
Characters added to the BMP, already approved by the UTC,
with the same code point, but with different names. The
UTC needs to revisit these, to update their approval to the
new names:
10F7 GEORGIAN LETTER YN
(UTC approved GEORGIAN LETTER IRRATIONAL VOWEL)
10F8 GEORGIAN LETTER ELIFI
(UTC approved GEORGIAN LETTER AINI)
267C RECYCLED PAPER SYMBOL
(UTC approved RECYCLED PAPER)
267D PARTIALLY-RECYCLED PAPER SYMBOL
(UTC approved PARTIALLY RECYCLED PAPER)
Math characters added to the BMP, considered, but not yet
approved by the UTC. There are 74 of these, documented in
WG2 N2356. That was based on WG2 N2336 "Additional Mathematical
Symbols", which superseded WG2 N2318, the document that
resulted from the discussion of these at the last UTC meeting.
The most problematical of the characters in the earlier
documents were omitted, in favor of progressing those which
seemed less controversial and of higher priority. The UTC
will need to review WG2 N2336, N2356, and N2341R (the draft
charts for the FPDAM, which includes these 74) and decide either
to approve them, or ask for modifications or removals from
the FPDAM. I won't type up the entire list of 74 here, as it
is available in those other documents.
Character whose code point was moved from that shown in the
PDAM. This will need to be reviewed and approved by the UTC:
27D0 WHITE DIAMOND WITH CENTRED DOT (moved from 255F)
Character removed from the PDAM. This removal accords with
the UTC decision to reject this character, and so requires no
further action by the UTC:
17DD KHMER SIGN LAAK
Characters that the U.S. ballot comments requested be removed
from the PDAM, but which were not. These are the four extra
radicals in the Japanese compatibility ideograph set. At this
point, since the removal was refused by WG2, the UTC will need
to reconfirm the four characters in question to ensure that
the two standards are synchronized. (Or it could ask for their
removal one more time in FPDAM ballot comments from the U.S.
NB -- but that seems pointless at this time, since the question
was already decided by the WG2 and will get the same result
in the FPDAM unless someone can come up with a stronger
implementation argument for their removal.)
FA4A CJK COMPATIBLITY CHARACTER-FA4A
FA5E CJK COMPATIBLITY CHARACTER-FA5E
FA5F CJK COMPATIBLITY CHARACTER-FA5F
FA67 CJK COMPATIBLITY CHARACTER-FA67
Characters whose names were changed from those printed in the
PDAM text. These fall into two categories: 1. those requested
in U.S. ballot comments, which can be considered to be
pre-approved by the UTC, since they came out of decisions made
in the joint UTC/L2 ad hoc meeting. 2. those requested in
other NB comments or by the WG2 plenary, which will need to
be reviewed and approved by the UTC.
"Pre-approved" name changes:
2140 DOUBLE-STRUCK N-ARY SUMMATION
291D LEFTWARDS ARROW TO BLACK DIAMOND
291E..2920 (same change of "FILLED" to "BLACK" as for 291D)
2933 WAVE ARROW POINTING DIRECTLY RIGHT
29A8 MEASURED ANGLE WITH OPEN ARM ENDING IN ARROW POINTING UP AND RIGHT
29A9..29AB (same removal of "TO THE" as for 2933 and 29A8)
29D1 LEFT BLACK BOWTIE
29D2 RIGHT BLACK BOWTIE
29D3 BLACK BOWTIE
29D4 LEFT BLACK TIMES
29D5 RIGHT BLACK TIMES
29D7 BLACK HOURGLASS
29EA BLACK DIAMOND WITH DOWN ARROW
29EB BLACK LOZENGE
29ED BLACK CIRCLE WITH DOWN ARROW
29EF ERROR-BARRED BLACK SQUARE
29F1 ERROR-BARRED BLACK DIAMOND
29F3 ERROR-BARRED BLACK CIRCLE
2A28 PLUS SIGN WITH BLACK TRIANGLE
[Incidentally, of these, I think the changes for 29D1..29D5 create
misnomers, and should be reverted to FILLED. I will raise that as
a UTC issue for comment on the FPDAM.]
Name changes requiring further review and approval:
2144 TURNED SANS-SERIF CAPITAL Y
(changed "INVERTED" to "TURNED")
23BE DENTISTRY SYMBOL LIGHT VERTICAL AND TOP RIGHT
23BF..23CC (comparable change of "DENTIST" to "DENTISTRY" in each name)
In addition to all the additions and name changes, WG2 also
agreed to a number of glyph corrections -- some of them requested
in NB ballot comments (technically out of scope for the PDAM, but
accommodated anyway), and others from other sources. Most of
these were non-controversial small fixes, and are going to be
rolled in as soon as possible.
3. Dis-unification of Brackets for CJK and Math (see Resolution M40.7)
Acting as individuals, Asmus and Michel brought in a proposal to
disunify the CJK brackets also used in math, to solve the implementation
problem Asmus talked about at the last UTC meeting. This, and the
proposal for adding more mathematical symbols (see above) led to
an ad-hoc meeting on mathematical symbols, whose report got written
up as WG2 N2344. The ad-hoc served as a vehicle to get Irish, and
in particular, Japanese support for the proposal. I argued against
the disunification, in accord with the official U.S. position.
Michel argued for the disunification, in opposition to the U.S.
position, and Asmus argued for the disunification, in opposition to
the UTC position. Korea effectively abstained in the ad hoc, and
China was not officially represented. The net of the ad hoc on this
issue was for Kent Karlsson (Sweden), who also supported the disunification,
to write up a proposal. In plenary I argued that proposal down from
10 disunifications to 6, but in the end WG2 approved the 6:
2B00 MATHEMATICAL LEFT WHITE SQUARE BRACKET
2B01 MATHEMATICAL RIGHT WHITE SQUARE BRACKET
2B02 MATHEMATICAL LEFT ANGLE BRACKET
2B03 MATHEMATICAL RIGHT ANGLE BRACKET
2B04 MATHEMATICAL LEFT DOUBLE ANGLE BRACKET
2B05 MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET
These are disunification clones of 301A, 301B, 3008, 3009, 300A, and 300B,
respectively.
In addition, WG2 approved the renaming of two characters in Amendment 1:
2985 MATHEMATICAL WHITE LEFT PARENTHESIS
2986 MATHEMATICAL WHITE RIGHT PARENTHESIS
and the addition of two CJK (wide) disunification clones of those two:
33DE WHITE LEFT PARENTHESIS
33DF WHITE RIGHT PARENTHESIS
The resolution didn't actually add these to the FPDAM text, but in
some ways, the wording is actually even worse than if they were added
to the FPDAM:
"WG 2 provisionally accepts to add 6 new math symbols ... (etc.) ...
per recommendation of the Math ad hoc group in document N2345R, with
the intent of including these in the standard in the FDAM-1 to 10646-1.
This provisional acceptance is to permit member bodies and liaison
organizations to review and comment by the next meeting of WG 2 in October
2001."
What this means, in effect, is that the disunifications will be added
to the FDAM-1 in October, without them having gone through PDAM or
FPDAM ballot comment and review, unless one or more member bodies or
liaison organizations raise strong enough objections to overturn the
consensus that was manufactured at WG2 #40, as captured in
Resolution M40.7.
Since the UTC and the U.S. national body are on record as opposing this
disunification, that means that if they want the disunification reversed,
they will have to line up a very strong objection before the October
WG2 meeting -- and will not have the nominal vehicle of the FPDAM
ballot comments to do it in, since these disunifications will not actually
appear in the ballot text.
4. Future Script Additions: Limbu, Ugaritic Cuneiform, Aegean scripts
WG2 didn't take any resolutions on these, but minuted the fact
that these proposals are now considered mature. The revised
proposals are in the hopper now, with national body comments
invited. And I made it clear that the intent by the proposers
is to progress these to amendment balloting by resolution at
the Singapore meeting this October. In particular:
Limbu (for the BMP) would be in Amendment 2 for 10646-1.
Ugaritic Cuneiform and the Aegean scripts (Linear B, etc.)
would be in Amendment 1 for 10646-2 (they go on Plane 1).
Given the mature status of the proposals now, and the buy-in
from the relevant academic communities for the historic
scripts, these are now the most likely candidate additions
we can see coming that would meet the presumptive deadline for
inclusion in Unicode 4.0. The Dai scripts (for the BMP) might
also make it, if they get pushed between now and October,
since those proposals are also fairly mature, have had a long
history in WG2, and since China has an interest in completing
them.
5. Korean Ad Hoc Meeting Report (see Resolution M40.1)
The DPRK and ROK were both heavily represented at the meeting. China
also brought along a Korean expert in their delegation. There was
a fairly extensive discussion of the meeting report (N2331) that
the Korean script ad hoc group put on the record. However, the net
effect was no impact on anything currently encoded. Everyone was
invited to continue talking, with Kyongsok Kim (ROK) and a player
to be named later from the DPRK nominated as co-chairs of the ad hoc
to coordinate any future reports to WG2.
Side discussions with the DPRK further clarified the distinction
between the standard arrangement of characters in 10646 and the
use of tables for arbitrary collation orders.
And there was a breakthrough of sorts in clarifying that it is
perfectly o.k. to *translate* the English reference names of
characters in the standard to whatever may be locally appropriate --
just as the French translators have done for the French edition
of 10646. This may help remove the pressure from the DPRK to change
the names of all the Korean Hangul characters in 10646.
--Ken
5