There is also the following Note:
Note: The difference between Variants B and C in Step 1 (Variant B
using normalization with NFC while Variant C not using any
normalization) is to account for the fact that in many non-Unicode
character encodings, some text cannot be represented directly.
For example, Vietnam is natively written "Vi&#x1EC7;t Nam"
(containing a LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
in NFC, but a direct transcoding from the windows-1258 character
encoding leads to "Vi&#xEA;&#x323;t Nam" (containing a LATIN SMALL
LETTER E WITH CIRCUMFLEX followed by a COMBINING DOT BELOW),
whereas direct transcoding of other 8-bit encodings of Vietnamese
may lead to other representations.
Would moving this closer to the A/B/C variants, and maybe adding
some text, be a solution to your last call comment?
Regards, Martin.
At 14:50 04/08/18 +0900, Martin Duerst wrote:
>Hello Chris,
>
>Many thanks for your comment. I have made it issue why-not-normalize-42
>(see http://www.w3.org/International/iri-edit#why-not-normalize-42).
>
>A few ideas on how to deal with it below.
>
>At 22:22 04/08/11 +0200, Chris Lilley wrote:
>
>>Hello ,
>>
>> > If the IRI is in an Unicode-based character encoding (for example
>> > UTF-8 or UTF-16): Do not normalize. Apply Step 2 directly to the
>> > encoded Unicode character sequence.
>>I believe that I understand why this step says 'do not normalize'
>>(otherwise, certain Unicode strings couldnever be used in query parts,
>>for example).
>>
>>However, as the two preceding steps say 'normalize' and this step says
>>'do not normalize' the reader could be confused - or perhaps consider it
>>an 'obvious error'.
>>
>>Do not tease the reader like this. Please explain *why* at this stage no
>>normalization is performed.
>
>You definitely have a point. But as you have noticed, the explanations
>are already given elsewhere in the document. I think there are several
>things that can be done:
>
>- capitalize 'NOT', to make clear that this is not an 'obvious error'.
>- add a pointer to 5.3 Normalization
>
>(http://www.w3.org/International/iri-edit/draft-duerst-iri.html#normaliza
>(http://www.w3.org/International/iri-edit/draft-duerst-iri.html#normalization)
>- do both of the above
>
>Which one do you prefer? Do you think this is enough, or do you have
>some other idea (actual wording preferred)?
>
>
>Regards, Martin.