* Phillips, Addison wrote:
>1. Change the text above to read:
>
> If the IRI or IRI reference is an octet stream in some known non-
> Unicode character encoding, convert the IRI to a sequence of
> characters from the UCS.
>
> In other cases (written on paper, read aloud, or otherwise
> represented independent of any character encoding) represent the IRI
> as a sequence of characters from the UCS.
IRIs are by definition a sequence of characters from the UCS. With the
requirement gone, I do not think there is a point in having this section
in the document.
>2. Add the following text just after the second paragraph above:
>
>NOTE: Some character encodings or transcriptions can be converted to or
>represented by more than one sequence of Unicode characters. Ideally the
>resulting IRI would use a normalized form, such as Unicode Normalization
>Form C (NFC, [UTR15]), since that ensures a stable, consistent
>representation that is most likely to produce the intended results.
>Implementers and users are cautioned that, while denormalized character
>sequences are valid, they might be difficult for other users or
>processes to guess and might produce unexpected results.
Normalization is already discussed in 5.3.2.2 "Character Normalization",
any discussion of it should be moved there if it's not already covered.
--
BjÃ¶rn HÃ¶hrmann Â· mailto:bjoern@hoehrmann.de Â· http://bjoern.hoehrmann.de
Am Badedeich 7 Â· Telefon: +49(0)160/4415681 Â· http://www.bjoernsworld.de
25899 DagebÃ¼ll Â· PGP Pub. KeyID: 0xA4357E78 Â· http://www.websitedev.de/