Hi,
Re <http://tools.ietf.org/html/draft-iab-identifier-comparison-07>, in
section 3.3 there is
Also, when a URI is embedded in plain text (e.g., an email message),
there is an additional concern because there is no termination
criterion for a URI. For example, consider
http://unicode.org/cldr/utility/list-unicodeset.jsp?a=a&amp;g=gc.
Some applications that detect URIs will stop before the first '.' in
the path, while others go to last '.', and yet others may stop at the
';'. As another point of comparison, Section 2.37 of [EE] (a
standard for history citations) specifies the use of a space after a
URI and before the punctuation.
It's unclear to me whether the `&amp;` in there is intentional or an en-
coding error. If it is intentional, that should be made very explicit. I
also find the claim a bit dubious, STD 66 quite clearly recommends using
<> around them and you could use white space aswell. More generally this
seems to be a bit far-fetched as an issue in "comparison", this is more
discussing applying heuristics to extract data from ambiguous text. Per-
haps the document can do without this paragraph.
Section 3.1 on hostnames seems to be missing the issue of "example.com"
versus "example.com." with a trailing full stop; it might be useful to
mention it there.
In section 3.3.2.3.,
[RFC3986] defines the userinfo production that allows arbitrary data
about the user of the URI to be placed before '@' signs in URIs. For
example: "http://alice:bob:chuck@example.com/bar" has the value
"alice:bob:chuck" as its userinfo. [...]
This is somewhat misleading as it fails to mention that while the
generic syntax allows this, individual schemes like the HTTP scheme, as
currently defined in RFC 2616, do not allow this. It might be better to
pick a scheme that actually allows this form.
Section 3.3.3,
[RFC3986] supports the use of path segment values such as "./" or
"../" for relative URIs. Strictly speaking, including such path
segment values in a fully qualified URI is syntactically illegal but
[RFC3986] section 4.1 nevertheless defines an algorithm to remove
them.
This should include a reference to STD 66 indicating where it defines
them as illegal (I could not find that myself, so the text might be
mistaken).
The reference [TR36] should link to http://www.unicode.org/reports/tr36/
or some other suitable address (currently it does not link anything).
regards,
--
BjÃ¶rn HÃ¶hrmann Â· mailto:bjoern@hoehrmann.de Â· http://bjoern.hoehrmann.de
Am Badedeich 7 Â· Telefon: +49(0)160/4415681 Â· http://www.bjoernsworld.de
25899 DagebÃ¼ll Â· PGP Pub. KeyID: 0xA4357E78 Â· http://www.websitedev.de/