http://www.w3.org/Bugs/Public/show_bug.cgi?id=1502
------- Additional Comments From timbl@w3.org 2005-06-17 02:05 -------
Michael (Kay), the two functions are *quite different* as I understand it. It is not that one operates on
part and the other on a whole URI. You can feed a whole or part URI to either.
encode-for-uri(s) takes ANY STRING (not necessarily any relation to a URI) and encodes it as a
something which can be transferred as path segment. It is an encoding in that there is a corresponding
decode. if you use it twice, then you get something double-encoded. Example: Use when encoding a
string argment to a HTML-form-style query.
clean(s) takes a URI (or part) and just cleans it up so that any unacceptable characters are encoded in
ASCII. It doesn't encode anything which is already encoded. There is no inverse function, as you can't
tell what characters were not originally clean in the original string. If you use it twice, its the same as
using it once. once. Example: use when encoding an IRI for transmission in HTTP.
Why would you want to perform both operations? The result of encode-for-uri will allways be clean so
performing a clean()n will have no effect. The result of cleaning a URI will be a clean URI whcih one may
want to then encopde as a URI encoded parameter within a new query URI being built up. But that is a
separate function, and should be programmed as such.