On Mon, 30 Jun 2008, Julian Reschke wrote:
>
> With question marks, there will be data loss. You may or may not notice
> it, because the page you get may look ok (for instance, it depends on
> how important that part of the query was). If you notice that something
> is wrong, then, yes, spotting the question mark may help. If you
> understand the issue itself. For how many users is that the case?
>
> With UTF-8/percent-escaping, the page may very well work as desired,
> because the server happens to understand that encoding
There is no question that always using UTF-8 would be better than the
current mess.
> (see Google case cited in Webkit bug report).
Do you mean the case that gets converted to &#...;? That's not UTF-8.
(If you mean something else, could you provide a link?)
> Finally, if you copy & paste the URL, you wouldn't see the replacement
> characters anyway, right? In which case the default handling (using
> UTF-8) would apply; which even more is a reason to consider making this
> mandatory (because otherwise following the link inside the document and
> the copy/paste case yield different results).
Having the encoding be essentially random is far worse than converting the
character to a question mark, IMHO.
Anyway, the whole issue is easily avoided by authors by just using UTF-8.
This entire problem can only be reached in invalid documents anyway.
> > > I care because I'd like to see documents using non-ASCII characters
> > > in query parts become compliant no matter what encoding they are in.
> >
> > Unless we change the definition of HTML5's URLs to be conforming even
> > when those URLs would not be treated as IRIs, I don't see any way to
> > get there from here.
>
> We could break the affected pages and/or add a mechanism through which
> pages can opt-in into the sane UTF-8 based behavior.
Breaking the pages isn't an option, and an opt-in is already available:
use UTF-8. This issue is not even remotely important enough on the grand
scale of things to deserve special syntax or options or whatnot.
> > The HTMLWG is only a small part of the broad range of places from
> > which I take input, which includes hundreds of blogs, at least three
> > separate bug systems, multiple other mailing lists, face to face
> > discussions, IRC conversations on dozens of channels and privately,
> > private e-mails, etc. I try to keep as much of the discussions to the
> > HTMLWG and WHATWG lists, but the sheer volume of traffic that would be
> > generated by archiving all the sources of input on public-html would
> > be staggering, and that's without even considering whether all those
> > people would actually be willing to have their input forwarded in that
> > way.
>
> In which case it seems to me we have a big process problem.
My goal is to get a good specification and bring the Web forward, not to
follow process, so that's quite possible, yes. I'm certainly not going to
start putting process ahead of getting quality feedback.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'