On Tue, 28 Apr 2009, Boris Zbarsky wrote:
> >
> > Following hyperlinks:
> > http://www.whatwg.org/specs/web-apps/current-work/#following-hyperlinks
>
> If I read this right, this requires spaces to end up in a the parsed url
> as-is (since they are added to the <unreserved> production), right? Is
> there a good reason for this?
Resolving the URL escapes the spaces in the host, path, and query
components. The spaces in the fragment identifier aren't changed, for
compatibility with IE (I believe the only place this is relevant is in
the DOM -- the fragment identifier is never used in a wire protocol).
> > > with a fragment (e.g. a Location HTTP header with a fragment), the
> > > only way to make that fragment match an id is to have the ID
> > > URI-escaped, and in particular have all non-ASCII characters
> > > URI-escaped.
> >
> > Right.
>
> Actually, I got that wrong; for an ID things are OK (you'd need to
> escape the fragment in the URI, but the ID itself can be unescaped). But
> for an <a name> the name would have to be escaped in the HTML.
Oh I thought you mean the ID in the fragment identifier.
Yes, for name="" you have to escape the name="" attribute's value to
exactly match the way the fragment idenfifier is written.
> > > Then that same ID is a pain to match from IRIs (they also end up
> > > needing to have those characters escaped).
> >
> > Why?
>
> Still talking about <a name>, the name in the HTML would be escaped so
> that it can be matched by URIs, and then the IRIs have to have the ref
> escaped as well, because no unescaping happens for names.
>
> This is probably ok, especially because everything should "just work"
> for cases when IRIs are used end-to-end (not the case in Gecko right
> now, effectively, but I'm working on getting that changed).
Ok.
Since the <a name=""> attribute is obsolete in HTML5 anyway, hopefully we
won't have too much in the way of non-ASCII values in them.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'