On Thursday, February 19, 2004, 12:17:56 AM, Boris wrote:
>> Because most stylesheets out there are in what? Most are in US-ASCII,
>> I would guess, since the entire syntax of CSS uses US-ASCII. The only
>> opportunities to have anything else are replaced content in:before and
>> :after, which is not too common in practice since it doesn't work in
>> MSIE/Win.
BZ> You forgot these niggling things little developers tend to put in code
BZ> (including stylesheets) to make it comprehensible -- comments. Lots and lots
BZ> of sheets have comments. Copyright notices, especially. With people's names.
BZ> Which tend to NOT be ascii, often enough, except in the US.
Good point. I had not considered comments. On the other hand
BZ> In the wild, most stylesheets that are not associated with US websites are
BZ> either ISO-8859-1 or Shift_JIS, from what I've seen. I would be hard pressed
BZ> to estimate relative frequencess of those two as compared to us-ascii.
Figures would be handy, but the point is well made.
>> So, if most stylesheets are US-ASCII then a default of UTF-8 would
>> work pretty well.
BZ> Yeah, as long as you stick to US sites....
No, I was not making that assumption, nor would i consider that
limitation to be at all suitable.
BZ> since treating ISO-8859-1 or
BZ> Shift_JIS as UTF-8 will at best lead to recoverable decoding errors (and at
BZ> worst to irrecoverable ones, depending on what your decoder looks like). Note
BZ> that attempting recovery from decoding errors has security issues, so I can
BZ> perfectly well understand people not trying to do that.
Could such security issues not be triggered by taking such a
stylesheet and referencing it from a page with a suitable encoding
that would, if applied to the stylesheet, trigger the error?
To clarify; the situation I would like to see is that all stylesheets
declare what encoding they are in, preferably using an @charset rule
so that authoring tools, which know this information, can reliably
pass on this info in the stylesheets they write.
If there are multiple sources of information, then they should all say
the same thing.
The limitations of text/* media types mean that application/css would
be a better bet in terms of consistent decoding without guesswork.
--
Chris Lilley mailto:chris@w3.org
Chair, W3C SVG Working Group
Member, W3C Technical Architecture Group