On Jul 5, 2008, at 00:44, Jirka Kosek wrote:
> Henri Sivonen wrote:
>
>> I disagree with the simplified framing of the issue, since it gives
>> the wrong idea of how little fixing is needed and where the
>> sensible place for the fix is. The doctype is the least of the
>> problems with XSLT and HTML5.
>
> Hi Henri, actually there are two issues. One is very simple -- how to
> allow producing of HTML5 compliant output with *existing* XSLT
> language
> and its implementation. This issue is very important because it is
> very
> common approach for producing HTML content. Moreover even HTML WG
> charter explicitly states that "legacy implementation" of "classic
> HTML"
> should be taken into account. And XSLT could be considered as such
> legacy application.
To use the existing XSLT language and implementations, one needs to
use <xsl:text disable-output-escaping="yes">. It's the
document.write() of the XML community, so it's ugly. However, it can
be done. It is an optional feature, but then having a serializer at
all in an XSLT processor is optional. For the optionality to be a
problem, it would need to be shown that there are notable
implementations that do implement the optional <xsl:output
method="html"/> but don't implement the optional disable-output-
escaping="yes".
> Of course there is second issue on which you really elaborate in your
> email and this is how to extend some *future version* of XSLT language
> and its implementation to support all bits of HTML5. I almost agree
> with
> your analysis on this issue.
The issues can be fixed without changing the XSLT language. I released
version 1.1.0 of the Validator.nu HTML Parser the other day. The
package comes with a sample program that uses an unmodified XSLT
engine (whatever you have set as the TrAX default) with an HTML5
parser and an HTML5 serializer. There's running code for addressing
the issues *today*.
http://about.validator.nu/htmlparser/
On the serialization side, it is up to the programmer of the XSLT
transformation to make sure that the output tree is conforming XHTML5
+ SVG 1.1 + MathML 2.0. If it isn't, the serialization results can be
wildly wrong. However, this isn't worse than <xsl:output method="html"/
>, since it, too, produces wrong results if the XSLT programmer
doesn't make sure the output trees are sanely shaped HTML 4 trees.
>> HTML5 defines HTML elements to go into the "http://www.w3.org/1999/xhtml
>> " namespace in order to abstract away the difference of
>> serialization from programs that operate on a namespace-aware tree
>> representation. HTML5 parsers that expose XML APIs to allow unified
>> application internals regardless of whether the data came in as
>> text/html or application/xhtml+xml put HTML elements in the "http://www.w3.org/1999/xhtml
>> " per spec. Moreover, with support for MathML and SVG, there can
>> also be element nodes in those namespaces. Programs operating on
>> trees shouldn't have to have different code throughout depending on
>> whether the program is targeted at text/html or application/xhtml
>> +xml.
>
> On the other hand, in past HTML (4 and previous) has not been using
> anything like namespaces while XHTML used this concept. If you have
> existing XSLT code that emits HTML and you want to use few new
> elements
> introduced in HTML5 why you should also start thinking about
> namespaces?
Starting to think about namespaces is not cool, but XSLT is on the XML
side of the fence, and XML has namespaces, so XHTML5 has them.
> You simply want to add those few new tags into your stylesheet and
> modify public identifier to make it clear that you are using brand
> new HTML5 language.
That works for people who know both XSLT and HTML really well.
However, for everyone who isn't a language lawyer at the bounds of
this approach are mysterious and arbitrary. That is, you hit the
limits of what the HTML output method of XSLT can do and those limits
depend on historical details.
> So, your idea sounds perfectly reasonable and I think once there is
> something like HTML5 output method in XSLT and HTML5 is widely
> deployed everyone should use such approach. But we are not there
> yet, we can propose such academically clean approach, but at the
> same time we should pragmatically solve todays' problems.
Within these constraints, there's <xsl:text disable-output-
escaping="yes">, which doesn't require us to allow cruftier syntactic
alternatives in HTML5 syntax.
If we allow a placeholder public id, cargo cultists will think that
the more complicated syntax is somehow better because HTML 4 had
similar cruft and cruft exists for a *reason*, will make up a
rationalization for it that doesn't even mention XSLT (something like
"it helps browsers better understand semantics") and will start
evangelizing the more crufty syntax to other people who will end up
wasting their time looking up a public id that is useless if they
aren't using XSLT. Time is the most valuable resource people have, so
inflicting time-wasting cruft on Web authors isn't nice.
>> I think the right way to deal with this is to define an HTML5
>> output method for XSLT.
>
> I agree, and I'm willing to manage that next version of XSLT will
> have such method. Of course this means that serialization of HTML5
> and other related issues are resolved before. Is this part of HTML5
> stable or are there any changes expected?
The SVG stuff is still commented out. Also, Julian contested the new
void elements.
--
Henri Sivonen
hsivonen@iki.fihttp://hsivonen.iki.fi/