On Oct 12, 2007, at 19:09, Doug Schepers wrote:
> Henri Sivonen wrote (on 10/12/2007 7:23 AM):
>> We don't do inline SVG in text/html yet. Personally, I hope we'll
>> get there. However, if we do, the main SVG complications will be
>> the xlink mapping, the /> syntax and SVG-native camelCaps. I don't
>> think it is a good idea to introduce more complications if we are
>> already entertaining inline SVG in text/html as a possibility.
>
> Thanks for outlining the challenges to integrating SVG into text/
> html, from an HTML5 standpoint. That's very helpful.
>
> I also want that to happen, and would like to facilitate that when
> the time comes. Also like you, I do have certain concerns about
> how it's done. I'll give you my viewpoint (which is not
> necessarily shared by the rest of the SVG or CDF WGs).
>
> From a technical and market viewpoint (an odd pairing, perhaps), I
> feel very strongly that SVG-in-HTML should maintain identical
> markup syntax with standalone SVG (or SVG-in-XHTML, and probably X/
> HTML-in-SVG); any differences between the two syntaces would be
> actively harmful to SVG.
Do you mean you'd like to bring in the complication of arbitrary
namespace prefixes? I'd like make the following deviations from SVG-
as-XML syntax:
1) I'd like to minimize the need of tokenizer parametrization to
toggling case folding behavior and, if we must, CDATA sections.
Specifically, I think attribute tokenization should run the same code
as attribute tokenization for the HTML parts of text/html.
2) I'd like to avoid supporting arbitrary namespace prefixes both
in order to sidestep issues in shipped IE versions and in order to
relieve authors of namespace syntax. (xlink: should probably be
considered non-arbitrary and hard-wired.)
More concretely, I've been thinking something like this might work:
* Case folding in the tokenizer should be made conditional so that
potentially camelCap names in <svg> subtrees would not be case-folded.
- Issue: Should case folding be toggled on and off (in which case
tokenizing "<svg " would happen in the case-folding state allowing
"<SvG ") or should names be collected unfolded and then whole names
conditionally case-folded (in which case we could require "<svg " to
be in lower case)?
- Issue 2: If the latter, to avoid expensively case-folding whole
start tag tokens *including* attributes later on, the tokenizer
should probably have to know about tag names that turn on the case-
preserving mode before looking for attributes but the tree builder
should be the part of the parser telling the tokenizer to switch back
to the case folding mode. This would be ugly but probably necessary.
* Start tag tokens should have a flag about the /> presence. The
tree builder would ignore this for HTML elements but would pop
immediately for SVG elements.
* The <svg> element would establish "an SVG scope" in the tree
builder. The <svg> start tag token would itself be handled in the
HTML state of the tree builder so that the <svg> element would be
subject to foster parenting.
* When in an SVG scope, the tree builder would ignore the HTML tree
building rules. This means that stray tags looking like HTML tags
could not cause the tree builder to pop out of the SVG scope. While
in the SVG scope, the tree builder would assign the SVG namespace URI
to the element nodes it creates.
- Issue: What to do if there is a prefixed element?
* When in the SVG scope, a start tag token would unconditionally
result in the corresponding element node to be appended to the
current node. (And if the /> flag is set on the token, the node would
be popped immediately.)
* When in the SVG scope, an end tag token would cause a
corresponding element to be searched starting with the current node
towards the start of the SVG scope (and no further). If an element
were found in scope, the stack would be popped until that element got
popped. If there were no such element in scope, the end tag would be
ignored. Any outcome but a single pop would be a parse error.
* When the current node is a foreignObject element in an SVG scope,
the start tag token <html> would establish a "nested HTML scope". </
html>, <body> and </body> would act like "normal" tokens in a nested
HTML scope. Specifically, any token other than </html> encountered in
a nested HTML scope would be unable to break out of the nested HTML
scope.
* Attributes with the name "xlink:href" on the tokenization level
would be reported by the tokenizer as local name "href" in the XLink
namespace.
* xmlns or xmlns:* attributes would have no meaning and would be
non-conforming except xmlns="http://www.w3.org/2000/svg" and
xmlns:xlink="http://www.w3.org/1999/xlink" would be allowed as
"talismans" on the <svg> start tag.
The above trial balloon proposal is designed to optimize SVG
integration in text/html in *future* browsers in a way that would
create a namespace-aware DOM that current DOM-based SVG
implementations would grok immediately but would at the same time
remove namespace declaration syntax from the sight of authors. The
proposal specifically isn't designed to clone the colon-based
namespaces-in-text/html mechanism of IE. OTOH, it shouldn't interfere
with it, either, except perhaps for xlink:href, which could be worked
around by introducing href.
The approach outlined above could be used for MathML as well.
However, in that case, the tokenizer should probably me modified to
switch to MathML entity tables when the tree builder is in a MathML
scope.
> From a logistics standpoint, this work should be done in
> coordination between the HTML, SVG, and CDF Working Groups. All
> have a vested interest in it, and each has a unique set of
> perspectives, needs, and knowledge. Perhaps we can begin talking
> about it at the upcoming Tech Plenary. We are all busy with other
> things right now, but opening the dialog will prepare us for what
> we'll need to consider going forward.
I agree it would make sense to talk about it at the Tech Plenary.
--
Henri Sivonen
hsivonen@iki.fihttp://hsivonen.iki.fi/