I found a problem with table tags (TR,TD,TH). They are declared as SGML_MIXED
in HTMLPDTD.c and they actually accept a close tag. But I think the
specification in the DTD allows them to be closed by other instances of
themselves (i.e. <TR> closes the previous <TR> and so on).
Since right now there is no way to specify this in the code, the result
is that a table containing only open tags is quickly overflowing the HTML
element stack, and there is no way to recover from it.
To "fix" the problem I just defined the tags as SGML_EMPTY.
The righ fix should be to add some information on which tag can close which
and stuff like that.
Also, it could be useful to have the possibility to "drive" the HTML parser
from the user level, so I could do things in the user callback like:
void myBeginElement(HText *text, int element_number,...)
{
if (element_number == HTML_TD && TD_already_open)
(myStream->actions->end_element)(myStream->target, HTML_TD,...)
...
}
Anyway, here is the "fix":
Index: HTMLPDTD.c
===================================================================
RCS file: /sources/public/libwww/Library/src/HTMLPDTD.c,v
retrieving revision 2.28
diff -b -r2.28 HTMLPDTD.c
480c490
< { "TD" , td_attr, HTML_TABLE_ATTRIBUTES, SGML_MIXED },
---
> { "TD" , td_attr, HTML_TABLE_ATTRIBUTES, SGML_EMPTY /*SGML_MIXED*/ },
482c492
< { "TH" , td_attr, HTML_TD_ATTRIBUTES, SGML_MIXED },
---
> { "TH" , td_attr, HTML_TD_ATTRIBUTES, SGML_EMPTY /*SGML_MIXED*/},
484c494
< { "TR" , id_attr, HTML_ID_ATTRIBUTE, SGML_MIXED },
---
> { "TR" , id_attr, HTML_ID_ATTRIBUTE, SGML_EMPTY /*SGML_MIXED*/},
---------------------------------------------
Raffaele Sena
Senior Software Engineer ( "THE" Linux Guy :)
NuvoMedia, Inc.
310 Villa Street
Mt. View, CA 94041
Main: +1.650.314.1200
Direct: +1.650.314.1255
Fax: +1.650.314.1201
mailto:raff@nuvomedia.comhttp://www.rocket-ebook.com