Arnaud Desitter wrote:
> On 22/01/2008, John Snelson <john.snelson@oracle.com> wrote:
>> Is there a better way to do what I want? I would be quite happy to
>> implement a new API method to do this if that's required - does anyone
>> else think this would be useful?
>
> Please refer to http://tidy.sf.net/issue/1636028.
> Your contribution to a new API would be welcome. Please post it using the
> tidy patch tracker.
Thanks for the pointer. From the bug report linked, it's not obvious
what the correct way to fix this is. Should I change tidyNodeGetText()
to return the unescaped value of the node, or should I add a new method?
Here's what I propose - I'll add a new method:
Bool tidyNodeGetValue( TidyDoc tdoc, TidyNode tnod, TidyBuffer* buf );
For attribute, text, comment, and processing instruction nodes this
method will fill the buffer with the value of the node. The value will
be unescaped, and not serialized (no "<!--" or "<?" etc.).
Some questions:
1) Are there other node types the method should work for?
2) Should I respect the specified output encoding, or use UTF-8? (For
instance, the tidyNodeGetName() function always returns UTF-8)
3) What should I do about unrepresentable characters?
John
--
John Snelson, Oracle Corporation http://snelson.org.uk/john
Berkeley DB XML: http://www.oracle.com/database/berkeley-db/xml
XQilla: http://xqilla.sourceforge.net