The walk() method of the DOM Walker class is defined as follows:
def walk(self, root):
if root.get_nodeType() == DOCUMENT_NODE:
c = root.get_documentElement()
assert c.get_nodeType() == ELEMENT_NODE
return self.walk1(c)
else:
return self.walk1(root)
This behaves unexpectedly if the Document node has several children,
as might happen if there are PIs preceding or following the root
element. Only the root element will be walked, missing any other
children of the root, which becomes apparent if you're walking the
tree in order to print it.
How should this be fixed? One choice is to change the
DOCUMENT_NODE case to:
for c in root.get_childNodes():
self.walk1(c)
However, this change really makes the distinction between walk() and
walk1() unnecessary. walk() is basically there as a wrapper for
walk1(), to get the root element if it's a Document node; if we just
traverse all the children, this is consistent for any node type so
walk() and walk1() could be collapsed into one function. This will
break code that subclasses Walker and overrides walk() or walk1() with
something customized.
What do people think should be done? Just fix walk(), or merge walk()
and walk1()?
--
A.M. Kuchling http://starship.skyport.net/crew/amk/
May you go safe, my friend, across that dizzy way / No wider than a hair, by
which your people go / From earth to Paradise; may you go safe today / With
stars and space above, and time and stars below.
-- Lord Dunsany