# This script segfaults:
use strict;
use XML::LibXML;
my $xml = <<EOF;
<!DOCTYPE bug [
<!ENTITY myent "xyz">
]>
<bug>
<elem>&myent;</elem>
</bug>
EOF
my $dom = XML::LibXML->load_xml (string => $xml,
expand_entities => 0);
my $root = $dom->documentElement;
my @nodes = $root->childNodes;
foreach my $node (@nodes) {
next if $node->nodeType != XML_ELEMENT_NODE;
next if $node->nodeName ne 'elem';
$root->removeChild ($node);
}
__END__
The script doesn't crash if you do one of the following:
- set expand_entities to 1
- do not use the entity &myent;
- instead of &myent; use &amp; (or &lt;, &gt; ...)
In dom.c, at one point the function _domReconileNS() gets called with a
tree argument where tree->ns is not NULL but tree->ns->prefix is either
NULL or points to lala-land.
In the above script, the pointer is NULL. But with a more complicated
XML document, tree->ns->prefix pointed to an address beyond the end of
the virtual RAM. That means that it is not sufficient to extend the
check to
if (tree->ns != NULL && tree->ns->prefix != NULL)
That just fixes the test case but the crash still happens with my "real"
data.
The bug occurred on Gentoo x86_64 with version 1.9000 but also with
version 2.0008.

> In dom.c, at one point the function _domReconileNS() gets called with a
> tree argument where tree->ns is not NULL but tree->ns->prefix is either
> NULL or points to lala-land.

Why is this the case? Do you know whether it's a bug in XML::LibXML or
in libxml2?
Show quoted text

>
> In the above script, the pointer is NULL. But with a more complicated
> XML document, tree->ns->prefix pointed to an address beyond the end of
> the virtual RAM. That means that it is not sufficient to extend the
> check to
>
> if (tree->ns != NULL && tree->ns->prefix != NULL)
>
> That just fixes the test case but the crash still happens with my "real"
> data.

> > In dom.c, at one point the function _domReconileNS() gets called with a
> > tree argument where tree->ns is not NULL but tree->ns->prefix is either
> > NULL or points to lala-land.

>
> Why is this the case? Do you know whether it's a bug in XML::LibXML or
> in libxml2?

Hard to say, when removeChild() is not present in libxml2. But it's
definetely one that has to be debugged in XML::LibXML first. Even if it
is a bug in libxml2, we still don't know which usage in XML::LibXML
causes it.
Regards,
Guido

The problem is that _domReconcileNs() unreferences the ns member of
xmlNodePtr for all node types it encounters but it is only valid for
element and attribute nodes.
The attached patch fixes that. All tests succeded here with the patch
applied.

Hi,
sorry for the late response - I was preoccupied with other things. I
have applied a variation of your patch now in the Mercurial repository,
and it will be part of a new release (with a test). I'll make the new
release - XML-LibXML-2.0011 (or maybe XML-LibXML-2.0100) once I have
looked at this bug report:
https://rt.cpan.org/Ticket/Display.html?id=80521
Regards,
-- Shlomi Fish

> Hi,
>
> sorry for the late response - I was preoccupied with other things. I
> have applied a variation of your patch now in the Mercurial repository,
> and it will be part of a new release (with a test). I'll make the new
> release - XML-LibXML-2.0011 (or maybe XML-LibXML-2.0100) once I have
> looked at this bug report:
>
> https://rt.cpan.org/Ticket/Display.html?id=80521
>
> Regards,
>
> -- Shlomi Fish