In this article, we discuss the XML functionality available in MySQL, with an emphasis on new features coming online in MySQL 5.1 and MySQL 6.0. We assume that you already have a working knowledge of XML, and that you know what the terms “valid” and “well-formed” mean. We also assume that you have some knowledge of XPath.

We cover the following topics:

Methods for outputting MySQL data in XML format, including the use of

lib_mysqludf_xql

, a third-party library that can be used for this task

Using the functions (new in MySQL 5.1)

ExtractValue()

and

UpdateXML()

for working with XML and XPath

Storing data from XML in a MySQL database using the

LOAD XML

statement (implemented in MySQL 6.0)

Some security considerations to keep in mind when using these techniques

An XML namespace declaration is included in the XML output of both mysql and mysqldump, beginning with MySQL 5.1.18.

There may be times when your application expects an XML format that is different from what is normally obtain from mysql or mysqldump. Suppose, for example, that your application expects a format such as this one:

You can see that generating even relatively simple XML output by such means can rapidly grow very complicated. While you can wrap such

SELECT

statements in stored procedures, the fact that MySQL stored routines cannot (at least at present) take a variable number of arguments. Fortunately, there is a third-party library that can be used with MySQL to make this task easier.

Using the

lib_mysqludf_xql

library. The

lib_mysqludf_xql

library is a set of UDFs originally written and placed under the GPL by Arnold Daniels. The source for these is now housed at www.mysqludf.org.

the section called “XPath Functionality”); however having to do so every time you want to access the data s not terribly convenient. However, MySQL developer Alexander Barkov has written a stored procedure

xmldump_load

that extracts data from the XML elements and attributes found in a file created using mysqldump --xml, and inserts this data into the columns of a MySQL table. The source of this stored procedure is shown here:

This statement reads data from an XML file into a table, and acts as the complement mysql or mysqldump in XML output mode (that is, using the

--xml

option). The

filename

must be given as a literal string. The

tagname

in the optional

ROWS IDENTIFIED BY

clause must also be given as a literal string, and must be surrounded by angle brackets (

<

and

>

).

LOAD DATA

clauses used in

LOAD XML

. If you have used MySQL's

LOAD DATA

statement before, then you should find the following clauses familiar, since they work in essentially the same way for

LOAD XML

as they do for

LOAD DATA

:

LOW_PRIORITY

or

CONCURRENT

LOCAL

REPLACE

or

IGNORE

CHARACTER SET

The

IGNORE number LINES

clause is analogous to the

LOAD DATA

statement's

IGNORE ... LINES

clause. LOAD XML also accepts

IGNORE number ROWS

. In either case, the clause causes the first

number

rows in the XML file to be skipped, and not to be imported.

(column_or_user_var,...)

SET

See

LOAD DATA INFILE

Syntax in the MySQL 5.1 Manual for more information about these clauses.

ROWS IDENTIFIED BY '<

tagname

>'.

Accepted input formats.

LOAD XML

supports three different XML formats:

Attributes are interpreted as column names, and attribute values as interpreted as column values:

<

row

column1

="

value1

"

column2

="

value2

" .../>

Tag names are interpreted as column names, and the content of these tags are interpreted as column values:

<

row

> <

column1

>

value1

</

column1

> <

column2

>

value2

</

column2

></

row

>

Table column names are derived from the

name

attributes of

<field>

tags, and column values are taken from the contents of these tags:

<row> <field name='

column1

'>

value1

</field> <field name='

column2

'>

value2

</field></row>

This is the format used by MySQL tools such as mysqldump

The import routine used by

LOAD XML

automatically detects the format used for each row and interprets it correctly, matching based on the tag or attribute name and the column name. You can easily verify this for yourself, by creating an XML file that uses any two or even all three formats and then using

It is also important to check the output being sent back to the client. For an example, let's revisit our previous example; however, this time, instead of using PHP's XPath functionality, we use the MySQL

ExtractValue()

function:

mysql>

SELECT ExtractValue(

->

LOAD_FILE('users.xml'),

->

'//user[login/text()="" or 1=1 and password/text()="" or 1=1]/attribute::id'