PDO: Learn how the PDO SQL parser causes bug reports

2008/07/03by admin

The PDO SQL parser causes many bugs. All drivers suffer from it. By default the parser replaces placeholder with '<bound_value>'.'<bound_value>' is a string. This can clash with the SQL syntax. PDO can make an INSERT fail. Please learn why and stop reporting bogus bugs.

Why does PDO parse my SQL commands?

PDO has chosen Prepared Statements as its preferred way of running database queries. The entire API is build around prepared statements. Wez has explained that primarily security considerations have caused this decision. Read PDO_MYSQLND: Prepared Statements, again for a discussion of prepared statements in the context of PDO.

Not every database system supports prepared statements. And those who do use a different syntax for placeholder. Some systems support named parameters (SELECT * FROM test WHERE id = :name), some support positional/ordinal parameter (SELECT * FROM test WHERE id = ?) and some support both. The cause of the differences is the lack of a standard. PDO tries to hide the differences from the user. This is done by a prepared statement emulation. The emulation includes a SQL parser which allows you to use whatever format you prefer for placeholder. If the underlying database system does not support the placeholder format you have chosen, PDO emulates it for you.

In many cases the emulation works flawless. But the bug databases on bugs.php.net and pecl.php.net show plenty of reports that boil down to the PDO SQL parser. Please learn how the SQL parser works. This should help avoiding duplicates in the bug database.

How the SQL parser operates

The PDO SQL parser is a very basic parser. That is no surprise. Its only task is to detect placeholders of the kind :name (named placeholders) or ? (positional/ordinal placeholder). Every found placeholder will be replaced by bound values. Bound values are input values you have associated with the placeholders using PDOStatement::bindValue or PDOStatement::bindParam.

The parser does its job by tokenizing your SQL command, like every parser. A statement like INSERT INTO test(id, label) VALUES (:id, :label) becomes a stream of token. The stream looks like this:

Type

Token

SQL

INSERT INTO test(id, label) VALUES (

Named Placeholder

:id

SQL

,

Named Placeholder

:label

SQL

)

The parser reads the token one by one. Whenever it hits a token of the type "Named Placeholder" (or "Positional Placeholder") it looks up the appropriate input value. Given that an input value can be found, PDO calls the underlying PDO driver of the database system to quote the input value. PDO does use PDO::quote(). You could to the very same in the userland. The quoted value gets used to rebuild the SQL command from the individual token.

Type

Token

Generated SQL

SQL

INSERT INTO test(id, label) VALUES (

INSERT INTO test(id, label) VALUES (

Named Placeholder

:id

'1'

SQL

,

,

Named Placeholder

:label

'a'

SQL

)

)

In the example PDO creates INSERT INTO test(id, label) VALUES ('1', 'a'). Note that PDO has added quotes around the 1. By default PDO takes the most secure default and converts everything to string. Some databases, however, might expect an unquoted integer value for the column id as it might be defined as INTEGER.

PDO is not aware of types

PDO does not know how to quote your input values, unless you tell PDO how to do it! PDO has no knowledge of underlying table definitions or a database specific SQL syntax. You must tell PDO how to quote an input value. PDO is not aware of the type. This is client-side emulated prepared statements without all the context information available with server-side prepared statements.

Start to make use the third parameter of the bind*-functions. Tell PDO what type to use for your input data! If you want to insert a integer value use $stmt->bindParam(":id", $id, PDO::PARAM_INT);. You must not use $stmt->bindParam(":id", $id);, unless you are sure that the PDO prepared statement emulation is not involved because you are using server-side prepared statement ($db->setAttribute(PDO::ATTR_EMULATE_PREPARES, 0);) and you use the placeholder syntax (named and/or positional/ordinal) supported by the underlying database system.

Example of “Bogus”: LIMIT – not a driver bug

$stmt = $db->prepare('SELECT * FROM test LIMIT :limit');
$stmt->bindValue(':limit', 1);
if (!$stmt->execute())
var_dump($stmt->errorInfo());
else
var_dump($stmt->fetchAll(PDO::FETCH_ASSOC));
Warning: PDOStatement::execute(): SQLSTATE[42000]: Syntax error or access violation:
1064 You have an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near ''1'' at line 1 in
/home/nixnutz/php53/ext/pdo_mysqlnd/tests/bug_44707.php on line 27
array(3) {
[0]=>
string(5) "2000"
[1]=>
int(1064)
[2]=>
string(149) "You have an error in your SQL syntax; check the manual that corresponds
to your MySQL server version for the right syntax to use near ''1'' at line 1"
}

This is not a bug! The PDO_MYSQLND client trace, which you can turn on by setting the PHP configuration option pdo_mysql.debug (see also PHP: Debugging ext/mysqli and mysqlnd), tells you why. The trace tells you what SQL has been send by PDO to the PDO_MYSQLND driver:
>pdo_mysql_stmt_execute
| info : stmt=0
| info : SELECT * FROM test LIMIT '1'
| >_pdo_mysql_error
| | info : file=/home/nixnutz/php53/ext/pdo_mysqlnd/mysql_statement.c line=303
| >_pdo_mysql_error
>pdo_mysql_stmt_execute

SELECT * FROM test LIMIT '1' is not a valid MySQL SQL command! PDO has inserted a quoted string at a place where you must insert an integer. PDO has done this because you told PDO to do it. How to fix it? Simple – tell PDO what you mean!
$stmt->bindValue(':limit', 1, PDO::PARAM_INT);
-->
>pdo_mysql_stmt_execute
| info : stmt=0
| info : SELECT * FROM test LIMIT 1
>pdo_mysql_stmt_execute

Alternatively, in case of MySQL, forget about PDO_MYSQL and give the upcoming PDO_MYSQLND a try. Avoid the prepared statement emulation and be happy. Or, go ext/mysqli. As PDO is not an abstraction layer and it does not help you much writing portable code, you can go back to those database vendor specific extensions who feature a usable API.

Real bugs in the PDO SQL parser

If you browse the bug databases you will find a good number of real bugs in the PDO SQL parser. The parser is not only not aware of types. It also fails to handle all the different SQL dialects, see PDO_MYSQLND: Prepared Statements, again. And it has real bugs like the segfault caused by "SELECT 1, 2 FROM DUAL WHERE 'test' LIKE 'O'chaos' AND :id". The segfault has been reported as a PDO_MYSQL bug, Bug #41125 PDO mysql + quote() + prepare() can result in seg fault more than a year ago. But the backtrace proofs that its a PDO bug. Not a driver bug.

Think a minute before reporting a bug

Learn how to use PDO properly, if you really insist on using it. Learn how the PDO SQL parser works. Think a minute before reporting a PDO bug. It might not be a driver bug. Nor might it be a PDO bug. It might just be the way how PDO is designed that bites you.

After one day with the bug databases and teaching this PDO lesson I feel good about closing some bugs as "Bogus".

But in any case we end up with the need for a parser. No matter where we put the parser – into the core or into the drivers – it will need context information. As you might know, I’m currently in favour of moving the parser (optionally) into the drivers. This will, however, not free the users to give type hints.

Once again: don’t get me wrong on PDO. I’m 110% behind PDO as such for PHP. PHP deserves a DB “abstraction”. But we need to streamline PDO. And we need to teach people about its current limitations.

One can learn a lot from the PDO code. But it goes back to PHP 5.0 and today we’re talking about 6.0. Its time for an upgrade.