At the moment, there are two big features in the works in MySQL optimizer - Subquery optimizations and Batched Key Access. While the former is a part of MySQL 6.0, I wrote about it here in my blog, and so forth, the latter was in nearly stealth mode until a couple of weeks ago.

That’s no longer the case:

Batched Key Access source code is now published as mysql-6.0-bka-preview tree.

Igor Babaev, the author of the feature, gave a talk about Batched Key Access at MySQL User Conference, and the slides are available here.

There is now Batched Key Access page on forge.mysql.com which a semi-official feature homepage which describes how you can get source, binaries, etc

If you’re developing any kind of “remote” storage engine or are using MySQL in a data warehousing application where disk IO is the bottleneck, I recommend to take a look.

After this post I’ve got a question how one can tell if his outer join was converted to inner. You can find it out by looking at the warning generated by EXPLAIN EXTENDED. If the outer join wasn’t converted, you’ll see it in the rewritten query in the warning:

Looking at the warning text we see an inner join instead of the left join, and also we see that the ON clause has been added into the WHERE.

…

Yes, those warning messages are hard to read, they have excessive quoting and the lines are too long. But at the moment certain kinds of query plan details are displayed only there, so skiming through the rewritten query text may pay off when you have doubts about what is going on.

“Inner” table access methods are constructed from parts of the ON condition. WHERE condition can’t be used to construct table accesses.

Parts of ON condition are checked as soon as possible

Parts of the WHERE condition
- are not checked until we’ve found a row combination that matches the ON clause
- are checked as soon as possible after that.

Or, in more detail:

Conversions

RIGHT JOIN to LEFT JOIN conversion is obvious:

(t1 RIGHT JOIN t2 ON cond) = (t2 LEFT JOIN t1 ON cond)

Conversion from outer to inner join is possible when the result of inner join will be the same. It will be the same if the row combination with NULL-complimented row will not pass the WHERE clause. For example, if we look at the query

t1 LEFT JOIN t2 ON some_cond WHERE t2.a=t1.b

we’ll see that a row with t2.a IS NULL will not satisfy the WHERE condition. Hence, this outer join can be converted to inner.

Constraints on join order

Outer tables go first

any outer table used in the outer join’s ON clause must be before all of the inner tables.

No interleaving

tables contained within an outer join must form a continuous sequence in the join order. Interleaving with tables that are outside of the outer join is not allowed.

Table access rules

Now, this requires some explanation. MySQL’s nested-loops join code tries to check parts of the WHERE as soon as
possible. For example when a query

SELECT * FROM
t1,t2, ...
WHERE
t1.col1=c1 AND
t2.col1=t1.col2 AND t2.col2=c3 AND
...

is executed using a join order of (t1, t2,…), it proceeds according to this kind of scenario:

We can see here that the the WHERE condition is split into parts that are checked “as early as possible”.

With outer joins is more complicated. We need to know if we’ll need to generate a NULL-complemented row combination. We won’t need to if there was a combination of inner tables that matched the ON (but not necessarily the WHERE) clause. The solution is to switch the WHERE parts checking on and off.

The best way to show it is with example: Suppose we have a query

SELECT * FROM ... ot1 LEFT JOIN (it1, it2) ON somecond WHERE ...

and suppose the join order is (ot1, it1, it2, …). The execution will proceed in this manner:

What’s visible there? When we start scanning table it1, we check only the ON condition. We can’t check the WHERE - we could iscard some it1’s row that is the only row that will match the ON condition, think there will be no matches, and erroneously generate the NULL-complimented row.

After we’ve found the match for the ON condition, we go back and check all parts of the WHERE we did not check because of the above mentioned reason.

After that, the execution proceeds as if this was an inner join, with ON merged into the WHERE clause.

The diagram also shows why we can’t use parts of the WHERE clause to create table acccess methods: because there are times when we can’t use parts of the WHERE for filtering. We always can use parts of the ON though.