I had been involved with subquery optimizations fairly closely, but last week I was surprised to find out that MySQL 5.6 does not support derived table merging. This feature was among the subquery features in the abandoned MySQL 6.0. In MariaDB, it was finished and released as part of MariaDB 5.3/5.5. As for MySQL, neither MySQL 5.6, nor MySQL 5.7 has this feature.

So what is this “derived merge”? It’s simple to understand. When one writes complex queries, it is common to use FROM-clause subqueries as a way to structure the query:

select
sum(o_totalprice)
from(select * from orders where o_orderpriority=’1-URGENT’)as high_prio_orders
where
o_orderdate between ‘1995-01-01′ and ‘1995-01-07′

MySQL optimizer processes this syntax very poorly. The basic problem is that FROM-subqueries are always materialized exactly as-specified. Conditions from outside the subquery are applied only after the materialization.

In our example, table orders has an index on o_orderdate, and there is a highly selective condition o_orderdate BETWEEN ... which one can use for reading through the index. But the condition is located outside the subquery, so it will not be used when reading the table. Instead, we will get the following plan:

Note that we see only one line, and the table orders is accessed through an index on o_orderdate. Running EXPLAIN EXTENDED will show why:
Message: select sum(`dbt3sf1`.`orders`.`o_totalprice`) AS `sum(o_totalprice)` from `dbt3sf1`.`orders` where ((`dbt3sf1`.`orders`.`o_orderpriority` = ‘1-URGENT’) and (`dbt3sf1`.`orders`.`o_orderDATE` between ‘1995-01-01′ and ‘1995-01-07′))

There is no FROM-clause subquery anymore. It has been merged into the upper select. This allowed the optimizer to avoid doing materialization, and also to use the condition and index on o_orderdate to construct a range access.

Query execution time for this particular example went down from 15 sec to 0.25 sec, but generally, the difference can be as big as your table is big.