Improve performance of WP_Query core

WP_Query, by default, selects wp_posts.* when building post queries. This can result in extremely large result sets when used with a large wp_posts table, resulting in MySQL being forced to use a temp table for sorting, which leads to all sorts of performance problems. SELECT * is generally considered to be a code smell as is, and is commonly known to cause problems at scale.

This patch breaks default WP_Query queries into two parts - one to select a list of IDs with the given selection criteria, and a second to actually select the posts once the IDs have been determined. This vastly reduces the amount of data that has to be processed by the database, resulting in fewer context switches and vastly improved performance.

I have implemented this in a production system with a wp_posts table with over 450k rows and reduced query times by two orders of magnitude. Database CPU burn is down by 90% and context switches have been drastically reduced.

This should be of some benefit even in small cases, since smaller datasets will result in faster sorts. The overhead of the second query is minimal. In large cases, the performance benefits range from "good" to "dramatic".

I have run the Wordpress PHPUnit test suite against the patch, and it did not break any tests. The suite, when I checked it out, had multiple failures present, but no new ones were introduced by this change.

Change History (105)

Unfortunately, not many of our unit tests cover WP_Query. After some bad experiences in version 3.1 when we added taxonomy queries, we're not likely to significantly alter WP_Query again (for both this ticket and #10964, among otherS) until we have them. Building some unit tests for WP_Query would be quite the undertaking, but it'd be one that would move along tickets like this.

That said, this does look relatively simple. The main concern would be seeing how plugins that hook deeply into WP_Query might be affected.

Another concern would be figuring out where old and new intersect with regards to performance. I imagine for simple queries or smaller datasets, the current version would be more performant.

What's one of your queries that isn't performing well? If you're doing an interesting join or sorting or querying by an unindexed field, you might be doing a tablesort that you should work to avoid. Can we see some EXPLAIN results?

All the patch does is reduce the amount of data that MySQL has to manage to perform the sort - with small dataset sizes, this isn't likely to have much of an impact, but once the size of your potential resultset (with included post content) passes your query into filesort territory, the differences become *big*.

One possible solution might be as simple as a config switch; the administrator could be warned that the switch may break some plugins. It's certainly not a perfect solution, but for people with large installs, it might be a pretty significant quality-of-experience improvement.

My first draft of this fix implemented query rewrites as a plugin using the posts_request and posts_results filters, but it's a hack of a solution, and seems like the sort of thing that belongs in the core anyhow. ​https://github.com/cheald/wp_fasterquery/blob/master/filter.php is the plugin, but it's very clearly tailored specifically to our needs and install, and may not work well for others.

While it's possible to fix this with a plugin, it feels like the fact that it can happen at all is a somewhat self-limiting design issue that could use some more formal resolution. If the team disagrees, that's certainly fine, but I felt I'd be remiss if I didn't at least make an attempt! :)

Some caching plugins return an empty query string from the posts_request filter. With the current query code, this causes get_results() to return null and skip the query. The plugin then fills in the posts array by fetching the posts from cache and returning them through the posts_results filter. This fails with 18536.diff applied. The get_col() call in the expand_ids block receives an empty query string. get_col() uses last_result when an empty query is passed. This results in some totally whacky stuff ending up in the SELECT * query in the expand IDs block.

A quick fix would be to save the query string before calling the filter and use that with get_col() if the filtered query string is empty. But, who knows what other back compat problems await. I'd still like to give this a go, but I think we have hold off until 3.4 to give plugin authors time to test and accommodate.

So this is basically what was suggested in #10964 i.e. split the query in two, right?

Yes. That ticket is all over the place though.

If there's a hook registered for posts_request, then don't do the expand_ids block. Create two new request filters for the two queries in the expand_ids block. That way plugins have a clear way forward. Sort of like Denis' suggestion in 10964, but I think new filters will handle back compat better than adding a flag to posts_request.

With regards to back-compat, I agree with ryan that it's better to not fire some of the old filters anymore and introduce new filters. That way, if a plugin wants to be compatible with WP < 3.4, it can just keep the old callbacks without any modification and add new code for WP 3.4+.

I was thinking about this ticket in combination with some others like #19726 and wondering if we should look into create a separate function that that would fill in the posts array from the passed in ID's that would first check cache for each post_id before trying to get it's data from the DB. This would allow the second query to be avoided in a lot of cases where an object cache was available. The order of the posts could easily be rebuilt from the order of the post_ids passed in.

I was thinking about this ticket in combination with some others like #19726 and wondering if we should look into create a separate function that that would fill in the posts array from the passed in ID's that would first check cache for each post_id before trying to get it's data from the DB. This would allow the second query to be avoided in a lot of cases where an object cache was available. The order of the posts could easily be rebuilt from the order of the post_ids passed in.

Essentially replicates some of the work of the advanced caching plugin. It's definitely a good idea.

I noticed that the query that fetches the rest of the rows contains the $orderby fragment again. This would obviously break when the orderby refers to any table other than wp_posts, for example when ordering by a meta key.

Going to use an approach similar to WP_User_Query: fetch only the non-cached posts and then populate the posts array according to the order of the ids found.

As a possible performance improvement for sites without a persistent object cache, do the get_post() array_map() only if $non_cached_ids and $ids don't have the same count? If they're the same just use $fresh_posts.

As a possible performance improvement for sites without a persistent object cache, do the get_post() array_map() only if $non_cached_ids and $ids don't have the same count? If they're the same just use $fresh_posts.

Although we'd still need something to restore order since orderby had to be dropped for the second query. So, probably not worth the worry. Regardless, we can hash out with further optimizations later.

As a possible performance improvement for sites without a persistent object cache, do the get_post() array_map() only if $non_cached_ids and $ids don't have the same count? If they're the same just use $fresh_posts.

Even with a non-persistent object cache, get_post() shouldn't trigger any new queries, since update_post_cache() is called before it.

I don't think _prime_post_caches() has a use case outside of WP_Query. If you wish to prime post caches elsewhere, simply use WP_Query. (For example, we do this for nav menus.)

Yes, but doing it this way still causes the first query to get the posts.ID of the posts that were already in the posts_in clause.

Take for instance the update_post_thumbnail_cache() function. Since we already have the post thumbnail ids, there is no reason to run the ID query just to get the same IDs which will be fed into the primer. The same goes in the nav menus. I also use a similar function a lot in my own code any time I need to show a user sorted list of posts.

It would be hard to avoid this extra query to get the id's you already told it unless you specifically tell WP_Query that you won't be using the sorted result.

That brings up a good point. A query such as WP_Query( 'post__in' => array( 5, 15 ) ) will look rather silly: It will first grab all IDs when we already have them. Perhaps a further optimization would be that, given a simple $where and a big enough $limit, we can avoid making the query all together, and simply populate the post objects and order them in PHP. But I'm perhaps getting a bit ahead of things. Ryan might have a better idea on how we can abstract this out so existing prime attempts are not adversely affected, and there is a canonical way to prime a cache.

Just coming in from a fresh experience of upgrading a production site to the latest (3.3.1), there is also the use of SQL_CALC_FOUND_ROWS in the user.php that was introduced recently (I haven't checked the revision history of it just yet). This should utilize a different approach as well. Is that easy to add to this patch?

I think avoiding expand_ids if a posts_request filter is present is safest for 3.4. There are a number of caching plugins that will break in bizarre fashions with expand_ids. Like we talked about in comment:14 and #10964.

Do we know whether this will work with plugins like W3 Total Cache? Sorry, please forgive me and don't flame, I'm not huge on knowledge of edits like this and the WP core. It's why I'm here, to learn :)

Really glad you guys are working on this and hope to have it down for 3.4. My site has been completely crippled by this slow query since we crossed about 100k posts.

Currently I run 3.1. My question is: if I want to test this right now, first I'll need to update my install to trunk version (3.4) then I install the patch: 18536.6.diff. Is that correct? Or are we better served to wait for 3.4 at this point?

first I'll need to update my install to trunk version (3.4) then I install the patch: 18536.6.diff. Is that correct?

Since [19918] (2 weeks ago) it's been in trunk, so installing the latest trunk SVN onto the server would allow you to test it. although it should work, just realise there's a lot of things still in development which aren't quite finished yet.

Previously, forums, topics, and replies (all custom post types) would come with their ancestors pre-populated so they could be walked up through their forums to root. Now sometimes there are no ancestors, so things like breadcrumbs are showing inaccurate results.

I've only been able to track it down to being inconsistent ancestor results in the cache; I haven't come up with the best place to drop in a fix. Most likely a call to _get_post_ancestors( $_post ); is needed somewhere.

In 3.3 WP_Query::get_posts(), get_post_status() called get_post() called _get_post_ancestors() before the caches were primed. This meant a post object with ancestors made it into the cache. In 3.4, the caches are primed without ancestors before get_post_status() -> get_post() -> _get_post_ancestors() is called. Since get_post() does cache adds, not sets, the first add without the ancestors wins. We could fix this by doing a cache set in get_post() if the ancestors property is not set in the cached version. I'd rather not mess with that right now, however. A simpler alternative that should suffice for the time being is to call _get_post_ancestors() from get_post_ancestors() if the ancestors property is not set. This could result in the query being run on every _get_post_ancestors() call in the situation where the cached object does not have ancestors, but I can live with that. In 3.5 we can contemplate a proper WP_Post object with lazy ancestors fetching and improved caching.

If we pass $this->posts[0]->ID instead of the $this->posts[0] object to get_post_status(), the extra query will be avoided. (The _get_post_ancestors() call is caused by an object with empty(filter) getting into get_post(). The lack of a query is caused by an ID getting into get_post() for a post object that is already cached.)

This avoids the unnecessary ancestors query, at least until get_post_ancestors() gets called.

However, subsequent calls to get_post_ancestors() will result in subsequent queries for the same data, as get_post_status() is no longer putting ancestors into the cache. Perhaps we need to call _get_post_ancestors() for is_singular(), as that's what was happening already with 3.3, and that's where ancestors are most likely to be used.

Nope. $post->ancestors is only sometimes set, and (under 3.3) it is never set for a post object that comes from WP_Query. It *is*, however, set for the post object in cache (but not $wp_query->posts[0]) when is_singular, due to get_post_status() caching the object (as described above).

Call _get_post_ancestors() from get_post_ancestors() if the ancestors property is not set in the post object. Works around situations where ancestors is not set in the cached version of the post object. see #18536

In 3.3 the $posts array in WP_Query can contain a post object with the ancestors property set. This is because get_post_status() is working on an object (reference) and that object gets the ancestors property set.

Can [21073] please be included in the 3.4 release? Passing the post ID breaks plugin "teaser" functionality, which uses the 'posts_results' filter to spoof post_status to 'publish' (while replacing post_content with a placeholder).