Use more granular caching for common queries in `WP_Site_Query`

Description

The new function get_site_by() that should serve as a modern replacement for get_blog_details() ended up not making it into 4.9, because it had several drawbacks of the latter related to caching, particularly caused by the caches used in WP_Site_Query being invalidated on any update to any site, which makes caching almost useless in a very big and active setup.

Better handling of caching specific queries that are common should be explored, particularly for queries which can be invalidated only by invalidating the caches for a specific site instead of relying on the aggressively invalidated last_changed. Let's look at how some of the old caches like blog-lookup or blog-id-cache work, and apply that to the modern behavior of WP_Site_Query.

While mentioning get_site_by() above, the changes should preferably be made to WP_Site_Query to benefit any other multisite area making such queries as well.

Change History (25)

42252.diff​ implements several caching improvements, some major, some minor. All of them are part of a new WP_Site_Query::get_cache_key( $query_args ) method. The goal of these changes is to bring the same granular handling that old caches such as blog-lookup use to WP_Site_Query in general and thus bring this solid foundation to any code that uses get_sites().

The most significant improvement and also kind of new concept introduces in the patch is that there is no longer just one last_changed key in the sites group. Instead there are several more last_changed keys, prefixed with md5-generated values from specific query arguments. There are currently four sets of arguments that are handled in a special way:

domain, path and network_id

domain and path

domain

path

If a query contains all arguments for one of the above bullet point without including any other "database field related" argument, a special last_changed key will be used which is generated by passing the concatenated values for the respective arguments (only the arguments of the respective bullet point above) through md5. The term "database field related" here means query arguments that deal with a specific database field, such as domain, path, ID, site__in, lang__in, public, to give only a few examples. Contrary, arguments that are not database field related would be for example number, offset, fields, no_found_rows, orderby. The latter of course do affect the outcome of a query - however they do not affect when the cache for a query needs to be invalidated (via a last_changed cache value).

While the logic is rather complex (and we may surely be able to improve it!), this change brings a significant improvement to these specific queries which are rather common. Notably, the technique of using more granular last_changed values provides much more flexibility than something like the existing blog-lookup cache, which only stores one result for a specific domain and path.

For example, both of the following sets would use md5( 'example.com' ) . '_last_changed' for their last changed key:

Set 1:

domain: example.com

number: 1

Set 2:

domain: example.com

number: 100

orderby: path

order: DESC

Another example, both of the following sets would use md5( 'example.com/foo/3' ) . '_last_changed' for their last changed key:

Set 1:

network_id: 3

domain: example.com

path: /foo/

number: 1

Set 2:

network_id: 3

domain: example.com

path: /foo/

number: 5

offset: 5

no_found_rows: false

These examples show the flexibility of this approach while not being subject to aggressive cache invalidation. To explain a bit further, the second example's cache would only be invalidated if a site with either network ID 3, domain example.com or path /foo/ was created/updated/deleted (see the change in clean_blog_cache() for that part specifically).

Concluding, I think this approach gives us the possibility to for example get get_site_by() merged as it currently is, as simply by making these improvements to WP_Site_Query it would have caching that is better than that in get_blog_details(). Furthermore, this will obviously benefit any query using get_sites(), as said before.

The code for this is admittedly rather complex, so don't hesitate to ask question or tear it apart. This is a new pattern for caching query results, so I hope we can get it to a solid state. Once we get there, WP_Network_Query (and possibly even non-multisite query classes) could be enhanced in the same way. Nicely enough, all existing unit tests passed for me with these changes. Of course we'll also need a bunch of tests for this behavior specifically, which I'll add later.

One last thing: Lines 707-724 of class-wp-site-query.php in the patch implement what we've discussed today during office-hours (plus I ignore update_site_cache completely, since it doesn't affect the query at all). I consider this a minor improvement and wanted to do it at the same time, but we could also take this out for now to make the patch better readable and focus on the more important things first.

So looking at the type of queries that WP_Site_Query is used for, these can be broken into 2, queries for a list of sites or query for a single site by domain / path / network_id. In my patch, it detects if query is for one site and uses different caching rules. If you are looking for a single site, then generate a cache key based on a domain/path/network_id. The cache lives forever, until it is invalidated in clean_blog_cache. This cache will have some massive improvements to profile of WordPress, but cahce invalidation will be the trick here.

I think if my patch in some form goes into core, it will have to go with #42284 to see the benefits.

The logic for creating a special last_changed prefix is a bit more flexible: Now any kind of permutation of the domain, path and network_id arguments or subset (as sole arguments of course) will determine a special prefix. By iterating through the arguments in a predefined order, it is ensured that there is only one key for the same argument values.

The logic in clean_blog_cache() has been adjusted so that it clears all possibly relevant last_changed caches. The most complicated part to figure out here was how to reliably create all relevant subsets to clear, but eventually I found ​https://gist.github.com/christophervalles/1066801 to help me out.

Both the logic in WP_Site_Query::get_cache_key() and clean_blog_cache() ensure that all relevant values before serializing are parsed into strings (in case they aren't already). This is necessary to ensure that they can be reliably cleared (think about for example $network_id which may be a string or integer).

The $ignore_args array (containing arguments which are irrelevant for determining a last_changed value) is brought back, any particular concerns why you stripped that out in your patch @spacedmonkey?

With this current behavior, the only regarding the arguments we'd need to worry about in the future is to keep them in sync between WP_Site_Query::get_cache_key() and clean_blog_cache().

Still much needed are unit tests, which I will work on in the next week or two. Let's of course keep continuing to discuss the approach and patch in the meantime.

42252.7.diff​ applies cleanly against trunk after the coding standards commit, and also fixes a bug in WP_Site_Query::get_cache_key() where a temporary loop variable would accidentally override another variable used later on, previously causing two tests to fail.

It essentially adds only one test, however it uses systematically generated datasets for it, so that all cases of domain/path/network ID combinations are covered, making it sufficient for the change. In addition, each set is tested with both a random site field updated and then again with a field updated that is used as orderby in the query, to ensure those fields cause a change accordingly.

You will notice that some existing tests have been modified. However this is only to account for the three new sites that have been added in wpSetUpBeforeClass() and are used for the new test.

Missing are tests that ensure that when one of the special fields is changed on a site, the caches for both the new and old value are cleared. However those would currently fail anyway since WordPress doesn't have logic for this in place. #40364 implements that logic, therefore I suggest we merge that one first (when it's ready), and then proceed with this one here.