Blog/ How to optimize a Drupal View so that it uses INNER JOIN instead of LEFT JOIN

In a recent project, we needed to show a list of 5 most recently updated users. To do this, we added a Unix timestamp field “User Updated” to the user object and created a simple view with required fields and ordered by User Updated field in descending order. Here is how the view looks like.

The problem was that the site had over 80,000 users and it took over 1.10 seconds to execute this view’s SQL query every time.

This was unacceptable and we decided that we had to optimize the query or the view to make it faster. Let’s look at the query in more detail:

You will notice that MySQL first gets all the information from the users table, puts in a temporary file and then joins all those entries with the field_data_field_user_updated table, adding the last updated value for each user. Then it sorts all the results by the last updated value in descending order and returns us the top 5. This query is inefficient in two ways:

For the users table, it is not using any index.

Instead of starting with the users table, it should start with the field_data_field_user_updated table, get the 5 most recently updated uids and then get all the fields for that user from the users table.

Add an index to the user table:

The users table uses the following fields: uid, status, picture, name and mail. Since it’s a limited number of fields, we decided to create a covering index. Here is the command:

After adding this index, the view executed in around 1 second, which is a gain of 0.10 seconds from before.

Start with field_data_field_user_updated table:

MySQL starts with users table because the view creates a left join. It can not omit any entry in the users table even if there is no corresponding entry in the field_data_field_user_updated table. But we have more information about our application than MySQL does. We know that there will be at least 5 out of 80,000 users who have been updated at least once and hence will have an entry in the field_data_field_user_updated table. So filtering on field_data_field_user_updated table first makes more sense. This effectively turns LEFT JOIN into INNER JOIN. If this were an INNER JOIN, MySQL will apply its optimizations and will use the field_data_field_user_updated table first. We confirmed this by replacing LEFT JOIN with INNER JOIN in the MySQL command line. The new query becomes:

You will notice that now MySQL uses field_data_field_user_updated first.

The next step is how to structure the View so that it uses an INNER JOIN instead of LEFT JOIN. Unfortunately with the fields, Views always uses LEFT JOIN so this was not possible. But we could give enough information via the View to the MySQL so that it uses field_data_field_user_updated table first. To do this, we added a NOT EMPTY filter on the User Updated field in the view.

This change adds an additional WHERE clause in the SQL query. Note that LEFT JOIN does not change to INNER JOIN. The new SQL query is:

Because of this condition, MySQL considers LEFT JOIN and INNER JOIN to be equivalent and effectively treats this as an INNER JOIN. As a result, field_data_field_user_updated table is first used to find 5 most recently updated uids and then corresponding fields are loaded from the user table. This change reduced the time execution time from about 1 second to about 0.13 seconds.

If you have any insights or have another way to optimize such queries, feel free to write a comment below.