Self-joined HABTM with Cache Updates

You have a self-joined HABTM table. It’s probably called User too. All is well, until you realize that when relationships change, your site does not display those changes. The reason is that the related objects (users) are not being updated when the relation is created/removed or one of the objects are updated/destroyed. And the cache keys that rails generate for the parent object are still referencing old data.

Where cache keys come into play are in views. When a view generates it’s html, the
cache method will ask your caching system (most likely memcached) if a key exists. If it does, then it pulls that fragment from the cache without the expense of generating the html. If they key doesn’t exists, then it renders the html and then stores the result in the cache. Here is a very simple example.

1

2

3

4

5

6

7

8

=cache@userdo

%p

%strongName:

=@user.name

%p

%strongChildren:

=@user.children.map(&:name).join(', ')

When rendered, this will display the user’s name, and the names of all their children.

Changing Relationship Doesn’t Change the Rendered View

All is going well, until you start getting reports of problems. A user deletes one of her children, but when she views her show page (see above view), she still see the deleted child’s name. The reason is that when you make a change to a HABTM relationship, it only removes the relationship in the
uses_relationships table. It doesn’t update the
updated_at datetime for the parent record. Thus, the cache key for the parent record does not change, and the old (incorrect) information is read from cache.

How To Fix

I wan’t able to find an definitive example of how to fix this completely and correctly for all cases. So here is my implementation.

What you need to do is “touch” all related objects in a relationship anytime that relationship changes. And those changes could be:

Parent (adult) object is updated.

Any children objects are updated.

Relationships are created.

Relationships are deleted.

To fix the first two cases, we just need a couple of callbacks.

1

2

3

4

5

6

7

8

9

10

after_update:touch_related_users

before_destroy:touch_related_users

deftouch_related_users

ifself.child?

self.adults.update_all(updated_at:Time.now)

elsifself.adult?

self.children.update_all(updated_at:Time.now)

end

end

Astute coders will notice that we used
update_all instead of
touch here. While we could have used touch, there is the possibility that you could create a circular update infinite loop. The
touch method will cause all touched objects to also execute any touches that are defined on their relationships.

In our particular case, we are not concerned about touching any additional relationships here because were are only updated the
updated_at attribute to trigger a cache invalidation. We aren’t actually changing the related objects data like a name or birthday.

Now, we need to fix the last two cases when relationships are created or deleted.

Note that we added
after_add::touch_updated_at,after_remove::touch_updated_at to the HABTM definition. These are callbacks that are triggered upon relationship additions or deletions. They both call the
touch_updated_at method that like above, will update the
updated_at attribute. But in this case, the HABTM will pass a reference to the modified user object.

Here we use the
update_column method because we do not want to trigger any callbacks, validations, or cascading touches.

Fixed!

Now, anytime a user adds a child, deletes a child, or any data on all related children is changed, the
updated_at attribute is set to the current datetime. This will change the calculated cache key for the user objects, and voilà, the view will regenerate the html and users will see the modifications.