Russian doll fragment caching with auto expiring keys - performance pros and cons

2.5k views Asked by At

The question is based on 2 articles:
- Basecamp Next by DHH from 37signals
- Advanced Caching in Rails by Adam Hawkins

I'm a little bit confused about the performance implications of using Russian doll caching, Specifically:

  1. When using auto expiring keys it seems like every request will result in an access to the database to fetch the object timestamp - am I missing something? (I understand that in the best case scenario you have to do that only for the top level key in the hierarchy, but still ...)

  2. In the 1st article they cache a todo list, as well as every todo item. Caching the list makes perfect sense as it saves a lot of work (DB query for all the items). But why cache the individual items? You are already going to the database to get the Item timestamp, so what exactly are you saving? Generating a few html lines?

  3. In the 2nd article Adam caches chunks of the view like this: cache [post, 'main-content'] ... cache [post, 'comments'] When a comment is added it changes the time stamp of post, and therefor invalidates both entires. However, main-content hasn't changed - you don't want to regenerate it!!! How would one go about invalidating only comments. (That's actually a very common user case - a model that has a few logically independent parts: the object itself, the different associations, data in some other store, etc. )

To me it seems like Russian doll caching makes sense only when you have a deep hierarchy of nested object.(in basecamp you have project->todos list -> todo -> items list). However, if you have a shallow hierarchy, it's better to just do the invalidation yourself.

Any feedback would be appreciated!
Thanks.

2

There are 2 answers

0
Kelvin On BEST ANSWER
  1. The top-level does need to hit the database. You can avoid this by storing the timestamp in a separate cache entry, keyed by model and id. One of the commenters on article 1 (Manuel F. Lara) suggested the same: "Is there another cache like projects/15-time where you always have the last timestamp for the projects list?"

  2. I think you're right about the "lowest" level in the nesting. You might need to do some testing to see the relative performance of DB access vs rendering the tiny partial.

  3. Another good point. According to the rails docs, if you pass a symbol to :touch it will update that attribute in addition to updated_at - maybe there's a way to skip changing the Post#updated_at and only update a column like comments_updated_at. Then you can use the latter for caching. But if you're trying to avoid DB access, you'll have to store yet another cache key for this timestamp (like in #1 above).

I guess you have to decide whether all this is worth the trouble to you. The 2 articles show simple, contrived examples to teach the principles. In an app with complex associations, the "generational" caching method may be more manageable.

0
Prathan Thananart On
  1. Yes. But usually the best-case and pretty-good-case scenarios come up pretty often.
  2. Yes. In some apps I develop, rendering the view takes 10-20 times as long as executing the queries. Look at your benchmarks.
  3. If rendering the post is costly, you may not want to touch the post but instead compose the comment list's cache key from [@post.comments.max(:updated_at), @post.comments.size].