How do I make NHibernate cache fetched child collections?

5.4k views Asked by At

I've got a fairly simple criteria query that fetches child collections, like so:

var order = Session.CreateCriteria<Order>()
    .Add(Restrictions.Eq("Id", id))
    .SetFetchMode("Customer", FetchMode.Eager)
    .SetFetchMode("Products", FetchMode.Eager)
    .SetFetchMode("Products.Category", FetchMode.Eager)
    .SetCacheable(true)
    .UniqueResult<Order>();

Using NH Prof, I've verified that this makes just one round trip to the database (as expected) with a cold cache; however, on successive executions, it retrieves only the Order from the cache and then hits the database with a SELECT(N+1) for every child entity in the graph, as in:

Cached query: SELECT ... FROM Order this_ left outer join Customer customer2 [...]
SELECT ... FROM Customer WHERE Id = 123;
SELECT ... FROM Products WHERE Id = 500;
SELECT ... FROM Products WHERE Id = 501;
...
SELECT ... FROM Categories WHERE Id = 3;

And so on and so forth. Clearly it's not caching the whole query or graph, only the root entity. The first "cached query" line actually has all of the join conditions that it's supposed to - it's definitely caching the query itself correctly, just not the entities, apparently.

I've tried this using the SysCache, SysCache2, and even HashTable cache providers and I always seem to get this same behaviour (NH version 3.2.0).

Googling turned up a number of ancient issues, such as:

However, these all seem to have been fixed a long time ago, and I get the same bad behaviour regardless of which provider I use.

I've read through the nhibernate.info documentation on SysCache and SysCache2 and there doesn't seem to be anything I'm missing. I've tried adding cacheRegion lines to the Web.config file for all tables involved in the query, but it doesn't change anything (and AFAIK those elements are just to invalidate the cache, so they shouldn't matter anyway).

With all of these super-old issues that all seem to be fixed/resolved, I figure this can't possibly still be a bug in NHibernate, it must be something that I'm doing wrong. But what?

Is there something special I need to do when combining fetch instructions in NHibernate with the second-level cache? What am I missing here?

2

There are 2 answers

1
Aaronaught On BEST ANSWER

I did manage to figure this out, so other folks can finally get a straight answer:

To sum it up, I've been confused for a while on the difference between the second-level cache and the query cache; Jason's answer is technically correct but it somehow didn't click for me. Here is how I would explain it:

  • The query cache keeps track of which entities are emitted by a query. It does not cache the entire result set. It's the equivalent of doing a Session.Load on a lazy-loaded entity; it knows/expects that one exists but doesn't track any other information about it unless specifically asked, at which point it will actually load the real entity.

  • The second-level cache tracks the actual data for each entity. When NHibernate needs to load any entity by its ID (by virtue of a Session.Load, Session.Get, lazy-loaded relationship, or, in the case above, an entity "reference" that's part of a cached query), it will look in the second-level cache first.

Of course this makes perfect sense in hindsight, but it's not so obvious when you hear the terms "query cache" and "second-level cache" being used almost interchangeably in so many places.

Essentially there are two sets of two settings each that you need to configure in order to see the expected results with query caching:

1. Enable both caches

In XML configuration, this means adding the following two lines:

<property name="cache.use_second_level_cache">true</property>
<property name="cache.use_query_cache" >true</property>

In Fluent NHibernate, it's this:

.Cache(c => c
    .UseQueryCache()
    .UseSecondLevelCache()
    .ProviderClass<SysCacheProvider>())

Please note the UseSecondLevelCache above because (at the time of this posting) it is never mentioned on the Fluent NHibernate wiki page; there are several examples of enabling the query cache but not the second-level cache!

2. Enable caching for each entity

Just enabling the second-level cache does pretty much nothing, and this is where I got tripped up. The second-level cache has to be not only enabled but configured for every single individual entity class that you want cached.

In XML, this is done inside the <class> element:

<cache usage="read-write"/>

In Fluent NHibernate (non-automap), it's done in the ClassMap constructor or wherever you put the rest of your mapping code:

Cache.ReadWrite().Region("Configuration");

This has to be done for every entity that is going to be cached. It's probably possible to set up in one place as a convention, but then you pretty much miss out on the ability to use regions (and in most systems you don't want to cache transactional data as much as configuration data).

And that's it. It's really not that hard to do but surprisingly difficult to find a good, complete example, especially for FNH.


One last point: The natural consequence of this is that it makes eager join/fetching strategies very unpredictable when used with the query cache. Apparently, if NHibernate sees that a query is cached, it will make no effort whatsoever to check first if all or even any of the actual entities are cached. It pretty much just assumes that they are, and tries to load each one up individually.

This is the reason for the SELECT N+1 disaster; it wouldn't be that big of a deal if NH noticed that the entities weren't in the second-level cache and just executed the query normally, as written, with fetches and futures and so on. But it doesn't do that; instead it tries to load every entity, and its relations, and its sub-relations, and its sub-sub-relations, and so on, one at a time.

So there is almost no point in using the query cache unless you've explicitly enabled caching for all of the entities in the entire graph, and even then, you'll want to be very careful (by way of expirations, dependencies, etc.) that cached queries don't outlast the entities that they are supposed to retrieve, otherwise you will just end up making the performance worse.

2
Jason Meckley On

a cached query only stores the IDs of the entities, not the values of the entity. within a cached entity only the IDs of related entities are cached. therefore if you don't cache all the involved entities as well as marking the related entities as cached you could end up with select n+1.