Firstly, I am not much of an expert in multi-threading and parallel programming.
I am trying to optimize the performance of a legacy application (.Net 4
, NHibernate 2.1
).
**So far, upgrading NHibernate
is not a priority, but is in the pipeline.
Over time, performance has become a nightmare with the growth of data. One item I have seen is a Parallel.ForEach
statement that calls a method that fetches and updates a complex entity(with multiple relationships - propeties & collections).
The piece of code has the following form (simplified for clarity):
void SomeMethod(ICollection<TheClass> itemsToProcess)
{
Parallel.ForEach(itemsToProcess, item => ProcessItem(item);
}
TheClass ProcessItem(TheClass i)
{
var temp = NHibernateRepository.SomeFetchMethod(i);
var result = NHibernateRepository.Update(temp);
return result;
}
SQL Server intermittently reports database lock errors with the following error:
Transaction (Process ID 20) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction
I suspect it is due to some race condition happening leading up to a deadlock, even if ISessions
are separate.
The ICollection<TheClass>
can have up to 1000 items and each with properties and sub-collections that are processed, generating many SELECT
and UPDATE
statements (confirmed using 'NHibernate Profiler')
Is there a better way to handle this in a parallel way, or shall I refactor the code to a traditional loop?
I do know that I can alternatively implement my code using:
- A
foreach
loop in the sameISession
context - With a Stateless Session
- With
Environment.BatchSize
set to a reasonable value
OR
- Using SQL BulkCopy
I have also read quite a bit of good info about SQL Server deadlocks and Parallel.ForEach
being an easy pitfall:
This is a very complicated topic. There's one strategy that is guaranteed to be safe and probably will result in a speedup:
Retry in case of deadlock.
Since a deadlock rolls back the transaction you can safely retry the entire transaction. If the deadlock rate is low the parallelism speedup will be high.
The nice thing about retry is that you can make a simple code change in a central place.
Since it's not apparent from the code posted: Make sure, that threads do not share the session or entities. Neither of them are thread-safe.