I was playing around with Constrained Execution Regions tonight to better round out my understanding of the finer details. I have used them on occasion before, but in those cases I mostly adhered strictly to established patterns. Anyway, I noticed something peculiar that I cannot quite explain.
Consider the following code. Note, I targeted .NET 4.5 and I tested it with a Release build without the debugger attached.
public class Program
{
public static void Main(string[] args)
{
bool toggle = false;
bool didfinally = false;
var thread = new Thread(
() =>
{
Console.WriteLine("running");
RuntimeHelpers.PrepareConstrainedRegions();
try
{
while (true)
{
toggle = !toggle;
}
}
finally
{
didfinally = true;
}
});
thread.Start();
Console.WriteLine("sleeping");
Thread.Sleep(1000);
Console.WriteLine("aborting");
thread.Abort();
Console.WriteLine("aborted");
thread.Join();
Console.WriteLine("joined");
Console.WriteLine("didfinally=" + didfinally);
Console.Read();
}
}
What would you think the output of this program would be?
- didfinally=True
- didfinally=False
Before you guess read the documentation. I include the pertinent sections below.
A constrained execution region (CER) is part of a mechanism for authoring reliable managed code. A CER defines an area in which the common language runtime (CLR) is constrained from throwing out-of-band exceptions that would prevent the code in the area from executing in its entirety. Within that region, user code is constrained from executing code that would result in the throwing of out-of-band exceptions. The PrepareConstrainedRegions method must immediately precede a try block and marks catch, finally, and fault blocks as constrained execution regions. Once marked as a constrained region, code must only call other code with strong reliability contracts, and code should not allocate or make virtual calls to unprepared or unreliable methods unless the code is prepared to handle failures. The CLR delays thread aborts for code that is executing in a CER.
and
The reliability try/catch/finally is an exception handling mechanism with the same level of predictability guarantees as the unmanaged version. The catch/finally block is the CER. Methods in the block require advance preparation and must be noninterruptible.
My particular concern right now is guarding against thread aborts. There are two kinds: your normal variety via Thread.Abort
and then the one where a CLR host can go all medieval on you and do a forced abort. finally
blocks are already protected against Thread.Abort
to some degree. Then if you declare that finally
block as a CER then you get added protection from CLR host aborts as well...at least I think that is the theory.
So based on what I think I know I guessed #1. It should print didfinally=True. The ThreadAbortException
gets injected while the code is still in the try
block and then the CLR allows the finally
block to run as would be expected even without a CER right?
Well, this is not the result I got. I got a totally unexpected result. Neither #1 or #2 happened for me. Instead, my program hung at Thread.Abort
. Here is what I observe.
- The presence of
PrepareConstrainedRegions
delays thread aborts insidetry
blocks. - The absence of
PrepareConstrainedRegions
allows them intry
blocks.
So the million dollar question is why? The documentation does not mention this behavior anywhere that I can see. In fact, most of the stuff I am reading is actually suggesting that you put critical uninterruptable code in the finally
block specifically to guard against thread aborts.
Perhaps, PrepareConstrainedRegions
delays normal aborts in a try
block in addition to the finally
block. But CLR host aborts are only delayed in the finally
block of a CER? Can anyone provide more clarity on this?
[Cont'd from comments]
I will break my answer into two parts: CER and handling ThreadAbortException.
I don't believe a CER is intended to help with thread aborts in the first place; these are not the droids you're looking for. It's possible I'm misunderstanding the statement of the problem as well, this stuff tends to get pretty heavy, but the phrases I found to be key in documentation (admittedly, one of which was was actually in a different section than I mentioned) were:
The code cannot cause an out-of-band exception
and
user code creates non-interruptible regions with a reliable try/catch/finally that *contains an empty try/catch block* preceded by a PrepareConstrainedRegions method call
Despite not being inspired directly in the constrained code, a thread abort is an out-of-band exception. A constrained region only guarantees that, once the finally is executing, as long as it obeys the constraints it has promised, it will not be interrupted for managed runtime operations that would otherwise not interrupt unmanaged finally blocks. Thread Aborts interrupt unmanaged code, just as they interrupt managed code, but without constrained regions there are some guarantees and probably also a different recommended pattern for the behavior you may be looking for. I suspect this primarily functions as a barrier against thread suspension for Garbage Collection (probably by switching the Thread out of Preemptive garbage collection mode for the duration of the region, if I had to guess). I could imagine using this in combination with weak references, wait handles, and other low level management routines.
As for the unexpected behavior, my thoughts are that you did not meet the contract you promised by declaring the constrained region, so the result is not documented and should be considered unpredictable. It does seem odd that the Thread Abort would be deferred in the try, but I believe this to be a side-effect of unintended usage, which is only worth exploring further for academic understanding of the runtime (a class of knowledge that is volatile, since there is no guarantee of the behavior future updates could change this behavior).
Now, I'm not sure what the extent of said side effects are in using the above-mentioned in unintended ways, but if we exit the context of using the force to influence our controlling body and let things run the way they normally would, we do get some guarantees:
With that, here is a sample of techniques meant to be used in cases where abort resiliency is necessary. I have mixed multiple techniques in a single sample which are not necessary to use at the same time (generally you wouldn't) just to give you a sampling of options depending on your needs.
I'm no expert on CER, so anybody please let me know if I've misunderstood. I hope this helps :)