Let's consider a worker role that:
- Hosts a WCF server
- Listens to a few Azure Storage Queues and Service Bus queues
The processing methods perform some Azure Storage I/O, HttpClient
calls to external APIs and Entity Framework calls. Now I want my worker role to gracefully shutdown so all pending operations are finished or cancelled in a managed manner:
- Stop accepting any incoming requests once
RoleEntryPoint.OnStop()
is triggered. Does Azure make it for me? If not how do I enforce it? - Allow
N
seconds for any pending operation to complete - After
N
seconds cancel any operations left. The cancellation must not exceedM
seconds so thatN + M < 5 minutes
. I believe 5 minutes is a guaranteed time Azure runtime will wait after it triggeredOnStop()
and before it terminates the process.
I'm imaging it something like this:
public override void Run() {
// create a cancellation token source
try {
// pass the token to all processing/listening routines
}
catch (Exception e) { }
}
public override void OnStop() {
try {
// trigger the cancellation token source
}
catch (Exception e) { }
}
The naive sample above assumes that all my processing routines are async top to bottom (to EF/HttpClient calls). If it's the way to go I need a working example that takes care of the preconditions (WCF host, Queue listeners).
The questions opened:
- How do I make sure no more incoming TCP requests are sent to my worker role after
OnStop()
is triggered? This is important to fit shutdown code into 5 minutes limit. - How to find out concrete numbers for
N
andM
considering all the stuff like WCF channel time outs, EF timeouts, etc. in the configuration file? - Will it be even possible for synchronous code?
As this official document mentioned about
ServiceHost.close()
:For gracefully terminate WCF Service receiving new request but allow existing connections to continue, you could refer to this issue.
For listening Service Bus queues, you could define a
CancellationTokenSource
object and invokeCancellationTokenSource.Cancel()
onceRoleEntryPoint.OnStop()
is triggered.And check whether cancellation has been requested for
CancellationTokenSource
as follows:Per my understanding, I assumed that you could just call
Task.Delay(TimeSpan.FromSeconds(N)).Wait()
after you invokeCancellationTokenSource.Cancel()
and terminate the WCF Service in theOnStop
function. Then the pending operations would be discarded along with shutting the worker role instance down.I assumed that you could leverage Application Insights with your worker role to retrieve the metrics data and configure the reasonable value for
N
, in order to reduce the failed request rate and quickly let your VM restart and begin processing new requests. Also you could refer to this tutorial about handling Azure OnStop event.