Dataflow (Task Parallel Library) and async await

1.9k views Asked by At

Lets say I use the Dataflow blocks in .NET. It is stated that "This dataflow model promotes actor-based programming" Which is exaclty what I want to get at here.

However, if I process messages from, say a BufferBlock<T> and in the processor of the message, I decide to use async/await, this will fork out the execution to the current tread and a worker thread for the awaited task.

Is there any way to prevent concurrent execution within the actor/message processor here?

it's fine if the awaited task executes using non blocking IO ops with native callbacks. But I'd really like to ensure that the any .NET code is only executed in sync.

2

There are 2 answers

0
svick On BEST ANSWER

This very much depends on how exactly will you process those messages.

If you use an ActionBlock with async action, don't set its MaxDegreeOfParallelism (which means the default value of 1 is used) and link it to the BufferBlock, then the action will execute for one message at a time, there will be no parallelism.

If you process the messages manually using a loop like this:

while (await bufferBlock.OutputAvailableAsync())
{
    var message = await bufferBlock.ReceiveAsync();
    await ProcessMessageAsync(message);
}

Then the messages will be processed one at a time too.

But in both cases, it doesn't mean that a message will be processed by a single thread. It may be processed by multiple threads, but not in parallel. This is because after await, the execution can resume on a different thread than where is was paused.

If you use some other way of processing messages (e.g. using the loop above, but omitting the await before ProcessMessageAsync()), then multiple messages could be processed concurrently.

4
Panagiotis Kanavos On

You misunderstand what await does. It doesn't fork anything, it simply awaits for the result of an already asynchronous operation.

Methods marked with the async keyword do not automatically become asynchronous. It is only when an asynchronous operation is encountered inside the async method that execution continues asynchronously. The async keyword simply tells the compiler where execution should continue after the asynchronous operation completes.

No ThreadPool threads are wasted or harmed while awaiting so you shouldn't be trying to limit, prevent or circumvent this. In fact, you get better scalability when using asynchronous operations because the ThreadPool threads used by TPL Dataflow do not block waiting for long-running asynchronous operations like I/O or web service calls.