I have a file with around 8000 employee records that I need to process by calling an rest API for each record. The sequential API calls are taking a lot of time, so I want to call them asynchronously in tasks and wait for all tasks to finish. I plan to have three tasks running at a time.
I have written the following code, but I'm concerned about race conditions or multi-threading problems since I am updating the employee entity inside the task. My understanding is I can update the entities but cannot call dbcontext methods. I know that the DBContext is not thread safe, so I am calling SaveChanges outside of the task loop. Can anyone review my code and let me know if I'm doing it right? Here's my pseudo code:
private async TempMethod()
{
var dbcontext = new DBContext();
var employees = dbcontext.Employees.ToList();
var allTasks = new List<Task<APIResult>();
var throttler = new SemaphoreSlim(initialCount: 3);
foreach (var employee in employees)
{
await throttler.WaitAsync();
allTasks.Add(
Task.Run(async () =>
{
try
{
var apiResult = await apiClient.Update(employee);
if (apiResult == "Success")
{
employee.lastupdatedby = "Importer";
}
apiResult.recordNumber = employee.recordNumber;
return apiResult;
}
finally
{
throttler.Release();
}
}
);
}
var results = await Task.WhenAll(allTasks);
foreach (var result in results)
{
dbcontext.APIResults.Add(result);
}
//Save both Updated Employee and the Result entitities.
dbcontext.SaveChangesAsync();
}
Your code seems right to me, under these conditions:
employee.lastupdatedbyandapiResult.recordNumberare either public fields, or trivial properties backed by private fields (without side-effects).apiClientis an instance, is thread-safe.employeesanyway, even in case of an early exception. In other words you don't want to complete ASAP in case of errors.employeeshaving theirlastupdatedbyupdated.employeein theemployeeslist that failed).As a side note, personally I would prefer to abstract the parallelization/throttling functionality in a separate helper method, instead of mixing threading and TPL mechanisms with my application code. I would like to link to a good quality implementation of a
ForEachAsyncthat returns results, and is compatible with .NET 4.6.1, but I can't find any. Jon Skeet's implementation here is decent, but it doesn't have ideal behavior in case of exceptions, and it doesn't preserve the order of the results.