How to avoid a 404 when deleting batches from Azure Table Storage

1.4k views Asked by At

Problem

I am trying to delete a lot of rows from table storage that may or may not exist. The deal is I need to minimise I/O and maximise bandwidth so 1 hit to rule them all would be awesome. Problem is that if any of the batch entities doesn't exist the whole batch fails.

Why

This also brings me to a design question - why doesn't the request simply return a deletion result with indication of which objects were not deleted due to 404. Why does it throw an exception? What is the reason for it.

More info

The batch size is within Table Storage constraint of 100 and they are all within the same partition.

2

There are 2 answers

4
Antonio Fiumanò On BEST ANSWER

You could PUT empty entities with the same PartitionKey and EntityKey before DELETEing them. This way you'll be sure you won't have 404 errors. It's two consistent calls for every batch instead of retrying many times and complicate the logic of your app. Not an ideal answer but we don't live in an ideal world :)

4
Gaurav Mantri On

To answer your question, there's no way to avoid this kind of situation. If an entity in the batch fails, the whole batch will fail.

However there's one thing you could do:

When the batch fails, it returns the index of the failed entity. What you could do is take that batch and create 3 separate batches out of that. 1st batch will be from first entity (0th index) to the index of the failed entity (minus one), 2nd will be the failed entity (so just one entity) and the last one would be from the failed entity index to the last entity. For the failed entity, you could simply try DeleteIfExists. So assuming you have 100 entities in a batch and let's say 30th entity fails, you would create 3 batches:

Batch 1: 0th to 29th entity (Index 0 - 28)

Batch 2: 30th entity (single entity) (Index 29)

Batch 3: 31st to 100th entity (Index 30 - 99)

This also brings me to a design question - why doesn't the request simply return a deletion result with indication of which objects were not deleted due to 404. Why does it throw an exception? What is the reason for it.

One possible reason I could think of is because of Storage API's adherence to REST. You try to delete a resource, it's not there so API would throw the error. Furthermore, an entity could fail to delete not only because the entity is not present but also because the if-match conditional header specified in the request does not match. To elaborate, you may want to delete an entity only if eTag matches. In this case, even though the entity is present your delete operation would fail. To deal with 404 errors on single entity delete operation, all client SDKs have implemented DeleteIfExists kind of functionality which will eat 404 error.