Azure data explorer Batching policy modifications

364 views Asked by At

I have huge amount of data flowing from Eventhub to Azure Data Explorer. Currently we have not done any modification on the batching policy, so it is scheduling every 5 minutes. But we need to reduce it to a less value so that the end to end lag is reduced.

How can I calculate the ideal batching time for this setup. Is there any calculation based on the CPU of ADX and the Data ingestion on Eventhub , so that I can figure out an ideal time without affecting the CPU usage of ADX

2

There are 2 answers

2
Avnera On BEST ANSWER

There is no tool or other functionality that allows you to do it today, you will need to try the desired setting for "MaximumBatchingTimeSpan" and observe the impact on the CPU usage.

0
Vladik Branevich On

Essentially, if you are ingesting huge volumes of data (per table), you are probably not using the 5 minutes batching window, or can decrease it significantly without detrimental impact. Please have a look at the latency and batching metrics for your cluster (https://learn.microsoft.com/en-us/azure/data-explorer/using-metrics#ingestion-metrics) and see a) if your actual latency is below 5 minutes - which would indicate the batching is not driven by time, and b) what is the "Batching type" that your cluster most often enacts - time/size/number of items. Based on these numbers you can tweak down the time component of your ingestion batching policy.