Advantages of using event hubs capture

2.2k views Asked by At

I am rather new to azure and cloud in general.

I was looking at this tutorial on how to integrate event hubs, event grid and azure functions to stream data into an SQL warehouse.

My question is:

What are the advantages of first storing the data in blob storage, as opposed to just processing the incoming data with an HTTP-triggered Azure-function directly, thus eliminating the need for event hub and event grid?

Thank you for taking the time to read my question. Any help is greatly appreciated :-)

2

There are 2 answers

0
Ivan Glasenberg On BEST ANSWER

This feature is used to backup / reuse the event data.

By default(if no capture is set up), the event data will be stored in eventhub in 7 days(maximum retention period). In some cases, if you don't process these events in 7 days, then the event data will be lost.

In this case, if you have captured feature configured, you can always reuse these event data since they are stored in blob storage.

Anyway, you should consider your need if you should enable/disable this feature.

0
Kashyap On

This feature is used to backup / reuse the event data.

I don't quite agree with Ivan. The article quoted by OP itself shows a great use of capture that is not backup/recovery.

If you want to process large number of events from event hub using Azure Functions (using EventHub Trigger for Functions), biggest problem is batching. maxBatchSize is just a suggestion to Function Runtime, there are too many variables and you may not (read will not) get big enough batches even if you set maxBatchSize to a big number. Also remember that HTTP triggers have a 230 seconds limit on Function execution time. Same is applicable to Blob trigger too if I recall correctly (because Blob trigger is implemented as a REST call to Azure Function internally).

Alternative is use capture in the way OP posted.

Some references: