AWS MSK with AWS Lambda — message acknowledgment

2k views Asked by At

I'm failed to find any information in the official AWS documentation about acknowledgment model used for message consumption by AWS Lambda from AWS MSK(managed Kafka).

How AWS Lambda acknowledge Kafka message from AWS MSK? Is it possible to configure it(automatic vs manual acks)?

2

There are 2 answers

0
Mradul Yd On BEST ANSWER

So, this is confusing for sure, but after playing around with it I could understand how does it work.

So as far as I explored there are no direct options through which you can communicate to the Kafka broker from AWS lambda to commit the offsets (and even if there are, I am not sure what mess can we do in sequencing, I would love to hear in comments). Lambda being server less can not maintain consumer groups. Instead the lambda service handles all these internally and kind of makes dynamic consumer groups.

The lambda service continuously polls the messages from the Kafka and invokes the lambda function. The ack only happens if all of the messages are successfully processed, i.e. the lambda invocation for all of the messages is successful and lambda did not face any runtime errors or invocation failures.

However if any errors occur, lambda service does not commit this batch and again polls the same batch until it gets succeeded.

Thus you need to plan your streaming such that any unwanted errors does not block your processing, this could be achieved by smartly and accurately catching errors in your code and then decide whether to raise error or just let it exit successfully.

Avoidable errors can block your streaming and result in higher consumer lag, so you would want to be accurate while catching errors.

Key take away is, you can not explicitly ack messages to Kafka topic from lambda, it is internally managed by lambda service. You can do so by letting your lambda fail -> no commit or succeed by error handling -> commit. kind of a same doubt that I had months ago

3
Ricardo Ferreira On

Kafka's acknowledgement model will be the same regardless if you're using MSK or something else — hence why you didn't find anything on AWS docs. Pretty much your consumer need to set the property enable.auto.commit to true so your consumer acknowledges all records returned every 5 seconds. This 5 seconds interval can be configured via the property auto.commit.interval.ms. If you set the property enable.auto.commit to false then it will be up to your consumer to acknowledge each record by calling the method commit() explicitly.

Now keep in mind the nature of Lambda functions. The underlying container that backs up each deployed function is recycled from time to time and whatever objects (such as the KafkaConsumer) you have instantiated in your function will be destroyed and recreated sub-sequentially. That means that you might experience some performance delays during consumption, as well as that your records might end up duplicated if the last poll didn't commit all read records. The consumer will resume its processing from the last committed offset.

Luckily, AWS released an interested support for execute Lambda functions for every Kafka record on MSK. Here is a link that you can learn more about this:

https://aws.amazon.com/blogs/compute/using-amazon-msk-as-an-event-source-for-aws-lambda/