I have a consumer polling from subscribed topic. It consumes each message and does some processing (within seconds), pushes to different topic and commits offset.
There are totally 5000 messages,
before restart - consumed 2900 messages and committed offset
after restart - started consuming from offset 0.
Even though consumer is created with same consumer group, it started processing messages from offset 0.
kafka version (strimzi) > 2.0.0 kafka-python == 2.0.1
We don't know how many partitions you have in your topic but when consumers are created within a same consumer group, they will consume records from different partitions ( We can't have two consumers in a consumer group that consume from the same partition and If you add a consumer the group coordinator will execute the process of Re-balancing to reassign each consumer to a specific partition).
I think the offset 0 comes from the property
auto.offset.reset
which can be :latest
: Start at the latest offset in logearliest
: Start with the earliest record.none
: Throw an exception when there is no existing offset data.But this property kicks in only if your consumer group doesn't have a valid offset committed.
N.B: Records in a topic have a retention period
log.retention.ms
property so your latest messages could be deleted when your are processing the first records in the log.Questions: While you want to consume message from one topic and process data and write them to another topic why you didn't use Kafka Streaming ?