I have a strange Kafka Server error when mirroring data with the MirrorMaker 1 in Apache Kafka 2.6.

org.apache.kafka.common.errors.NotEnoughReplicasException: The size of the current ISR Set(3) is insufficient to satisfy the min.isr requirement of 2 for partition FooBar-0

The strange thing is, that the min.isr setting is 2 and the ISR Set has 3 nodes. Nevertheless I get the NotEnoughReplicasException Exception.

Also taking a deeper look to the topic does not show any curiosities

[root@LoremIpsum kafka]# /usr/lib/kafka/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic FooBar
Topic: FooBar       PartitionCount: 1       ReplicationFactor: 3    Configs: min.insync.replicas=2,cleanup.policy=compact,segment.bytes=1073741824,max.message.bytes=5242880,min.compaction.lag.ms=604800000,message.timestamp.type=LogAppendTime,unclean.leader.election.enable=false
        Topic: FooBar       Partition: 0    Leader: 3       Replicas: 2,3,1 Isr: 3

The logs of the 3 nodes look normal (as far as I can judge). Is there any other reason that could produce this message. What else could be checked?

Thank you very much for any advice!



Michael Heil

The term "ISR Set(3)" means that only broker #3 is in-sync. This is also visible in the output of the kafka-topics command. Apparently, there is something going wrong with the replication of the data between the brokers.

Under the covers of MirrorMaker1 there is a plain KafkaConsumer and KafkaProducer which do the work. According to the JavaDocs of the Producer Callback the NotEnoughReplicasException is a retriable exception.

Therefore, you are likely to get rid of this error by setting the following producer configurations:

acks=all: The number of acknowledgments the producer requires the leader to have received before considering a request complete. This controls the durability of records that are sent. 
retry.backoff.ms=1000: The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.
delivery.timeout.ms=300000: An upper bound on the time to report success or failure after a call to send() returns. This limits the total time that a record will be delayed prior to sending, the time to await acknowledgement from the broker (if expected), and the time allowed for retriable send failures.

All details on the KafkaProducer configurations are given here.