I have configured 3 brokers in Kafka running on different ports .I am using spring cloud stream kafka
brokers: localhost:9092,localhost:9093,localhost:9094.
I am creating a data pipeline that gets continuous stream of data .I am storing stream of data in kafka topic with 3 brokers running .Till now there is no problem .My concern is suppose say 3 brokers went down for 5 minutes then at that time i am unable to get data on kafka topic .There will be data loss for 5 minutes .From spring boot i will get warning
2020-10-06 11:44:20.840 WARN 2906 --- [ad | producer-2] org.apache.kafka.clients.NetworkClient : [Producer clientId=producer-2] Connection to node 0 (/192.168.1.78:9092) could not be established. Broker may not be available.
Is there a way to store data temporary when all brokers goes down and again start to resume writing to topic from a temporary storage when brokers are up again ?
You could make use of the internal buffer the Producer is using to send the data to the cluster. The KafkaProducer has a queue under the covers and a dedicated I/O thread that actually sends the data to the cluster.
In combination with the producer configuration
retries
(by default set to 0) you may want to increase thebuffer.memory
which is described asHowever, I do not think that having the producer itself dealing with a complete cluster failure is generally a good idea. Kafka itself is designed to deal with failures of individual brokers, but if all your broker go down uncontrolably at the same time you may run into bigger issues than just missing some data of an individual producer.
If only one broker is not reachable for a time period there is nothing to be done, as Kafka internally will switch the partition leader of the topic to another broker (if the partition was replicated of course).