kafkaRDD next() slow

40 views Asked by At

My spark consumption Kafka has data accumulation problems. After troubleshooting, I found that data consumption is very time-consuming once about 10 minutes. I located the specific code here.

   while (consumerRecords.hasNext()) {
       long begin = System.currentTimeMillis();
       ConsumerRecord<String, Message> consumerRecord = consumerRecords.next();
       long next = System.currentTimeMillis() - begin ;
   ....

The type of consumerRecords object is KafkaRDD, and the next() method took about 40 seconds to return the data. This caused data accumulation.

This is my information from monitoring

2020/10/19 18:03:44.000 7 records 40 min 0.4 s 40 min
2020/10/19 18:03:43.500 2 records 40 min 0.4 s 40 min
2020/10/19 18:03:43.000 7 records 39 min 40 s 40 min
2020/10/19 18:03:42.500 2 records 39 min 0.4 s 39 min
2020/10/19 18:03:42.000 8 records 39 min 0.4 s 39 min

I don’t know how to continue to troubleshoot this problem, or what causes it to be so time-consuming?

Please give me some guidance and suggestions, thank you

0

There are 0 answers