kafka message log files rolling over frequently

5.1k views Asked by At

I installed and configured confluent kafka. And kafka is running with a 1GB heap size.

export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G" #from /bin/kafka-server-start

I created a topic “thing-data” with only one partition and using an automated job to pump some data into this topic every 5 seconds. And every message is around 2400 bytes in size.

What I see is the smallest offset of my topic is changing too frequently. That means kafka queue is able to hold very few records at a given point in time. I had a look at the topic message log files sizes in /var/log/kafka/thing-data-0/

[hduser@laptop thing-data-0]$ ll

-rw-r--r--. 1 confluent confluent 10485760 Dec 30 17:05 00000000000000148868.index
-rw-r--r--. 1 confluent confluent   119350 Dec 30 17:05 00000000000000148868.log

[hduser@laptop thing-data-0]$ ll

-rw-r--r--. 1 confluent confluent 10485760 Dec 30 17:08 00000000000000148928.index
-rw-r--r--. 1 confluent confluent    54901 Dec 30 17:08 00000000000000148928.log

[hduser@laptop thing-data-0]$ ll

-rw-r--r--. 1 confluent confluent 10485760 Dec 30 17:12 00000000000000148988.index
-rw-r--r--. 1 confluent confluent    38192 Dec 30 17:13 00000000000000148988.log

As you can see the log files rolls over very frequently. Each time old files are marked as .deleted and getting deleted after the configured time.

Below are the configuration settings related to logs from /etc/kafka/server.properties.

log.roll.hours=168
log.retention.hours=168  #i tried with log.retention.ms as well .. :-)  
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

When I restart the kafka the files looks like below.

-rw-r--r--. 1 confluent confluent 10485760 Dec 30 17:21 00000000000000149099.index
-rw-r--r--. 1 confluent confluent        0 Dec 30 17:21 00000000000000149099.log

I suspect something with the .index file size because it is set to the maximum ( segment.index.bytes default value is 10485760). (I suspect this because kafka cluster was working fine for almost a month)

Not sure what is going wrong for this and any help will be appreciated.

Some of the reference I have made given below.

http://kafka.apache.org/documentation/

https://stackoverflow.com/questions/28586008/delete-message-after-consuming-it-in-kafka

1

There are 1 answers

0
Mohit Gupta On

Did you check for log.roll.ms—This is the primary configuration. By default, it doesn't have any value. But if present it will override log.roll.hours.