Kafka write the logs to page cache and relies on os to flush it periodically to disk. Log flush rate metric in kafka measures the rate of that. The value is in time. So what does a 10 sec log flush rate means?
I was load testing our kafka servers and seeing that log flush rates shoots up to seconds from milliseconds. And it stays there for some hours even before coming back to millisecond level. While this happens we are not seeing any loss of message or latency. During this load test diskutil is also hitting 100% sometime and comes back to normal immediately. Could someone please help me understand this behaviour? What is the outcome of these high metrics?