What is a partition leader in Apache Kafka?

30.6k views Asked by At

Are kafka leaders partitions themselves or are they brokers? My initial understanding was that they were partitions which acted as read/write agents which then deffered their value to ISRs.

However recently I have been hearing them mentioned as though they happen at the "broker" level, hence my confusion.

I know there are other posts which aim to answer this question, but the answers there did not help.

4

There are 4 answers

0
Yogi On

Partition leader concept works, when Kafka topic have --replication-factor more then 1 (that also means our cluster must have broker count greater or equals to replication-factor).

In such scenario when ever producer push any message to topic's partition, the request first comes to partition's leader (among all replicated partition present on Kafka cluster). Which stores the message and first replicate the message on other follower partitions and then after sends acknowledge for the message to producer.

After completion above process only, particular message would be available for consumer to consume.

I recommend official link for more understanding.

0
H.Ç.T On

All topic-partitions in Kafka has one leader and if replication factor is greater than 1, leader has follower(s). Partition leaders can be checked with this command:

bin/kafka-topics.sh --bootstrap-server localhost:9092 --topic myTopic --describe

In the output of this command broker ids of partition leaders is shown as leader: xx

7
Ofek Hod On

Some answers here are not absolutely correct so I would like to make it clearer.

Every partition has exactly one partition leader which handles all the read/write requests of that partition. (update: from Kafka 2.4.0, consumers are allowed to read from replicas)
If replication factor is greater than 1, the additional partition replications act as partition followers.
Kafka guarantees that every partition replica resides on a different broker (whether if it's the leader or a follower), so the maximum replication factor is the number of brokers in the cluster.

Every partition follower is reading messages from the partition leader (acts like a kind of consumer) and does not serve any consumers of that partition (only the partition leader serves read/writes).
A partition follower is considered in-sync if it's reading records from the partition leader without lagging behind and without losing connection to ZooKeeper (max lag default is 10 seconds and ZooKeeper timeout is 6 seconds, both are configurable).
If a partition follower is lagging behind or lost connection from ZooKeeper, it considered out-of-sync.
When a partition leader shuts down for any reason (e.g a broker shuts down), one of it's in-sync partition followers becomes the new leader.

The replication section in Kafka Documentation explains this in details.
Confluent also wrote a nice blog about this topic.

0
Giorgos Myrianthous On

tl;dr

Are kafka leaders partitions themselves or are they brokers?

The partition leader is a Kafka Broker.


Partition Leader

This is clearly mentioned in Kafka Docs:

Each partition has one server which acts as the "leader" and zero or more servers which act as "followers". The leader handles all read and write requests for the partition while the followers passively replicate the leader. If the leader fails, one of the followers will automatically become the new leader. Each server acts as a leader for some of its partitions and a follower for others so load is well balanced within the cluster.

Therefore, a partition leader is actually the broker that serves this purpose and is responsible for all read and write requests for this particular partition.


Partition Leader Election

The assignment of a leader for a particular partition happens during a process called partition leader election. This process happens when the topic/partition is created or when the partition leader (i.e. the broker) is unavailable for any reason.

Additionally, you can force preferred replica election by using Preferred Replica Leader Election Tool:

With replication, each partition can have multiple replicas. The list of replicas for a partition is called the "assigned replicas". The first replica in this list is the "preferred replica". When topic/partitions are created, Kafka ensures that the "preferred replica" for the partitions across topics are equally distributed amongst the brokers in a cluster. In an ideal scenario, the leader for a given partition should be the "preferred replica". This guarantees that the leadership load across the brokers in a cluster are evenly balanced. However, over time the leadership load could get imbalanced due to broker shutdowns (caused by controlled shutdown, crashes, machine failures etc). This tool helps to restore the leadership balance between the brokers in the cluster.

To do so, you have to run the following command:

bin/kafka-preferred-replica-election.sh --zookeeper localhost:12913/kafka --path-to-json-file topicPartitionList.json

where the content of topicPartitionList.json should look like the one below:

{
 "partitions":
  [
    {"topic": "topic1", "partition": 0},
    {"topic": "topic1", "partition": 1},
    {"topic": "topic1", "partition": 2},
    {"topic": "topic2", "partition": 0},
    {"topic": "topic2", "partition": 1}
  ]
}

How to find which broker serves as the partition leader

In order to find which broker serves as the partition leader and which serve as In-Sync Replicas (ISR), you have to run the following command:

kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic myTopic

and the output should be identical to the one below:

Topic:myTopic       PartitionCount:4        ReplicationFactor:1     Configs:
    Topic: myTopic      Partition: 0    Leader: 2       Replicas: 2     Isr: 2
    Topic: myTopic      Partition: 1    Leader: 3       Replicas: 3     Isr: 3
    Topic: myTopic      Partition: 2    Leader: 4       Replicas: 4     Isr: 4
    Topic: myTopic      Partition: 3    Leader: 0       Replicas: 0     Isr: 0