Kubernetes Node healthcheck

2.6k views Asked by At

I am trying to understand the Node-Controller in Kubernetes. Kubernetes documentation mentions that node heartbeats are done using NodeStatus and LeaseObject updates. Someone, please explain why both mechanisms are needed for monitoring node health. Does Kubernetes master internally use a job/cronjob for a node health check processing?

1

There are 1 answers

5
Wytrzymały Wiktor On

Lease is a lightweight resource, which improves the performance of the node heartbeats as the cluster scales.

The Lease objects are tracked as a way of helping the heartbeats to continue functioning efficiently as a cluster scales up. According to the docs, this would be their primary function relating to heartbeats.

Whereas the NodeStatus is used for Heartbeats by the kubelet, NodeStatus is also an important signal for other controllers in k8s.

For example: the k8s scheduler is responsible for scheduling pods on nodes. It tries to find the best fit for a node to optimize memory, cpu, and other usage on the node. It wouldn't however want to schedule a pod on a node with node status condition set to NetworkUnavailable: true or some other condition which would make the pod unsuitable to run on that node.

If there is a signal or signals that you don't know or understand, there is a good chance there is a controller that uses that field or signal to accomplish its logic.

EDIT:

The node-controller is a part of the kube-controller-manager:

The Kubernetes controller manager is a daemon that embeds the core control loops shipped with Kubernetes. In applications of robotics and automation, a control loop is a non-terminating loop that regulates the state of the system. In Kubernetes, a controller is a control loop that watches the shared state of the cluster through the apiserver and makes changes attempting to move the current state towards the desired state. Examples of controllers that ship with Kubernetes today are the replication controller, endpoints controller, namespace controller, and serviceaccounts controller.

Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.

EDIT_2:

Based on your latest comments, we have 2 additional points to address:

  1. "how the node-controller processes the node health check"

While implementing k8s, you probably don't need to know this level of detail. All the details which should be useful for you are already in the linked public docs. There is no need to worry about that but I understand that it brought the more practical question:

  1. I am not sure how much load a big cluster can generate.

This is where the Considerations for large clusters comes to help. It will show you how to handle big clusters and which tools are there at your disposal when it comes to managing them.