I have set up Kubernetes HA cluster(stacked ETCD) using Kubeadm. When I deliberately shut down one master node the whole cluster goes down and I get error as :

[vagrant@k8s-master01 ~]$ kubectl get nodes
Error from server: etcdserver: request timed out

I am using Nginx as LB to load balance Kubeapi

NAME           STATUS   ROLES    AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s-master01   Ready    master   27d   v1.19.2   192.168.30.5    <none>        CentOS Linux 7 (Core)   3.10.0-1127.19.1.el7.x86_64   docker://19.3.11
k8s-master02   Ready    master   27d   v1.19.2   192.168.30.6    <none>        CentOS Linux 7 (Core)   3.10.0-1127.19.1.el7.x86_64   docker://19.3.11
k8s-worker01   Ready    <none>   27d   v1.19.2   192.168.30.10   <none>        CentOS Linux 7 (Core)   3.10.0-1127.19.1.el7.x86_64   docker://19.3.11
k8s-worker02   Ready    <none>   27d   v1.19.2   192.168.30.11   <none>        CentOS Linux 7 (Core)   3.10.0-1127.19.1.el7.x86_64   docker://19.3.11

[vagrant@k8s-master01 ~]$ kubectl get pods -n kube-system
NAME                                   READY   STATUS    RESTARTS   AGE
coredns-f9fd979d6-wkknl                0/1     Running   9          27d
coredns-f9fd979d6-wp854                1/1     Running   8          27d
etcd-k8s-master01                      1/1     Running   46         27d
etcd-k8s-master02                      1/1     Running   10         27d
kube-apiserver-k8s-master01            1/1     Running   60         27d
kube-apiserver-k8s-master02            1/1     Running   13         27d
kube-controller-manager-k8s-master01   1/1     Running   20         27d
kube-controller-manager-k8s-master02   1/1     Running   15         27d
kube-proxy-7vn9l                       1/1     Running   7          26d
kube-proxy-9kjrj                       1/1     Running   7          26d
kube-proxy-lbmkz                       1/1     Running   8          27d
kube-proxy-ndbp5                       1/1     Running   9          27d
kube-scheduler-k8s-master01            1/1     Running   20         27d
kube-scheduler-k8s-master02            1/1     Running   15         27d
weave-net-77ck8                        2/2     Running   21         26d
weave-net-bmpsf                        2/2     Running   24         27d
weave-net-frchk                        2/2     Running   27         27d
weave-net-zqjzf                        2/2     Running   22         26d
[vagrant@k8s-master01 ~]$

Nginx Config :

stream {
        upstream apiserver_read {
             server 192.168.30.5:6443;
             server 192.168.30.6:6443;
        }
        server {
                listen 6443;
                proxy_pass apiserver_read;
        }
}

Nginx logs :

2020/10/19 09:12:01 [error] 1215#0: *12460 no live upstreams while connecting to upstream, client: 192.168.30.11, server: 0.0.0.0:6443, upstream: "apiserver_read", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/10/19 
2020/10/19 09:12:01 [error] 1215#0: *12465 no live upstreams while connecting to upstream, client: 192.168.30.5, server: 0.0.0.0:6443, upstream: "apiserver_read", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/10/19 09:12:02 [error] 1215#0: *12466 no live upstreams while connecting to upstream, client: 192.168.30.10, server: 0.0.0.0:6443, upstream: "apiserver_read", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/10/19 09:12:02 [error] 1215#0: *12467 no live upstreams while connecting to upstream, client: 192.168.30.11, server: 0.0.0.0:6443, upstream: "apiserver_read", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/10/19 09:12:02 [error] 1215#0: *12468 no live upstreams while connecting to upstream, client: 192.168.30.5, server: 0.0.0.0:6443, upstream: "apiserver_read", bytes from/to client:0/0, bytes from/to upstream:0/0

2

There are 2 answers

0
CrowDev On BEST ANSWER

The reason why ETCD times out is because it is a distributed key-value database and it needs quorum to be healthy. This basically means that all members of a ETCD cluster vote on certain decisions and the majority decides what to do. When you have 3 nodes you can always lose 1 as 2 nodes are still a majority

The problem of having 2 nodes is that when 1 goes down the last ETCD node waits for a majority vote before deciding anything, which will never happen.

This is why you always need an unequal number of master nodes on a Kubernetes cluster.

0
mre On

I have the same setup (stacked etcd, but with keepalived and HAProxy instead of nginx) and I had the same problem.

You need at least 3 (!) control plane nodes. Only then you can shut down one out of the three control plane nodes without losing functionality.

3 out of 3 control plane nodes up:

$ kubectl get pods -n kube-system
[...list of pods...]

2 out of 3 control plane nodes up:

$ kubectl get pods -n kube-system
[...list of pods...]

1 out of 3 control planes nodes up:

$ kubectl get pods -n kube-system
Error from server: etcdserver: request timed out

Again 2 out of 3 up:

$ kubectl get pods -n kube-system
[...list of pods...]