Kubernetes failed to setup network for pod after executed kubeadm reset

14.8k views Asked by At

I initialized Kubernetes with kubeadm init, and after I used kubeadm reset to reset it I found --pod-network-cidr was wrong. After correcting it I tried to use kubeadm to init Kubernetes again like this:

kubeadm init --use-kubernetes-version v1.5.1 --external-etcd endpoints=http://10.111.125.131:2379 --pod-network-cidr=10.244.0.0/16

Then I got some errors on the nodes

12月 28 15:30:55 ydtf-node-137 kubelet[13333]: E1228 15:30:55.838700   13333 cni.go:255] Error adding network: no IP addresses available in network: cbr0
12月 28 15:30:55 ydtf-node-137 kubelet[13333]: E1228 15:30:55.838727   13333 cni.go:209] Error while adding to cni network: no IP addresses available in network: cbr0
12月 28 15:30:55 ydtf-node-137 kubelet[13333]: E1228 15:30:55.838781   13333 docker_manager.go:2201] Failed to setup network for pod "test-701078429-tl3j2_default(6945191b-ccce-11e6-b53d-78acc0f9504e)" using network plugins "cni": no IP addresses available in network: cbr0; Skipping pod  
12月 28 15:30:56 ydtf-node-137 kubelet[13333]: E1228 15:30:56.205596   13333 pod_workers.go:184] Error syncing pod 6945191b-ccce-11e6-b53d-78acc0f9504e, skipping: failed to "SetupNetwork" for "test-701078429-tl3j2_default" with SetupNetworkError: "Failed to setup network for pod \"test-701078429-tl3j2_default(6945191b-ccce-11e6-b53d-78acc0f9504e)\" using network plugins \"cni\": no IP addresses available in network: cbr0; Skipping pod"

or

Dec 29 10:20:02 ydtf-node-137 kubelet: E1229 10:20:02.065142   22259 pod_workers.go:184] Error syncing pod 235cd9c6-cd6c-11e6-a9cd-78acc0f9504e, skipping: failed to "SetupNetwork" for "test-701078429-zmkdf_default" with SetupNetworkError: "Failed to setup network for pod \"test-701078429-zmkdf_default(235cd9c6-cd6c-11e6-a9cd-78acc0f9504e)\" using network plugins \"cni\": \"cni0\" already has an IP address different from 10.244.1.1/24; Skipping pod"

Why can't I create a network for the new pods?

By the way, I use flannel as network provider and it works fine.

[root@ydtf-master-131 k8s151]# kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                      READY     STATUS                       RESTARTS   AGE       IP               NODE
default       test-701078429-tl3j2                      0/1       ContainerCreating   0          2h        <none>           ydtf-node-137
kube-system   dummy-2088944543-hd7b7                    1/1       Running             0          2h        10.111.125.131   ydtf-master-131
kube-system   kube-apiserver-ydtf-master-131            1/1       Running             7          2h        10.111.125.131   ydtf-master-131
kube-system   kube-controller-manager-ydtf-master-131   1/1       Running             0          2h        10.111.125.131   ydtf-master-131
kube-system   kube-discovery-1769846148-bjgp8           1/1       Running             0          2h        10.111.125.131   ydtf-master-131
kube-system   kube-dns-2924299975-q8x2m                 4/4       Running             0          2h        10.244.0.3       ydtf-master-131 
kube-system   kube-flannel-ds-3fsjh                     2/2       Running             0          2h        10.111.125.137   ydtf-node-137
kube-system   kube-flannel-ds-89r72                     2/2       Running             0          2h        10.111.125.131   ydtf-master-131
kube-system   kube-proxy-7w8c4                          1/1       Running             0          2h        10.111.125.137   ydtf-node-137
kube-system   kube-proxy-jk6z6                          1/1       Running             0          2h        10.111.125.131   ydtf-master-131
kube-system   kube-scheduler-ydtf-master-131            1/1       Running             0          2h        10.111.125.131   ydtf-master-131
6

There are 6 answers

1
Robat.Michael On

I figured it out, if you change --pod-network-cidr when you reinitialize kubernetes via kubeadm init, you should delete some auto-created things, just follow the steps below before you execute kubeadm init again:

  1. execute kubeadm reset on master and nodes.

  2. execute etcdctl rm --recursive registry reset data in etcd.

  3. rm -rf /var/lib/cni on master and nodes

  4. rm -rf /run/flannel on master and nodes

  5. rm -rf /etc/cni on master and nodes

  6. ifconfig cni0 down on master and nodes

  7. brctl delbr cni0 on master and nodes

Now, my Kubernetes works fine :)

0
Filipe Felisbino On

I had a similar issue and the fix in that case was to apply the flannel pod network to the cluster:

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml
1
Michel Schildmeijer On

What helped for me:

  • ip link set cni0 down

    brctl delbr cni0

    Delete and re-apply flannel

So no need to build up your cluster again

0
Ivan Aracki On

After kubeadm reset and before kubeadm init do the following on master and worker nodes:

  1. kubeadm reset
  2. systemctl stop kubelet & systemctl stop docker
  3. rm -rf /var/lib/cni/
  4. rm -rf /var/lib/kubelet/*
  5. rm -rf /etc/cni/
  6. ifconfig cni0 down & ip link delete cni0
  7. ifconfig flannel.1 down & ip link delete flannel.1
  8. ifconfig docker0 down

Tested with kubernetes server version: v1.13.2 & flannel: v0.10.0-amd64

github issue reference

0
solsson On

I had an issue after change of --pod-network-cidr, with join reporting success but no node being added. kubeadm reset and re-join had no effect. Solved through apt-get remove kubelet kubeadm kubectl kubernetes-cni after reset, followed by docker and/or machine restart, followed by reinstall and then join agin.

1
Daniel Maldonado On

This document helped a lot:

https://github.com/feiskyer/kubernetes-handbook/blob/master/en/troubleshooting/pod.md

specially the part which applies to this issue:

$ ip link set cni0 down
$ brctl delbr cni0  

If you do this on the api servers and then just reboot the machine it should stabilize pretty quickly.