Docker Swarm: Calico node pod is unhealthy

1.7k views Asked by At

Details:

OS: RHEL 7.4

uname -r: 3.10.0-693.el7.x86_64

Docker version

Client: Docker Enterprise Edition (EE) 2.0
 Version:       17.06.2-ee-10
 API version:   1.30
 Go version:    go1.8.7
 Git commit:    66261a0
 Built: Fri Apr 27 00:38:41 2018
 OS/Arch:       linux/amd64

Server: Docker Enterprise Edition (EE) 2.0
 Engine:
  Version:      17.06.2-ee-10
  API version:  1.30 (minimum version 1.12)
  Go version:   go1.8.7
  Git commit:   66261a0
  Built:        Fri Apr 27 00:40:03 2018
  OS/Arch:      linux/amd64
  Experimental: false

Error

Calico-node pod is unhealthy: %!s(<nil>)

I am trying to join a node to docker swarm cluster as a worker. But i'm getting the aformntioned error in health status check. As a result, the node is unable to join the swarm cluster.

The desired result must be successful node addition to the swarm cluster.

Regards Aditya

2

There are 2 answers

1
CK5 On BEST ANSWER

I was able to resolve the issue by cd to /proc/sys/net/ipv4/conf/all. If rp_filter is 2, then modify the value to 1 or 0. Once done, do :wq.

Now the node must join the network without any issue.

Regards

KrisT

0
bgercken On

I ran into the same issue in my test environment.

In my case it turned out that I was running out of disk space when I joined the node to the swarm.

Make sure that you have enough free space in /var/lib/docker on your host.

You will be able to tell if this is the problem if the ucp-calico-cni "/install-cni.sh" starts and then suddenly fails.

You can see this by doing the following:

  1. Remove the node from the swarm:

    docker swarm leave
    
  2. Then add it with your URL:

    docker swarm join --token SWMTKN-1-0le10al9t1coov7c23mg28gcviozrr1ggueqwlyjt51i7gpefd-5xxre29bwafxg0xud1abcdefg 192.168.0.191:2377
    
  3. Then immediately start:

    watch "docker ps"
    

You should see a ucp-pause and then ucp-calico-cni process.

If it starts and then fails - then space may be your problem.

A successful start up should look like:

34ed65e25213        docker/ucp-calico-cni        "/install-cni.sh"        8 seconds ago       Up 7 seconds
                          k8s_install-cni_calico-node-c2zd5_kube-system_ce6396d7-b16b-11e8-b3c7-0242ac11000b_0
21e1e3ff96f0        docker/ucp-calico-node       "start_runit"            14 seconds ago      Up 13 seconds
                          k8s_calico-node_calico-node-c2zd5_kube-system_ce6396d7-b16b-11e8-b3c7-0242ac11000b_0
a206f3242319        docker/ucp-pause:3.0.3       "/pause"                 29 seconds ago      Up 27 seconds
                          k8s_POD_calico-node-c2zd5_kube-system_ce6396d7-b16b-11e8-b3c7-0242ac11000b_0
840a48831f1b        docker/ucp-agent:3.0.3       "/bin/ucp-agent agent"   35 seconds ago      Up 29 seconds             2376/tcp
                          ucp-agent.u0a7uoqgrav90039vbvj43qt8.kdlov8fvojxjo291dph3ihcm2
74acd9eaabba        docker/ucp-hyperkube:3.0.3   "kubelet --allow-p..."   36 seconds ago      Up 35 seconds
                          ucp-kubelet
6f196e802795        docker/ucp-hyperkube:3.0.3   "kube-proxy --clus..."   36 seconds ago      Up 35 seconds
                          ucp-kube-proxy
1e695e3ac165        docker/ucp-agent:3.0.3       "/bin/ucp-agent pr..."   37 seconds ago      Up 36 seconds (healthy)   0.0.0.0:6444->6444/tcp, 0.0.0.0:12378->12378/tcp, 0.0.0.0:12376->2376/tcp   ucp-proxy