On 1.17.2, each new node has InternalIP equal to ExternalIP

Question

On 1.17.2, each new node has InternalIP equal to ExternalIP

238 views Asked by adalberto caccia At 29 January 2020 at 15:43

starting this Monday, after an upgrade to 1.17.2 from within Rancher, each new node (DigitalOcean droplets, all with Ubuntu 18.04.3) gets its InternalIP incorrectly equal to the ExternalIP, the Public one! This is in turn my primary suspect culprit for the lack of intra-cluster DNS resolution we've been experiencing since Monday, as I've just found the unresponsive services were lying on the new node with InternalIP=ExternalIP.

kubectl describe node exacto-devel-mail-01
...
Addresses:
InternalIP:  37.139.20.177
Hostname:    exacto-devel-mail-01

An "old" node (pre-1.17.2 upgrade, so presumably we were running on 1.16.6):

kubectl describe node exacto-devel-06
...
Addresses:
InternalIP:  10.129.254.119
Hostname:    exacto-devel-06

I've tried to edit the node, assigning the correct InternalIP value, but nothing happened! It just continues to show the wrong address!

This issue of failure in resolving cluster-DNS names of containers running on those "bad" nodes with broken InternalIP's appeared on another cluster after upgrading it to v1.16.6. So I can say the issue affects Kubernetes 1.16.6 and 1.17.2, at least in Rancher-managed k8s clusters.

To further clarify the issue, here's the current list of nodes in my development environment cluster:

kubectl get nodes -o wide
NAME                  STATUS   ROLES               AGE    VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
exacto-devel-01       Ready    controlplane,etcd   11d    v1.17.2   37.139.0.68      <none>        RancherOS v1.5.5     4.14.138-rancher    docker://19.3.5
exacto-devel-02       Ready    controlplane,etcd   11d    v1.17.2   37.139.0.231     <none>        RancherOS v1.5.5     4.14.138-rancher    docker://19.3.5
exacto-devel-03       Ready    controlplane,etcd   11d    v1.17.2   37.139.4.139     <none>        RancherOS v1.5.5     4.14.138-rancher    docker://19.3.5
exacto-devel-04       Ready    worker              11d    v1.17.2   10.129.254.158   <none>        Ubuntu 18.04.3 LTS   4.15.0-74-generic   docker://19.3.5
exacto-devel-05       Ready    worker              11d    v1.17.2   10.129.254.200   <none>        Ubuntu 18.04.3 LTS   4.15.0-74-generic   docker://19.3.5
exacto-devel-06       Ready    worker              10d    v1.17.2   10.129.254.119   <none>        Ubuntu 18.04.3 LTS   4.15.0-74-generic   docker://19.3.2
exacto-devel-elk-01   Ready    worker              25h    v1.17.2   185.14.186.204   <none>        Ubuntu 18.04.4 LTS   4.15.0-76-generic   docker://19.3.5
exacto-devel-elk-02   Ready    worker              7h8m   v1.17.2   198.211.118.87   <none>        Ubuntu 18.04.4 LTS   4.15.0-76-generic   docker://19.3.5

As you can see, 01,02,03 and elk-01,elk-02 all are affected by this issue: Under the "InternalIP" column header there's a clearly identifiable Public IP! While it doesn't seem to matter on 01,02 and 03 nodes, since they've got etcd and controlplane role, it is actually blocking the expansion of the cluster to add new functionality (ELK in this example), since any workload deployed on those new nodes will face intra-cluster DNS resolution issues.

Please advise on what to do next! Thank you

Original Q&A

There are 1 answers

**Arghya Sadhu** · Accepted Answer · 2020-01-29 16:19:41

Arghya Sadhu On 29 January 2020 at 16:19 BEST ANSWER

When you create custom cluster from rancher you can specify IP address to the rancher agent and agent will register the node with that IP.

TechQA.

On 1.17.2, each new node has InternalIP equal to ExternalIP

There are 1 answers

Related Questions in UBUNTU

Related Questions in KUBERNETES

Related Questions in RANCHER

Popular Questions

Popular Tags

Trending Questions