Kubernetes networking issue - Service nodePort can't be reached externally

4.4k views Asked by At

I have a 3x node kubernetes cluster: node1 (master), node2, and node3. I have a pod that's currently scheduled on node3 that I'd like to be exposed externally to the cluster. So I have a service of type nodePort with the nodePort set to 30080. I can successfully do curl localhost:30080 locally on each node: node1, node2, and node3. But externally, curl nodeX:30080 only works against node3. The other two timeout. tcpdump confirms node1 and node2 are receiving the request but not responding.

How can I make this work for all three nodes so I don't have to keep track of which node the pod is currently scheduled on? My best guess is that this is an iptables issue where I'm missing an iptables rule to DNAT traffic if the source IP isn't localhost. That being said, I have no idea how to troubleshoot to confirm this is the issue and then how to fix it. It seems like that rule should automatically be there.

Here's some info my setup:
kube-ravi196: 10.163.148.196
kube-ravi197: 10.163.148.197
kube-ravi198: 10.163.148.198
CNI: Canal (flannel + calico)
Host OS: Ubuntu 16.04
Cluster set up through kubeadm

$ kubectl get pods --namespace=kube-system -l "k8s-app=kube-registry" -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP              NODE
kube-registry-v0-1mthd   1/1       Running   0          39m       192.168.75.13   ravi-kube198

$ kubectl get service --namespace=kube-system -l "k8s-app=kube-registry"
NAME            CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
kube-registry   10.100.57.109   <nodes>       5000:30080/TCP   5h

$ kubectl get pods --namespace=kube-system -l "k8s-app=kube-proxy" -o wide
NAME               READY     STATUS    RESTARTS   AGE       IP               NODE
kube-proxy-1rzz8   1/1       Running   0          40m       10.163.148.198   ravi-kube198
kube-proxy-fz20x   1/1       Running   0          40m       10.163.148.197   ravi-kube197
kube-proxy-lm7nm   1/1       Running   0          40m       10.163.148.196   ravi-kube196

Note that curl localhost from node ravi-kube196 is successful (a 404 is good).

deploy@ravi-kube196:~$ curl localhost:30080/test
404 page not found

But trying to curl the IP from a machine outside the cluster fails:

ravi@rmac2015:~$ curl 10.163.148.196:30080/test
(hangs)

Then trying to curl the node IP that the pod is scheduled on works.:

ravi@rmac2015:~$ curl 10.163.148.198:30080/test
404 page not found

Here are my iptables rules for that service/pod on the 196 node:

deploy@ravi-kube196:~$ sudo iptables-save | grep registry
-A KUBE-NODEPORTS -p tcp -m comment --comment "kube-system/kube-registry:registry" -m tcp --dport 30080 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "kube-system/kube-registry:registry" -m tcp --dport 30080 -j KUBE-SVC-JV2WR75K33AEZUK7
-A KUBE-SEP-7BIJVD3LRB57ZVM2 -s 192.168.75.13/32 -m comment --comment "kube-system/kube-registry:registry" -j KUBE-MARK-MASQ
-A KUBE-SEP-7BIJVD3LRB57ZVM2 -p tcp -m comment --comment "kube-system/kube-registry:registry" -m tcp -j DNAT --to-destination 192.168.75.13:5000
-A KUBE-SEP-7QBKTOBWZOW2ADYZ -s 10.163.148.196/32 -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-MARK-MASQ
-A KUBE-SEP-7QBKTOBWZOW2ADYZ -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m tcp -j DNAT --to-destination 10.163.148.196:1
-A KUBE-SEP-DARQFIU6CIZ6DHSZ -s 10.163.148.198/32 -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-MARK-MASQ
-A KUBE-SEP-DARQFIU6CIZ6DHSZ -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m tcp -j DNAT --to-destination 10.163.148.198:1
-A KUBE-SEP-KXX2UKHAML22525B -s 10.163.148.197/32 -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-MARK-MASQ
-A KUBE-SEP-KXX2UKHAML22525B -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m tcp -j DNAT --to-destination 10.163.148.197:1
-A KUBE-SERVICES ! -s 192.168.0.0/16 -d 10.106.192.243/32 -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc: cluster IP" -m tcp --dport 1 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.106.192.243/32 -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc: cluster IP" -m tcp --dport 1 -j KUBE-SVC-E66MHSUH4AYEXSQE
-A KUBE-SERVICES ! -s 192.168.0.0/16 -d 10.100.57.109/32 -p tcp -m comment --comment "kube-system/kube-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.100.57.109/32 -p tcp -m comment --comment "kube-system/kube-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-SVC-JV2WR75K33AEZUK7
-A KUBE-SVC-E66MHSUH4AYEXSQE -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-7QBKTOBWZOW2ADYZ
-A KUBE-SVC-E66MHSUH4AYEXSQE -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-KXX2UKHAML22525B
-A KUBE-SVC-E66MHSUH4AYEXSQE -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-SEP-DARQFIU6CIZ6DHSZ
-A KUBE-SVC-JV2WR75K33AEZUK7 -m comment --comment "kube-system/kube-registry:registry" -j KUBE-SEP-7BIJVD3LRB57ZVM2

kube-proxy logs from 196 node:

deploy@ravi-kube196:~$ kubectl logs --namespace=kube-system kube-proxy-lm7nm
I0105 06:47:09.813787       1 server.go:215] Using iptables Proxier.
I0105 06:47:09.815584       1 server.go:227] Tearing down userspace rules.
I0105 06:47:09.832436       1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0105 06:47:09.836004       1 conntrack.go:66] Setting conntrack hashsize to 32768
I0105 06:47:09.836232       1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0105 06:47:09.836260       1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
1

There are 1 answers

0
ravishi On

I found the cause of why the service couldn't be reached externally. It was because iptables FORWARD chain was dropping the packets. I raised an issue with kubernetes at https://github.com/kubernetes/kubernetes/issues/39658 with a bunch more detail there. A (poor) workaround is to change the default FORWARD policy to ACCEPT.

Update 1/10

I raised an issue with Canal, https://github.com/projectcalico/canal/issues/31, as it appears to be a Canal specific issue. Traffic getting forwarded to flannel.1 interface is getting dropped. A better fix than changing default FORWARD policy to ACCEPT is to just add a rule for flannel.1 interface. sudo iptables -A FORWARD -o flannel.1 -j ACCEPT.