I have a 3x node kubernetes cluster: node1 (master), node2, and node3. I have a pod that's currently scheduled on node3 that I'd like to be exposed externally to the cluster. So I have a service of type nodePort with the nodePort set to 30080. I can successfully do curl localhost:30080
locally on each node: node1, node2, and node3. But externally, curl nodeX:30080
only works against node3. The other two timeout. tcpdump confirms node1 and node2 are receiving the request but not responding.
How can I make this work for all three nodes so I don't have to keep track of which node the pod is currently scheduled on? My best guess is that this is an iptables issue where I'm missing an iptables rule to DNAT traffic if the source IP isn't localhost. That being said, I have no idea how to troubleshoot to confirm this is the issue and then how to fix it. It seems like that rule should automatically be there.
Here's some info my setup:
kube-ravi196: 10.163.148.196
kube-ravi197: 10.163.148.197
kube-ravi198: 10.163.148.198
CNI: Canal (flannel + calico)
Host OS: Ubuntu 16.04
Cluster set up through kubeadm
$ kubectl get pods --namespace=kube-system -l "k8s-app=kube-registry" -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kube-registry-v0-1mthd 1/1 Running 0 39m 192.168.75.13 ravi-kube198
$ kubectl get service --namespace=kube-system -l "k8s-app=kube-registry"
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-registry 10.100.57.109 <nodes> 5000:30080/TCP 5h
$ kubectl get pods --namespace=kube-system -l "k8s-app=kube-proxy" -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kube-proxy-1rzz8 1/1 Running 0 40m 10.163.148.198 ravi-kube198
kube-proxy-fz20x 1/1 Running 0 40m 10.163.148.197 ravi-kube197
kube-proxy-lm7nm 1/1 Running 0 40m 10.163.148.196 ravi-kube196
Note that curl localhost from node ravi-kube196 is successful (a 404 is good).
deploy@ravi-kube196:~$ curl localhost:30080/test
404 page not found
But trying to curl the IP from a machine outside the cluster fails:
ravi@rmac2015:~$ curl 10.163.148.196:30080/test
(hangs)
Then trying to curl the node IP that the pod is scheduled on works.:
ravi@rmac2015:~$ curl 10.163.148.198:30080/test
404 page not found
Here are my iptables rules for that service/pod on the 196 node:
deploy@ravi-kube196:~$ sudo iptables-save | grep registry
-A KUBE-NODEPORTS -p tcp -m comment --comment "kube-system/kube-registry:registry" -m tcp --dport 30080 -j KUBE-MARK-MASQ
-A KUBE-NODEPORTS -p tcp -m comment --comment "kube-system/kube-registry:registry" -m tcp --dport 30080 -j KUBE-SVC-JV2WR75K33AEZUK7
-A KUBE-SEP-7BIJVD3LRB57ZVM2 -s 192.168.75.13/32 -m comment --comment "kube-system/kube-registry:registry" -j KUBE-MARK-MASQ
-A KUBE-SEP-7BIJVD3LRB57ZVM2 -p tcp -m comment --comment "kube-system/kube-registry:registry" -m tcp -j DNAT --to-destination 192.168.75.13:5000
-A KUBE-SEP-7QBKTOBWZOW2ADYZ -s 10.163.148.196/32 -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-MARK-MASQ
-A KUBE-SEP-7QBKTOBWZOW2ADYZ -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m tcp -j DNAT --to-destination 10.163.148.196:1
-A KUBE-SEP-DARQFIU6CIZ6DHSZ -s 10.163.148.198/32 -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-MARK-MASQ
-A KUBE-SEP-DARQFIU6CIZ6DHSZ -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m tcp -j DNAT --to-destination 10.163.148.198:1
-A KUBE-SEP-KXX2UKHAML22525B -s 10.163.148.197/32 -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-MARK-MASQ
-A KUBE-SEP-KXX2UKHAML22525B -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m tcp -j DNAT --to-destination 10.163.148.197:1
-A KUBE-SERVICES ! -s 192.168.0.0/16 -d 10.106.192.243/32 -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc: cluster IP" -m tcp --dport 1 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.106.192.243/32 -p tcp -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc: cluster IP" -m tcp --dport 1 -j KUBE-SVC-E66MHSUH4AYEXSQE
-A KUBE-SERVICES ! -s 192.168.0.0/16 -d 10.100.57.109/32 -p tcp -m comment --comment "kube-system/kube-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.100.57.109/32 -p tcp -m comment --comment "kube-system/kube-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-SVC-JV2WR75K33AEZUK7
-A KUBE-SVC-E66MHSUH4AYEXSQE -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-7QBKTOBWZOW2ADYZ
-A KUBE-SVC-E66MHSUH4AYEXSQE -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-KXX2UKHAML22525B
-A KUBE-SVC-E66MHSUH4AYEXSQE -m comment --comment "kube-system/glusterfs-dynamic-kube-registry-pvc:" -j KUBE-SEP-DARQFIU6CIZ6DHSZ
-A KUBE-SVC-JV2WR75K33AEZUK7 -m comment --comment "kube-system/kube-registry:registry" -j KUBE-SEP-7BIJVD3LRB57ZVM2
kube-proxy logs from 196 node:
deploy@ravi-kube196:~$ kubectl logs --namespace=kube-system kube-proxy-lm7nm
I0105 06:47:09.813787 1 server.go:215] Using iptables Proxier.
I0105 06:47:09.815584 1 server.go:227] Tearing down userspace rules.
I0105 06:47:09.832436 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0105 06:47:09.836004 1 conntrack.go:66] Setting conntrack hashsize to 32768
I0105 06:47:09.836232 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0105 06:47:09.836260 1 conntrack.go:81] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I found the cause of why the service couldn't be reached externally. It was because iptables FORWARD chain was dropping the packets. I raised an issue with kubernetes at https://github.com/kubernetes/kubernetes/issues/39658 with a bunch more detail there. A (poor) workaround is to change the default FORWARD policy to ACCEPT.
Update 1/10
I raised an issue with Canal, https://github.com/projectcalico/canal/issues/31, as it appears to be a Canal specific issue. Traffic getting forwarded to flannel.1 interface is getting dropped. A better fix than changing default FORWARD policy to ACCEPT is to just add a rule for flannel.1 interface.
sudo iptables -A FORWARD -o flannel.1 -j ACCEPT
.