Problem
I'm trying to create a service of type NodePort
in my Kubernetes cluster, but it's not working as expected, and I suspect it has to do the fact that I've disabled ELB permissions for the IAM roles being used on my master node. I wouldn't think ELB permissions should matter for NodePort
, but I'm seeing an error message that leads me to think this. Am I doing something wrong? Is this a known issue others have seen before?
Attempt
Deployed a service of type NodePort
to my cluster, expecting to be able to reach my service on any of the nodes' public IPs and the given port, but I can't. There's 1 master and 2 non-master nodes, and no process is even bound to port 30095 (the assigned NodePort
) except on the master node. SSH'ing onto the master and curling that port in a variety of ways does nothing (curl
just hangs). Curling the endpoints associated with the service works fine. kubectl describe
on the service suggests there was some error creating a load balancer, but I don't know why it would be doing that.
I'll reiterate that I specifically disabled the IAM role used by the master nodes from being able to do any ELB things. I don't want developers using my Kubernetes cluster to be able to spin up ELB's in my account, or do anything for that matter that would create AWS resources in my account.
Actual Result
information about service (commands run from local workstation) -- note
CreatingLoadBalancerFailed
error in output ofkubectl describe service
:$ kubectl get services frontend -oyaml apiVersion: v1 kind: Service ---SNIP--- ports: - nodePort: 30095 port: 80 protocol: TCP targetPort: 80 selector: app: guestbook tier: frontend sessionAffinity: None type: NodePort status: loadBalancer: {} $ kubectl describe services frontend Name: frontend Namespace: default Labels: app=guestbook tier=frontend Selector: app=guestbook,tier=frontend Type: NodePort IP: 100.67.10.125 Port: <unset> 80/TCP NodePort: <unset> 30095/TCP Endpoints: 100.96.1.2:80,100.96.2.2:80,100.96.2.4:80 Session Affinity: None Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1h 4m 15 {service-controller } Warning CreatingLoadBalancerFailed (events with common reason combined)
looking for processes bound to port on non-master node:
$ netstat -tulpn | grep 30095 # no output
looking for processes bound to port on master node:
$ netstat -tulpn | grep 30095 tcp6 0 0 :::30095 :::* LISTEN 1540/kube-proxy
attempting to curl the service (just hangs):
$ curl localhost:30095 # just hangs ^C $ curl -g -6 http://[::1]:30095 # just hangs ^C $ curl -vvvg -6 http://[::1]:30095 * Rebuilt URL to: http://[::1]:30095/ * Hostname was NOT found in DNS cache * Trying ::1... * Connected to ::1 (::1) port 30095 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.38.0 > Host: [::1]:30095 > Accept: */* > # just hangs after that ^C $ curl 100.67.10.125:30095 # just hangs ^C
curling an endpoint from master node (works, so the pods are running fine):
$ curl 100.96.2.4 <html ng-app="redis"> <head> ---SNIP--- </body> </html>
Expected Result
Expected to see the same result from curling the endpoints when curling the external IP of any of the nodes on the service's assigned NodePort
of 30095
.
Additional details:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1+82450d0", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"not a git tree", BuildDate:"2016-12-14T04:09:31Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.6", GitCommit:"e569a27d02001e343cb68086bc06d47804f62af6", GitTreeState:"clean", BuildDate:"2016-11-12T05:16:27Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
GitHub issue: https://github.com/kubernetes/kubernetes/issues/39214
- Mailing list post: https://groups.google.com/forum/#!topic/kubernetes-dev/JNC_bk1L3iI
Kubernetes does this because it assumes that a new
NodePort
service may have previously been aLoadBalancer
service, and so it may need to clean up the cloud load balancer. A PR was opened that would fix this issue, but then closed. In the mean time, switching IAM policy for themaster
role to haveelasticloadbalancing:DescribeLoadBalancers
instead ofelasticloadbalancing:*
solved the issue, i.e. the rest of the cluster includingNodePort
services work fine, but still restricts people from creating ELBs.