How to add external GCP loadbalancer to kubespray cluster?

671 views Asked by At

I deployed a kubernetes cluster on Google Cloud using VMs and Kubespray.

Right now, I am looking to expose a simple node app to external IP using loadbalancer but showing my external IP from gcloud to service does not work. It stays on pending state when I query kubectl get services.

According to this, kubespray does not have any loadbalancer mechanicsm included/integrated by default. How should I progress?

1

There are 1 answers

3
Matt On

Let me start of by summarizing the problem we are trying to solve here.

The problem is that you have self-hosted kubernetes cluster and you want to be able to create a service of type=LoadBalancer and you want k8s to create a LB for you with externlIP and in fully automated way, just like it would if you used a GKE (kubernetes as a service solution).

Additionally I have to mention that I don't know much of a kubespray, so I will only describe all the steps that need to bo done to make it work, and leave the rest to you. So if you want to make changes in kubespray code, it's on you. All the tests I did with kubeadm cluster but it should not be very difficult to apply it to kubespray.


I will start of by summarizing all that has to be done into 4 steps:

  1. tagging the instances
  2. enabling cloud-provider functionality
  3. IAM and service accounts
  4. additional info

Tagging the instances All worker node instances on GCP have to be labeled with unique tag that is the name of an instance; these tags are later used to create a firewall rules and target lists for LB. So lets say that you have an instance called worker-0; you need to tag that instance with a tag worker-0

Otherwise it will result in an error (that can be found in controller-manager logs):

Error syncing load balancer: failed to ensure load balancer: no node tags supplied and also failed to parse the given lists of hosts for tags. Abort creating firewall rule

Enabling cloud-provider functionality K8s has to be informed that it is running in cloud and what cloud provider that is so that it knows how to talk with the api.

controller manager logs informing you that it wont create an LB.

WARNING: no cloud provider provided, services of type LoadBalancer will fail

Controller Manager is responsible for creation of a LoadBalancer. It can be passed a flag --cloud-provider. You can manually add this flag to controller manager pod manifest file; or like in your case since you are running kubespray, you can add this flag somewhere in kubespray code (maybe its already automated and just requires you to set some env or sth, but you need to find it out yourself).

Here is how this file looks like with the flag:

apiVersion: v1
kind: Pod
metadata:
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    ...
    - --cloud-provider=gce    # <----- HERE

As you can see the value in our case is gce, which stangs for Google Compute Engine. It informs k8s that its running on GCE/GCP.


IAM and service accounts Now that you have your provider enabled, and tags covered, I will talk about IAM and permissions.

For k8s to be able to create a LB in GCE, it needs to be allowed to do so. Every GCE instance has a deafult service account assigned. Controller Manager uses instance service account, stored within instance metadata to access GCP API.

For this to happen you need to set Access Scopes for GCE instance (master node; the one where controller manager is running) so it can use Cloud Engine API.

Access scopes -> Set access for each API -> compute engine=Read Write

To do this the instance has to be stopped, so now stop the instance. It's better to set these scopes during instance creation so that you don't need to make any unnecessary steps.

You also need to go to IAM & Admin page in GCP Console and add permissions so that master instance's service account has Kubernetes Engine Service Agent role assigned. This is a predefined role that has much more permissions than you probably need but I have found that everything works with this role so I decided to use is for demonstration purposes, but you probably want to use least privilege rule.


additional info There is one more thing I need to mention. It does not impact you but while testing I have found out an interesting thing.

Firstly I created only one node cluster (single master node). Even though this is allowed from k8s point of view, controller manager would not allow me to create a LB and point it to a master node where my application was running. This draws a conclusion that one cannot use LB with only master node and has to create at least one worker node.


PS I had to figure it out the hard way; by looking at logs, changing things and looking at logs again to see if the issue got solved. I didn't find a single article/documentation page where it is documented in one place. If you manage to solve it for yourself, write the answer for others. Thank you.