kube-controller-manager doesn't start when using "cloud-provider=aws" with kubeadm

I'm trying to use Kubernetes integration with AWS, but kube-controller-manager don't start. (BTW: Everything works perfectly without the ASW option)

Here is what I do:

-- 1 --

ubuntu@ip-172-31-17-233:~$ more /etc/kubernetes/aws.conf

apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
cloudProvider: aws
kubernetesVersion: 1.10.3

-- 2 --

ubuntu@ip-172-31-17-233:~$ more /etc/kubernetes/cloud-config.conf


(I tried lots of combinations here, according to the examples which I found, including "ws_access_key_id", "aws_secret_access_key", omitting the .conf, or removing this file, but nothing worked)

-- 3 --

ubuntu@ip-172-31-17-233:~$ sudo kubeadm init --config /etc/kubernetes/aws.conf

[init] Using Kubernetes version: v1.10.3
[init] Using Authorization modes: [Node RBAC]
[init] WARNING: For cloudprovider integrations to work --cloud-provider must be set for all kubelets in the cluster.
        (/etc/systemd/system/kubelet.service.d/10-kubeadm.conf should be edited for this purpose)
[preflight] Running pre-flight checks.
        [WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [ip-172-31-17-233 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs []
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [localhost] and IPs []
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [ip-172-31-17-233] and IPs []
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 19.001348 seconds
[uploadconfig]Â Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node ip-172-31-17-233 as master by adding a label and a taint
[markmaster] Master ip-172-31-17-233 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: x8hi0b.uxjr40j9gysc7lcp
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

You can now join any number of machines by running the following on each node
as root:

  kubeadm join --token x8hi0b.uxjr40j9gysc7lcp --discovery-token-ca-cert-hash sha256:8ad9dfbcacaeba5bc3242c811b1e83c647e2e88f98b0d783875c2053f7a40f44

-- 4 --

ubuntu@ip-172-31-17-233:~$ mkdir -p $HOME/.kube
ubuntu@ip-172-31-17-233:~$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: overwrite '/home/ubuntu/.kube/config'? y
ubuntu@ip-172-31-17-233:~$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

-- 5 --

ubuntu@ip-172-31-17-233:~$ kubectl get pods --all-namespaces

NAMESPACE     NAME                                       READY     STATUS             RESTARTS   AGE
kube-system   etcd-ip-172-31-17-233                      1/1       Running            0          40s
kube-system   kube-apiserver-ip-172-31-17-233            1/1       Running            0          45s
kube-system   kube-controller-manager-ip-172-31-17-233   0/1       CrashLoopBackOff   3          1m
kube-system   kube-scheduler-ip-172-31-17-233            1/1       Running            0          35s

kubectl version

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Any idea? I'm new to Kubernetes, and I have no idea what I can do...

Thanks, Michal.


Answer:

Any idea?

Check following points as potential issues:

  • kubelet has proper provider set, check /etc/systemd/system/kubelet.service.d/20-cloud-provider.conf containing:

    Environment="KUBELET_EXTRA_ARGS=--cloud-provider=aws --cloud-config=/etc/kubernetes/cloud-config.conf

    if not, add and restart kubelet service.

  • In /etc/kubernetes/manifests/ check following files have proper configuration:

    • kube-controller-manager.yaml and kube-apiserver.yaml:


      if not, just add, and pod will be automatically restarted.

  • Just in case, check that AWS resources (EC2 instances, etc) are tagged with kubernetes tag (taken from your cloud-config.conf) and IAM policies are properly set.

If you could supply logs as requested by Artem in comments that could shed more light on the issue.


As requested in comment, short overview of IAM policy handling:

  • create new IAM policy (or edit appropriately if already created), say k8s-default-policy. Given below is quite a liberal policy and you can fine grain exact settings to match you security preferences. Pay attention to load balancer section in your case. In the description put something along the lines of "Allows EC2 instances to call AWS services on your behalf." or similar...

      "Version": "2012-10-17",
      "Statement": [
          "Effect": "Allow",
          "Action": "s3:*",
          "Resource": [
          "Effect": "Allow",
          "Action": "ec2:Describe*",
          "Resource": "*"
          "Effect": "Allow",
          "Action": "ec2:AttachVolume",
          "Resource": "*"
          "Effect": "Allow",
          "Action": "ec2:DetachVolume",
          "Resource": "*"
          "Effect": "Allow",
          "Action": ["ec2:*"],
          "Resource": ["*"]
          "Effect": "Allow",
          "Action": ["elasticloadbalancing:*"],
          "Resource": ["*"]
        }  ]
  • create new role (or edit approptiately if already created) and attach previous policy to it, say attach k8s-default-policy to k8s-default-role.

  • Attach Role to instances that can handle AWS resources. You can create different roles for master and for workers if you need to. EC2 -> Instances -> (select instance) -> Actions -> Instance Settings -> Attach/Replace IAM Role -> (select appropriate role)

  • Also, apart from this check that all resources in question are tagged with kubernetes tag.