All my permissions were working fine before. After upgrading to EKS 1.25, I started getting the error below when doing kubectl logs pod -n namespace

I tried to debug it. I look at the configMap, clusterRole and RoleBinding. I don't see any apparent issues (it's actually been two years since I created these objects, perhaps I'm missing something now with the latest version of Kubernetes?)

Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)

aws-auth configMap

apiVersion: v1
data:
  mapRoles: |
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::<some-number>:role/eksctl-<xyz-abs>-nodegrou-NodeInstanceRole-DMQXBTLLXHNU
      username: system:node:{{EC2PrivateDNSName}}
  mapUsers: |
    - userarn: arn:aws:iam::043519645107:user/kube-developer
      username: kube-developer
      groups:
       - kube-developer
kind: ConfigMap
metadata:
  creationTimestamp: "2020-07-03T16:55:08Z"
  name: aws-auth
  namespace: kube-system
  resourceVersion: "104191269"
  uid: 844f189d-b3d6-4204-bf85-7b789c0ee91a

ClusterRole and RoleBinding

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kube-developer-cr
rules:
- apiGroups: ["*"]
  resources:
    - configmaps
    - endpoints
    - events
    - ingresses
    - ingresses/status
    - services
  verbs:
    - create
    - get
    - list
    - update
    - watch
    - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kube-developer-crb
subjects:
- kind: Group
  name: kube-developer
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: kube-developer-cr
  apiGroup: rbac.authorization.k8s.io

Error when drilling down into a running pod enter image description here

---EDIT----

I tried to create ClusterRoleBinding with the same user as the one thrown in the error message kube-apiserver-kubelet-client and assigned it roleRef kubelet-api-admin, still getting the same issue.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kube-apiserver
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:kubelet-api-admin
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: kube-apiserver-kubelet-client

---Edit---

Second day of debugging, I launched another instance of EKS. I found that it has CSR (Certificate Signing Request) whereas my EKS is missing CSR.

enter image description here

1

There are 1 answers

0
Amnon On

I got the same symptom whilst upgrading EKS. I had upgraded EKS, added nodes running a newer kubelet version, but did not move the running workloads to the new nodes, hence the error message. I got it working when I:

  1. moved the instances running the nodes with the old k8s version to "StandBy" (using the aws console, but possible also in CLI)
  2. drained the nodes and making k8s schedule them on the new nodes. I used kubectl drain <node>