I have been trying to deploy my application AWS EKS Fargate for the first time over months. I was getting lots of error but fixed step by step and I got stuck in the following error.
{"level":"info","ts":1697396478.544361,"msg":"version","GitVersion":"v2.4.5","GitCommit":"d9482de36bf51a17e7def869f677f35c2b6c6045","BuildDate":"2022-11-09T20:46:04+0000"}
{"level":"info","ts":1697396478.6424556,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1697396478.656497,"logger":"setup","msg":"adding health check for controller"}
{"level":"info","ts":1697396478.6567256,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-v1-pod"}
{"level":"info","ts":1697396478.6568844,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1697396478.6569781,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-elbv2-k8s-aws-v1beta1-targetgroupbinding"}
{"level":"info","ts":1697396478.6570742,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-networking-v1-ingress"}
{"level":"info","ts":1697396478.6571383,"logger":"setup","msg":"starting podInfo repo"}
{"level":"info","ts":1697396480.6573372,"msg":"starting metrics server","path":"/metrics"}
I1015 19:01:20.657319 1 leaderelection.go:243] attempting to acquire leader lease default/aws-load-balancer-controller-leader...
{"level":"info","ts":1697396480.657604,"logger":"controller-runtime.webhook.webhooks","msg":"starting webhook server"}
E1015 19:01:20.657922 1 leaderelection.go:325] error retrieving resource lock default/aws-load-balancer-controller-leader: Get "https://10.100.0.1:443/api/v1/namespaces/default/configmaps/aws-load-balancer-controller-leader": context canceled
{"level":"error","ts":1697396480.6580412,"logger":"setup","msg":"problem running manager","error":"open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory"}
here is the one of the tutorials I was following. https://www.youtube.com/watch?v=cRODPz9GXb0&t=231s This tutorial does not have the same issue as I am getting.
So I checked the manifest files and command that I issued are correct.
> eksctl create cluster --name <cluster name> --version 1.24 --region us-east-1 --fargate --alb-ingress-access
> eksctl utils associate-iam-oidc-provider --region us-east-1 --cluster <cluster name> --approve
> aws iam create-policy --policy-name AWSLoadBalancerControllerIAMPolicy --policy-document file://iam_policy.json
> eksctl create iamserviceaccount \
--cluster <cluster name> \
--region us-east-1 \
--namespace default \
--name alb-ingress-controller \
--attach-policy-arn arn:aws:iam::<accountId>:policy/AWSLoadBalancerControllerIAMPolicy \
--approve
> kubectl apply -f full.yaml
> kubectl apply -f alb.yaml
to resolve issue of "controller":"TargetGroupBinding","error":"no matches for kind \"TargetGroupBinding\" in version \"elbv2.k8s.aws/v1beta1\""
> kubectl apply -k "github.com/aws/eks-charts/stable/aws-load-balancer-controller/crds?ref=master"
to resolve issue of # E1015 17:46:57.040954 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:default:alb-ingress-controller" cannot list resource "pods" in API group "" at the cluster scope
> kubectl apply -f rbac-role.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
name: deployment-2048
spec:
selector:
matchLabels:
app.kubernetes.io/name: app-2048
replicas: 2
template:
metadata:
labels:
app.kubernetes.io/name: app-2048
spec:
containers:
- image: public.ecr.aws/l6m2t8p7/docker-2048:latest
imagePullPolicy: Always
name: app-2048
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
namespace: default
name: service-2048
spec:
ports:
- port: 80
targetPort: 80
protocol: TCP
type: NodePort
selector:
app.kubernetes.io/name: app-2048
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: default
name: ingress-2048
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing # internal
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/healthcheck-path: /
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: service-2048
port:
number: 80
# alb.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
name: alb-ingress-controller
namespace: default
spec:
selector:
matchLabels:
app.kubernetes.io/name: alb-ingress-controller
template:
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
spec:
containers:
- name: alb-ingress-controller
args:
- --ingress-class=alb
- --cluster-name=<cluster name>
- --aws-vpc-id=<vpc-id>
- --aws-region=us-east-1
env:
image: docker.io/amazon/aws-alb-ingress-controller:v2.4.5
serviceAccountName: alb-ingress-controller
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
name: alb-ingress-controller
rules:
- apiGroups:
- ""
- extensions
resources:
- configmaps
- endpoints
- events
- ingresses
- ingresses/status
- services
- pods/status
verbs:
- create
- get
- list
- update
- watch
- patch
- apiGroups:
- ""
- extensions
resources:
- nodes
- pods
- secrets
- services
- namespaces
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernates.io/name: alb-ingress-controller
name: alb-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: alb-ingress-controller
subjects:
- kind: ServiceAccount
name: alb-ingress-controller
namespace: default
I really cannot find what I am missing. the posted youtube link is one of the tutorials I was following. I tried to follow the aws workshop but no luck so far. Can someone tell me what I am missing?
I saw the github issues. https://github.com/kubernetes-sigs/kubebuilder/issues/1501 but I dont get why I need to edit the application source code to enable webhook? also tutorials about aws eks fargate, I dont remember any of them talking about webhook...
Initially I was trying to deploy by using terraform or cloudformation. But could not figure out how to as most of tutorials were using eksctl. I could not find how to do the followings in cloudformation nor terraform. like eksctl utils associate-iam-oidc-provider --region us-east-1 --cluster <cluster name> --approve
or creating service account
so I have started using eksctl and also to make its deployment work (regardless of this is what I want to do at the end or not).
I even removed the ecr repositry and set to public.ecr.aws/l6m2t8p7/docker-2048:latest
to make it simple as possible.
but at this point, I have no clue how to make this situation simpler to debug and make the first simple application deployment worked....
any help or suggestions will be very very appreciated.
Also I intentionally did not use helm. I am new to kubernates and I did not want to over complicate it. Once it resolve this, I am planning to use helm after doing enough research about it.