Facing Issues with Load Balancing using NGINX Load Balancer on AWS EKS

301 views Asked by At

I am deploying a triton inference server on the Amazon Elastic Kubernetes Service (Amazon EKS) and using Nginx Open-Source Load Balancer for load-balancing. Our EKS Cluster is private (EKS Nodes are in private subnets) so that no one can access it from the outside world.

Since, triton inference server has three endpoints:-

port 8000: for HTTP requests

port 8001: for grpc requests

port 8002: Prometheus metrics server

First of all, I have created a deployment for Triton on AWS EKS and exposed it using clusterIP = None, so that all the replicas endpoints are exposed and identified by NGINX Load Balancer.

apiVersion: v1
kind: Service
metadata:
  name: triton
  labels:
    app: triton
spec:
  clusterIP: None
  ports:
     - protocol: TCP
       port: 8000
       name: http
       targetPort: 8000
     - protocol: TCP
       port: 8001
       name: grpc
       targetPort: 8001
     - protocol: TCP
       port: 8002
       name: metrics
       targetPort: 8002
  selector:
    app: triton

Then, I have created a image for nginx opensource load balancer using the below configuration. Configuration file for NGINX on EKS node at the location /etc/nginx/conf.d/nginx.conf.

resolver kube-dns.kube-system.svc.cluster.local valid=5s;
 upstream backend {
    zone upstream-backend 64k;
    server triton.default.svc.cluster.local:8000;
 }
  
 upstream backendgrpc {
    zone upstream-backend 64k;
    server triton.default.svc.cluster.local:8001;
 }
  
 server {
    listen 80;
    location / {
      proxy_pass http://backend/;
    }
 }
  
 server {
         listen 89 http2;
  
         location / {
             grpc_pass grpc://backendgrpc;
         }
 }
  
 server {
     listen 8080;
     root /usr/share/nginx/html;
     location = /dashboard.html { }
     location = / {
        return 302 /dashboard.html;
     }
 } 

Dockerfile for Nginx Opensource LB is:-

FROM nginx
RUN rm /etc/nginx/conf.d/default.conf
COPY /etc/nginx/conf.d/nginx.conf /etc/nginx/conf.d/default.conf

I have created a ReplicationController for NGINX. To pull the image from the private registry, Kubernetes needs credentials. The imagePullSecrets field in the configuration file specifies that Kubernetes should get the credentials from a Secret named ecr-cred.

The nginx-rc file looks like:-

 apiVersion: v1
 kind: ReplicationController
 metadata:
   name: nginx-rc
 spec:
   replicas: 1
   selector:
     app: nginx
   template:
     metadata:
       labels:
         app: nginx
     spec:
       imagePullSecrets:
       - name: ecr-cred
       containers:
       - name: nginx
         command: [ "/bin/bash", "-c", "--" ]
         args: [ "nginx; while true; do sleep 30; done;" ]
         imagePullPolicy: IfNotPresent
         image: <Image URL with tag>
         ports:
           - name: http
             containerPort: 80
             hostPort: 8085
           - name: grpc
             containerPort: 89
             hostPort: 8087
           - name: http-alt
             containerPort: 8080
             hostPort: 8086
           - name: triton-svc
             containerPort: 8000
             hostPort: 32309

Now, the issue which I am facing is, when the pods are increasing, the nginx load balancer is not doing the load balancing between those newly added pods.

Can anyone help me?

0

There are 0 answers