I am deploying a triton inference server on the Amazon Elastic Kubernetes Service (Amazon EKS) and using Nginx Open-Source Load Balancer for load-balancing. Our EKS Cluster is private (EKS Nodes are in private subnets) so that no one can access it from the outside world.
Since, triton inference server has three endpoints:-
port 8000: for HTTP requests
port 8001: for grpc requests
port 8002: Prometheus metrics server
First of all, I have created a deployment for Triton on AWS EKS and exposed it using clusterIP = None, so that all the replicas endpoints are exposed and identified by NGINX Load Balancer.
apiVersion: v1
kind: Service
metadata:
name: triton
labels:
app: triton
spec:
clusterIP: None
ports:
- protocol: TCP
port: 8000
name: http
targetPort: 8000
- protocol: TCP
port: 8001
name: grpc
targetPort: 8001
- protocol: TCP
port: 8002
name: metrics
targetPort: 8002
selector:
app: triton
Then, I have created a image for nginx opensource load balancer using the below configuration. Configuration file for NGINX on EKS node at the location /etc/nginx/conf.d/nginx.conf.
resolver kube-dns.kube-system.svc.cluster.local valid=5s;
upstream backend {
zone upstream-backend 64k;
server triton.default.svc.cluster.local:8000;
}
upstream backendgrpc {
zone upstream-backend 64k;
server triton.default.svc.cluster.local:8001;
}
server {
listen 80;
location / {
proxy_pass http://backend/;
}
}
server {
listen 89 http2;
location / {
grpc_pass grpc://backendgrpc;
}
}
server {
listen 8080;
root /usr/share/nginx/html;
location = /dashboard.html { }
location = / {
return 302 /dashboard.html;
}
}
Dockerfile for Nginx Opensource LB is:-
FROM nginx
RUN rm /etc/nginx/conf.d/default.conf
COPY /etc/nginx/conf.d/nginx.conf /etc/nginx/conf.d/default.conf
I have created a ReplicationController for NGINX. To pull the image from the private registry, Kubernetes needs credentials. The imagePullSecrets field in the configuration file specifies that Kubernetes should get the credentials from a Secret named ecr-cred.
The nginx-rc file looks like:-
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-rc
spec:
replicas: 1
selector:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
imagePullSecrets:
- name: ecr-cred
containers:
- name: nginx
command: [ "/bin/bash", "-c", "--" ]
args: [ "nginx; while true; do sleep 30; done;" ]
imagePullPolicy: IfNotPresent
image: <Image URL with tag>
ports:
- name: http
containerPort: 80
hostPort: 8085
- name: grpc
containerPort: 89
hostPort: 8087
- name: http-alt
containerPort: 8080
hostPort: 8086
- name: triton-svc
containerPort: 8000
hostPort: 32309
Now, the issue which I am facing is, when the pods are increasing, the nginx load balancer is not doing the load balancing between those newly added pods.
Can anyone help me?