I have installed Knative/KServe in my AWS EKS cluster. Everything is working fine, but recently we decided to try gRPC for our services deployed there. It's deployed with Istio,winth nginx ingress in front of everything, with ingress pointing to Istio ingress gateway:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
cert-manager.io/cluster-issuer: default
kubernetes.io/tls-acme: "true"
name: computing-ingress
namespace: istio-system
spec:
ingressClassName: nginx
tls:
- hosts:
- '*.default.knative.company.com'
secretName: cert-knative-wildcard
rules:
- host: '*.default.knative.company.com'
http:
paths:
- backend:
service:
name: istio-ingressgateway
port:
number: 80
path: /
pathType: Prefix
As mentioned in Knative documentation https://github.com/meteatamel/knative-tutorial/blob/master/docs/grpc.md i have changed my InferenceService yaml by adding h2c port block:
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
annotations:
finalizers:
- inferenceservice.finalizers
name: triton-test
namespace: default
spec:
predictor:
model:
args:
- --model-control-mode=poll
- --repository-poll-secs=5
- --allow-grpc=true
- --grpc-port=9000
- --log-verbose=0
env:
- name: CUDA_VISIBLE_DEVICES
value: "0"
- name: S3_DATA_PATH
value: s3://mymodeldata/
- name: S3_PARAMS
value: --region us-east-2 --no-sign-request
image: XXXXXXX.dkr.ecr.us-east-2.amazonaws.com/ml:mytriton
modelFormat:
name: triton
name: kserve-container
ports:
- containerPort: 9000
name: h2c
protocol: TCP
protocolVersion: v2
resources:
limits:
cpu: "2"
memory: 8Gi
nvidia.com/gpu: "1"
requests:
cpu: "2"
memory: 8Gi
nvidia.com/gpu: "1"
storageUri: s3://mymodeldata/
volumeMounts:
- mountPath: /dev/shm
name: dshm
nodeSelector:
DedicatedFor: GPU
volumes:
- emptyDir:
medium: Memory
sizeLimit: 2Gi
name: dshm
Since gRPC annotation is ingress-level, i have changed main ingress to have more specific paths:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
cert-manager.io/cluster-issuer: default
kubernetes.io/tls-acme: "true"
name: computing-ingress
namespace: istio-system
spec:
ingressClassName: nginx
tls:
- hosts:
- '*.default.knative.company.com'
secretName: cert-knative-wildcard
rules:
- host: '*.default.knative.company.com'
http:
paths:
- backend:
service:
name: istio-ingressgateway
port:
number: 80
path: /v1
pathType: Prefix
- backend:
service:
name: istio-ingressgateway
port:
number: 80
path: /v2
pathType: Prefix
and after that created second ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
cert-manager.io/cluster-issuer: default
kubernetes.io/tls-acme: "true"
nginx.ingress.kubernetes.io/backend-protocol: GRPC
nginx.ingress.kubernetes.io/grpc-backend: "true"
name: computing-grpc-ingress
namespace: istio-system
spec:
ingressClassName: nginx
tls:
- hosts:
- '*.default.knative.company.com'
secretName: cert-knative-wildcard
rules:
- host: '*.default.knative.company.com'
http:
paths:
- backend:
service:
name: istio-ingressgateway
port:
number: 80
path: /
pathType: ImplementationSpecific
- backend:
service:
name: istio-ingressgateway
port:
number: 80
path: /grpc.reflection.v1alpha.ServerReflection/ServerReflectionInfo
pathType: ImplementationSpecific
But i cannot get it working in any way. I tried one million different configurations, but i get either 404 or 502 errors. My istio ingress service is:
apiVersion: v1
kind: Service
metadata:
annotations:
alb.ingress.kubernetes.io/healthcheck-path: /healthz/ready
alb.ingress.kubernetes.io/healthcheck-port: "31619"
labels:
app: istio-ingressgateway
install.operator.istio.io/owning-resource: unknown
istio: ingressgateway
istio.io/rev: default
operator.istio.io/component: IngressGateways
release: istio
name: istio-ingressgateway
namespace: istio-system
spec:
clusterIP: 172.17.17.19
clusterIPs:
- 172.17.17.19
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: status-port
port: 15021
protocol: TCP
targetPort: 15021
- name: http2
port: 80
protocol: TCP
targetPort: 8080
- name: https
port: 443
protocol: TCP
targetPort: 8443
selector:
app: istio-ingressgateway
istio: ingressgateway
sessionAffinity: None
type: ClusterIP
Is there any possible way to get it working? I'm not sure what additional information should i add. Thank you!
Since you're chaining two different HTTP routers together, you might want to try isolating the behavior for each one:
Try invoking the Knative service from a container in the cluster using the address of the internal Istio balancer that the Nginx ingress is pointing at (i.e.
172.17.17.19
with the appropriateHost
header. If that's not working, your problem is in the Istio + Knative combination.Try running a grpc container directly behind the Nginx Ingress, and make sure that Nginx is able to pass grpc traffic.
If both of those are working, then there's some difference between your test in-cluster traffic and the way Nginx is sending the traffic. My guess here would be that the forwarded traffic is missing the
Host
header, but I'd check the two other debugging steps outlined above first.