KNative and gRPC with nginx-ingress

505 views Asked by At

I have installed Knative/KServe in my AWS EKS cluster. Everything is working fine, but recently we decided to try gRPC for our services deployed there. It's deployed with Istio,winth nginx ingress in front of everything, with ingress pointing to Istio ingress gateway:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: default
    kubernetes.io/tls-acme: "true"
  name: computing-ingress
  namespace: istio-system
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - '*.default.knative.company.com'
    secretName: cert-knative-wildcard
  rules:
  - host: '*.default.knative.company.com'
    http:
      paths:
      - backend:
          service:
            name: istio-ingressgateway
            port:
              number: 80
        path: /
        pathType: Prefix

As mentioned in Knative documentation https://github.com/meteatamel/knative-tutorial/blob/master/docs/grpc.md i have changed my InferenceService yaml by adding h2c port block:

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  annotations:
  finalizers:
  - inferenceservice.finalizers
  name: triton-test
  namespace: default
spec:
  predictor:
    model:
      args:
      - --model-control-mode=poll
      - --repository-poll-secs=5
      - --allow-grpc=true
      - --grpc-port=9000
      - --log-verbose=0
      env:
      - name: CUDA_VISIBLE_DEVICES
        value: "0"
      - name: S3_DATA_PATH
        value: s3://mymodeldata/
      - name: S3_PARAMS
        value: --region us-east-2 --no-sign-request
      image: XXXXXXX.dkr.ecr.us-east-2.amazonaws.com/ml:mytriton
      modelFormat:
        name: triton
      name: kserve-container
      ports:
      - containerPort: 9000
        name: h2c
        protocol: TCP
      protocolVersion: v2
      resources:
        limits:
          cpu: "2"
          memory: 8Gi
          nvidia.com/gpu: "1"
        requests:
          cpu: "2"
          memory: 8Gi
          nvidia.com/gpu: "1"
      storageUri: s3://mymodeldata/
      volumeMounts:
      - mountPath: /dev/shm
        name: dshm
    nodeSelector:
      DedicatedFor: GPU
    volumes:
    - emptyDir:
        medium: Memory
        sizeLimit: 2Gi
      name: dshm

Since gRPC annotation is ingress-level, i have changed main ingress to have more specific paths:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: default
    kubernetes.io/tls-acme: "true"
  name: computing-ingress
  namespace: istio-system
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - '*.default.knative.company.com'
    secretName: cert-knative-wildcard
  rules:
  - host: '*.default.knative.company.com'
    http:
      paths:
      - backend:
          service:
            name: istio-ingressgateway
            port:
              number: 80
        path: /v1
        pathType: Prefix
      - backend:
          service:
            name: istio-ingressgateway
            port:
              number: 80
        path: /v2
        pathType: Prefix

and after that created second ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: default
    kubernetes.io/tls-acme: "true"
    nginx.ingress.kubernetes.io/backend-protocol: GRPC
    nginx.ingress.kubernetes.io/grpc-backend: "true"
  name: computing-grpc-ingress
  namespace: istio-system
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - '*.default.knative.company.com'
    secretName: cert-knative-wildcard
  rules:
  - host: '*.default.knative.company.com'
    http:
      paths:
      - backend:
          service:
            name: istio-ingressgateway
            port:
              number: 80
        path: /
        pathType: ImplementationSpecific
      - backend:
          service:
            name: istio-ingressgateway
            port:
              number: 80
        path: /grpc.reflection.v1alpha.ServerReflection/ServerReflectionInfo
        pathType: ImplementationSpecific

But i cannot get it working in any way. I tried one million different configurations, but i get either 404 or 502 errors. My istio ingress service is:

apiVersion: v1
kind: Service
metadata:
  annotations:
    alb.ingress.kubernetes.io/healthcheck-path: /healthz/ready
    alb.ingress.kubernetes.io/healthcheck-port: "31619"
  labels:
    app: istio-ingressgateway
    install.operator.istio.io/owning-resource: unknown
    istio: ingressgateway
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio
  name: istio-ingressgateway
  namespace: istio-system
spec:
  clusterIP: 172.17.17.19
  clusterIPs:
  - 172.17.17.19
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: status-port
    port: 15021
    protocol: TCP
    targetPort: 15021
  - name: http2
    port: 80
    protocol: TCP
    targetPort: 8080
  - name: https
    port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  sessionAffinity: None
  type: ClusterIP

Is there any possible way to get it working? I'm not sure what additional information should i add. Thank you!

1

There are 1 answers

0
E. Anderson On

Since you're chaining two different HTTP routers together, you might want to try isolating the behavior for each one:

  • Try invoking the Knative service from a container in the cluster using the address of the internal Istio balancer that the Nginx ingress is pointing at (i.e. 172.17.17.19 with the appropriate Host header. If that's not working, your problem is in the Istio + Knative combination.

  • Try running a grpc container directly behind the Nginx Ingress, and make sure that Nginx is able to pass grpc traffic.

  • If both of those are working, then there's some difference between your test in-cluster traffic and the way Nginx is sending the traffic. My guess here would be that the forwarded traffic is missing the Host header, but I'd check the two other debugging steps outlined above first.