SSL with GRPC on AWS EKS and Istio Ingress gives StatusCode.UNAVAILABLE

1.4k views Asked by At

I'm running a kubernetes cluster using AWS EKS service (K8S version 1.17), with Istio (1.7.1) installed on it as an Operator installation.

I've been running the services just fine as they work properly, and also I'm running the Istio Ingress Gateway as ingress service, published with an AWS NLB with the following annotations on the Istio Ingress Gateway:

metadata:
  annoations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "tcp"
    service.beta.kubernetes.io/aws-load-balancer-internal: "false"
    service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "redacted arn"
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"

This creates successfully the NLB with 4 listeners (as per Istio ingress definition), with the 443 running TLS with the provided certificate.

Behind it the Gateway is configured as following:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: service-gateway
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 80
        name: grpc-plain
        protocol: GRPC
      hosts:
        - redacted
    - port:
        number: 443
        name: grpc-tls
        protocol: GRPC
      hosts:
        - redacted
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: service-vservice
  namespace: app
spec:
  gateways:
  - istio-system/service-gateway
  hosts:
  - redacted
  http:
  - route:
    - destination:
        host: service
        port:
          number: 8000

but while the plain port (80) works just fine with the load balancer, the SSL/TLS port 443 gives the following error using any language (tested with C, C++, Python):

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.UNAVAILABLE
    details = "failed to connect to all addresses"
    debug_error_string = "{"created":"@1601027790.018488379","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":4089,"referenced_errors":[{"created":"@1601027790.018476348","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":393,"grpc_status":14}]}"
>

As an example, the Python client has been initialized as following:

import grpc
from service_pb2_grpc import ServiceStub

creds = grpc.ssl_channel_credentials()

with grpc.secure_channel(url, creds) as channel:
    grpc_client = ServiceStub(channel)

What am i doing wrong to get this error while using a simple client?

1

There are 1 answers

1
Piotr Malec On

According to this article about using GRPC on AWS, it appears that using GRPC on AWS is a challenging task.

There is another article about how to create a load blanacer for GRPC on AWS.

Here’s the fact – gRPC does not work with AWS load balancers properly.

It follows up with a workaround using envoy:

How Can you Load Balance gRPC on AWS using Envoy

So what’s the solution here? We decided to use third-party software for load balancing. In this case, we used Envoy (for AWS load balancer Layer 7). It’s an open-source software created by Lyft.