I'm trying to use GKE private cluster with standard config, with the Anthos service mesh managed profile. However, when I try to deploy "Iris" model for the test, the deployment stuck in calling "storage.googleapis.com":
$ kubectl get all -n test
NAME READY STATUS RESTARTS AGE
pod/iris-model-default-0-classifier-dfb586df4-ltt29 0/3 Init:1/2 0 30s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/iris-model-default ClusterIP xxx.xxx.65.194 <none> 8000/TCP,5001/TCP 30s
service/iris-model-default-classifier ClusterIP xxx.xxx.79.206 <none> 9000/TCP,9500/TCP 30s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/iris-model-default-0-classifier 0/1 1 0 31s
NAME DESIRED CURRENT READY AGE
replicaset.apps/iris-model-default-0-classifier-dfb586df4 1 1 0 31s
$ kubectl logs -f -n test pod/iris-model-default-0-classifier-dfb586df4-ltt29 -c classifier-model-initializer
2022/11/19 20:59:34 NOTICE: Config file "/.rclone.conf" not found - using defaults
2022/11/19 20:59:57 ERROR : GCS bucket seldon-models path v1.15.0-dev/sklearn/iris: error reading source root directory: Get "https://storage.googleapis.com/storage/v1/b/seldon-models/o?alt=json&delimiter=%2F&maxResults=1000&prefix=v1.15.0-dev%2Fsklearn%2Firis%2F&prettyPrint=false": dial tcp 199.36.153.8:443: connect: connection refused
2022/11/19 20:59:57 ERROR : Attempt 1/3 failed with 1 errors and: Get "https://storage.googleapis.com/storage/v1/b/seldon-models/o?alt=json&delimiter=%2F&maxResults=1000&prefix=v1.15.0-dev%2Fsklearn%2Firis%2F&prettyPrint=false": dial tcp 199.36.153.8:443: connect: connection refused
2022/11/19 21:00:17 ERROR : GCS bucket seldon-models path v1.15.0-dev/sklearn/iris: error reading source root directory: Get "https://storage.googleapis.com/storage/v1/b/seldon-models/o?alt=json&delimiter=%2F&maxResults=1000&prefix=v1.15.0-dev%2Fsklearn%2Firis%2F&prettyPrint=false": dial tcp 199.36.153.8:443: connect: connection refused
2022/11/19 21:00:17 ERROR : Attempt 2/3 failed with 1 errors and: Get "https://storage.googleapis.com/storage/v1/b/seldon-models/o?alt=json&delimiter=%2F&maxResults=1000&prefix=v1.15.0-dev%2Fsklearn%2Firis%2F&prettyPrint=false": dial tcp 199.36.153.8:443: connect: connection refused
I used "sidecar injection" with the namespace labeling:
kubectl create namespace test
kubectl label namespace test istio-injection- istio.io/rev=asm-managed --overwrite
kubectl annotate --overwrite namespace test mesh.cloud.google.com/proxy='{"managed":"true"}'
When I don't use "sidecar injection", the deployment was quite successful. But in this case I need to inject the proxy manually to get the accesss to the model API. I wonder if this is the intended operation or not.
Istio sidecars will block connectivity on other init containers. This is a known issue with Istio sidecars unfortunately. A potential workaround is to ask Istio to don't "filter" traffic going to storage.googleapis.com (i.e. don't route that traffic through Istio's egress), which can be done through Istio's
excludeIPRanges
flag.In the longer term, due to these shortcomings, Istio seems to be moving away from sidecars into their new "Ambient mesh".