I have three clusters named gke-1 (Auto Pilot Cluster), gke-2, and gke-3. All the clusters are deployed in separate projects and different VPCs. We have established a connection via VPC peering to allow communication between all the projects internally. I have set up the Multi-cluster service using the following documentation: https://cloud.google.com/kubernetes-engine/docs/how-to/migrate-gke-multi-cluster.
Our fleet host cluster is gke-1, and the registered clusters are gke-2 and gke-3. When I export a service from gke-1 into gke-2 and gke-3, it's working fine as expected. However, the issue arises when I export a service from gke-2 to gke-1 and gke-3 to gke-1; it's not working as expected. In this scenario, the services are exported from gke-2 to gke-1, and traffic director is created and appears healthy. But when I make an API call from a gke-1 pod to the gke-2 service, it throws a 503 error with the message: "Failed to connect to ..svc.clusterset.local port 80 after 131287 ms: Couldn't connect to the server."
Can anyone please help me debug this issue?
As per documentation, you can run the following command to debug and describe your multi-cluster services status:
gcloud container fleet multi-cluster-services describe
After running the command, you can evaluate the output based on the code status.
Lastly, in the documentation available, kindly note that there is limitations in the MCS in multiple projects.