ECS task stuck in PENDING state

1k views Asked by At

I am trying out AWS App-Mesh, I have pushed an image to ECR which starts a web server on 8080 port, and created an ECS service for it. I have been following this guide just to try out the service https://docs.aws.amazon.com/app-mesh/latest/userguide/getting-started-ecs.html . When I get to the part where I update my service with enabled AppMesh, my task gets stuck in the PENDING state and envoy task is unhealthy (screenshot attached)

I am using

840364872350.dkr.ecr.us-east-1.amazonaws.com/aws-appmesh-envoy:v1.21.1.2-prod

as envoy image

enter image description here

To be honest I don't really understand how this works and I want to know if I can debug this somehow to understand the problem. Thank you in advance !

1

There are 1 answers

0
Patrick On

With ECS AppMesh Integration, a ContainerDependency(container startup order) is added to the application container to only start if envoy is healthy. See images below:

Application Container with Container Ordering (View container)

Application Container with Container Ordering (Edit task/container)

In order to find out why envoy container is UNHEALTHY I would suggest enabling logging on the envoy container. In my scenario the envoy container couldn't retrieve EC2 Task Metadata (see below snippet) and hence the envoy container was UNHEALTHY in my case, so the ECS task would remain in pending indefinitely.

[error][aws] [source/extensions/common/aws/credentials_provider_impl.cc:118] Could not retrieve credentials listing from the EC2MetadataService

After adding the necessary permission in the ECS Task Role, the issue was resolved, as the envoy container was healthy and the application container can start as well. Hope the above helps.