Pulumi Fargate abort deploy on bad health check

106 views Asked by At

I'm deploying an app to AWS Fargate using Pulumi. On healthy deploys, it's done in about 10 minutes. On unhealthy deploys, the app will report unhealthy status and fargate will restart it over and over again for about 45 minutes before deciding the deploy didn't work out. We know in 99% of cases within a minute after deploying if it'll never succeed.

The health checks during deploys thus have three states;

  • healthy, please drain the old app version and switch all new load to this new app version
  • not sure yet
  • abort deploy, the old app never stops serving requests during the entire failed deployment

How can we tell fargate to give up restarting once the app reports "no, this code is broken, no deploys of this code will ever work, please abort deployment"?

Here's the current config:

const lb = new awsx.classic.lb.ApplicationLoadBalancer('app-alb', {})

const lbTargetGroup = new awsx.classic.lb.ApplicationTargetGroup(
  'app-targetgroup',
  {
    loadBalancer: lb,
    protocol: 'HTTP',
    port: appConfig.requireNumber('webPort'),
    healthCheck: {
      protocol: 'HTTP',
      path: '/health',
      unhealthyThreshold: 10,
      healthyThreshold: 2,
      timeout: 20,
      interval: 30,
      matcher: '200',
    },
  }
)

and afaict the matcher only differentiates two states; lgtm and other. Can we add a third state?

1

There are 1 answers

0
Mark B On BEST ANSWER

The only option you have is to enable ECS Deployment Circuit Breaker, if you haven't already. Other than that, there is no option to have ECS fail a deployment any sooner.

ECS Circuit Breaker is the closest thing to what you describe as:

"tell fargate to give up restarting once the app reports "no, this code is broken, no deploys of this code will ever work, please abort deployment"?"

However it will still retry a minimum of 10 times, so it will still take a long time before it decides the deployment is a failure.

At this time, there are no other settings in ECS, Fargate, the Load Balancer, or the Target Group, to control this further.