I am planning to build an auto-scaling group (ASG) in a multi-AZ (AZ = availability zone) network. Let's say that we ran some diagnostics and discovered that we need at least 8 instances for normal load, and 24 instances during peak times.
Here's a sample screenshot console.
I am confused whether these 8 instances (or 24 instances) will be run across AZs or in one AZ. Moreover, if I have to force ASG to have, say, 8 instances each in an AZ, how do I do that?
When you create the Auto Scaling group, you nominate the AZs in which instances should be launched.
Auto Scaling will aim to keep the number of instances in each AZ balanced. For example, when launching a new instance, it will launch in the AZ with the fewest number of instances in the Auto Scaling group (or a random AZ if they are equal). When terminating an instance, it will select an instance in the AZ with the most instances in the Auto Scaling group (or a random AZ if they are equal).
Therefore, to ensure 8 instances in each AZ, the Auto Scaling group would need to have an instance count equal to 8 times the number of configured AZs.
If you wish to ensure that 8 instance will be running at all times, and the Auto Scaling group is using 3 AZs, then there is the (small) possibility that one AZ might fail. If this happens, Auto Scaling will launch more instances in the remaining AZs. If your application cannot wait for these extra instances to fail, then it will need to have 4 instances in each of the 3 AZs. This way, if one AZ fails, there will still be two AZs each with 4 instances, giving 8 instances running.
Therefore: