Why do all App Service requests end up in the same instance?

177 views Asked by At

I have configured my App Service Plan to have 4 instances.

Within the App Service Plan, I host 1 web application and 1 api (node.js).

After waiting an hour or so after setting the instance count to 4 manually, I perform a load test, but I see only 1 instance being hit.

enter image description here

What is preventing me to use all the servers?

I've set:

  • session affinity to OFF
  • Using 4 instances in the load test 100 users each, public traffic enabled.
  • Set App Service Plan instance count to 4

To me it seems this is all I would need to balance the traffic evenly over the 4 instances, yet it doesn't seem to do so. What am I missing here?


Update:

I've got these settings under scale out:

enter image description here


Update:

Did run another load test: same results - only 1 instance is hit.

enter image description here

3

There are 3 answers

6
user2794745 On

You may need to confirm the instances configured on the App Service. This is configured under scaling.

It's possible to scale out the number of instances on the ASP to a particular value, but have a different number of instances for each App Service hosted on the plan. This is useful if some applications aren't designed work with multiple instances.

2
nmbrphi On

What did you set for the PerSiteScaling configuration?

By default, this voice is set to false, but if you can change it to true, the platform automatically spreads the instances of the Web App across all available instances of the App Service plan.

This is what the docs says:

Apps are allocated to available App Service plan using a best effort approach for an even distribution across instances. While an even distribution is not guaranteed, the platform will make sure that two instances of the same app will not be hosted on the same App Service plan instance.

The platform does not rely on metrics to decide on worker allocation. Applications are rebalanced only when instances are added or removed from the App Service plan.

e.g. Enable PerSiteScaling using PowerShell

Set-AzAppServicePlan -ResourceGroupName $ResourceGroup `
   -Name $AppServicePlan -PerSiteScaling $true
0
BlackStar On

The reason as why it only shows 1 of the instances receiving all the traffic is because of how scalability works

On Microsoft documentation you can find this text:

https://learn.microsoft.com/en-us/azure/app-service/manage-automatic-scaling?tabs=azure-portal

How automatic scaling works: You enable automatic scaling for an App Service Plan and configure a range of instances for each of the web apps. As your web app starts receiving HTTP traffic, App Service monitors the load and adds instances. Resources may be shared when multiple web apps within an App Service Plan are required to scale out simultaneously.

It is not very clear what it is trying to say but you need to consider the App service and the App service plan as different entities.

The app service can be taken as the actual process on the server that is doing the work. For example, when you open Chrome in your laptop (or any other browser) you'll see there are several chrome processes

Take this as an example enter image description here

But all of those processes are running on your machine.

Even if you need 4 users (each one need one window only), you can do that by opening 4 different windows on the same machine (let's leave any other kind of complexity out of the example, like monitors/keyboards)

Now let's say you need 1000 windows, then you'll probably hit a limit on RAM because there could be no more available on a single computer, so in order to accomplish that you'll now need a second or several more computers.

This is where the App Service plan comes as a concept. It is like the physical machines you have available at your disposal to do your work.

At the level of the App Service, it needs a certain amount of processes to complete the work, and most of the time for your scenario, only one machine would be enough. That's why you only see one of them being hit with all the traffic.

You can test this by going crazier and send 10K requests so it forces the App service Plan to make use of the rest of the nodes.

Another test could be to deploy another Service App (or several more) so you'll probably see those get allocated in different nodes for balancing.

For simplicity, Azure hides this kind of complexity to the user because it is in charge of controlling how to send traffic and utilize the nodes adjusting for better performance and timings.

In your metrics it says that you barely used 50% CPU of one of the nodes, so you'll definitively need to push harder to see the use of other nodes.