Nginx performance issue

516 views Asked by At

I am having some performance issues with Nginx.

I've got a single Nginx server sitting in front of 4 node.js applications. The node application doesn't do anything fancy, it just loops through the expressJS stack and doesn't access any external DB sources, etc.

I'm using ApacheBench to get a benchmark of how my application performs. When I use a single server in my upstream rule, I get a constant 1000 req/s throughput. For example:

upstream cp_backend {
    server 10.134.14.128:5000;
    keepalive 32;
}

Returns 1000 req/s. When I add a second node application (running on a second server), for example:

upstream cp_backend {
    server 10.134.14.128:5000;
    server 10.134.34.226:5000;
    keepalive 32;
}

I see around 2000 req/s throughput (as expected).

However, as soon as I add more than 2 backend servers, the app no longer scales linearly. For example:

upstream cp_backend {
    server 10.134.14.128:5000;
    server 10.134.34.226:5000;
    server 10.134.26.172:5000;
    keepalive 32;
}

Only returns about 2500 req/s, instead of 3000 as expected. Similarly:

upstream cp_backend {
    server 10.134.14.128:5000;
    server 10.134.34.226:5000;
    server 10.134.26.172:5000;
    server 10.134.26.178:5000;
    keepalive 32;
}

Only return about 3000 req/s, instead of 4000 as expected.

To give some more background. I have tested another (non nodeJS) application running on these same sets of servers and I'm able to get upwards of 17k req/s, so I know Nginx and my testing machine is capable of reaching those numbers. It's also worth mentioning that the content-length was actually bigger on the 17k test than the node app.

Each node app is running on a separate VM with its own resources. When using simple tools like top, I don't see any major bottlenecks. Plus, I know when using a single upstream server that the node.js VM is capable of 1000 req/s. Seems like it's an nginx bottleneck because I can get 1000 req/s when going via nginx direct to any one of them. If I know that each node app is capable of 1000 req/s, why can't I achieve 4k req/s with 4 upstream servers?

Any recommendations on good profiling software to use?

2

There are 2 answers

5
Soviut On

Because you haven't specified any attempts to diagnose or what kind of hardware you're developing on, you're not going to get very good answers.

I'd assume it's some kind of resource shortage due to your hardware. You're probably either maxing out your CPUs or saturating your network connection.

Each Node app gets its own thread, but if two of those threads are running on the same core then they could be impacting each others' performance. Even if each app is running on its own core, you could be saturating the bus.

If it's network saturation, that just means your apps are fighting for bandwidth and should be given better network isolation.

Finally, the computer running ApacheBench may be struggling to keep up.

In all cases, find yourself some hardware profilers and see what your hardware is doing because your servers are obviously a lot more powerful and are scaling just fine.

0
Pramod Waikar On

The detail of the CPU and Memory utilisation on each node is needed to understand the bottleneck including the nginx proxy server.

Though, It is best practice to use the 'least_conn' parameter when you connect more than 2 upstream server in proxy. This will make sure the server load distribution is equal, prefer to the server with least load at that time.

You will get more info from http://nginx.org/en/docs/http/load_balancing.html

You may need to tune the Nginx as well. http://clouditops.blogspot.in/2016/11/improve-nginx-performance.html