Strange NGINX throughput when switched from Apache

86 views Asked by At

System information (AWS EC2 Instance (m4.large) behind the ElasticBeanstalk):

Region: us-west-1
Memory: 8GB
CPU: 2 core / 2.4GHz
PHP Version: 7.0.22 (ZTS) with FPM
Nginx Version: 1.10.2

There is an API used by web/mobile/other. Each endpoint is making database requests and using cache (APCu or Redis)

Apache

Apache serves ~40 requests per second. Latency was ~500-1200ms (depends on the API endpoint).

Nginx

Then we decided to move to Nginx. But faced the strange behavior - throughput decreased to ~ 20 requests per second. And the latency is constantly increasing (e.g.: test starts with 300ms and ends with >31000ms)

/etc/nginx/nginx.conf:

user webapp;
pid /var/run/nginx.pid;

worker_processes auto;
worker_rlimit_nofile 10000;

error_log /var/log/nginx/error.log;
include /usr/share/nginx/modules/*.conf;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    fastcgi_buffers 8 16k;
    fastcgi_buffer_size 32k;
    fastcgi_connect_timeout 60;
    fastcgi_send_timeout 300;
    fastcgi_read_timeout 300;

    charset utf-8;

    client_max_body_size 50m;

    gzip  on;
    gzip_vary on;
    gzip_min_length 10240;
    gzip_proxied expired no-cache no-store private auth;
    gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml application/json;
    gzip_disable "MSIE [1-6]\.";

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    upstream php {
        server 127.0.0.1:9000;
    }

    include /etc/nginx/conf.d/*.conf;

    index   index.html index.htm;
}

/fpm/pools/www.conf:

[www]
user = webapp
group = webapp
listen = 127.0.0.1:9000

pm = dynamic
pm.max_children = 75
pm.start_servers = 30
pm.min_spare_servers = 30
pm.max_spare_servers = 35
pm.max_requests = 500

... the rest is default

Performance is measured by Apache Jmeter, using custom scenarios. Tests are run from the same region (another EC2 instance).

cURL stats:

lookup: 0.125
connect: 0.125
appconnect: 0.221
pretransfer: 0.221
redirect: 0.137
starttransfer: 0.252
total: 0.389

tcptraceroute is also perfect (1ms)

Please advise! I cannot find the cause of the problem by myself.. Thanks!

0

There are 0 answers