PostgreSQL VACUUM/CLUSTER/UPDATE disk at 100% but only 5MB/sec

1.1k views Asked by At

I am getting a very strange PostgreSQL 9.4 behavior. When it runs an UPDATE on a large table, or performs VACUUM or CLUSTER of a large table it seems to hang for a very long time. In fact I just end up killing the process the following day. What's odd about it is that CPU is idle and at the same time disk activity is at 100% BUT it only reports a 4-5 MB/sec reads and writes (see screenshot of nmap & atop).

Screenshot, nmap & atop, disk activity

My server is 24CPU, 32GB RAM and RAID1 (2 SAS 15K x 2). Normally when disk is at 100% utilization it gives me 120-160 MB/s combined reads/writes which can stay almost indefinitely at >100MB/sec of sustained IO.

The system becomes very sluggish altogether even terminal command line. My guess it has something to do with shared memory and virtual memory. When this happens PostgreSQL consumes maximum configured shared memory.

I have disabled swapping vm.swappiness=0. I didn't play with vm.dirty_ratio, vm.dirty_background_ratio and such. System huge pages are disabled vm.nr_hugepages=0.

The following are my postgresql.conf settings:

shared_buffers = 8200MB
temp_buffers = 12MB
work_mem = 32MB
maintenance_work_mem = 128MB
#-----------------------------------------------------
synchronous_commit = off
wal_sync_method = fdatasync
checkpoint_segments = 32
checkpoint_completion_target = 0.9
#-----------------------------------------------------
random_page_cost = 3.2      # RAIDed disk
effective_cache_size = 20000MB  # 32GB RAM
geqo_effort = 10
#-----------------------------------------------------
autovacuum_max_workers = 4
autovacuum_naptime = 45s
autovacuum_vacuum_scale_factor = 0.16
autovacuum_analyze_scale_factor = 0.08

How can disk be at 100% when it is only doing 5MB/sec? Even the most exhausting random read/write routine should still be a level of magnitude faster. It must have something to do with the way PostgreSQL deals with the mapped/shared memory. Also this wasn't occurring with postgres 9.1.

I am trying to educate myself on disk/memory behavior but at this point I need help from the PROs.

1

There are 1 answers

0
oᴉɹǝɥɔ On BEST ANSWER

After lengthy investigation I found correlation between disk saturation with low read/write speeds and the IOPS number. The higher the number of IOPS the lower the IO saturation bandwidth. One of the screenshots in my question has "Transfers/sec". When than number goes high the transfer rate falls.

Unfortunately there isn't much can be done on the database configuration side. PostgreSQL heavily relies on shared memory mapping files to memory pages. When time comes to sync some/all memory pages back to disk it may have tens/hundreds of thousands of dirty pages to sync for a database with large tables. It causes a lot of random disk access and a zillion of small atomic IOs.

Since neither installing SSD nor enabling writeback is an option in my case I had to resolve the problem by approaching it from a different angle. I addressed each case individually.

The UPDATE statement I had was affecting more than half or table records every time it ran. Instead of doing the update I recreate the table each time. This almost doubled performance.

CLUSTER-ing a table results in rebuilding all table indexes except the one by which clustering is performed. For large tables with many indexes this is an important consideration to keep that in mind when performing clustering.

I also replaced VACUUM with ANALYSE which didn't seem like it affected table performance much but runs measurably quicker than VACUUM.