On one of my Postgres server autovacuum worker is holding ShareUpdateExclusiveLock for hours. At the same time, one of the application programs is trying to detach a partition from the table.
The application which is trying to detach the partition has acquired AccessExclusiveLock on the parent table and waiting to acquire ShareUpdateExclusiveLock on the partition to be detached, which blocks all select/insert operations on the table.
The data in the table is written by only one writer, no delete or update operation is performed.
The issue has triggered twice and the system became usable only after killing the process from the shell.
As per Postgres document if there is a conflict autovacuum worker will be interrupted. In which exceptional case this interruption may not work?
Postgres version is 12.11 and is using Patroni for clustering.
As Jeff commented, this is most likely an anti-wraparound autovacuum process, because other autovacuum processes would not block your SQL statements for more than a second. If you kill that autovacuum run, it will only come back, because it has an important job to do. If you had the nerve to keep killing that process, you would eventually get into worse trouble than you have now.
The solution is to let the autovacuum process run to completion. If that takes too long, see that
autovacuum_vacuum_cost_delayis set low enough for this table and that you setmaintenance_work_memhigh enough for good autovacuum performance.Perhaps you got into this quandary by constantly attaching and detaching partitions, so that autovacuum could never complete its job. In that case, rethink your application design.