we have a 50 node redshift cluster, and we run vacuum periodically. and currently we are running a pipeline where we are moving some data onto S3 and deleting it from redshift.
after about 2 weeks of processing. our disk usage on 49 nodes ( except leader ) came down from 95% to 80%. but the disk usage on leader went up and its now at 100%.
I tried rebooting the cluster to see if there were transient files that were holding the space. but that didnt help.
any suggestion would be a great help at this point.
thanks!
You might have some "skewed" tables, which means tables are not distributed on the nodes evenly, the following SQL will give you list of tables and based on
skew
column, you might need to redistribute your tables.