I am using SizeTieredCompaction strategy in Scylla db. I deleted half of my data in a specific token range (let's say x to y). My gc_grace_seconds is set to 6 hours. I want to get rid of all the tombstones that are created in this token range. If I run nodetool compact --start-token x --end-token y keyspace table
on all the nodes in cluster after gc_grace_seconds has passed, what would happen? will it delete the tombstones and how much disk space will it consume? Will it be same as nodetool compact
major compaction that needs 50% more space?
Disk space requirement for compaction on a token range in scylla/cassandra
334 views Asked by Dinesh Raj At
2
There are 2 answers
0
On
To delete the tombstones you also need to run nodetool repair
. See here for details on the repair procedure. Basically repair compares data between node so that tombstones can be safely expired.
The space required for compaction is dependent on the specific workload, it is impossible to provide an answer without data about your workload. But 2x is a safe bet which takes into account safety margins. After full compaction the space used will be minimal as only 1 copy of the data is save on each node.
Scylla's documentation of
nodetool compact
(see https://docs.scylladb.com/operating-scylla/nodetool-commands/compact/) doesn't even the token range option, unfortunately. But the Cassandra documentation (https://cassandra.apache.org/doc/latest/operating/compaction/index.html) explains what the so-called sub-range compaction does:With STCS the common case is that all sstables have tokens from all over the token ring, so your nodetool compact call will usually invoke a full major compaction of all sstables. The token range option will likely not exempt any of the sstables from being compacted. So the temporary disk space overhead will be as usual with STCS: At the end of the compaction, you have both the old sstables, and the new one. You assumed the new ones have only half of the original data, so the new sstable will be around half the total size of the old sstable, so this is probably the "50%" you asked about.