Clarification about Cassandra tombstones and manual compaction

292 views Asked by At

I have few questions about Cassandra tombstones and manual compaction.

Let's say that I delete a row (partition key) in my Cassandra cluster at time X. Let's assume that gc_grace_seconds has its default value (ten days).

  1. Is it true that if manually start a nodetool compact at a time lower than X+10 days the old data will be still on disk after the compaction ?

  2. Instead, if I start nodetool compact at a time higher than X+10 days the old data is really removed from disk ?

  3. Let's assume that the delete was issued at time X and later on I change the gc_grace_seconds to a lower value (let's say 1 day). If at time X+2 days I start nodetool compact the old data will be really removed from disk ? In other words the tombstone, when created, contains the deletion time and not the expiration time, right ?

1

There are 1 answers

1
Manish Khandelwal On

Is it true that if manually start a nodetool compact at a time lower than X+10 days the old data will be still on disk after the compaction ?

Yes tombstones are not removed if compaction is run before gc_grace_seconds.

Instead, if I start nodetool compact at a time higher than X+10 days the old data is really removed from disk ?

Generally yes but depends on compaction strategy also. So you cannot be 100% sure of this.

Let's assume that the delete was issued at time X and later on I change the gc_grace_seconds to a lower value (let's say 1 day). If at time X+2 days I start nodetool compact the old data will be really removed from disk ? In other words the tombstone, when created, contains the deletion time and not the expiration time, right ?

Yes you are correct on this. Tombstones contains deletion time. Expiry depends on gc_grace_seconds value of the table.

You should generally not run nodetool compact command (major compcations) and your compactions should be running automatically (minor compactions).