delete row key from cassandra cli

3.3k views Asked by At

i set my column family gcgraceseconds to 0; but stills rowkey is not deleted it remains in my column family

create column family workInfo123
with column_type = 'Standard'
  and comparator = 'UTF8Type'
  and default_validation_class = 'UTF8Type'
  and key_validation_class = 'UTF8Type'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and populate_io_cache_on_flush = true
  and gc_grace = 0 
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and default_time_to_live = 0
  and speculative_retry = 'NONE'
  and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.LZ4Compressor'}
  and index_interval = 128;

see below the view of

[default@winoriatest] list workInfo123;
Using default limit of 100
Using default cell limit of 100
-------------------
RowKey: a
-------------------
RowKey: xx

2 Rows Returned.
Elapsed time: 17 msec(s).

i am using cassandra -cli should i have change anything else

UPDATE:-

after using ./nodetool -host 127.0.0.1 compact

[default@winoriatest] list workInfo123;
Using default limit of 100
Using default cell limit of 100
-------------------
RowKey: xx

2 Rows Returned.
Elapsed time: 11 msec(s).

why xx remains ??

2

There are 2 answers

0
Ralf On BEST ANSWER

When you delete a row in Cassandra, it does not get deleted straight away. Instead it is marked with a tombstone. The effect is, that you still get a result for the key, but no columns will be delivered. The tombstone is required because

  1. Cassandra data files become read-only once they are "full"; the tombstone is added to the currently open data file containing the deleted row.
  2. you have to give the cluster a chance to propagate the delete to all nodes holding a copy of the row.

For the row and its tombstone to be removed a compaction is required. This process re-organizes the data files and while it does that, it prunes deleted rows. That is, if the GC grace period of the tombstone has been reached. For single-node(!) clusters it is OK to set the grace period to 0 because the delete does not have to be propagated to any other node (that might be down at the point in time you issued the delete).

If you want to enforce the removal of deleted rows, you can trigger a flush (sync memory with data files) and a major compaction via the nodetool utility. E.g.

./nodetool flush your_key_space the_column_family && ./nodetool compact your_key_space the_column_family

After the compaction completes, the deleted rows should truly be gone.

0
prabhakaran On

Default GC grace period is ten days(means 846000 sec) in order to remove the rowkey immediately

UPDATE COLUMN FAMILY column_family_name with GC_GRACE= 0;

execute the above cli query follow the nodetool flush and compact operation.