How to delete a record in Cassandra?

21.5k views Asked by At

I have a table like this:

CREATE TABLE mytable (
    user_id int,
    device_id ascii,
    record_time timestamp,
    timestamp timeuuid,
    info_1 text,
    info_2 int, 
    PRIMARY KEY (user_id, device_id, record_time, timestamp)
);

When I ask Cassandra to delete a record (an entry in the columnfamily) like this:

DELETE from my_table where user_id = X and device_id = Y and record_time = Z and timestamp = XX;

it returns without an error, but when I query again the record is still there. Now if I try to delete a whole row like this:

DELETE from my_table where user_id = X

It works and removes the whole row, and querying again immediately doesn't return any more data from that row.

What I am doing wrong? How you can remove a record in Cassandra?

Thanks

2

There are 2 answers

1
Aaron On BEST ANSWER

Ok, here is my theory as to what is going on. You have to be careful with timestamps, because they will store data down to the millisecond. But, they will only display data to the second. Take this sample table for example:

aploetz@cqlsh:stackoverflow> SELECT id, datetime  FROM data;

 id     | datetime
--------+--------------------------
 B25881 | 2015-02-16 12:00:03-0600
 B26354 | 2015-02-16 12:00:03-0600

(2 rows)

The datetimes (of type timestamp) are equal, right? Nope:

aploetz@cqlsh:stackoverflow> SELECT id, blobAsBigint(timestampAsBlob(datetime)),
                                  datetime FROM data;

 id     | blobAsBigint(timestampAsBlob(datetime)) | datetime
--------+-----------------------------------------+--------------------------
 B25881 |                           1424109603000 | 2015-02-16 12:00:03-0600
 B26354 |                           1424109603234 | 2015-02-16 12:00:03-0600

(2 rows)

As you are finding out, this becomes problematic when you use timestamps as part of your PRIMARY KEY. It is possible that your timestamp is storing more precision than it is showing you. And thus, you will need to provide that hidden precision if you will be successful in deleting that single row.

Anyway, you have a couple of options here. One, find a way to ensure that you are not entering more precision than necessary into your record_time. Or, you could define record_time as a timeuuid.

Again, it's a theory. I could be totally wrong, but I have seen people do this a few times. Usually it happens when they insert timestamp data using dateof(now()) like this:

INSERT INTO table (key, time, data) VALUES (1,dateof(now()),'blah blah');
0
Chandra Duddukuri On
CREATE TABLE worker_login_table (
    worker_id text,
    logged_in_time timestamp, 
    PRIMARY KEY (worker_id, logged_in_time)
);

INSERT INTO worker_login_table (worker_id, logged_in_time) 
   VALUES ("worker_1",toTimestamp(now()));

after 1 hour executed the above insert statement once again

select * from worker_login_table;

 worker_id| logged_in_time
----------+--------------------------
 worker_1 | 2019-10-23 12:00:03+0000
 worker_1 | 2015-10-23 13:00:03+0000

(2 rows)

Query the table to get absolute timestamp

select worker_id, blobAsBigint(timestampAsBlob(logged_in_time )), logged_in_time from worker_login_table;

worker_id     | blobAsBigint(timestampAsBlob(logged_in_time)) | logged_in_time 
--------+-----------------------------------------+--------------------------
 worker_1 |                           1524109603000 | 2019-10-23 12:00:03+0000
 worker_1 |                           1524209403234 | 2019-10-23 13:00:03+0000

(2 rows)

The below command will not delete the entry from Cassandra as the precise value of timestamp is required to delete the entry

DELETE from worker_login_table where worker_id='worker_1' and logged_in_time ='2019-10-23 12:00:03+0000';

By using the timestamp from blob we can delete the entry from Cassandra

DELETE from worker_login_table where worker_id='worker_1' and logged_in_time ='1524209403234';