Cassandra Behavior - when a column dropped and reinserted with different datatype - data corruption

Question

Cassandra Behavior - when a column dropped and reinserted with different datatype - data corruption

140 views Asked by TUSHAR BAPECHA At 25 October 2023 at 11:45

I am using Cassandra as database. I initially had a table with column A with datatype as text, Now the use case changed and I wanted the table to list. So I did the following things - Wrote migrationfile where - file1. I drop the column A with column type as text.

Query - ALTER TABLE DB_NAME.demo DROP A;

file2. Alter table and add column A again with column type as list

Query - ALTER TABLE DB_NAME.demo ADD A list;

Above approach I used by looking into articles and suggestions (also referred - Unable to change/alter the data type of column in cql cassandra )

After that I had simple script which updated values in column A for some records.

Will the above activity cause any problem to my table as I am using same column name / column again with new datatype?

So after cassandra scale up & restart, Will I get any corruption in SSTable OR exceptions that may occur in cassandra DB ? Or any other concerns that may rise?

Asking as my database got corrupted and I am not getting the reason behind it! :(

I am not aware or not getting why I am getting the below exceptions on Cassandra -

Caused by: java.io.IOException: Corrupt (negative) value length encountered
    at org.apache.cassandra.utils.ByteBufferUtil.skipWithVIntLength(ByteBufferUtil.java:359) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.marshal.AbstractType.skipValue(AbstractType.java:456) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:249) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.rows.UnfilteredSerializer.readComplexColumn(UnfilteredSerializer.java:670) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:611) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1221) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1176) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.Columns.apply(Columns.java:384) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:605) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.readNext(UnfilteredDeserializer.java:209) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.computeNext(SSTableIterator.java:153) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.hasNextInternal(SSTableIterator.java:182) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:378) ~[apache-cassandra-3.11.3.jar:3.11.3]
    ... 32 common frames omitted

Need help / guidance regarding the above issue!

OR maybe is their any possibility that the issue is not due to my above activity?

Original Q&A

There are 1 answers

**Alex Ott** · Answer 1 · 2023-10-25T18:14:09+00:00

Dropping a column from a table in Cassandra doesn't drop the data. The dropped column is just marked as deleted in a special table, but the data for that column is still there; they are removed before sending a response back to the driver. So when you add the column with the same but another type, when trying to read data, Cassandra will use the wrong codec and fail to decode the data, leading to errors.

The actual removal of the data for the deleted column happens during the compaction process, so you can add a column with the same name only after you make sure that all old data are removed by compaction.

TechQA.

Cassandra Behavior - when a column dropped and reinserted with different datatype - data corruption

There are 1 answers

Related Questions in DATABASE

Related Questions in DATABASE-DESIGN

Related Questions in CASSANDRA

Related Questions in CORRUPTION

Popular Questions

Trending Questions