Cassandra data reappearing

241 views Asked by At

Inspired by this, I wrote a simple mutex on Cassandra 2.1.4.

Here is a how the lock/unlock (pseudo) code looks:

public boolean lock(String uuid){
    try {
        Statement stmt = new SimpleStatement("INSERT INTO LOCK (id) VALUES (?) IF NOT EXISTS", uuid);
        stmt.setConsistencyLevel(ConsistencyLevel.QUORUM);
        ResultSet rs = session.execute(stmt);
        if (rs.wasApplied()) {
            return true;
        }
    } catch (Throwable t) {
        Statement stmt = new SimpleStatement("DELETE FROM LOCK WHERE id = ?", uuid);
        stmt.setConsistencyLevel(ConsistencyLevel.QUORUM);
        session.execute(stmt); // DATA DELETED HERE REAPPEARS!
    }
    return false;
}

public void unlock(String uuid) {
    try {
        Statement stmt = new SimpleStatement("DELETE FROM LOCK WHERE id = ?", uuid);
        stmt.setConsistencyLevel(ConsistencyLevel.QUORUM);
        session.execute(stmt);
    } catch (Throwable t) {
    }
}

Now, I am able to recreate at will a situation where a WriteTimeoutException is thrown in lock() in a high load test. This means the data may or may not be written. After this my code deletes the lock - and again a WriteTimeoutException is thrown. However, the lock remains (or reappears).

Why is this?

Now I know I can easily put a TTL on this table (for this usecase), but how do I reliably delete that row?

1

There are 1 answers

4
RussS On BEST ANSWER

My guess on seeing this code is a common error that happens in Distributed Systems programming. There is an assumption that in case in failure your attempt to correct the failure will succeed.

In the above code you check to make sure that initial write is successful, but don't make sure that the "rollback" is also successful. This can lead to a variety of unwanted states.


Let's imagine a few scenarios with Replicas A, B and C.

Client creates Lock but an error is thrown. The lock is present on all replicas but the client gets a timeout because that connection is lost or broken.

State of System

A[Lock], B[Lock], C[Lock]

We have an exception on the client and attempt to undo the lock by issuing a delete but this fails with an exception back at the client. This means the system can be in a variety of states.

0 Successful Writes of the Delete

A[Lock], B[Lock], C[Lock] All quorum requests will see the Lock. There exists no combination of replicas which would show us the Lock has been removed.

1 Successful Writes of the Delete

A[Lock], B[Lock], C[] In this case we are still vulnerable. Any request which excludes C as part of the quorum call will miss the deletion. If only A and B are polled than we'll still see the lock existing.

2/3 Successful Writes of the Delete (Quorum CL Is Met)

A[Lock/], B[], C[] In this case we have once more lost the connection to the driver but somehow succeeded internally in replicating the delete request. These scenarios are the only ones in which we are actually safe and that future reads will not see the Lock.


Conclusion

One of the tricky things with situations like this is that if you fail do make your lock correctly because of network instability it is also unlikely that your correction will succeed since it has to work in the exact same environment.

This may be an instance where CAS operations can be beneficial. But in most cases it is better to not attempt to use distributing locking if at all possible.