How can I fix "Snapshot isolation transaction aborted due to update conflict"?

14.5k views Asked by At

I see an error message related to transaction isolation levels. There are two tables involved, first one is updated frequently with transaction isolation level set to SERIALIZABLE, the second one has a foreign key on first one.

Problem occurs when doing insert or update of the second table. Once in few hours I get the following error message:

Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.first' directly or indirectly in database 'DB' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

I don't set transaction isolation level when inserting or updating second table, also I ran command DBCC USEROPTIONS and it returns read_committed.

3

There are 3 answers

7
ntziolis On BEST ANSWER

First:
It seems, you're not using SERIALIZABLE, but snapshot isolation which was introduced with MSSQL 2005. Here is an article to understand the difference:
http://blogs.msdn.com/b/craigfr/archive/2007/05/16/serializable-vs-snapshot-isolation-level.aspx

=> This was based on the error, message, but as you have explained again in the comments the error comes when editing the second table.

Second:
For modifications MSSQL Server always tries to acquire locks, and since there are locks (by using a transaction) on the first table which escalate to locks on the second table because of the (foreign key) the operation fails. So every modification causes in fact a mini transaction.

The default transaction level on MSSQL is READ COMMITTED, but if you turn on the option READ_COMMITTED_SNAPSHOT it will convert READ COMMITTED to a SNAPSHOT like transaction every time you use READ COMMITTED. Which then leads to the error message you get.

To be precise as VladV pointed out, it's not really using the SNAPSHOT isolation level, but READ COMMITTED with row versioning rather than locking, but only on a statement basis, where SNAPSHOT is using row versioning on a transaction basis.

To understand the difference check out this:
http://msdn.microsoft.com/en-us/library/ms345124(SQL.90).aspx

To find out more about the READ_COMMITTED_SNAPSHOT, its explained in detail here:
http://msdn.microsoft.com/en-us/library/tcbchxcb(VS.80).aspx
and here: Default SQL Server IsolationLevel Changes

Another reason for you to see SNAPSHOT isolation if you have not specified it, is by using implicit transaction. After turing this option on and you don't actually specify the isolation level on a modifying statement (which you don't), MS SQL server will choose whatever he believes is the right isolation level. Here are the details:
http://msdn.microsoft.com/en-us/library/ms188317(SQL.90).aspx

For all theses scenarios the solution is the same though.

Solution:
You need to execute the operations in sequence, and you can do this by specifically using a transaction with SERIALIZABLE isolation level on both operations: when inserting/updating the first and when inserting/updating the second.
This way you block the respective other until it is completed.

0
Adriaan de Beer On

We had a similar issue - and you'd be glad to know that you should be able to solve the problem without removing the FK constraint.

Specifically, in our scenario, we had frequent updates to the parent table in a READ COMMITTED transaction. We also had frequent concurrent (long running) snapshot transactions occurring that needed to insert rows into a child table with a FK to parent table - so essentially it's the same scenario as yours, except we used a READ COMMITTED instead of SEREALIZABLE transaction.

To solve the problem, create a new UNIQUE NONCLUSTERED constraint on the primary table over the FK column. In addition you must also re-create the FK after you've created the unique constraint as this will ensure that the FK now references the constraint (not the clustered key).

Note: the disadvantage is that you now have a seemingly redundant constraint on the table that needs to be maintained by SQL server when updates are made to the parent table. That said, it may be a good opportunity for you to consider a different/alternate clustered key...and if you're lucky, it could even replace the need for another index on this table...

Unfortunately I can't find a good explanation on the web on why creating a unique constraint solves the problem. The easiest way I can explain why this works is because the FK now only references the unique constraint - and a modification to the parent table (i.e. to the non-FK referenced columns) does not cause an update conflict in the snapshot transaction as the FK now references an unchanged unique constraint entry. Contrast this with the clustered key where a change to any column in parent table would affect the row version in this table - and since the FK sees an updated version number, the snapshot transaction needs to abort.

Furthermore, if the parent row is deleted in the non-snapshot transaction, then both the clustered and unique constraints would be affected and, as expected, the snapshot transaction will roll back (so FK integrity is maintained).

I've been able to reproduce this problem using the above sample code that I have adapted from this blog entry

---------------------- SETUP Test database
-- Creating Customers table without unique constraint
USE master;
go

IF EXISTS (SELECT * FROM sys.databases WHERE name = 'SnapshotTest')
BEGIN;
DROP DATABASE SnapshotTest;
END;
go

CREATE DATABASE SnapshotTest;
go

ALTER DATABASE SnapshotTest
SET ALLOW_SNAPSHOT_ISOLATION ON;
go

USE SnapshotTest;
go

CREATE TABLE Customers
   (CustID int NOT NULL PRIMARY KEY,CustName varchar(40) NOT NULL);

CREATE TABLE Orders
  (OrderID char(7) NOT NULL PRIMARY KEY,
   OrderType char(1) CHECK (OrderType IN ('A', 'B')),
   CustID int NOT NULL REFERENCES Customers (CustID)
  );

INSERT INTO Customers (CustID, CustName) VALUES (1, 'First test customer');

INSERT INTO Customers (CustID, CustName) VALUES (2, 'Second test customer');
GO

---------------------- TEST 1: Run this test before test 2
USE SnapshotTest;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;

-- Check to see that the customer has no orders
SELECT * FROM Orders WHERE  CustID = 1;

-- Update the customer
UPDATE Customers SET CustName='Updated customer' WHERE  CustID = 1;
-- Twiddle thumbs for 10 seconds before commiting
WAITFOR DELAY '0:00:10';

COMMIT TRANSACTION;
go

-- Check results
SELECT * FROM Customers (NOLOCK);
SELECT * FROM Orders (NOLOCK);
GO

---------------------- TEST 2: Run this test in a new session shortly after test 1
USE SnapshotTest;
SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;

SELECT * FROM   Customers WHERE  CustID = 1;
INSERT INTO Orders (OrderID, OrderType, CustID) VALUES ('Order01', 'A', 1);

-- Twiddle thumbs for 10 seconds before commiting
WAITFOR DELAY '0:00:10';

COMMIT TRANSACTION;
go

-- Check results
SELECT * FROM Customers (NOLOCK);
SELECT * FROM Orders (NOLOCK);
go

And to fix the above scenario, re-setup the test database. Then run the following script before running Test 1 and 2.

ALTER TABLE Customers 
ADD CONSTRAINT UX_CustID_ForSnapshotFkUpdates UNIQUE NONCLUSTERED (CustID)

-- re-create the existing FK so it now references the constraint instead of clustered index (the existing FK probably has a different name in your DB)
ALTER TABLE [dbo].[Orders] DROP CONSTRAINT [FK__Orders__CustID__1367E606]

ALTER TABLE [dbo].[Orders]  WITH CHECK ADD FOREIGN KEY([CustID])
REFERENCES [dbo].[Customers] ([CustID])
GO
0
Super Kai - Kazuya Ito On

According to my 3 experiments with "SNAPSHOT", "SERIALIZABLE" and "READ COMMITTED SNAPSHOT" isolation levels, I got the same error below with only "SNAPSHOT" isolation level when updating the row which is already updated by other transaction while I did not get the same error below with "SERIALIZABLE" and "READ COMMITTED SNAPSHOT" isolation levels:

Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.person' directly or indirectly in database 'test' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

And, as the documentation says below, with "SNAPSHOT" isolation level, we get the same error above when updating the row which is already updated by other transaction. And, I do not think we can solve or avoid the error above so what we can do is retry the transaction again as the error above says. So if our applications get the error above, we will be able to handle the error above as an exception handling to retry the transaction again:

A snapshot transaction always uses optimistic concurrency control, withholding any locks that would prevent other transactions from updating rows. If a snapshot transaction attempts to commit an update to a row that was changed after the transaction began, the transaction is rolled back, and an error is raised.

For my experiment with "SNAPSHOT" isolation level, I created "person" table with "id" and "name" in "test" database as shown below:

id name
1 John
2 David

Now, I did these steps with SQL queries as shown below:

Flow Transaction 1 (T1) Transaction 2 (T2) Explanation
Step 1 BEGIN; T1 starts.
Step 2 BEGIN; T2 starts.
Step 3 UPDATE person SET name = 'Tom' WHERE id = 2; T1 updates "David" to "Tom" so this row is locked by T1 until T1 commits.
Step 4 UPDATE person SET name = 'Lisa' WHERE id = 2; T2 cannot update "Tom" to "Lisa" because this row is locked by T1 so T2 is waiting T1 to unlock this row by commit to update this row.
Step 5 COMMIT; Waiting... T1 commits.
Step 6 ROLLBACK;

Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.person' directly or indirectly in database 'test' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
Now, T2 automatically rollbacks and gets the error.