Disk full because of transaction log caused by expensive query

382 views Asked by At

I have a problem with a query that could look simple but it is causing me a lot of problems here in my development environment.

What I'm trying to do is to change the articleID by a new article ID that I have in a new table.

+-----------------+
|     comments    |
+-----------------+
| cid | articleID |
+-----------------+
|  1  |     1     |
+-----------------+
|  2  |     1     |
+-----------------+
|  3  |     2     |
+-----------------+

+------------------------+
|      new_comments      | 
+------------------------+
| comment_id | articleID |
+------------------------+
|      1     |    10     |
+------------------------+
|      2     |    10     |
+------------------------+
|      3     |    32     |
+------------------------+ 

And I want that the table "comments" ends like:

+-----------------+
|     comments    |
+-----------------+
| cid | articleID |
+-----------------+
|  1  |    10     |
+-----------------+
|  2  |    10     |
+-----------------+
|  3  |    32     |
+-----------------+

So I execute this query:

UPDATE comments SET comments.articleID = new_comments.articleID_new
FROM comments INNER JOIN new_comments
ON comments.cid = new_comments.comment_id;

The problem is that I have here 20GB of free space in disk and, when I execute this query the transaction log starts growing and uses all the free space available before the query finishes. In 6 minutes, the 20GB of free space of the disk just disappear.

I have changed recovery mode to simple. However, the problem persist and the transaction log keeps growing.

I can see the transaction log of the database growing using the next query that I saw here in stackoverflow:

SELECT (size * 8.0)/1024.0 AS size_in_mb,
CASE
    WHEN max_size  = -1 THEN 9999999 -- Unlimited growth, so handle this how you want
    ELSE (max_size * 8.0)/1024.0
END AS max_size_in_mb
FROM MyDatabase.sys.database_files
WHERE data_space_id = 0;

Does anyone know what options do I have or what can I do to stop the query from writing such amount of information to the transaction log?

1

There are 1 answers

1
Ben Thul On BEST ANSWER

Try something like this:

declare @batchSize int 10000;
while(1=1)
begin
UPDATE top(@batchSize) comments SET comments.articleID = new_comments.articleID_new
FROM comments INNER JOIN new_comments
ON comments.cid = new_comments.comment_id
where comments.articleID != new_comments.articleID_new;
if (@@rowcount < @batchSize)
   break
end

This will run your update 1000 rows at a time until the update is done. As always, run this in a test system first but this should do the trick.

Now, as to why you're running into this, the database needs to be able to roll a transaction back until it commits. So, if you try to modify too much data in one transaction, you'll fill up your log. So, even though SQL is declarative in an academic sense, we don't live in an academic world sometimes. :)