MS-SQL: Space overhead for rebuild vs. recreate

122 views Asked by At

we need to remove old data from a huge table once a year. The rows of the table are rather compact (around 40 bytes) and there is just a clustered index on the table.

The database is about 750 GB in total, and the table worked on is 640 GB in size and contains 8.7 billion rows before cleaning. After removing, only 3.7 billion rows remain, the size drops to about 500 GB for data.

These numbers look odd, but they are fine: each page had some rows removed. Some pages were emptied and dropped, some were unchanged and are still 100% full, but most pages are partly filled now and there is a lot of unclaimed space on each page.

To reclaim this space, I need to rebuild the index. My question is: How to defrag an index in a database that is about the size of the index itself?

If I remember correctly, an INDEX DEFRAG REBUILD will require as free space 1.3x the size it works on, as it copies the data in a sorted way. The DB would grow by almost 1 TB, and this new space will not be needed once the defrag finishes.

Shrinking back after the defrag is not helpful as it introduces new (heavy) fragmentation.

I am aware of the "SORT_IN_TEMPDB" setting. Is there an estimate how much free space in the DB will be required with this setting?

As an alternative, I could drop & recreate the clustered index, but I am unsure what the space requirement for that operation is.

Reorganising the index does not reclaim space on each page (?), so this operation is also not what I want.

Thanks for any ideas! Ralf

1

There are 1 answers

2
TomTom On

we need to remove old data from a huge table once a year.

That is exactlyx the use case of partitioning. Partition per year, drop subtable, done. Downtime? Millisecond if you do it smart.