I am trying to find which user records in a table are taking up the most space. To this end, I am using the DATALENGTH
function in SqlServer.
SELECT
UserName,
SUM(
ISNULL(DATALENGTH(columnA), 1) +
ISNULL(DATALENGTH(columnB), 1) +
....
ISNULL(DATALENGTH(columnZ), 1) +
)/1000000 AS SizeInMegaBytes
FROM MyTable
GROUP BY UserName
ORDER BY SizeInMegaBytes DESC
Results:
+----------+-----------------+
| UserName | SizeInMegaBytes |
+----------+-----------------+
| User1 | 1700 |
+----------+-----------------+
| User2 | 1504 |
+----------+-----------------+
| .... | .... |
+----------+-----------------+
| User75 | 20 |
+----------+-----------------+
Total Size = 16,523 MB
The only problem is that the results don't match up with the size of the table. I use the built-in stored procedure to get the size of the table
sp_spaceused [MyTable]
Results:
+---------+-------+-------------+-------------+------------+-------------+
| name | rows | reserved | data | index_size | unused |
+---------+-------+-------------+-------------+------------+-------------+
| MyTable | 61477 | 59425416 KB | 42482152 KB | 62584 KB | 16880680 KB |
+---------+-------+-------------+-------------+------------+-------------+
The stored procedure shows the total data size as 42 GB yet the query of all the columns shows 16 GB. What could be taking up the extra space if I have accounted for the size of all the columns?
EDIT - I don't think my issue is the same as the duplicate mentioned because here I am taking the SUM
of all the grouped records while the previous question did not. There seems to be such a large disparity between the SUM of the DataLength function and the results of sp_spaceused
(29 GB) I don't think it could be accounted for by indexes or header information alone.
First; your math is suspect; 1MB = (1KB * 1KB) = 1024B * 1024B
Second; there could be metadata associated with table and records. Inspecting the table definition can give insight here.