Sparse columns not showing any savings

947 views Asked by At

Ive ran a test comparing a table with a few sparse columns to a table with no sparse columns, and i am seeing zero space saving.

I have two tables, both storing Address info mainly in varchar columns. both tables allow nulls, one has columns sparse property set.

I insert 1000 rows of default values in each (default values are null). Sparse columns store nulls differently so i believe i should see a space saving. but on running sp_spaceUsed i dont see any saving. Any ideas on what i am doing wrong or where my understanding is incorrect?

    Create Table SparseColTest_NonSparse
    (
        AddressID int identity(1,1) not null,
        AddressLine1 varchar(500)  null,
        AddressLine2 varchar(500)  null,
        AddressLine3 varchar(500)  null,
        PostalCode varchar(20)  null,
        Country varchar(50) 
    )

    Create Table SparseColTest_Sparse
    (
        AddressID int identity(1,1) not null,
        AddressLine1 varchar(500) sparse null,
        AddressLine2 varchar(500)  sparse null,
        AddressLine3 varchar(500)  sparse null,
        PostalCode varchar(20)  sparse null,
        Country varchar(50) 
    )


    declare @i int
    set @i = 0

    while(@i <= 100000)
    BEGIN

        insert into SparseColTest_NonSparse Default values
        insert into SparseColTest_Sparse default values

        set @i = @i + 1
    END

    exec sp_spaceUsed 'SparseColTest_NonSparse'
    exec sp_spaceUsed 'SparseColTest_Sparse'

    /*
    name                          rows                 reserved           data                       index_size         unused
    ----------------------------- -------------------- ------------------ -------------        ----- ------------------ ------------------
    SparseColTest_NonSparse       210003               2888 KB            2840 KB                    8 KB               40 KB


    name                          rows                 reserved           data                       index_size         unused
    ----------------------------- -------------------- ------------------ -------------        ----- ------------------ ------------------
    SparseColTest_Sparse                210003               2888 KB            2840 KB                    8 KB               40 KB

    ****NOTE - even with 210k rows sparse and non sparse tables are identical in size.
    */
1

There are 1 answers

2
Gordon Linoff On BEST ANSWER

The issue is the storage on the pages. This is an approximation of what happens.

Space is taken up for the following reasons:

  1. AddressID (4 bytes)
  2. NULL bit map (1 byte in both tables)
  3. List of sparse columns (2 bytes per NULL field)
  4. List of column values (not any, because NULL varchars take no space)
  5. 6 bytes of overhead for sparse columns (only reference is here)
  6. Some other minor overhead

This might require some explanation. Both tables have the AddressID, so that takes up the same amount of space.

All non-sparse columns are in the NULL bit-map. You might think only the NULL-able ones would be there, but no. SQL Server has the bit-map for all of them. Stored one byte at a time. So, the six columns in the non-sparse table take the same space as the two non-sparse columns (identity and country) in the sparse table.

Varchars are variable length data. When NULL, they still reserve 2 bytes per column. So this is 10 bytes in the first record. And 2-bytes in the second.

Sparse records have an overhead of 6 bytes.

Sparse columns only occupy space when they have values. None here have values.

The end result is that between the overheads, and possibly byte-alignment restrictions, add up to the same length in both records. The values for the corresponding columns in the non-sparse table are taking up some space for the lengths, and this seems to match the space used by the columns. I also suspect that the data rows on the page are aligned on 2- or 4-byte boundaries. The NULL bit-maps are the same size in both tables.

I see a space savings when I have a least 8 more sparse columns in the sparse table than the non-sparse table. This definitely gives you the savings on the NULL-bit map, which is rounded up to the nearest byte.

The real savings for sparse columns are on the fixed-length data types. There is more overhead for storing each value, but that is where sparseness comes in. There aren't many values. You lose the overhead both of the bit-mask and of the space the value would occupy.

For your example, sparse columns are not useful. With real data, they are likely to actually increase storage space.

Check out the documentation here.