Save space writing bitset to a file in C++

1.6k views Asked by At

I was wondering how I can save space writing a bitset to a file ( probably using iostream) in c++. Will breaking up the bitset into bitset of size 8 and then writing each individual bitset to the file save me space? What is your thought about this. This is the intention of data compression.

2

There are 2 answers

5
Fred Foo On

If you normally write one byte per bit in the bitset, then yes, storing eight elements to a byte will save you 7/8 of the space in the limit (you will have to store the size of the bitset somewhere, of course).

For example, this writes a bitset using one character per bit (7/8 overhead):

for (size_t i=0, n=bs.size(); i<n; ++i)
    stream << bs[i];

while this stores it optimally compact (if we disregard padding at the end):

for (size_t i=0, n=(bs.size() + 1) % 8; i<n; ++i) {
    uint8_t byte=0;
    for (size_t j=0; j<8; ++j)
        byte = (byte << 1) | bs[i*8 + j];
    stream << byte;
}

Note that uint8_t is not standard C++03. It resides in C99's <stdint.h> or C++0x's <cstdint>. You can also use an std::bitset<8> if you want.

0
Rexxar On

If you use boost::dynamic_bitset instead, you can specify the type of the underlying blocks and retrieve them with to_block_range and from_block_range functions.

http://www.boost.org/doc/libs/1_46_0/libs/dynamic_bitset/dynamic_bitset.html#to_block_range

(for example, use unsigned char as block type and store them in a stream in binary mode)