Are custom data types a possible and viable option for variable size efficency?

413 views Asked by At

Instead of storing a single boolean into a byte block, why not store 8 booleans in that same block.

Example: 01010101 = 8 booleans, 1 3 5 7 = false, 2 4 6 8 = true.

Would this work?

is there a better way?

What are any pros and cons of doing this?

Would this have much of an impact on networking?

2

There are 2 answers

0
congusbongus On BEST ANSWER

What you've described are often called bit fields; they're commonly used when space is at a premium (at the bits and bytes level) or you're really trying to shrink something. This includes (but is not limited to):

  • compression algorithms
  • general purpose protocols, to limit overhead
  • shrinking working data to fit caches

Otherwise, like most other problems in programming, you're better off using solutions that handle such low-level details for you, or keep your code as simple for humans as possible. Sometimes that means sticking with plain bools as it describes your code's intent. If your language and code base easily supports bit fields though, that's also fine. For completeness, C/C++ supports bit fields natively via this struct colon syntax:

struct Foo {
    char f1 : 1;
    char f2 : 1;
    char f3 : 1;
    // ...
};

... where the number after the colon represents how many bits that field uses. There's also vector<bool> but it's a problematic type that's seldom used these days, and it's clumsier too.

To answer your question more directly, unless you are working on a very-low-overhead network protocol, then it's highly unlikely that you'll need to work with bit fields. The space savings are minimal compared to the usual time scales in networking, and if you are really worried about it you are better off using an off-the-shelf solution like protocol buffers.

0
ravi On

You can use vector<bool> for this purpose as it's designed on the lines of providing space efficiency.

vector<bool> is a pseudo-container that contains not actual bools, but a packed representation of bools that is designed to save space. In a typical implementation, each "bool" stored in the "vector" occupies a single bit, and an eight-bit byte holds eight "bools.

But there's a problem with this approach:-

Suppose you want to address individual bools in the vector, you would do something like:-

vector<bool> v;
bool *pb = &v[0];

But actually there's no bool in the vector. This could create some problem when implemented under the umbrella of networking as you would at some point of time need to reference individual bools.