constructing key by bit shifting 3 integers in C

246 views Asked by At

I want to construct a key composed of 3 values by using bit shifting operations:

According to my understanding, the C statement code I am starting from creates a hash table by constructing its keys from certain data variables:

uint64_t key = (uint64_t)c->pos<<32 | c->isize;

My interpretation is that key is a combination of the last 32 digits of c->pos, which must be a 64 bit unsigned integer, and c->isize, also a 64bit unsigned integer.

But I am not sure if that is the case, and maybe the | pipe operator has a different meaning when applied to bit shifting operations.

What I want to do next is to modify the way key is constructed and include a third c->barc element into the variable. Given the number of possibilities of c->barc and c->isize, I was thinking that instead of building key with 32+32 bits (pos+isize), I would build it with 32+16+16 bits (pos+isize+barc) splitting the last 32 bits between isize and barc.

Any ideas how to do that?

3

There are 3 answers

0
John M On BEST ANSWER

What I think you need is a solid explanation of bitmasking.

For this particular case, you should use the & operator to mask out the upper 16 bits of c->isize before shifting it up, and then use the & operator again to mask the upper 48 bits of c->barc.

Let's look at some diagrams.

let
    c->pos          = xxxx_xxxx_....._xxxx
    c->isize        = yyyy_yyyy_....._yyyy
    c->barc         = zzzz_zzzz_....._zzzz
where
    x, y, and z are bits.

note: underscores are to identify groups of 4 bits.

If I understand correctly, you want a 64-bit number like this:

    xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_yyyy_yyyy_yyyy_yyyy_zzzz_zzzz_zzzz_zzzz

right?

As you already know, we get the upper 32 x's by doing

                       |-----32 bits of pos----|---32 0 bits--|
(uint64_t)c->pos<<32 = xxxx_xxxx_...._xxxx_xxxx_0000_...._0000

Now, we want to bitwise-or that with the following:

|----------32 0 bits----|
0000_0000_...._0000_0000_yyyy_yyyy_yyyy_yyyy_0000_0000_0000_0000

To get that number there, we do this:

((c->isize & 0xffff) << 16)
because:
c->isize & 0xffff gives 
    yyyy_yyyy_yyyy_yyyy_yyyy_yyyy_yyyy_yyyy
&   0000_0000_0000_0000_1111_1111_1111_1111
---------------------------------------------
    0000_0000_0000_0000_yyyy_yyyy_yyyy_yyyy

and then we shift it left by 16 to get 

|--------32 0 bits------|
0000_0000_...._0000_0000_yyyy_yyyy_yyyy_yyyy_0000_0000_0000_0000

Now, the final part, the

|-------48 0 bits-------|
0000_0000_...._0000_0000_zzzz_zzzz_zzzz_zzz

is the result plain and simply of

(c->barc & 0xffff) = 

    zzzz_zzzz_zzzz_zzzz_zzzz_zzzz_zzzz_zzzz
&   0000_0000_0000_0000_1111_1111_1111_1111
-------------------------------------------------
    0000_0000_0000_0000_zzzz_zzzz_zzzz_zzzz

So we take all of these expressions and bitwise-or them together.

uint64_t key = ((uint64_t)c->pos << 32) | ((c->isize & 0xffff) << 16) 
               | (c->barc & 0xffff);

if we diagram it out, we see
    xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_0000_0000_0000_0000_0000_0000_0000_0000
    0000_0000_0000_0000_0000_0000_0000_0000_yyyy_yyyy_yyyy_yyyy_0000_0000_0000_0000
or  0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_zzzz_zzzz_zzzz_zzzz
-----------------------------------------------------------------------------------
    xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_xxxx_yyyy_yyyy_yyyy_yyyy_zzzz_zzzz_zzzz_zzzz
0
bremby On

The "pipe operator" is actually a bitwise OR operator. The code takes two (presumably) 32-bit integers, one of them shifts left by 32 bits and combines them together. Thus you get a single 64-bit number. See Wiki for more info about bitwise operations.

If you want to compose your key from three 32-bit integers, then you obviously have to manipulate them to fit them into 64 bits. You can do something like this:

uint64_t key = (uint64_t)c->pos<<32 | (c->isize & 0xFFFF0000) | (c->barc & 0xFFFF);

This code takes 32 bits from c->pos, shifts them in the higher 32 bits of the 64-bit key, then takes the higher 16 bits of c->isize and finally the lower 16 bits of c->barc. See here for more.

2
luk32 On

I wouldn't do it. It is not safe if you are not designing whole thing by yourself. But let's explain some things.

My interpretation is that key is a combination of the last 32 digits of c->pos,

Generally, yes.

which must be a 64 bit unsigned integer, and c->isize, also a 64bit unsigned integer.

No. You know nothing about size of type of pos andisize, it is cast onto uint64_t it might be any type that allows such a cast.

My bet is that both values are 32-bit. 1st value is being cast onto 64bit type, because bit shift equal to or greater than the width of the type is undefined behaviour. So to stay safe it is widened.

The code probably packs two 32bit values into a 64bit one, otherwise it would loose information.

Moreover, if it wanted to construct key from values which would overlap it would most probably use xor rather than or. Your way is not a good approach, unless you precisely know what are you doing. You should find out what types your operands are and then choose a method for creation keys out of them.