I wish to move bits 0,8,16,24 of a 32-bit value to bits 0,1,2,3 respectively. All other bits in the input and output will be zero.
Obviously I can do that like this:
c = c>>21 + c>>14 + c>>7 + c;
c &= 0xF;
But is there a faster (fewer instructions) way?
Or wait for Intel Haswell processor, doing all this in exactly one instruction (pext).
Update
Taking into account
clarified constraints
and assuming32-bit unsigned values
, the code may be simplified to this: