I have a 512 bit wide vector register (16 values) and a mask to store them to memory using _mm512_mask_i32scatter_epi32()
. To determine how many values are written to memory I count the leading zeroes of the mask using __builtin_clz()
. If the mask is not (!) empty, everything works fine. But when the mask is empty something strange happens:
std::cout << "mask = " << mask << " clz(mask) " << __builtin_clz(mask) << "\n";
mask = 0 clz(mask) 31
I have two questions:
- Does anyone know, why clz is 31 and not 32?
- Is there a better way to determine the number of written values?
Sincerely