We permute a vector in a few places, and we need the distinguished 0 value to use with the vec_perm
built-in. We have not been able to locate a vec_zero()
or similar, so we would like to know how we should handle things.
The code currently use two strategies. The first strategy is a vector load:
__attribute__((aligned(16)))
static const uint8_t z[16] =
{ 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0 };
const uint8x16_p8 zero = vec_ld(0, z);
The second strategy is an xor using the mask we intend to use:
__attribute__((aligned(16)))
static const uint8_t m[16] =
{ 15,14,13,12, 11,10,9,8, 7,6,5,4, 3,2,1,0 };
const uint8x16_p8 mask = vec_ld(0, m);
const uint8x16_p8 zero = vec_xor(mask, mask);
We have not started benchmarks (yet), so we don't know if one is better than the other. The first strategy uses a VMX load and it could be expensive. The second strategy avoids the load but introduces a data dependency.
How do we obtain a VSX value of zero?
I'd suggest to let the compiler handle it for you. Just initialise to zero:
- which will likely compile to an
xor
.For example, a simple test:
On my machine, this compiles to: