Neon intrinsic to prevent overflow by subtracting the minimum element from all elements [no looping]

231 views Asked by At

I want to figure out a tricky way to prevent overflow while using Neon Intrinsic in C for ARM. Here's the logic performed element by element:

min = array[0]
for(i=1;i<64;i++)
{
    if(min > array[i])
    {
        min = array[i];
    }
}
for(i=0;i<64;i++)
{
    array[i] -= min;
}

I want an alternative solution, which eliminates the need of element by element operations, by performing operations in SIMD way. Thanks.

NOTE: In my case, I use four vectors of uint8x16_t datatype. I want to find a single minimum from them and perform normalization (ie; my array with 64 elements, segmented into four uint8x16_t vectors).

1

There are 1 answers

0
Armali On
  1. Use vmin_u8 multiple times to accumulate minimum values in a vector (say a 8x8)

  2. Use vpmin_u8 'n' times on same vector - bubble sort (here, n = 8)

  3. Use vdup_8(sorted_result[0]) to construct vector with target length

  4. Use vsub_u8 to subtract => normalize.

    – ffox