Bitwise weighted sum for the QOI index hash function

109 views Asked by At

I recently contributed to a Rust library for the QOI image format. As part of the encoding process, the encoder keeps track of an array of already seen RGBA pixel values. It calculates the index using a simple weighted sum hash function:

index = (3*r + 5*g + 7*b + 11*a) % 64

The library uses bitwise operations to calculate the hash which proves to be faster than the naive solution. I need help understanding the algorithm.

pub fn hash_index(px: [u8; 4]) -> u8 {
    let v = u32::from_le_bytes([px[0], px[1], px[2], px[3]]) as u64;
    let s = ((v & 0xff00_ff00) << 32) | (v & 0x00ff_00ff);
 
    (s.wrapping_mul(0x0300_0700_0005_000b_u64) >> 56) as u8 & 63
}

From my understanding the algorithm:

  • Creates a u32 representation of the pixel bytes
  • Expands it to a u64 to fit the multiplication by a shl 32. It works because the maximum mul value, 11x255 does not overflow a word.
  • Masks the higher and lower bytes. The higher bytes contain A and B and the lower bytes contain G and R.

What I don't understand:

  • How both multiplication and summation are done in the same call? Where is the addition operation take place?
  • Why is the weights constant inverted? Shouldn't it be 0x0300_0700_0005_000b?
  • Where is the result? Why call shr 56 to extract the MSB?

Can someone please elaborate on those points, or walk me through the code? My bitwise skills are quite lacking but I happy I understand this far.

Thanks

1

There are 1 answers

3
harold On

The multiplication is used to add the different parts. The sum adds up in the upper byte of the product, some other incomplete results ends up in the lower bytes of the product and they are discarded.

Consider that s * 0x0300_0700_0005_000b = s * 0xb + s * 0x5_0000 + s * 0x0700_0000_0000 + s * 0x0300_0000_0000_0000, and in turn, s * 0x5_0000 = (s * 5) << 16.

The goal of the multiplication is to accumulate the result in the most significant byte of the product. The most significant byte is the only byte of the product that is "reachable" from every byte of s: a multiplication can only shift data left, not right (well, a double-width multiplication could, if you look at its upper half, but that's not used here).

In order for a byte of s to get to the most significant byte, it needs to be shifted by a distance which is the "opposite" in some sense of the position where it started. So the least significant byte of s (the r component) needs to be shifted the most, that is why the weight of the r component is at the top of the multiplier. The other non-zero digits of the multiplier make extra copies of r throughout the lower 7 bytes of the product too, but they will be ignored.

Speaking of "copies", consider that s has the form r + (b << 16) + (g << 40) + (a << 56) and the multiplier has the form 11 + (5 << 16) + (7 << 40) + (3 << 56). If you work out all the partial products, you get:

11r + (11b << 16) + (11g << 40) + (11a << 56) +
(5r << 16) + (5b << 32) + (5g << 56) + (5a << 72) +
(7r << 40) + (7b << 56) + (7g << 80) + (7a << 96) +
(3r << 56) + (3b << 72) + (3g << 96) + (3a << 112)

Of these, anything shifted by 64 or more entirely disappears, because we're doing a wrapping 64-bit multiplication.

Anything shifted by less than 56 will be discarded by the right-shift by 56. By the way this is only true if you chose the inputs such that the additions performed by the multiplication do not carry into the part of the product that we care about. That works out OK here, because there is a generous amount of "buffer space": (11g << 40) + (7r << 40) (the only two partial products that come anywhere close to the most significant byte) can be at most ((11*255 << 40) + (7*255 << 40)) = 4590 << 40, which is not enough to affect the most significant byte.

What's left is 11a + 5g + 7b + 3r. Since this all happened in the most significant byte, only the 8 least significant bits of that sum are produced, but we only need the 6 least significant bits here anyway. That "automatic" loss of the higher bits could be used to remove the & 63 operation, by shifting the multiplier left by 2 and right-shifting by 58 instead of 56.