Hamming distance of two integers mysql

273 views Asked by At

mysql schema and query are here

http://sqlfiddle.com/#!9/444873/1

The query seems to work and returns to me only the rows that have hamming distance less than 7 bits.

It seems that the following property applies:

bit_count(a ^ b ) >= abs(bit_count(a) - bit_count(b))

Some examples

                bit_count
a     1111       4
b     0000       0
a^b   1111       4

a     1010       2
b     0110       2
a^b   1100       2

a     1001       2    
b     1001       2
a^b   0000       0

Is the above inequality true?

If yes can somebody provide a proof?

I am asking that because if the above inequality is true then the index I used makes sense to reduce the query time

1

There are 1 answers

1
harold On BEST ANSWER

Here's a proof, it can probably be done simpler but it's not too bad.

With induction in length of the bit string, the base case being the empty string, for which the inequality obviously holds.

The induction step is prepending (or appending, doesn't make any difference) a bit to both A and B.

  • if we prepend a 0 to both, the popcnts don't change so the inequality still holds.
  • if we prepend a 0 to one of them but 1 to the other, their XOR will have one more bit set so the LHS goes up by 1. In the RHS, one of the popcnts goes up (doesn't matter which, absolute difference is commutative), and as a result the RHS will either go up by 1 (same as the LHS, no problem) or down by 1 (still fine, the RHS is allowed to be less than the LHS, but this is why it's not an equality)
  • if we prepend a 1 to both, their XOR does not change so the LHS stays the same. In the RHS, both popcnts go up by 1, they cancel each other, so the RHS also stays the same.

Thus this inequality holds for bit strings of any length.