Hamming distance between two binary strings not working

29.5k views Asked by At

I found an interesting algorithm to calculate hamming distance on this site:

def hamming2(x,y):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x) == len(y)
    count,z = 0,x^y
    while z:
        count += 1
        z &= z-1 # magic!
    return count

The point is that this algorithm only works on bit strings and I'm trying to compare two strings that are binary but they are in string format, like

'100010'
'101000'

How can I make them work with this algorithm?

4

There are 4 answers

0
dlask On BEST ANSWER

Implement it:

def hamming2(s1, s2):
    """Calculate the Hamming distance between two bit strings"""
    assert len(s1) == len(s2)
    return sum(c1 != c2 for c1, c2 in zip(s1, s2))

And test it:

assert hamming2("1010", "1111") == 2
assert hamming2("1111", "0000") == 4
assert hamming2("1111", "1111") == 0
0
Adam Hammes On

If we are to stick with the original algorithm, we need to convert the strings to integers to be able to use the bitwise operators.

def hamming2(x_str, y_str):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x_str) == len(y_str)
    x, y = int(x_str, 2), int(y_str, 2)  # '2' specifies we are reading a binary number
    count, z = 0, x ^ y
    while z:
        count += 1
        z &= z - 1  # magic!
    return count

Then we can call it as follows:

print(hamming2('100010', '101000'))

While this algorithm is cool as a novelty, having to convert to a string likely negates any speed advantage it might have. The answer @dlask posted is much more succinct.

0
Mikheil Zhghenti On

I think this explains well The Hamming distance between two strings

def hammingDist(s1, s2):
    bytesS1=bytes(s1, encoding="ascii")
    bytesS2=bytes(s2, encoding="ascii")
    diff=0
    for i in range(min(len(bytesS1),len(bytesS2))):
        if(bytesS1[i]^bytesS2[i]!=0):
            diff+=1
    return(diff)
0
Panos Kalatzantonakis On

This is what I use to calculate the Hamming distance.
It counts the # of differences between equal length strings.

def hamdist(str1, str2):
    diffs = 0
    for ch1, ch2 in zip(str1, str2):
        if ch1 != ch2:
            diffs += 1
    return diffs