I'm trying to work through how fuzzywuzzy calculates this simple fuzz ratio:
print(fuzz.ratio("66155347", "12026599"))
25
Why is the fuzz ratio not 0 since they are completely different characters in every position?
The Levenshtein Distance = 8 (because every value needs to be substituted) a is 8 (length of string 1 is 8) b is 8 (length of string 2 is 8)
fuzz.ratio is (a+b - Levenshtein Distance)/(a+b)
fuzz.ratio is (8+8 - 8)/(8+8) = .50
fuzz.ratio is 50
There also must be something wrong with my math; I'm getting 50.
How does the fuzz ratio arrive at 25?
Any guidance would be appreciated.
Thanks
The fuzzywuzzy library uses a weighted version of the Levenshtein distance which gives a weight of 2 to replacements, which brings the Levenshtein distance up to 12. Then (8 + 8 - 12) / (8 + 8) = 0.25.