The levenshtein distance algorithm in Python is too slow as I am comparing many strings.
So I want to use difflib.ndiff
to do it.
I tried parsing the output by interpreting "+", "-", " " from the result of ndiff
, but failed.
Here is what I tried:
import difflib
edit_dist = sum(op[0] == "-" for op in difflib.ndiff("split", "sitting"))
But the returned result is incorrect.
I want to use solutions from stdlib
. Any advice / solutions?
P.S. I need to get the nunber edit distance, not ratio so SequenceMatcher.ratio
doesn't work