I'm trying to find a diff (longest common subsequences) between two lists of strings. I'm guessing difflib
could be useful here, but difflib.ndiff
annotates the output with -
, +
, etc. For instance
from difflib import ndiff
t1 = 'one 1\ntwo 2\nthree 3'.splitlines()
t2 = 'one 1\ntwo 29\nthree 3'.splitlines()
d = list(ndiff(t1, t2 )); print d;
[' one 1', '- two 2', '+ two 29', '? +\n', ' three 3']
Is tokenising and removing the letter-codes in the output the right way? Is this the proper Pythonic way of diffing lists?
If all you want is the difference of first list from second, you can convert them to
set
and take set difference using-
operator.Example -