Deepdiff Printing all keys if any mismatched

468 views Asked by At

I am using a deepdiff to compare data of two databases. here is example

from deepdiff import DeepDiff
users1 = [{'id': 1, 'name': 'John', 'age': 30}, {'id': 2, 'name': 'Jane', 'age': 25}]
users2 = [{'id': 1, 'name': 'John', 'age': 30}, {'id': 2, 'name': 'Bob', 'age': 35}]
diff = DeepDiff(users1, users2)
print(diff)

It is giving me output like below

{'values_changed': {"root[0]['age']": {'new_value': 20, 'old_value': 30}, "root[1]['name']": {'new_value': 'Bob', 'old_value': 'Jane'}, "root[1]['age']": {'new_value': 35, 'old_value': 25}}}

But I wanted the keys like ids should also be printed so that i will be able to know which id has mismatched.

The sample example output could be

 {'values_changed': {"root[ID_VALUE]['age']": {'new_value': 20, 'old_value': 30}, "root[1]['name']": {'new_value': 'Bob', 'old_value': 'Jane'}, "root[ID_VALUE]['age']": {'new_value': 35, 'old_value': 25}}}

or

{'values_changed': {1:{"root[0]['age']": {'new_value': 20, 'old_value': 30}}, "{2:root[1]['name']": {'new_value': 'Bob', 'old_value': 'Jane'}, "root[1]['age']": {'new_value': 35, 'old_value': 25}}}}

Is there any way to do this.

Thanks

2

There are 2 answers

10
Mark On
# making a slightly larger example 
users1 = [{'id': 69, 'name': 'John', 'age': 30}, {'id': 420, 'name': 'Jane', 'age': 25}, {'id': 123, 'name': 'Janet', 'age': 32}, {'id': 42, 'name': 'Jack', 'age': 22}]
users2 = [{'id': 69, 'name': 'John', 'age': 30}, {'id': 420, 'name': 'Bob', 'age': 35}, {'id': 123, 'name': 'Janet', 'age': 69}, {'id': 42, 'name': 'Jack', 'age': 22}]

dd = DeepDiff(users1, users2)
k = [users1[i]['id'] for i in dd.affected_root_keys] # [420, 123]
v = dd["values_changed"].values()

dict(zip(k, v))

This assumes that the IDs don't change between the two lists

0
Sanjay Kumar On

I tried and was able to get the desired result but wanted to know If there is any efficient way to this

users1 = [{'id': 69, 'name': 'John', 'age': 30}, {'id': 420, 'name': 'Jane', 'age': 25}, {'id': 123, 'name': 'Janet', 'age': 32}, {'id': 42, 'name': 'Jack', 'age': 22}]
df = pd.DataFrame(users1) # I making df here for example but in real i am doing from df to dict
users2 = [{'id': 69, 'name': 'John', 'age': 30}, {'id': 420, 'name': 'Bob', 'age': 35}, {'id': 123, 'name': 'Janet', 'age': 69}, {'id': 42, 'name': 'Jack', 'age': 22}]
diff = DeepDiff(users1, users2)
# print(diff)
diff = diff.get('values_changed')
if diff is not None:
                for key in list(diff.keys()):
                    find_index = re.search(r'\b\d+\b', key)
                    index = int(find_index.group())
                    primary_key_from_df = df['id'].iloc[index]
                    diff[primary_key_from_df] = diff.pop(key)
print(diff)

This is giving me below result which is fine for me

{420: {'new_value': 35, 'old_value': 25}, 123: {'new_value': 69, 'old_value': 32}}