How the Hexdigest md5 values of two same CSV files are different when I check it. The difference between the two CSV files is that one is tab seperated and the other one is comma seperated, else the values are same.
f1 = open(r'D:\Temporary\New File.csv',mode='r')
f2 = open(r'D:\Temporary\Old File.csv',mode='r')
print hashlib.md5(t1).hexdigest(),' ',hashlib.md5(t2).hexdigest()
if hashlib.md5(t1).hexdigest()== hashlib.md5(t2).hexdigest():
print "Match"
else:
print "Not Match"
The output shows :
a4b2720cafdcb859e7ef07a7a3564ba3 237a5c28b890f94636035482a363853a
Not Match
On the other hand, this code gives correct output, where I introduced read() function and then took the md5 digest. Now the keys match.
f1 = open(r'D:\Temporary\New File.csv',mode='r')
f2 = open(r'D:\Temporary\Old File.csv',mode='r')
print f1.read()
print f2.read()
print hashlib.md5(t1).hexdigest(),' ',hashlib.md5(t2).hexdigest()
if hashlib.md5(t1).hexdigest()== hashlib.md5(t2).hexdigest():
print "Match"
else:
print "Not Match"
Now, the output is:
Ultimator Start Code Start Count
Ultimator,Start Code,Start Count,,,,
d41d8cd98f00b204e9800998ecf8427e d41d8cd98f00b204e9800998ecf8427e
Match
MD5 is a cryptographic hash function, which works on the raw data of a file. If two CSV files have the same contents (by your consideration) just using different delimiters, the raw data differs. That's why the MD5 hexdigest values must differ too.
When you call
file.read
before, the position pointer of this file will be at the end of the file and callingfile.read
again after returns''
: