Comparison Between one way sha256 hash in Node js vs Python

4k views Asked by At

I have the following version of code for Python:

import hashlib
msg = 'abc'
print msg
sha256_hash = hashlib.sha256()
sha256_hash.update(msg)
hash_digest = sha256_hash.digest()
print hash_digest

And corresponding Node js version:

var crypto= require('crypto');
var msg = 'abc';
var shasum = crypto.createHash('sha256').update(msg);
var hashDigest = shasum.digest();
console.log(hashDigest);

However, the binary output is slightly off for both:

  • Node : �x����AA@�]�"#�a��z���a��
  • Python:�x���AA@�]�"#�a��z���a��

The hex representation is correct though between the two libraries. Am I doing something wrong here?

3

There are 3 answers

2
loganfsmyth On BEST ANSWER

TL;DR

Your node code is trying to parse the result of the hash as utf8 and failing.


The difference is in how the languages treat their binary data and string types. When considering the final binary output, your examples both output the same values. So let's example the output of your two examples, in hex:

ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

In Python:

'\xbax\x16\xbf\x8f\x01\xcf\xeaAA@\xde]\xae"#\xb0\x03a\xa3\x96\x17z\x9c\xb4\x10\xffa\xf2\x00\x15\xad'

In Node:

<SlowBuffer ba 78 16 bf 8f 01 cf ea 41 41 40 de 5d ae 22 23 b0 03 61 a3 96 17 7a 9c b4 10 ff 61 f2 00 15 ad>

In this case, the core thing to notice is that the result in Python is returned as a string. In Python, strings are simply arrays of chars (0-255) values. The value in Node however, is stored as a Buffer, which actually represents an array of values (0-255) as well. That is the key different here. Node does not return a string, because strings in Node are not arrays of single-byte characters, but arrays of UTF-16 code units. Python supports Unicode using a separate string class designated by u''.

So then compare your examples of printing the output, shortened for readability

print '\xbax\x16\xbf\x8f\x01\xcf\xeaAA'

vs

console.log('' + 
    new Buffer([0xba, 0x78, 0x16, 0xbf, 0x8f, 0x01, 0xcf, 0xea, 0x41, 0x41]))

The Python code says, write this array of bytes to the terminal. The second however, says something very different, convert this array of bytes into a string, and then write that string to the terminal. But the buffer is binary data, not UTF-8 encoded data, so it will fail to decode your data into a string, causing garbled results. If you wish to directly compare the binary values as actual decoded values in a terminal, you need to give the equivalent instructions in both languages.

print '\xbax\x16\xbf\x8f\x01\xcf\xeaAA'

vs

process.stdout.write(
    new Buffer([0xba, 0x78, 0x16, 0xbf, 0x8f, 0x01, 0xcf, 0xea, 0x41, 0x41]))

process.stdout.write in this case being a way to write binary values to the terminal, rather than strings.

Really though, you should just compare the hashes as hex, since it is already a string representation of a binary value, and it's easier to read than improperly decoded unicode characters.

1
mscdex On

It matches for me.

Python 2.7.3:

Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> msg = 'abc'
>>> sha256_hash = hashlib.sha256()
>>> sha256_hash.update(msg)
>>> hash_digest = sha256_hash.hexdigest()
>>> print hash_digest
ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
>>>

Node v0.10.30:

> crypto.createHash('sha256').update('abc').digest('hex')
'ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad'

Both hex strings match.

0
basebandit On

I had a similar situation converting python hmac256 function below to its Node.js equivalent

 def HmacSha256(key, sign):
        return hmac.new(key, sign, hashlib.sha256).digest()

 hash = HmacSha256("\0"*32, rawMsg)
 print hash

Sample output of the snippet above.

python test.py sasa
_��"/���q���h�u$�k�w�)R]n�mf�

This is the string representation of the bytes you get after hashing Its Nodejs Equivalent was as simple as

function HmacSha256(key, sign){
    return crypto
      .createHmac("sha256", key)
      .update(sign)
      .digest()
  }

const hash = HmacSha256("\0".repeat(32), rawMsg).toString()
    console.log(hash)

Sample output of the nodejs snippet above

node test.js sasa
_��"/���q���h�u$�k�w�)R]n�mf�

Note the outputs are the same.All I had to do was to convert the Buffer array returned in the HmacSha256("\0".repeat(32), rawMsg) to string. I am using Node v8.11.2 and Python 2.7.15rc1