Convert strings to 16Bit UTF integers in Javascript

1k views Asked by At

I'm working with firebase firestore and for some reason I need to hash every new user document ID to integer specifically a 16Bit UTF integer. Mostly it's an attempt at duplicating the .hashcode method in Dart.

what I have found and tried is this

// Convert to 32bit integer
const hashFunction = (string) => {
  var hash = 0;
  if (string.length == 0) return hash;

  for (i = 0; i < string.length; i++) {
    char = string.charCodeAt(i);
    hash = (hash << 5) - hash + char;
    hash = hash & hash;
  }
  return hash >>> 0;   // returns hash as positive integer
};

but this only converts to 32bit int. Any and all help is appreciated.

EDIT1: Using @selbie's changes i made some progress but the hash i'm getting is quite different from the .hashCode method in Dartthis is dart's output and this is JS output

is there anyway i can get thesame result

1

There are 1 answers

1
selbie On

If you only want a 16-bit hash, then:

Instead of this:

return hash >>> 0;   // returns hash as positive integer

This:

return hash & 0xffff; // return lower 16-bits of hash value

This line:

if (string.length == 0) return hash;

Isn't needed since your for-loop will simply not run if string.length is 0.

Finally, not a good idea to have a variable name the same as a type. So let's just rename string to be s.

Here's a simplified solution:

const hashFunction = (s) => {
  var hash = 0;

  for (i = 0; i < s.length; i++) {
    hash = (hash << 5) - hash + s.charCodeAt(i);
    hash = hash & hash; // prevent overflow from happening
  }
  return hash & 0xffff;   // returns lower 16-bit of hash value
};

With that in mind, a 16-bit hash isn't a very strong hash. There are sure to be collisions after a few thousand strings are hashed. Consider a crytographically strong hash or a larger hash width as appropriate.