In my use case, a javascript tracker generate a unique ID for a visitor whenever he/she visits the site, using the following formula:
function generateUUID(){
return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
var r = Math.random()*16|0, v = c == 'x' ? r : (r&0x3|0x8);
return v.toString(16);
});
}
It generates strings like this (rfc4122):
"3314891e-285e-40a7-ac59-8b232863bead"
Now I need to encode that string in a Number (e.g. BigInteger in Java) that can be read by Mahout. And likewise, restore it (in PHP) to display results. Is there any fast, consistent and reliable way to do that?
Some solutions are:
- Mapping each possible char (alphanumeric + '-') to a number [1..M] and summing each char position accordingly
- get 2 longs from md5 hash
- keep a hash map in memory
Any ideas appreciated!
If Mahout can use a compound ID of two longs, you can use:
If you really are stuck with one long, then I'd agree with your idea to use a portion of a hash based on the entire UUID