I have a problem with using hash function. I have to assign some number(128 bit or 64 bit) with every word in the document. So, the hash value of "similarity" must be near with "similar". That means, if has value of similarity=>10022(say) then similar=>10025. which should near with similar word. also the hash value of different name should similar. that means, hash value of "john" also should be near about with " michel" or "sita"... so on. If any body have any idea about it.
Thanks in advanced. :)
it's not working in that way , first you have to find the general model for the sample value of available data, and then use it for the streaming log messages.