SimHash implementation in Java?

8.2k views Asked by At

Has anyone come across a simhash function implemented in Java?

I've already searched for it, but couldn't find anything.

4

There are 4 answers

1
Aaron Digulla On

According to this page, you should ask the developers of BibSonomy.

4
jitter On

btw. It looks like Google has patented the algorithm. If you are in US, successfully compete with Google, and do not have own parent portfolio, then do not tell them you are using it.

An implementation in C

http://dsrg.mff.cuni.cz/~holub/sw/shash/


[Removed no longer relevant BibSonomy text]

1
aNeurone On

Here you can find the full java source code. It's very simple. A demo also is provided. http://aneurone.blogspot.com/2012/09/simhash.html

0
otmar On

In the mean time, the hash4j library includes a SimHash Java implementation. There is also a FastSimHash implementation, which is up to 10x faster using a bit hack as described in this blog post.