Hashing n-grams by cyclic polynomials - java implementation

1.5k views Asked by At

I'm solving some problem that involves Rabin–Karp string search algorithm. This algorithm requires rolling hash to be faster then naive search. This article describes how to implement rolling hash. I implemented "Rabin-Karp rolling hash" without problems and found few implementations implementations, but article also mentions computational complexity and that hashing n-grams by cyclic polynomials is prefered. It links to BuzHash implementation of such technique but I wonder how it can be used to build n-gram hash on top of it. I want to have something like this or

CPHash cp = new CPHash("efghijk");
cp.shiftRight('l') // now we got hash of "fghijki"
cp.shiftLeft('d') // "defghi"

for java.

For people who will encounter problems related with string search (like me) there are some articles that I found usefull: 1, 2, 3

1

There are 1 answers

0
Daniel Lemire On BEST ANSWER

I recently published an Apache licensed Java library which implements several rolling hash functions including Cyclic and Rabin-Karp:

http://code.google.com/p/rollinghashjava/

https://github.com/lemire/rollinghashjava