Hashes generated by Rabin Karp Rolling Hash not reflecting on the Text

Question

Hashes generated by Rabin Karp Rolling Hash not reflecting on the Text

622 views Asked by Nitish Upreti At 15 January 2012 at 07:09

Note: Lots of Possible duplicates, but nothing seems to be solving my problem.

I am working on a Plagiarism detection based on MOSS.

After successfully implementing a filter which strips out all the necessary details(comments,punctuations etc) I hash the content using a Rolling Hash Implementation(Rabin Karp)

However the Hashes which match in two text-files of source code, have very different underlying text(No plagiarism and yet same hashes)

The Algorithm I implemented(Ruby) --> (Partial Snippet)

 #Preprocessing from RobinKarp Algorithm
  for c in 0...k do
    text_hash=(radix*text_hash+text_to_process[c].ord)%q
  end

  #Main loop
  for c in 0...loop do   
        text_hash=((radix*text_hash-(text_to_process[c].ord)*highorder)+(text_hash[c+k].ord))%q

Is there an issue with my implementation? Or the Parameters I specify can be at fault?

I take radix=34 ( I am not sure if it is the right value, I am assuming the stripped out text will only contain alphabets+some special charcters like '+','-','*','/' so a rough estimate of total 34 characters)

I am taking q(prime) to be 101

Is this a collision issue I am dealing with? Any pointers as to how to tackle the problem?

Original Q&A

There are 1 answers

**mcdowella** · Accepted Answer · 2012-01-15T08:07:38+00:00

I note that with q = 101, there are only 101 possible hash values - 0, 1, 2...100. Have you tried increasing q? Another approach would be to look and see if the hash values look like they are randomly chosen values within the possible values of 0,1..q-1.

You should of course also test your program on cases where there are repeated strings for it to find - a failure there could be another symptom of any problem that is also causing collisions, and it would be easier to find and debug.

TechQA.

Hashes generated by Rabin Karp Rolling Hash not reflecting on the Text

There are 1 answers

Related Questions in RUBY

Related Questions in ALGORITHM

Related Questions in PLAGIARISM-DETECTION

Related Questions in RABIN-KARP

Popular Questions

Popular Tags

Trending Questions