I am just looking at various algorithm's efficiency. Not just big O efficiency, but practical efficiency. Anyway i was testing a Rabin Karp algorithm i wrote against a brute force string comparison approach, and the brute force approach almost consistently was twice as fast. Why is this? I thought that a rolling hash was suppose to be much more efficient.
My only guess on why this is is that must rabin Karp solution just sucks, but otherwise im not too sure. I've tested it on very large strings (400kb) and the results are always the same.
An example is for a 400kb string
- brute force - 0.023785114288330078 seconds
- rabin karp - 0.041351318359375 seconds
The code for both are below:
def rabinKarp(self, inpStr, containStr) -> bool:
if len(containStr) > len(inpStr):
return False
searchHash = 0
d, p = 52, 2**16 + 7
for i in containStr:
searchHash = (searchHash * d + ord(i)) % p
currHash = 0
# init currHash
for i in range(0, len(containStr)):
currHash = (currHash * d + ord(inpStr[i])) % p
h = d ** (len(containStr) - 1)
for i in range(len(containStr), len(inpStr)):
if currHash == searchHash:
return True
currHash = (currHash - ord(inpStr[i - len(containStr)]) * h) % p
currHash = (currHash * d + ord(inpStr[i])) % p
return False
def program(self, stringInp, matchInp):
if len(matchInp) > len(stringInp):
return False
for i in range(len(matchInp), len(stringInp) + 1):
if stringInp[i - len(matchInp): i] == matchInp:
return True
return False
I am just asking out of personal curiosity.