As the size of my substring grows, how can I find the complexity of this section of code?
if size > 160:
sub = (hashlib.sha1(sub.encode('utf-8')).hexdigest())
I became curious when I noticed my program running as if the hash function executed in constant time. For my program, if the 'size' is 165, worst case the above code will execute 165x. A test I've just done shows sha1 executing with an unstable relationship with length.
Length Time
0 0
1 0.015000105
2 0.016000032
3 0.046000004
4 0.046999931
5 0.062000036
6 0.078000069
7 0.078000069
8 0.07799983
9 0.108999968
Test code:
import string
import random
import hashlib
import time
def randomly(size=6, chars=string.ascii_uppercase + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
for i in range(1, 10000001, 1000000):
random_str = randomly(i)
start = time.time()
str_hash = hashlib.sha1(random_str.encode('utf-8')).hexdigest()
print time.time() - start
I don't agree with DarthGizka. Here are more description from the same wikipedia article:
The work of padding is just a pre-processing. More work is done within
for each chunk
. Since mattkaeo's data size is larger than 1000000 chars (except for the 1st one), for loop should consumes the most time, while consumption of padding is negligible.mattkaeo's result is not very linear, I believe, is because he only run each sample once, so system noise (e.g. OS and other processes share CPU power) is significant. I run each sample 200 times:
The result is much more linear: