understanding Universal hashing chapter on CLRS

631 views Asked by At

Hi I am reading the chapter about universal hashing on CLRS.

On page 234

Corollary 11.4

Using universal hashing and collision resolution by chaining in a table with m slots, it takes expected time Theta(n) to handle any sequence of n INSERT, SEARCH and DELETE operations containing O(m) INSERT operations.

I kinda get the idea but I have a hard time to understand this English sentence. What does the end "containing O (m) INSERT operations" mean?

Does it mean n = O(m) insertion was performed already. Then, .... I don't know. I can't parse this sentence. What is the difference between the 1st INSERT and 2nd INSERT?

I would like to hear your opinion.

Thanks!

2

There are 2 answers

1
mcdowella On BEST ANSWER

I think that there is only one sequence of n insert, search, and delete operations, but the parameter m is used to limit the number of insert operations you are allowed to put within those n operations. Suppose that you had a table of size 10, so m=10, and then you set n=1 000 000, with the first 500 000 operations inserts, and the next 500 000 searches for an item not in the table. Then the performance would be very poor, because the table would have chains about 100 000 items long.

So if you have a table with m slots, the theorem only allows you about m insert operations, so that the table never holds more than about m items, and the chains won't get too long, and all the operations are pretty much O(1) - in the example above you can only have about 10 insert operations, so the other 999 990 operations must be either search or delete.

0
Omri Barel On

Saying "O(m)" INSERT operations means that there are C*m+B INSERT operations in the sequence, where m is the number of slots.

You have a sequence of n operations on a table with m slots. The number of INSERT operations is proportional to the number of slots (regardless of the length of the sequence), and not to the number of operations.

In other words, the expected time is theta(n) only if the number of insertions is not some function of n (for example, half of the operations in the sequence are insertions) but a linear function of the number of slots, and it doesn't grow with the length of the sequence.