Hash function for Unordered associative containers in C++

Question

Hash function for Unordered associative containers in C++

460 views Asked by Shambo At 07 December 2013 at 02:22

In C++ for every unordered associative containers( like unordered_map, unordered_set, unordered_multimap) we need to define a hash function. As pointed out by Wikipedia,

struct X{int i,j,k;};

struct hash_X{
  size_t operator()(const X &x) const{
    return hash<int>()(x.i) ^ hash<int>()(x.j) ^ hash<int>()(x.k);
  }
};

struct hash_X is a custom hashing function for struct X. But what does this function do? Why do we need a hash function? Can there be any other type of custom hashing functions? If so how do we compare the efficiency between any two such functions.

Original Q&A

There are 2 answers

Erbureth On 07 December 2013 at 02:38

Short answer: To look-up the elements really fast.

As opposed to ordered containters, which store elements into some form of red-black trees (or another AVL tree), unordered ones uses indexed buckets to contain the nodes. Retrieving a bucket by index has O(1) complexity.

Hash function is a function, that takes a element and transform it into such integer index.

Consequently, as a domain of indices is smaller than domain of all the elements, collision may occur and more elements can be placed into one bucket, and thus reducing the effectiveness of element look-up. Therefore the least probability of a collision is definitely the property of a hash function to strive for. Another one should be the effectiveness of the hash computation.

See Perfect Hash Function for some more analysis

**Joe Z** · Accepted Answer · 2013-12-07T02:38:42+00:00

The goal of a hashing function is to map the contents of an arbitrary data structure to an integer in such a way that most of the items you're likely to encounter map to different integers, and that the complete set of items you're likely to encounter together spreads out evenly over the set of integers. With such a function in hand, it becomes easy to build a container (such as unordered_map) that looks up arbitrary items very quickly.

I realize that definition is somewhat abstract. More concretely, consider the example you gave above from Wikipedia. It XORs the i, j and k fields of the structure together to form a hash value. This is a valid hashing function (it merged the structure down to a single integer). But, if i, j and k all have similar ranges of values, then it may not be a terribly great hashing function. For example, (1,2,3) and (3,1,2) both will hash to the same value.

An ideal hashing function usually looks more like a random number generator: For predictable inputs, it gives seemingly random outputs. (But remember, the same input must always give the same output.) The best hash function for your data structure really depends on what sort of data you'll be hashing.

This set of lecture notes looks like it covers most of the important points: http://www.cs.cornell.edu/Courses/cs312/2008sp/lectures/lec21.html

You can find others by googling.

TechQA.

Hash function for Unordered associative containers in C++

There are 2 answers

Related Questions in C++

Related Questions in HASH

Related Questions in UNORDERED-MAP

Related Questions in UNORDERED-SET

Related Questions in HASH-FUNCTION

Popular Questions

Popular Tags

Trending Questions