Preferred way of structuring Redish hash store

55 views Asked by At

I have a number of objects; each object belongs to one or more categories. For each object, I have a serialized string representing the properties of the relevant object-category pair.

I would like to set up a store to retrieve those properties, knowing the category id and object id. I could set up the store in the following ways:

1: Fewer keys, more fields per key

  • Have a hash category:{category_id} that, for each key, stores a value that has field names object_id and field values set to that object-category pair's json-stringified properties.

2: More keys, one field per key

  • Have a hash category-object:{category_id}-{object_id} that, for each key, stores a single field equal to that category-object pair's json-stringified properties.

I would like to compare the two approaches' performance properties, but I cannot find any meaningful differences. Are there any advantages of one approach over the other?

Note the value storage is a given - the value stored need to be json-stringified properties.

1

There are 1 answers

0
LeoMurillo On BEST ANSWER

You should expect a slightly better performance using one key (category-object:{category_id}-{object_id}) using SET and GET commands, than a hash with key name category:{category_id} and field name {object_id} using HSET and HGET commands.

Both GET and HGET are time complexity O(1). The first one (GET) will run the hashing function once on the key name and then look up once on the hash map. The second one (HGET) will do the same twice. Hence a slight difference.

But this difference, next to the Round Trip Time, is negligible and I wouldn't use it as criteria for the design decision.

Instead, other criteria would be:

  • Would I ever need to get all objects for a given category? Then use the first approach, as one command will get you that (HGETALL) whereas in the second approach you would need to SCAN and then GET.
  • Would I ever need to EXPIRE a given {category_id}-{object_id}? This would imply the second approach because you cannot expire a field of a hash.
  • Would I ever need to get all the {category_id}-{object_id} values for a given object? This may tilt the balance for the second approach because there I only need to SCAN. IN the first approach I would need to SCAN then HSCAN on each result.

If you want to run some actual benchmarks, see How fast is Redis?