GROUP BY-Like query in DynamoDB with DynamoDbEnhancedClient

682 views Asked by At

I'm trying to write a Query in DynamoDB with logic similar to a GROUP BY in SQL. Let's consider the following table with partition key simulation_id:

Simulation (Table):
   simulation_id_1 - type_1 - ...
   simulation_id_1 - type_1 - ...

   simulation_id_1 - type_2 - ...
   simulation_id_1 - type_2 - ...
   simulation_id_1 - type_2 - ...

   simulation_id_2 - type_2 - ...
   simulation_id_2 - type_2 - ...
   simulation_id_2 - type_2 - ...
   simulation_id_2 - type_2 - ...

   simulation_id_2 - type_3 - ...

The result would be:


simulation_id_1 - type_1 - 2 (count of entries)
simulation_id_1 - type_2 - 3
simulation_id_2 - type_2 - 4
simulation_id_2 - type_3 - 1

What is the best way for achieving that using DynamoDbEnhancedClient?

1

There are 1 answers

0
Leeroy Hannigan On BEST ANSWER

GROUP BY or SUM or any other type of math metric is not supported by DynamoDB and is considered somewhat of an anti-pattern.

With that said, there is 2 ways to achieve your use-case needs.

  1. You can execute a Query operation and set the Select parameter to COUNT DOCS. This will return the count of all the items which match a given simulationId, however, this will require the Query to read all of the items which relate to that simulationId, only returning the count. It will also require you to use a lower level client I believe.
  2. The second option is the most favourable, use DynamoDB Streams and a Lambda function to keep an "aggregate" record for you simulationId. Meaning for every item you add or remove, you increment or decrement the aggregate record. This will allow you to get the sum of all records in an efficient and cost-effective way.
PK SK Agg
simulation_id_1 aggregate 3
simulation_id_1 type1
simulation_id_1 type2
simulation_id_1 type3
simulation_id_2 aggregate 1
simulation_id_2 type1