Modeling a NoSQL database (DynamoDB) for different sorting queries

870 views Asked by At

I have somewhat of a specific question regarding how to model my DynamoDB tables so that I can handle the necessary queries.

My application is centered on the idea of "events." Every event has attributes (name, location, time, number of attendees, etc). Events are associated with the cities in which they are located. I am trying to figure out how to perform a get / query request (probably a series of get / query requests) to obtain the top 25 events with the most attendees for a specific city.

I come from a background of relational databases, and this would be a really simple query (select * from events where city = x order by attendees limit 25). But I am having a hard time figuring out how to do the same with a non-relational database. I know I will have to create additional tables to store mappings of hashes, but I can't seem to figure it out.

One way I have thought of implementing it is to somehow let the "attendees" (of Number type) be the range key, and let the city be the hash key. But this will not necessarily be a unique key because multiple events in the same city could have the same number of attendees. Also, is it even possible to "update / atomically increment" a range key?

Thanks for all your help!

1

There are 1 answers

0
Chen Harel On

Note: I still believe that RDBMS is more suited for these queries but here goes:

First, You can only atomically increment an attribute.
Now for your scenario I would suggest the following:

Table: Events
hk: eventId
attributes

Table: Top_Attendees_Per_City
hk: city
rk: eventId

Table: Event_Id_Generator
hk: event_counter
running_counter

Table: Minimum_Attendees_Per_City
hk: city
min_attendees_number, max_attendees_number, events_number

Once an event has fired to your backend, you will need to assign it a running id. this is not mandatory and it has issues with scaling but it will ensure that if events have the same number of attendees, newer events will take precedence in your "top25".
You will need to check if the number of attendees is between the min and max while counting the events until "25". This enables your mimimum_attendees_per_city to decide whether this new event is going to appear in the top25. If it is, then you add it to the top_attendees_per_city.
Finally, you Query that table with setScanIndexForward(false) and setLimit(25)
resulting with the 25 events with the highest number of attendees. Final note: The resulting items are not ordered by their attendees, you can order them in the application level before returning.