Amazon QLDB Design Pricinples

623 views Asked by At

I'm wondering the best way to design tables in QLDB and whether it's best to perform joins or perhaps have nested documents. For example, if I have the tables transaction and payment where a payment must be associated to a transaction. Which of the following options are best;

Nested Document Option (One table)

{
    'payment_reference': 'abc123',
    'transaction': {
        'id': 123,
        'name': 'John Doe',
        'amount': '$10'
    },
    'fees': '$2',
    'amount_paid': '$12'
}

Two Table Option

Payment Document

{
    'payment_reference': 'abc123',
    'transaction_id': 12,
    'fees': '$2',
    'amount_paid': '$12'
}

Transaction Document

{
    'id': 123,
    'amount': '$10',
    'name': 'John Doe',
   
}

2

There are 2 answers

0
Ymartin On BEST ANSWER

I think @Aurgho has answered your question. But I am going to put my general thoughts based on what Aurgho said, which might help others coming to this post with similar question.

There are multiple factors that can influence your design decision, along with the quotas and limits QLDB imposes. Here are few pointers that might help you think forward:

  • Query Pattern: At this point, Amazon QLDB allows creation of indexes only on the top level fields. In the nested document design(Option #1), if your queries are going to be on any of the fields of the nested document, then those queries won't use index and will perform scans. This can impact your performance. With Option #2, you can have indexes on both the tables and use those indexed fields in your join criteria.

  • Access pattern: Are you going to have significantly more writes than reads? If your reads are sparse and not extremely sensitive to a little elevated latency, Option #1 might be better from data modeling perspective, where are all the payment related information is captured in a single document. On the other hand, if you have a lot more reads and the reads are latency sensitive, you should evaluate your options from the previous point's perspective.

  • Quotas and Limits: Amazon QLDB has quotas on the document size (which is currently at 128 KB) https://docs.aws.amazon.com/qldb/latest/developerguide/limits.html#limits.fixed. If your plan to add more fields as you go, the per document size can keep increasing with the nested fields and you might eventually run into the document size limit. There are other quotas too which can impact your decision based on your use case.

Generally speaking, if you are not going to query on a field in the nested document and/or your writes >>> reads and/or your reads are not super sensitive to latency and/or your document size will stay within the currently imposed limits, you could do with Option #1. Having all your data in one document can ease you at the application layer when you are pushing the data into QLDB(just one insert) and when you have to process the documents in your code, but you will have to choose your trade-offs correctly.

These are just general pointers to help you think forward. You could have other use cases where either of the design options becomes more convincing than the other and you can trade-off certain advantages/disadvantages between the two.

Also, QLDB has some recommendations to optimize your query performance, which can further help you with your decision https://docs.aws.amazon.com/qldb/latest/developerguide/working.optimize.html

0
Aurgho Bhattacharjee On
  • If, as in the nested document option, transaction documents are chosen to be nested inside payment documents, please keep in mind that the document size limit is 128KB as mentioned in the QLDB limits documentation . If the payment document can be foreseen to be large enough to hit this limit after nesting, this option could be risky.
  • If you foresee having to index on some of the fields in the transaction documents, you can create two separate tables and perform a join instead. (As noted in the create index reference, QLDB does not allow indexing on nested values of document and as mentioned in our limits documentation, AWS QLDB allows a maximum of 5 indexes per table)

The above recommendations are only based on the information provided in the post and we are unaware of the current access patterns in this use-case and will require further understanding to be able to answer better. You can reach out to the team at qldb-outbound AT amazon.com for further consultation regarding your use-case.

Thanks