I'm working on a project to manage documents (eg: create, read, maintain different versions etc...) and my plan is to use the following AWS architecture.
When a document is created/updated it will be saved on to a version enabled s3 bucket via API Gateway S3 proxy. S3 put event will trigger a lambda to get latest version and all version ids and save it to DynamoDB. Once it is saved on a DynamoDB table, it will be indexed in Elasticsearch via DynamoDB stream.
My Plan is to use Elasticsearch for all search queries. And I will load the latest documents from DynamoDB. Since each record has S3 version ids i can query old versions from S3 as well.
Since my architecture relies much on eventual consistency i.e. (S3 to DynamoDB and DynamoDB to Elastic Search) I'm worried that I would not get the latest document data either when I query the Elasticsearch or query DynamoDB after I create a document.
Any suggestions for improvements will be much appreciated.
Thanks!
As you said your application architecture has multiple points where eventual consistency is used.
If your application business case absolutely requires that when you query data, you get the absolute latest version, then your architecture choices are bad and you should, for example, consider using a RDS persistence instead.
If not, then you just design the rest of your system keeping in mind that getting a completed
PUT
does not guarantee that queries immediately return the data. Giving instructions on how to do this vastly depends on your application and cannot feasibly be generalized.