Problem Decription:
It is not viable to use a distributed transaction that spans the database and the message broker to atomically update the database and publish messages/events.
The outbox pattern describes an approach for letting services execute these two tasks in a safe and consistent manner; it provides source services with instant "read your own writes" semantics, while offering reliable, eventually consistent data exchange across service boundaries.
What would be the downside if I read a message from topicA -> write a message to topicB (exactly once semantics with Kafka Streams) and with an eventlistener update the database?
That means I would have eventual consistency till the database entity is persited, but no data loss since i have the message in the Kafka topic (retry till persistance works).
This pattern also has the following issues:
The Message Relay might publish a message more than once. It might, for example, crash after publishing a message but before recording the fact that it has done so. When it restarts, it will then publish the message again. As a result, a message consumer must be idempotent, perhaps by tracking the IDs of the messages that it has already processed. Fortunately, since Message Consumers usually need to be idempotent (because a message broker can deliver messages more than once) this is typically not a problem.
Question:
So when it comes to compromises what is better, keeping Kafka as the Single Source of Truth and dealing with eventual consistency in the database or keeping the Db as a source of truth and using kafka as a dumb message broker?
I'm really interested in your opinion! Thnx!
Im not sure if your really require the stream processor for this. Maybe a good approach will be writing to the DB and using a CDC connector to stream data to Kafka. The CDC will keep track of the transaction log of the DB Tables and reflect the changes to the kafka topic. Even incase of a connect failure, once restarted the topic will eventually be synchronized with the DB state