AxonIQ process new events while replaying older ones

191 views Asked by At

Every now and then, it happens that we have to replay events from axon event store to update our projections. This replay takes several hours and that will only keep taking longer as time goes by. That means our application is not processing new events whilst the replay is taking place.

I wonder if there is a way to keep processing events while doing a replay at the same time? Does axon provide such a mechanism, or do we have to create two distinct eventhandlers: one for "normal" processing and one for replay?

2

There are 2 answers

0
Steven On

Axon does not provide a solution to this out of the box. There are roughly speaking two approaches you can take:

  1. Do a form of blue/green deployment. Nine out of ten, a replay required to comply with the new Query Model format constructed during development. If you use a blue/green deployment style, you can simply warm-up/replay the new instance and jump over once it is done.
  2. Implement additional components to construct a new TrackingEventProcessor and its contained event handling components during the invocation of the replay. Furthermore, a new query model store needs to run in parallel to store the new format, with aliases in place to switch over once the process is done.

I'd wager option one of these would be the least code work and more ops. Option 2 flips that around of course.

Note that you will always need to deal with some level of eventual consistency when doing a replay. Events simply keep happening in a system, so reaching the end of the event stream is conceptually not going to happen. Hence you will need to decide when the replay is far enough prior to constructing any solution in this space.

Lastly, there are means to improve event handling speed too, thus improving replay speed as well. This blog gives a quick glance of what you can do to this end, which in short can be summarized by adjusting the event handling batchSize, the number of processing threads and segments, and lastly tieing into Axon's UnitOfWork to improve database access.

Update

What's good to take into account when it comes to the replay functionality, is to notice it is a powerful, but heavy tool. Any eventing system (so not just Axon) where you'd need to do a replay over billions of events would impose some strains on the implementation and deployment strategies. Or taking this a step further, whether you should even do this.

Hence, you will have to deduce whether a replay is the best solution for the problem at hand. Sometimes, simply updating your Query Models directly is the most pragmatic solution and a massive timesaver. But in some scenarios, you would have implemented the enhancements in event handling and deployment strategy to allow quick replays.

I've also seen a lot of domains where a replay did not require a full replay of the event store. If the users of the application only require the last year worth of models to be present immediately, then it is far simpler to only replay the last year's worth of events. Maybe even the last 3 months will suffice.

From a conceptual point of view, a replay can resolve all answers in your system. That's definitely the good thing about going for an Event Sourcing solution, as you effectively have a single source of truth. But as with any tool, you should use it for the right problem at the right time.

2
Marc On

Thank you Steven. The improvement in the blog are interesting, but even then, a 1 billion event stream would take close to 10 hours to replay and that is simply not acceptable to miss new events for such a long period. So I guess even with super good event handling speed, one would still need to tackle the problem by implementing one of the two solutions that you propose.