We have a legacy system that produces files that each contains hundreds of messages (financial transactions). We need to transform these messages into another format and submit them (individually) to a target system. The question is: Should ESB accept these files for processing directly, or should there be an adapter application between the legacy system and ESB that would split received files into individual messages and let the ESB to process the messages individually (instead of processing the whole file)?
In the first solution we expect two ESB flows. The first one would transform the file into a new format, split it into the messages, and store these messages into a temporary location. The transformation needs to process the file as a whole, because the file contains some common sections that are needed for transformation of the individual messages. The second flow would take the individual transformed messages (each in a separate DB transaction), pass them to the target system, and wait for its answer (synchronously or asynchronously).
The second solution would replace the first flow by an external application that would transform the file, split it into individual transformed messages, and store them in a temporary location (local file system). The second flow would stay in the ESB.
In our eyes, the disadvantage of the first solution is in that the ESB would have to process huge files (in the first flow), which is commonly considered an antipattern. On the other hand, the ESB would adjust directly to the interface of the legacy system, which is one of the purposes of ESB.
In the second solution, the adapter application would contain the transformation logic, which should be another of the purposes and responsibilities of ESB.
What is the commonly suggested solution for this situation (a pattern)? Could you provide some references that are more descriptive than these two links that I've found?
https://www.ibm.com/developerworks/wikis/display/esbpatterns/File+Processing
Edit Another reference: http://www.ibm.com/developerworks/webservices/library/ws-largemessaging/
Remember that there are 3 message types in SOA: Command, Event, Document
That 'Document' bit is for chunks of data. It is probably better suited to 'real' document types such as 'Order' or 'Invoice' and the like but there is nothing stopping you from going with 'TransactionBatch'.
That being said, it is a rather unused message type in that not many service buses actually implement anything around it, since:
So what I would do in your scenario is have an endpoint that processes the file. So something like a
ProcessTransactionFileCommand
sent to the processing endpoint and in it you only have a reference to the actual file (stored somewhere in the file system or even a url to download from). That processing endpoint can process the file and send the individual messages (all within a transaction) to the integration endpoint that sends the message off to the external system. You could have aSendTransactionCommand
to do that.In this way your system is very flexible in that the integration endpoint can receive individual integration commands form some parts of your solution while the processing endpoint can handle the batch and split them into individual integration commands.
Should you be in the .NET space you may want to look at my FOSS service bus project: http://shuttle.codeplex.com/
But any service bus will do the trick (MassTransit, NServiceBus, etc.)