Find byte offsets for e-mail attachments

81 views Asked by At

I got a requirement to deliver emails to a legacy system that needs to read the attachments.

For each part in a multipart email I need to provide the byte offset for where the attachment starts in the email, so the legacy system doesn't need to know how to parse emails.

Performance and memory usage is an issue, so the solution can't load the entire email into memory. And to my eyes that leaves out javax.mail.

How would you go about it in Java?

My first idea was to use mime4j, but the library does not keep of byte offset or even the line number. I investigated making a PR to mime4j to add tracking of line numbers and byte offsets. But it is not very easy, since it is a very mature project and it uses lots of buffering internally.

Now I am thinking that maybe I am going about this the wrong way. So I would very much appreciate any ideas of how to solve this in a simple matter.

1

There are 1 answers

2
rockwotj On

You're going to run into issues just sending the byte offsets and the full email, as emails still can be base64 encoded or printed-quoteable encoded.

You'll want to use a MimeStreamParser and give your own ContentHandler and override the body method. You can then directly send the BodyDescriptor and InputStream to the legacy system. The InputStream is the "decoded" email (IE handles any Content-Transfer-Encoding). The BodyDescriptor is useful to extract stuff from the headers of the part that you may care about (MimeType and Charset are the most useful ones).

This does not buffer the whole email, and allows you to stream out just the body parts. I'm not sure how you're communicating with the legacy system (via the network or if it's an inprocess subcomponent) but hopefully that works!