Scala: parse MIME/multipart raw emails over HTTP one at a time

964 views Asked by At

I'm trying to parse raw email messages over HTTP one at a time that come in MIME/multipart. Here is a chunk of one of the mails, the mail that my code most recently threw this exception on

java.nio.charset.MalformedInputException: Input length = 1

And here is (i think) the relevant chunk of that mail:

Content-Type: multipart/alternative;
 boundary="------------000401070001090809020709"

--------------000401070001090809020709
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit

Is there a Scala library out there for easily handling this type of input? Otherwise is there an easy way to write some code that handles it?

I've been looking at mime4j and this scala code in particular.

As of now, my code just uses scala.io.Source.fromURL to scrape the raw mail as follows:

scrape(scala.io.Source.fromURL(url))

which turns the BufferedSource into a String and splits it:

source.mkString.split("\n\n", 2) 

I've also tried using an implicit codec since scala.io.Source.fromURL can take a codec:

implicit val codec = Codec("UTF-8")
    codec.onMalformedInput(CodingErrorAction.REPLACE)
    codec.onUnmappableCharacter(CodingErrorAction.REPLACE)

but I think I'd need one of these for each charset?

Any help is greatly appreciated.

0

There are 0 answers