I encountered a problem when parsing email with mime4j. The email has an attachment, and I use MimeStreamParser to parse it. The parser does not call startMultipart method at all. Instead, it only calls body method once, and the BodyDescriptor is "text/plain".
I do not know the root of this problem, the email format or my program?
Here is my test program:
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import org.apache.james.mime4j.*;
import org.apache.james.mime4j.dom.BinaryBody;
import org.apache.james.mime4j.dom.Body;
import org.apache.james.mime4j.dom.Entity;
import org.apache.james.mime4j.dom.Header;
import org.apache.james.mime4j.dom.Message;
import org.apache.james.mime4j.dom.MessageBuilder;
import org.apache.james.mime4j.dom.Multipart;
import org.apache.james.mime4j.dom.TextBody;
import org.apache.james.mime4j.dom.address.Mailbox;
import org.apache.james.mime4j.dom.address.MailboxList;
import org.apache.james.mime4j.dom.field.AddressListField;
import org.apache.james.mime4j.dom.field.ContentTypeField;
import org.apache.james.mime4j.dom.field.DateTimeField;
import org.apache.james.mime4j.dom.field.UnstructuredField;
import org.apache.james.mime4j.field.address.AddressFormatter;
import org.apache.james.mime4j.message.BodyPart;
import org.apache.james.mime4j.message.MessageImpl;
import org.apache.james.mime4j.message.DefaultMessageBuilder;
import org.apache.james.mime4j.message.SimpleContentHandler;
import org.apache.james.mime4j.parser.ContentHandler;
import org.apache.james.mime4j.parser.MimeStreamParser;
import org.apache.james.mime4j.stream.BodyDescriptor;
import org.apache.james.mime4j.stream.Field;
import org.apache.james.mime4j.stream.MimeConfig;
public class TestClass extends SimpleContentHandler{
public static void main(String[] args) throws MimeException, IOException {
ContentHandler handler = new TestClass();
MimeConfig config = new MimeConfig();
MimeStreamParser parser = new MimeStreamParser(config);
parser.setContentHandler(handler);
InputStream instream = new FileInputStream("mail/testuser1");
try {
parser.parse(instream);
} finally {
instream.close();
}
}
@Override
public void headers(Header arg0) {
// TODO Auto-generated method stub
System.out.println("headers args: "+arg0);
}
@Override
public void body(BodyDescriptor bd, InputStream is) {
// TODO Auto-generated method stub
System.out.println("body descriptor: "+bd);
}
public void startMessage(){
System.out.println("startMessage");
}
public void endMessage(){
System.out.println("endMessage");
}
public void startBodyPart(){
System.out.println("startBodyPart");
}
public void endBodyPart() {
System.out.println("endBodyPart");
}
public void preamble(InputStream is){
System.out.println("preamble");
}
public void epilogue(InputStream is) {
System.out.println("epilogue");
}
public void startMultipart(BodyDescriptor bd){
System.out.println("startMultipart");
}
public void endMultipart() {
System.out.println("endMultipart");
}
public void raw(InputStream is) {
System.out.println("raw");
}
}
Here is a part of my email file:
From MAILER_DAEMON Wed Aug 21 19:24:53 2013
Date: Wed, 21 Aug 2013 19:24:53 +0800
From: Mail System Internal Data <[email protected]>
Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
Message-ID: <[email protected]>
X-IMAP: 1377072167 0000000003
Status: RO
This text is part of the internal format of your mail folder, and is not
a real message. It is created automatically by the mail system software.
If deleted, important folder data will be lost, and it will be re-created
with the data reset to initial values.
From [email protected] Sat Aug 24 10:53:42 2013
Return-Path: <[email protected]>
X-Original-To: [email protected]
Delivered-To: [email protected]
Received: from shupc (unknown [192.168.75.130])
by mail.abc.com (Postfix) with SMTP id C0F5B1EFBC3
for <[email protected]>; Sat, 24 Aug 2013 10:53:42 +0800 (CST)
Message-ID: <7F1C30C9CB284CA594D01CBE210257D3@shupc>
From: "john" <[email protected]>
To: "smith" <[email protected]>
Subject: aaa
Date: Sat, 24 Aug 2013 10:53:42 +0800
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_NextPart_000_000B_01CEA0B8.32903020"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512
X-UID: 3
Status: O
Content-Length: 386430
This is a multi-part message in MIME format.
------=_NextPart_000_000B_01CEA0B8.32903020
Content-Type: multipart/alternative;
boundary="----=_NextPart_001_000C_01CEA0B8.32903020"
------=_NextPart_001_000C_01CEA0B8.32903020
Content-Type: text/plain;
charset="gb2312"
Content-Transfer-Encoding: base64
dGVzdCBhYSBiYiBjYw==
------=_NextPart_001_000C_01CEA0B8.32903020
Content-Type: text/html;
charset="gb2312"
Content-Transfer-Encoding: base64
PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv
L0VOIj4NCjxIVE1MPjxIRUFEPg0KPE1FVEEgaHR0cC1lcXVpdj1Db250ZW50LVR5cGUgY29udGVu
dD0idGV4dC9odG1sOyBjaGFyc2V0PWdiMjMxMiI+DQo8TUVUQSBjb250ZW50PSJNU0hUTUwgNi4w
MC4yOTAwLjU1MTIiIG5hbWU9R0VORVJBVE9SPg0KPFNUWUxFPjwvU1RZTEU+DQo8L0hFQUQ+DQo8
Qk9EWSBiZ0NvbG9yPSNmZmZmZmY+DQo8RElWPjxGT05UIHNpemU9Mj50ZXN0IGFhIGJiIGNjPC9G
T05UPjwvRElWPjwvQk9EWT48L0hUTUw+DQo=
------=_NextPart_001_000C_01CEA0B8.32903020--
------=_NextPart_000_000B_01CEA0B8.32903020
Content-Type: application/octet-stream;
name="10112716229607.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="10112716229607.doc"
0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAAFAAAAKAIAAAAAAAAA
EAAAKgIAAAEAAAD+////AAAAACMCAAAkAgAAJQIAACYCAAAnAgAA////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////s
pcEAcWAJBAAA8FK/AAAAAAAAEAAAAAAABgAArJ0CAA4AYmpianFQcVAAAAAAAAAAAAAAAAAAAAAA
AAAECBYAOBIDABM6AQATOgEA1gwBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAA
AAAAAAD//w8AAAAAAAAAAAD//w
The problem is with the sample email which not multipart. It contains, as inline text, a multipart email.
Remove the first headers ("FROM MAILER") and then make sure all lines following after Content-Type are indented (eg charset and boundary) by at least one whitespace character as required by the spec (RFC822 or later) or remove the linefeed. See example:
Change from :
to either:
or:
Alternatively, try a different message.