I'm using Zend's Zend_Mail_Storage_Pop3
to connect to mail server, open an email, and iterate through its attachments. If the attachment is a PDF, I need to download it. At each iteration of each message part, I call the getHeaders
and use Regex to determine the mime type of the attachment. In most cases, I get something like this:
["content-type"]=> string(64) "application/octet-stream; name=abc.pdf"
["content-transfer-encoding"]=> string(6) "base64"
But in some cases, I get something like this:
multipart/mixed; boundary=--boundary_2_1dca5b3b-499e-4109-b074-d8b5f914404a
How do I determine the mime type of such attachments?
This is a little bit of a complicated case. When the
content-type
ismultipart/mixed
that means that there are several pieces of the email. One or more of these might be an attachment (in addition to possibly including an html region or plain text).When the
content-type
ismultipart/mixed
, a boundary is also given. You can use this regex to determine if you are dealing with a multipart email:(note that this sample is part of a larger class dealing with email messages)
If your message is a multipart email, the next step is to separate all of the parts. You can do this like so:
The boundary always will start with
--
per the email standards. Then the only thing left to do is to parse each of the individual parts.You probably already have code to do that. Each part will have the same headers that you mentioned in your question:
content-type
andcontent-transfer-encoding
.There might be other part headers as well, and you will want to remove them (they will all start with the prefix
content
if I remember correctly).Then make sure that if the part is base64 encoded that you account for that (you can check the
content-transfer-encoding
header to determine this.The mime-type of the individual attachment will be stored in the part's
content-type
header just like in the case of a single part message.One note - this assumes that you are dealing with the raw source of the message. To do this, you can use
getRawHeader
andgetRawContent
.