Parsing emails for TIFF attachments in C#

496 views Asked by At

I built an email parser that extracts TIFF attachments from emails sent by 2 different fax providers, RingCentral and eFax.

The application uses Pop3 to retrieve the email as a text stream and then parse the text to identify the section that represents the Tiff image.

By converting that section of text to a byte array and using a BinaryWriter, I'm able to create the TIFF file on my local hard drive.

public void SaveToFile(string filepath)
{
    BinaryWriter bw = new BinaryWriter(new FileStream(filepath, FileMode.Create));

    bw.Write(this.Data);
    bw.Flush();
    bw.Close();
}

The issue is that the eFax email attachments cause runtime errors when converting the text to a byte array.

//_data is a byte array
//RawData is a string
_data = Convert.FromBase64String(RawData);  //fails on this line

I get the following error:

The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters.

I assume it has something to do with the encoding/decoding of the string, but I've tried various encoding types and still get the error.

Some additional information:

  • Programming Language: C#
  • Email Host: GMail
  • If I manually forward the email back to myself, the parser works, but will not work against the original.
  • I even tried auto-forwarding in GMail but this did not work.

I'm responding here to the first comment below and thanks for your response.

The TIFF file is created by taking the section of text from the email that is associated to the TIFF file attachment, converting it to a byte array, and saving the file with a .tiff file extension. This works fine for all RingCentral emails. For example, the RingCentral email section header looks like this:

------=_NextPart_3327195283162919167883
Content-Type: image/tiff; name="18307730038-0803-141603-326.tif"
Content-Transfer-Encoding: base64
Content-Description: 18307730038-0803-141603-326.tif
Content-Disposition: attachment; filename="18307730038-0803-141603-326.tif"

Please note the Content-Transfer-Encoding value of base64. This explains why I use the following C# conversion code:

_data = Convert.FromBase64String(tiffEmailString);

_data is the private variable and is used as the return value in the SaveToFile method above (i.e. _data is returned when this.Data property value was used).

Now for the eFax (the email the fails) section header:

Content-Type: image/tiff; name=FAX_20130802_1375447833_61.tif
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="FAX_20130802_1375447833_61.tif"
Content-MD5: 1B2M2Y8AsgTpgAmY7PhCfg==

It too shows base64. So shouldn't Convert.FromBase64String() method call work?

I'm also going to check whether my parser is grabbing additional text. But if I'm missing something, please point it out. Thanks.

Latest update:

As it turns out the issue was not the encoding but my parser! I was inadvertently including an additional header value in the attachment text. It's working now. Thanks.

0

There are 0 answers