Can an email header have different character encoding than the body of the email?

6k views Asked by At

Is an email with different character encoding for it's header and body valid? The Use Case: While processing an email, should I check for the character encoding of it's header separately, or will checking that of it's body be sufficient? Can someone guide me as to how to figure this out? Thanks in advance!

1

There are 1 answers

0
klarki On BEST ANSWER

Email headers should use the ASCII charset, if you want the header fields to have a different encoding you need to use the encoded word syntax: http://en.wikipedia.org/wiki/MIME#Encoded-Word

The email body can be directly encoded in different encoding only if mail servers that transfer it have 8bit mime enabled (nowadays every mail server should have it enabled, but it's not guaranteed), otherwise you need to encode the body in transfer encoding (quoted-printable or base64)

The charset can be different in each case, that is you can have every encoded word in different charset and every mail part encoded in different charset or even different transfer encoding as well.

For example you can have:

Subject: =?UTF-8?Q?Zg=C5=82oszenie?= //header value in UTF-8 encoded with quoted printable

and the body encoded:

Content-Type: text/plain; charset="iso-8859-2"
Content-Transfer-Encoding: base64

WmG/87PmIEfqtmyxIEphvPE=

different charsets, different transfer encodings in the same email, no problem.

From experience I can tell you that such mails are very common. Even worse, you can get an email that states one charset in Content-Type header and another charset in html body meta tag:

Content-Type: text/html; charset="iso-8859-2"

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charser=utf-8">

It's up to you to guess the actual charset used. Probably it's the one in meta tag.

Assume nothing. Expect everything. Take no prisoners. This is Sparta.