How to get two-sequence representation of UTF-8 character using JavaMail's MimeUtility or Apache Commons and quoted-printable?

207 views Asked by At

I'm having a string which contains the German ü character. Its UTF value is 0xFC, but its quoted-printable sequence should actually be =C3=BC instead of =FC. However, using JavaMail's MimeUtility like below, I can only get the single-sequence representation.

String s = "Für";
ByteArrayOutputStream baos = new ByteArrayOutputStream ();
OutputStream encodedOut = MimeUtility.encode (baos, "quoted-printable");

encodedOut.write (s.getBytes (StandardCharsets.UTF_8));
String encoded = baos.toString ();   // F=FCr

(Defining StandardCharsets.US_ASCII instead of UTF_8 resulted in F?r, which is - obviously - not what I want.)

I have also already taken a look into Apache Commons' QuotedPrintableCodec, which I used like this:

String s = "Für";
QuotedPrintableCodec qpc = new QuotedPrintableCodec ();
String encoded = qpc.encode (s, StandardCharsets.UTF_8);

However, this resulted in F=EF=BF=BDr, which is similar to the result Java's URLEncoder would produce (% instead of = as an escape character, F%EF%BF%BDr), and which is not understandable to me.

I'm getting the string from a JavaMail MimeMessage using a ByteArrayOutputStream like so:

ByteArrayOutputStream baos = new ByteArrayOutputStream ();
message.writeTo (baos);
String s = baos.toString ();

On the initial store procedure, I receive a string containing a literal (whose correct quoted-printable sequence seems to be =EF=BF=BD) instead of an umlaut-u. However, on any consecutive request Thunderbird makes (e.g. copying to Sent), I receive the correct ü. Is that something I can fix?

What I would like to receive is the two-sequence representation as required by IMAP and the respective mail clients. How would I go about that?

0

There are 0 answers