How Java Convert latin to UTF-8 perfect?

1.3k views Asked by At

When I try to convert latin1 String to utf8 by Java,something wrong happen. as follows code:

    byte[] latin2 = "¦ñ¨ãÓñ²½ìá".getBytes("ISO-8859-1");
    byte[] latin1 = "¦á¨ãÓñ²½ìá".getBytes("ISO-8859-1");
    byte[] utf8 = new String(latin1, "GB2312").getBytes("GB2312");
    byte[] utf81 = new String(latin2, "GB2312").getBytes("GB2312");
    System.out.println(new String(utf8,"GB2312"));
    System.out.println(new String(utf81,"GB2312"));

The output is

 ?ㄣ玉步灬
 ?ㄣ玉步灬

So,I'm comfused about it,how can i convert latin1 to utf8 exact?

The DB field is:

`name` char(20) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
1

There are 1 answers

0
Bertrand G. On

The second parameter in a new String(bytes, charset) call is to set the Charset used for decoding the byte array (From Javadoc: "charset The charset to be used to decode the bytes")... Hence in your case it should be set to the one you used to encode the bytes: "ISO-8859-1":

new String(latin1, "ISO-8859-1").getBytes("GB2312");