I have a .rtf
file. The file is in windows-1251
encoding.
I need to save this string to another file, and I need to save it in utf-8
encoding. And I need this file to be well-readable in result.
So, I try a lot of variants, I read java-docs, and other sources, I spent 2 days in searching for answer, but still, I can't convert it to well-readable file
Here is a file with that string, that you can download to run my tests
That is image content of file
Here is my java tests, that you can use and try to convert file
This is a short cases of my code from file
@Test
public void windows1251toUtf8() throws IOException {
//Prepare file
File dir = new File("/tmp/TESTS/");
if (!dir.exists() && !dir.mkdirs()) {
throw new RuntimeException("Cant create destination dir");
}
File destination = new File(dir, "test.rtf");
if (!destination.exists() && !destination.createNewFile()) {
throw new RuntimeException("Cant create destination file");
}
//-----------------------------------------------------------------------------------------
//Not work
InputStream inputStream = getClass().getClassLoader().getResourceAsStream("utils/encoding/windows1521File.rtf");
Scanner sc = new Scanner(inputStream, "WINDOWS-1251");
StringJoiner stringBuilder = new StringJoiner("\n");
while (sc.hasNextLine()) {
stringBuilder.add(sc.nextLine());
}
String text = decode(stringBuilder.toString(), "WINDOWS-1251", "UTF-8");
byte[] bytes = text.getBytes(Charset.forName("UTF-8"));
Files.write(bytes, destination);
//-----------------------------------------------------------------------------------------
//Not work
URL resource = getClass().getClassLoader().getResource("utils/encoding/windows1521File.rtf");
String string = FileUtils.readFileToString(new File(resource.getPath()), Charset.forName("WINDOWS-1251"));
byte[] bytes = convertEncoding(string.getBytes(), "WINDOWS-1251", "UTF-8");
FileUtils.writeByteArrayToFile(destination, bytes);
//-----------------------------------------------------------------------------------------
//Not work
InputStream inputStream = getClass().getClassLoader().getResourceAsStream("utils/encoding/windows1521File.rtf");
byte[] bytes = IOUtils.toByteArray(inputStream);
String s = new String(bytes);
byte[] bytes2 = s.getBytes("WINDOWS-1251");
FileUtils.writeByteArrayToFile(destination, bytes2);
}
public static byte[] convertEncoding(byte[] bytes, String from, String to) throws UnsupportedEncodingException {
return new String(bytes, from).getBytes(to);
}
public static String decode(String text, String textCharset, String resultCharset) {
if (StringUtils.isEmpty(text)) {
return text;
}
try {
byte[] bytes = text.getBytes(textCharset);
ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes);
byte[] tmp = new byte[bytes.length];
int n = inputStream.read(tmp);
byte[] res = new byte[n];
System.arraycopy(tmp, 0, res, 0, n);
return new String(res, resultCharset);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
In all cases in result, I catch something like this
Or like this
Is there any way to do conversion?