Wrong default system encoding detected in Java application

327 views Asked by Soamid At 16 March 2022 at 14:40

What is the problem?

I've noticed a strange problem with Java showing different default file encodings while running at the same machine and OS (Windows 10). If I run my Gradle application from a console, Charset.defaultCharset() shows Windows-1250. When I run it from IntelliJ (also as Gradle app) it shows Windows-1252.

It is even more strange when I run it on a different computer with Windows 11 - the results are quite opposite, Windows-1252 while running from a console and Windows-1250 in IntelliJ.

The correct system encoding for my OS (Polish version of Win 10/11) should always be Windows-1250 as far as I know.

I use AdoptOpenJDK 16, Gradle 7.0 and IJ 2021.3.2.

Why is it important in my case?

My Java application executes external Python scripts and communicates with Python processes created by ProcessBuilder via Process.getInput/OutputStream(). When I send some data with non-ascii characters through that stream, they are replaced with ? and read as such on the Python side. For example, on Java side I am sending a line like this:

try (var inputWriter = new BufferedWriter(new OutputStreamWriter(scriptProcess.getOutputStream()))) {
    inputWriter.write("Właściciel");
}

and on the Python side I am receiving this data like this:

inputBuffer = []
for line in stdin:
    inputBuffer.append(line.rstrip())

When I print inputBuffer or write it to a file, it shows W?a?ciciel. It's worth noting that this behavior doesn't depend on the encoding of the input string itself - "Właściciel" can be read from UTF-8 or Windows-1250 or Windows-1252 file and the problem remains the same.

If I force a correct encoding by adding it as a Writer's parametr:

var writer = new OutputStreamWriter(scriptProcess.getOutputStream(), "Windows-1250")

..then it works ok, question marks disappear. But I feel hardcoding "system encoding" is not a good solution, because it will collapse if someone runs my app on Windows with other regional settings (e.g. with the English language, where default encoding is UTF-8).

So my question is: is there another way to determine valid system encoding or to create communication between processes that is independent of system encoding/region settings?

Original Q&A

TechQA.

Wrong default system encoding detected in Java application

What is the problem?

Why is it important in my case?

There are 0 answers

Related Questions in PYTHON

Related Questions in JAVA

Related Questions in WINDOWS

Related Questions in ENCODING

Related Questions in CP1250

Popular Questions

Popular Tags

Trending Questions