Display (polish) characters properly

4.9k views Asked by At

I'm reading an xml-file which contains german, french, spanish, english and polish text.

To handle the polish letters (which caused the most trouble) i tried to do it like this:

File file = new File(path);
InputStream is = new FileInputStream(file);
Reader reader = new InputStreamReader(is, charset);

InputSource src = new InputSource(reader);
src.setEncoding(charset.name());

SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();

saxParser.parse(src, handler);

The problem i encountered was that none of the default charsets display the text properly. Some have questionmarks in it some have a combination of other chars in it e.g. ÄÖ..

To break it a bit down I wrote another snippet to test which charset works:

public static void main(String[] args){
        Charset charset = StandardCharsets.UTF_8;
        String chars = "śłuna długie";
        System.out.println(new String(chars.getBytes(charset), charset));
}

Again tested every single one but nothing works.. I hope you've got an idea.

1

There are 1 answers

0
codewing On BEST ANSWER

My solution: Change the encoding of your ide

I used the default encoding of my ide (intellij) which was "windows-1252", due to the fact that I'm using windows on this pc.

So I changed it to UTF-8 and the short test code worked fine for me.