How to print Unicode symbols U+2610 and U+2612 to Windows console with Java?

4.6k views Asked by At

What I do:

public class Main {
    public static void main(String[] args) {
        char i = 0x25A0;
        System.out.println(i);
        i = 0x2612;
        System.out.println(i);
        i = 0x2610;
        System.out.println(i);
    }
}

What I get in IDE:

What I get in IDE

What I get in Windows console:

What I get in Windows console

I have Windows 10 (Russian locale), Cp866 default coding in console, UTF-8 coding in IDE. How to make characters in console look correct?

3

There are 3 answers

4
Joey On BEST ANSWER

Two problems here, actually:

  1. Java converts output to its default encoding which doesn't have anything to do with the console encoding, usually. This can apparently only be overridden at VM startup with, e.g.

    java -Dfile.encoding=UTF-8 MyClass
    
  2. The console window has to use a TrueType font in order to display Unicode. However, neither Consolas, nor Lucida Console have ☐, or ☒. So they show up as boxes with Lucida Console and boxes with a question mark with Consolas (i.e. the missing glyph glyph). The output is still fine, you can copy/paste it easily, it just doesn't look right, and since the Windows console doesn't use font substitution (hard to do that with a character grid anyway), there's little you can do to make them show up.

I'd probably just use [█], [ ], and [X] instead.

2
bobince On

Cp866 default coding in console

well yeah. Code page 866 doesn't include characters U+25A0, U+2610 or U+2612. So even if Java were using the correct encoding for the console (either because you set something like -Dfile.encoding=cp866, or it guessed the right encoding, which it almost never manages), you couldn't get the characters out.

How to make characters in console look correct?

You can't.

In theory you could use -Dfile.encoding=utf-8, and set the console encoding to UTF-8 (or near enough, code page 65001). Unfortunately the Windows console is broken for multi-byte encodings (other than the legacy locale-default supported ones, which UTF-8 isn't); you'll get garbled output and hangs on input. This approach is normally unworkable.

The only reliable way to get Unicode to the Windows console is to skip the byte-based C-standard-library I/O functions that Java uses and go straight to the Win32 native WriteConsoleW interface, which accepts Unicode characters (well, UTF-16 code units, same as Java strings) and so avoids the console bugs in byte conversion. You can use JNA to access this API—see example code in this question: Java, UTF-8, and Windows console though it takes some extra tedious work if you want to make it switch between console character output and regular byte output for command piping.

And then you have to hope the user has non-raster fonts (as @Joey mentioned), then then you have to hope the font has glyphs for the characters you want (Consolas doesn't for U+2610 or U+22612). Unless you really really have to, getting the Windows console to do Unicode is largely a waste of your time.

5
GAlexMES On

Are you sure, that the font you use, has characters to display the Unicode? No font supports every possible Unicode character. U+9744,9632 and 9746 are not supported by e.g. the Arial font. You can Change the font of your IDE console and your Windows console too.