Print emoji/foreign characters in CLI/shell/Terminal using PHP

391 views Asked by At

When I run a script in the terminal containing <?php echo "पीएचपी";, it displays garbage characters instead of the emoji and foreign text.

Specifically, it displays 🚀पीएचपी.

However, running a Node.js script containing console.log("पीएचपी") correctly displays the emoji and the foreign text as पीएचपी.

How can I echo/print emojis and the foreign text properly so they display as intended in the CLI when using PHP?

Any suggestions on how to resolve this and get PHP to display emojis and unicode text correctly in the terminal?

This scenario has been tested using Windows Terminal (Powershell 7), cmd and GitBash(MINGW64) terminal

Running chcp in my windows terminal returns 65001 (which is utf-8). So the terminal itself is configured UTF-8 properly. Reference for chcp: https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers?redirectedfrom=MSDN

Minimal Reproducible Example:

  1. Run chcp 65001 in Windows Terminal/cmd.
  2. Run chcp again to make sure it returns Active code page: 65001.
  3. Run the php script below(make sure extension=mbstring is enabled in php.ini):
<?php

$utf8_string = "पीएचपी";
$detected_encoding = mb_detect_encoding($utf8_string);

echo "Detected encoding[$utf8_string]: " . $detected_encoding;
  1. I still got this displayed:
Detected encoding[🚀पीएचपी]: UTF-8

ADDENDUM: I am using PHP7.0 . It works in PHP 8.2 but not PHP 7.0

4

There are 4 answers

0
Aizzat Suhardi On BEST ANSWER
shell_exec("chcp 65001");
echo "Hello, पीएचपी";

You have to definitely run shell_exec(chcp 65001) once before outputting emojis and foreign text. This answer has been tested with PHP7.0 using Windows Terminal and Powershell.

sapi_windows_cp_set as related to sapi_windows_cp_get pointed out by @Oliver is only available PHP >=7.1.

3
Shila Mosammami On

The issue you're seeing might be related to the terminal's encoding settings, rather than PHP itself. Your terminal needs to support and be set to use UTF-8 to correctly display the emoji and foreign text. The mb_detect_encoding function is detecting the encoding of the string as UTF-8, which is correct.

To verify that PHP is correctly handling the UTF-8 encoded string, you could write the string to a file and then open that file in a text editor that you know supports UTF-8. If the text displays correctly in the text editor, then PHP is handling the UTF-8 encoding correctly, and the issue is likely with your terminal's settings.

0
fat penguin On

It seems that it's not the encoding, but rather the font you are using. While the encoding is correct, the font you use may not have the correct glyphs for the Windows terminal (cmd / powershell).

Do you have the Arabic langauge pack installed? It maybe helpful as well.

Just as a point of reference, my output of your script looks like this: script output

However it is perfectly normal when I copied and pasted it in a browser Detected encoding[पीएचपी]: UTF-8

Sorry can't be more help, I hope this helps pointing you in the right direction.

1
Olivier On

PHP 7.1 introduced a number of changes related to code pages on Windows (see here for the details). One of those changes is the call to php_win32_cp_cli_setup() in the CLI SAPI. That function ultimately calls the SetConsoleOutputCP() Win32 API to set the code page associated with the console.

The code page is set according to the default_charset PHP option. By default, the value of that option is UTF-8, so the code page is set to 65001:

C:\Users\Olivier>C:\php\php.exe -r "echo sapi_windows_cp_get();"
65001

If I set default_charset = "windows-1252" in php.ini, I get:

C:\Users\Olivier>C:\php\php.exe -r "echo sapi_windows_cp_get();"
1252

You mentioned in a comment that you were using PHP 7.0. With that version, the CLI runs with the default OEM code page, which causes your encoding mismatch.