Azure CLI and PowerShell 7.3 OutputEncoding issue

217 views Asked by At

There are two different commands run in my PowerShell 7.3 within Windows Terminal.

  1. Just run the az devops user show command.

    az devops user show --user $UserPrincipalName -o json
    

    enter image description here

  2. Run the same command with a $(...).

    $(az devops user show --user $UserPrincipalName -o json)
    

    enter image description here

It because my user info contains Chinese characters. The first command will be able to show correct Chinese characters. The seconds command will show garbled text on the screen.

With the same commands, if I run in my Windows PowerShell 5.1, it display the Chinese characters perfectly. Why?

What's the difference between these two commands? How to display the Chinese characters correctly on the screen when running the second command in PowerShell 7.3?

Edit

I tried to run this before run my second command. Then the output display Chinese characters correctly.

[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding('big5')

Is there any way I can force az command output using utf8 encoding?

2

There are 2 answers

3
mklement0 On BEST ANSWER

tl;dr

  • If both the OEM and ANSI code pages associated with your computer's legacy system locale (aka language for non-Unicode programs) are 950 - as they would be with Chinese (Traditional, Taiwan), for instance - no extra effort should be needed to properly decode az's output - except if you went out of your way to use a different code page for your PowerShell (Core) 7.3 sessions (see next section).

    • If code page 950, "Microsoft's implementation of the de facto standard Big5 character encoding." is sufficient for your needs, i.e. if it can encode all characters you need, there is no need for UTF-8.

    • To query the code pages in effect, run:

      Get-ItemPropertyValue registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage OEMCP, ACP
      
    • To see the name of the system locale in effect or change the system locale (requires administrative privileges), run intl.cpl, select the Administrative tab, and click on Change system locale...

  • If you do need UTF-8 output:

    • There is seemingly no direct way to instruct az to produce UTF-8 output (see next section for details).

    • There are only two - suboptimal - workarounds:

      • Either: Change your system to use UTF-8 system-wide, in which case both the OEM and ANSI code page default to 65001, i.e. UTF-8. However, this has far-reaching consequences and should only be done after careful consideration - see this answer.

        • That said, with this one-time workaround in place, no extra efforts is needed: az (as well as any other CLIs that uses the system locale's code pages) will output UTF-8, and PowerShell will decode it as such.
      • Or: Create a modified entry point for the az CLI:

        • Define the following az function, which emulates what az.cmd does while also requesting UTF-8 output via the Python command line and ensuring that PowerShell decodes it properly.
          If you place it in your $PROFILE file, it'll be available in future sessions (except those started with -NoProfile), and because functions have higher command precedence than external applications (including batch files), it will execute instead of az.cmd:
# Custom az CLI entry point that requests UTF-8 output and decodes it as such.
# Based on the az.cmd batch file that comes with v2.57.0
# (but this batch file's content rarely changes).
function az {

  # Determine the full az.cmd path and derive the location of the
  # bundled Python executable.
  $azCliPath = (Get-Command -ErrorAction Ignore az.cmd).Path
  if (-not $azCliPath) { throw "'az.cmd' cannot be located via the system's path." }
  $bundledPythonExe = Convert-Path -ErrorAction Ignore -LiteralPath "$azCliPath\..\..\python.exe"
  if (-not $bundledPythonExe) { throw "Failed to load Python executable." }

  # Prepare the environment and temporarily instruct
  # PowerShell to decode external-program output as UTF-8.
  $prevValue = $env:AZ_INSTALLER; $env:AZ_INSTALLER = 'MSI'
  $prevEncoding = [Console]::OutputEncoding; [Console]::OutputEncoding = [Text.UTF8Encoding]::new()

  # Call the actual CLI via the bundled Python, requesting UTF-8 output (-X utf8),
  # and passing all arguments as well as the output through.
  & $bundledPythonExe -X utf8 -IBm azure.cli @args

  # Restore previous settings.
  [Console]::OutputEncoding = $prevEncoding
  $env:AZ_INSTALLER = $prevValue

}

Background information

  • Fundamental PowerShell behavior:

    • PowerShell decodes the output from external programs (executables) into .NET strings whenever it captures such output, which happens when storing output in a variable, sending it on through the pipeline to another command, sending it to a file,[1] or enclosing it in the following operators to allow it to participate in a larger statement: $(...), @(...), or (...)

    • When no capturing and therefore no decoding is involved, i.e. when an external program is allowed to print directly to the console, the display output often appears fine; conversely, character-encoding mismatch problems only arise when capturing is involved, which explains the difference between your two commands (direct output vs. enclosure in $(...)).

    • When capturing is involved, the output is decoded based on the character encoding stored in [Console]::OutputEncoding.

      • By default, this encoding reflects the current console window's output code page, as reported by chcp.com, e.g. 950, which is the equivalent of [System.Text.Encoding]::GetEncoding('big5'), but it can be changed in-session (in-process).
    • In turn, the console code pages (both the in- and output code page, which are usually set together, to the same value) default to the OEM code page associated with the legacy system locale (aka language for non-Unicode programs).

    • The two PowerShell editions - the legacy, Windows-only Windows PowerShell and the modern, cross-platform PowerShell (Core) 7+ do not differ with respect to this behavior.

    • Therefore, if your code behaved as expected in Windows PowerShell, but not in PowerShell 7.3 on a given machine, there are only two possible explanations:

      • In your PowerShell 7.3 session, you modified [Console]::OutputEncoding after session startup.

      • You used the registry to preconfigure a different console code page for PowerShell (Core) sessions (by defining a CodePage DWORD value at HKEY_CURRENT_USER\Console\<full-exe-path-with-backslashes-replaced-with-underscores>, but note that launching from a shortcut file bypasses that).

      • In either case, you must (at least temporarily) set [Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding('big5') to match the encoding used by az in your environment.

  • Python's role:

    • The Azure CLI's entry point, az.cmd, is a batch file that uses a bundled Python version to call the actual implementation, which is a Python module. As of CLI version v2.57.0, the bundled Python version is v3.11.7

    • Python's command-line behavior is unusual, in that it uses the system locale's ANSI code page to encode its output by default; however, in a locale associated with code page 950, the OEM and the ANSI code page have the same value, which is why by default no extra effort should be required for proper decoding.

    • While Python normally allows the use of environment variables to request UTF-8 output instead ($env:PYTHONUTF8=1 in Python v3.7+), the python.exe CLI call performed in az.cmd explicitly prevents that (option -I), necessitating the custom function approach above, so that UTF-8 can be requested via a CLI option (-X utf8).


[1] As a selective exception in PowerShell 7.4+, > and >>, the redirection operators - unlike calls to cmdlets such as Set-Content and Out-File - now save the raw byte output from external programs to the target file. See this answer for background information.

0
Will Huang On

Based on the answer of @mklement0 provided. Here is the final solution I choose:

  1. Edit C:\Program Files\Microsoft SDKs\Azure\CLI2\wbin\az.cmd file.

    ::
    :: Microsoft Azure CLI - Windows Installer - Author file components script
    :: Copyright (C) Microsoft Corporation. All Rights Reserved.
    ::
    
    @IF EXIST "%~dp0\..\python.exe" (
    SET AZ_INSTALLER=MSI
    "%~dp0\..\python.exe" -IBm azure.cli %*
    ) ELSE (
    echo Failed to load python executable.
    exit /b 1
    )
    
  2. Add -X utf8 argument to Line 8.

    ::
    :: Microsoft Azure CLI - Windows Installer - Author file components script
    :: Copyright (C) Microsoft Corporation. All Rights Reserved.
    ::
    
    @IF EXIST "%~dp0\..\python.exe" (
    SET AZ_INSTALLER=MSI
    "%~dp0\..\python.exe" -X utf8 -IBm azure.cli %*
    ) ELSE (
    echo Failed to load python executable.
    exit /b 1
    )
    

Then the az.cmd will always produce UTF-8 encoding. That solved all my issues.