I have a batch script which does a curl command using:
powershell -Command "curl.exe -k -S -X GET -H 'Version: 16.0' -H 'Accept: application/json' 'https://IntendedURL' > result.json".
I am then using the result.json as an input file to another program.
The issue I am facing is that running the batch script results in a result.json file with format UTF-16 LE BOM. With UTF-16 LE BOM format, the .json file cannot even be opened and observed directly using a normal browser. It has exception of "SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data".
I did some searching around and the answer that came back is that curl should not result in BOM encoding being saved. Nonetheless, I ended up with result.json which is with BOM encoding. If there is something that can be done to the batch script to make it output without BOM, help is appreciated.
For the second part of my issue, I can manually copy the content of result.json and save it in a new text document and specifically choose UTF-8 format. I searched around as well and found a bit of a solution like here: Batch script remove BOM () from file , but the solution script to remove the BOM seems rather complicated.
Thus, any help to remove the BOM encoding without resorting to manually copying the content and saving it in the specific desired format is appreciated as well.
Best regards, cyborg1234
Note:
curl.exedirectly from your batch file.In PowerShell,
>is (in effect) an alias of theOut-Filecmdlet, whose default in Windows PowerShell is UTF-16LE (in PowerShell (Core) 7+, it is now the more sensible (BOM-less) UTF-8, across all cmdlets).You have two options (spoiler alert: choose the 2nd):
In principle, you can use
Out-Fileexplicitly (or preferably, with text input,Set-Content), in which case you can use its-Encodingparameter to control the output character encoding. However:In Windows PowerShell
-Encoding UTF8invariably creates a UTF-8 file with a BOM. Short of using .NET APIs directly, the only way to create BOM-less UTF-8 files in Windows PowerShell is to useNew-Item- see this answer.Even clearing that hurdle is typically not enough, and the following also affects PowerShell (Core) up to v7.3.x (but is no longer a problem in v7.4+ - see this answer):
When PowerShell captures output from external programs it currently invariably decodes them into .NET strings first - even when redirecting to a file - using the character encoding stored in
[Console]::OutputEncoding, and then re-encodes on output, based on the specified or default character encoding of the cmdlet used.If that encoding doesn't match the character encoding of what
curl.exeoutputs, the output will be misinterpreted - and given that[Console]::OutputEncodingdefaults to the legacy system locale's OEM code page, misinterpretation will occur if the output is UTF-8-encoded, so - unless you've explicitly configured your system to use UTF-8 system-wide (see this answer) - you'll need to (temporarily) set[Console]::OutputEncoding = [Text.UTF8Encoding]::new()before thecurl.execall.Preferably, let
curl.exeitself write the output file, i.e. replace> result.jsonwith-o result.json: this will save the output as-is to the target file.Taking a step back:
You can make your
curl.execall directly from your batch file, which avoids the character-encoding problems.cmd.exe's>operator, unlike PowerShell's is a raw byte conduit, so no data corruption should occur - that said, you're free to use-o result.jsoninstead.When calling from
cmd.exe(a batch file), only"..."quoting is supported, so the'...'strings have been transformed into"..."ones below.