PHP - How to save text file with ANSI encoding?

16.8k views Asked by At

I'm doing:

file_put_contents("txt/myfile.txt", $fileContents);

I have tried many ways to force my text file to be ANSI, like:

$fileContents = mb_convert_encoding($fileContents , mb_detect_encoding($fileContents , mb_detect_order(), true), 'WINDOWS-1252');

I have also tried:

$fileContents = iconv("ISO-8859-1", "WINDOWS-1252", $fileContents );

I need ANSI because the text file should look nice when I open it with the "type" command from MS-DOS (cmd.exe in Windows 7)

If I open my current file I can see the UTF-8 BOM:

C:\Users\XXX>type C:\myfile.txt

´╗┐V017666999 00000000000000005350005122013

If I open the file with Notepad++ and apply "Convert to ANSI" I get (what I need):

C:\Users\XXX>type C:\myfile.txt

V017666999 00000000000000005350005122013

Is there any way I can fix this? Thanks in advance.

3

There are 3 answers

0
DuckN'Bear On BEST ANSWER

Now I know what happened, the file is being created correctly, but the undesired BOM is added when I download it.

This is the problem, I just had to change this:

/* Bad code */
header('Content-disposition: attachment; filename='.$_GET['filename']);
header('Content-type: application/txt');
readfile($_GET['filename']);

to this (Download as binary file so it remains intact):

/* Good code */
header('Content-disposition: attachment; filename='.$_GET['filename']);
header('Content-type: application/txt');
header('Content-Transfer-Encoding: binary');
header('Content-Description: File Transfer');
header('Content-Transfer-Encoding: binary');
header('Cache-Control: must-revalidate');
ob_clean();
flush();
readfile('txt/'.$_GET['filename']);

(This was originally posted as an edit on the question, but @Daniel suggested posting an answer for clarification).

0
Vladislav Rastrusny On

If you have non-ASCII characters in your PHP file, you need to convert encoding of your PHP file in the first place.

If your non-ASCII characters come from some external source, you need to iconv it. But not from ISO-8859-1, but from the encoding of your external source.

0
NVRM On

The following regex will strip out all kinds of ansii escape sequences, including colors, ansii rgb colors, cursor movements, line jumps, to keep only the UTF-8 characters.

Example screen capture from the console: enter image description here

Example raw input as an image, since html pages aren't showing escape codes:

raw ansii

Strip ANSII colors to raw UTF-8 text with php:

<?php    
$ansii = " |[0m [34m▓▓▓▓▓[0m |[0m[2m.[0m[34m▓▓▓  [0m[2m.[0m|[0m [34m ▓▓▓ [0m |[0m[2m.[0m[34m▓ ▓ ▓[0m[2m.[0m|[0m [34m  ▓  [0m |[0m[2m.[0m[34m ▓▓▓ [0m[2m.[0m|[0m [34m▓▓▓  [0m |[0m[2m.[0m[34m▓▓▓▓▓[0m[2m.[0m| [2m[37m♞ [ENGINES] ♘♘♘♘♘♘♘♘♘♘♘♘♘♘ ♞[0m";

echo preg_replace("/\x1B\[[0-9;]*[JKmsu]/","",$ansii);

/* OUTPUT ----*/
/* | ▓▓▓▓▓ |.▓▓▓  .|  ▓▓▓  |.▓ ▓ ▓.|   ▓   |. ▓▓▓ .| ▓▓▓   |.▓▓▓▓▓.| ♞ [ENGINES] ♘♘♘♘♘♘♘♘♘♘♘♘♘♘ ♞*/

Where even the linux command line utility: iconv -f "ASCII" -t "UTF-8" is failing to parse 16bits rgb true colors ansii escape codes.

This can be used to replace most of the --not core included-- php-mbstring package components, since as example php strlen() will then return the good length just as mb_strlen().

Run online: https://3v4l.org/vRScD