When I cat a file in bash I get the following:
$ cat /tmp/file
microsoft
When I view the same file in vim I get the following:
^@m^@i^@c^@r^@o^@s^@o^@f^@t^@
How can I identify and remove these "non-printable" characters. What does '^@' mean in vim??
(Just a piece of background information: the file was created by base 64 decoding and cutting from the pssh header of an mpd file for Microsoft Playready)
What you see is Vim's visual representation of unprintable characters. It is explained at
:help 'isprint'
:Therefore,
^@
stands for a null byte = 0x00. These (and other non-printable characters) can come from various sources, but in your case it's an ...encoding issue
If you clearly observe your output in Vim, every second byte is a null byte; in between are the expected characters. This is a clear indication that the file uses a multibyte encoding (
utf-16
, big endian, no byte order mark to be precise), and Vim did not properly detect that, and instead opened the file aslatin1
or so (whereas things worked out properly in the terminal).To fix this, you can either explicitly specify the encoding:
Or tweak the
'fileencodings'
option, so that Vim can automatically detect this. However, be aware that ambiguities (as in your case) make this prone to fail:That's why a byte order mark (BOM) is recommended for 16-bit encodings; but that assumes that you have control over the output encoding.