Print non-ascii/unicode characters in shell

1.9k views Asked by At

I am using following command to search and print non-ascii characters:

grep --color -R -C 2 -P -n "[\x80-\xFF]" .

The output that I get, prints the line which has non-ascii characters in it. However it does not print the actual unicode character.

Is there a way to print the unicode character?

output

./test.yml-35-
./test.yml-36-- name: Flush Handlers
./test.yml:37:  meta: flush_handlers
./test.yml-38-
--
1

There are 1 answers

0
Thomas Dickey On BEST ANSWER

This was answered in Searching for non-ascii characters. The real issue as shown in Filtering invalid utf8 is that the regular expression you are using is for single bytes, while UTF-8 is a multibyte encoding (and the pattern must therefore cover multiple bytes).

The extensive answer by @Peter O in the latter Q/A appears to be the best one, using Perl. grep is the wrong tool.