Removing extended ASCII characters. Linux script (128-255)

308 views Asked by At

I want to remove in my text any kind of ASCII character with code in interval [128-255]. gsub(/[^a-z]/, "", $0) --This is how I remove everything except the letters; gsub(/ē|é|ě|è|ū|ú|ǔ|ù|ǖ|ǘ|ǚ|ǜ|ü|ō|ó|ǒ|ò|ī|í|ǐ|ì|ā|á|ǎ|à|å|ä|â/, "", $0) -- This is how I remove some extended characters, but not every. gsub(/"[\128-\255]"/, "", $0) I am trying this, but it shows me an error, invalid interval. So, can anybody please help with that problem. Thanks beforehand.

2

There are 2 answers

1
Ignacio Vazquez-Abrams On BEST ANSWER

Backslash codes must be in octal, or prefixed with a x and in hexadecimal.

\200-\377
\x80-\xff

Or you could just use strings.

0
reece On

The \nnn syntax is octal (where n is 0-7), so:

\128 = invalid octal
\200 = 128
\255 = 173
\377 = 255

So you want:

\200-\377