I'm trying to use a regex to format some binary from xxd -b
, but to demonstrate this simply I'll show you what I expect to happen:
Regex to delete: /1x|1.*/
Text: 1x21y3333333313333
-> 2
Where all occurrences of 1x
are deleted, then everything starting at the first 1 that shows up should be deleted. It should be immediately obvious what's going on, but if it's not, play with this. The key is that if 1x
is matched, the rest of the pattern should be aborted.
Here is the output from echo "AA" | xxd -b
(the bindump of AA\n
):
0000000: 01000001 01000001 00001010 AA.
My goal is to 1. delete the first 0 for every byte (ascii = 7 bits) and 2. delete the rest of the string so only the actual binary is kept. So I have piped it into sed 's/ 0//g'
:
0000000:100000110000010001010 AA.
Adding the second step, sed -E 's/ 0| .*//g'
:
0000000:
Obviously, I expect to instead get:
0000000:100000110000010001010
Things I've tried but haven't done the job:
xxd
can take-g0
to merge the columns, but it retains the first zero in every byte (characters each take up a byte, not 7 bits)-r
I will use perl instead in the meantime, but this behaviour baffles me and maybe there's a reason (lesson) here?
If I understand your question correctly, this produces what you want:
The key change here is the use of two blanks in front of
.*
so that this only matches the part that you want to remove.Alternatively, we can remove blank-zero first: