Using sed to replace delimited lists inside larger body of text

105 views Asked by At

I have a large file with many instances of variable length lists of numbers in square brackets, max one list per line, list is never empty, e.g.:

[1, 45, 54, 78] or [32]

I want to get rid of the square brackets and the commas, e.g.:

1 45 54 78 or 32

I can successfully match them with this regex in sed:

\\[\\([0-9]*\\)\\(, \\([0-9]*\\)\\)*\\]

but I don't know how to use group numbers to refer to the groups I want, e.g. doing:

sed  's/\\t\[\\([0-9]*\\)\\(, \\([0-9]*\\)\\)*\\]/\\t\\1 \\3/g'

will only result in the destination file getting the first and the last numbers in the list.

(I did solve my problem using awk, but am wondering if it can be done using sed)

Is there any way to refer to variable number of groups in sed?

3

There are 3 answers

0
potong On

This might work for you (GNU sed):

sed -r ':a;/\[([0-9]+(, )*)+\]/!b;s//\n&\n/;h;s/[][,]//g;G;s/.*\n(.*)\n.*\n(.*)\n.*\n/\2\1/;ba' file

This finds the pattern, marks it with a newline either side and copies the entire line to the hold space. It then deletes the brackets and commas in the pattern and recombines the altered with the original pattern and then repeats until no further patterns are found.

0
Floris On

How about:

sed 's/\[([\d ,]+)\]/\1/g' | sed 's/,//g'

Two separate commands - first extracts "stuff inside square brackets", second strips commas.

0
Jotne On

This awk should do:

awk '{gsub(/[][,]/,x)}1' file
1 45 54 78 or 32