SED wildcard selects too large range

264 views Asked by At

Working with Sed on Mac osx 10.6.8 creating a .command file with Text Editor which will be executed in Geektool. I have a string MYSTRING and try to remove the link-tags from it. But when using the wildcard Sed seems to select a too long range.

MYSTRING="<link>part_1</link>This part must remain.<link>part_x</link> Like this part."
echo $MYSTRING |
sed s/"<link>".*"<\/link>"//g

I had expected this result:

This part must remain. Like this part.

But the actual result is:

 Like this part.

It seems that Sed takes the first link as the from-value and the last /link as the to-value, causing everything in between to be deleted. How do I make Sed understand it should take the first /link after link rather than taking the last?

2

There are 2 answers

0
kev On

Because .* is greedy. Try:

sed 's@<link>[^<]*</link>@@g'
0
tripleee On

The behavior of the regex repetition operator * is to match the longest, leftmost possible match. Unfortunately, sed does not support stingy matching, but Perl does:

perl -pe 's%<link>.*?</link>%%g'

or maybe you can formulate a regex where greedy matching is unproblematic;

sed 's%<link>[^<>]*</link>%%g'