Deleting leading spaces is not working in Bash

124 views Asked by At

I have a string in Bash which may or may not start with any number of leading spaces, e.g.

"  foo bar baz"
" foo bar baz"
"foo bar baz"

I want to delete the first instance of "foo" from the string, and any leading spaces (there may not be any).

Following the advice from this question, I have tried the following:

str=" foo bar baz"
regex="[[:space:]]*foo"
echo "${str#$regex}"
echo "${str#[[:space:]]*foo}"

If str has one or more leading spaces, then it will return the result I want, which is _bar baz (underscore = leading space). If the string has no leading spaces, it won't do anything and will return foo bar baz. Both 'echoes' return the same results here.

My understanding is that using * after [[:space:]] should match zero or more instances of [[:space:]], not one or more. What am I missing or doing wrong here?

EDITS

@Raman - I've tried the following, and they also don't work:

echo "${str#[[:space:]]?foo}"
echo "${str#?([[:space:]])foo}"
echo "${str#*([[:space:]])foo}"

All three solutions will not delete 'foo' whether or not there is a trailing space. The only solution that kind of works is the one I posted with the asterisk - it will delete 'foo' when there is a trailing space, but not when there isn't.

3

There are 3 answers

8
gniourf_gniourf On BEST ANSWER

The best thing to do is to use parameter expansions (with extended globs) as follows:

# Make sure extglob is enabled
shopt -s extglob

str=" foo bar baz"
echo "${str##*([[:space:]])}"

This uses the extended glob *([[:space:]]), and the ## parameter expansion (greedy match).

Edit. Since your pattern has the suffix foo, you don't need to use greedy match:

echo "${str#*([[:space:]])foo}"

is enough.

Note. you can put foo in a variable too, but just be careful, you'll have to quote it:

pattern=foo
echo "${str#*([[:space:]])"$pattern"}"

will work. You have to quote it in case the expansion of pattern contains glob characters. For example when pattern="foo[1]".

2
Paul Hodges On

If what you want is a real regex match, you should be using a real regex match:

$: [[ "$str" =~ [[:space:]]*(.*) ]]
$: echo "[${BASH_REMATCH[1]}]"
[foo  bar       baz]

A more pedestrian approach would be to skip the quotes.

$: echo "[$str]"
[ foo bar baz]
$: new=$( echo $str )
$: echo "[$new]"
[foo bar baz]

Be aware that this opens you up to all sorts of messes in any more complex situations. It breaks if you wanted to preserve more than a single consecutive space between values, or a tab instead of just a quote, etc.

$: str=' foo  bar'$'\t''baz';
$: echo "[$str]"
[ foo  bar      baz]
$: new=$( echo $str )
$: echo "[$new]"
[foo bar baz]

It can cause other sorts of havoc too, but it's good to know the trick for the cases when it's appropriate.

3
KamilCuk On

My understanding is that using * after [[:space:]] should match zero or more instances of [[:space:]], not one or more

That's wrong.

What am I missing

That glob is not regex. In regex * matches zero or more preceding characters or groups. In glob * matches anything. It's the same as for filename expansion, think along ls [[:space:]]*foo.

You can use extended bash glob and do:

shopt -s extglob
str=' foo bar baz'
echo "${str#*([[:space:]])foo}"

To do anything more complicated, actually use a regex.

str=' foo bar baz';
[[ $str =~ ^[[:space:]]*foo(.*) ]];
echo "${BASH_REMATCH[1]}"