Parsing a delimited string into an array in Bash - why is "$var" different from $var though $var has no spaces?

139 views Asked by At

I am running Bash version 4.2.25. Here is my code:

#!/usr/bin/env bash

string="one:two:three:four"

# without quotes
IFS=: read -ra array_1 <<< $string
for i in "${array_1[@]}"; do printf "i = [$i]\n"; done
# output:
# i = [one two three four]

# with quotes
IFS=: read -ra array_2 <<< "$string"
for i in "${array_2[@]}"; do printf "i = [$i]\n"; done
# output:
# i = [one]
# i = [two]
# i = [three]
# i = [four]

What explains the difference in behavior?

2

There are 2 answers

0
that other guy On BEST ANSWER

I'm unable to reproduce your problem on Linux with bash 4.2.46 and bash 4.3.30. However, here's an adapted version that does show the described behavior:

string="one:two:three:four"
IFS=:

read -ra array_1 <<< $string
for i in "${array_1[@]}"; do printf "i = [$i]\n"; done
# i = [one two three four]

read -ra array_2 <<< "$string"
for i in "${array_2[@]}"; do printf "i = [$i]\n"; done
# i = [one]
# i = [two]
# i = [three]
# i = [four]

This happens because variables are not actually split on spaces, they're split on $IFS (which defaults to space, tab and linefeed).

Since we've overridden $IFS, it's values with colons we have to be careful about quoting. Spaces no longer matter.

The source code shows that Bash hardcodes a space in string_list, called through write_here_string. When IFS does not include a space, a string that expands to multiple words will no longer be read into tokens along similar lines, making the difference more pronounced.

PS: This is a good example of why we should always quote our variables, even when we know what they contain.

1
cxw On

It looks like a bug. I looked back through CHANGES and couldn't find anything specific, but on cygwin bash 4.3.48(8), both quoted and unquoted give the expected output (four lines). Sometime when I have bandwidth to burn I'll clone the repo and blame redir.c to see if I can find some relevant commits.