Honoring quotes while reading shell arguments from a file

297 views Asked by At

In bash, I can pass quoted arguments to a command like this:

$ printf '[%s]\n' 'hello world'
[hello world]

But I can't get it to work right if the argument is coming from a subshell:

$ cat junk
'hello world'
$ printf '[%s]\n' $(cat junk)
['hello]
[world']

Or:

$ cat junk
hello world
$ printf '[%s]\n' $(cat junk)
[hello]
[world]

Or:

$ cat junk
hello\ world
$ printf '[%s]\n' $(cat junk)
[hello\]
[world]

How do I do this correctly?

EDIT: The solution also needs to handle this case:

$ printf '[%s]\n' abc 'hello world'
[abc]
[hello world]

So this solution doesn't work:

$ cat junk
abc 'hello world'
$ printf '[%s]\n' "$(cat junk)"
[abc 'hello world']

The question at Bash quoting issue has been suggested as a duplicate. However, it isn't clear how to apply its accepted answer; the following fails:

$ cat junk
abc 'hello world'
$ FOO=($(cat junk))
$ printf '[%s]\n' "${FOO[@]}"
[abc]
['hello]
[world']
1

There are 1 answers

3
Charles Duffy On BEST ANSWER

There's no one good solution here, but you can choose between bad ones.


This answer requires changing the file format:

Using a NUL-delimited stream for the file is the safest approach; literally any C string (thus, any string bash can store as an array element) can be written and read in this manner.

# write file as a NUL-delimited stream
printf '%s\0' abc 'hello world' >junk

# read file as an array
foo=( )
while IFS= read -r -d '' entry; do
  foo+=( "$entry" )
done <junk

If valid arguments can't contain newlines, you may wish to leave out the -d '' on the reading side and change the \0 on the writing side to \n to use newlines instead of NULs. Note that UNIX filenames can contain newlines, so if your possible arguments include filenames, this approach would be unwise.


This answer almost implements shell-like parsing semantics:

foo=( )
while IFS= read -r -d '' entry; do
  foo+=( "$entry" )
done < <(xargs printf '%s\0' <junk)

xargs has some corner cases surrounding multi-line strings where its parsing isn't quite identical to how a shell does. It's a 99% solution, however.


This answer requires a Python interpreter:

The Python standard library shlex module supports POSIX-compliant string tokenization which is more true to the standard than that implemented by xargs. Note that bash/ksh extensions such as $'foo' are not honored.

shlex_split() {
  python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
    sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d '' entry; do
  foo+=( "$entry" )
done < <(shlex_split <junk)

These answers pose a security risk:

...specifically, if the contents of junk can be written to contain shell-sensitive code (like $(rm -rf /)), you don't want to use either of them:

# use declare
declare "foo=($(cat junk))"

# ...or use eval directly
eval "foo=( $(cat junk) )"

If you want to be sure that foo is written in a way that's safe to read in this way, and you control the code that writes to it, consider:

# write foo array to junk in an eval-safe way, if it contains at least one element
{ printf '%q ' "${foo[@]}" && printf '\n'; } >junk;

Alternately, you could use:

# write a command which, when evaluated, will recreate the variable foo
declare -p foo >junk

and:

# run all commands in the file junk
source junk