does a publicly available partial solution exist (in any language) to parse *nix command line options into a data structure in the case where the option keys are not known in advance.

basically, parse something like

my-script -x ex --y=why zebra

and get

{'x': 'ex', 'y': 'why'}

without knowing that the option keys will be x and y before parsing.

there have been similar questions asked regarding perl and java, but no positive responses.

i understand that "command line options" are not a well-defined syntax and that any such solution will not produce the desired output for all inputs, but am asking if any such partial solution is known.

1

There are 1 answers

9
Charles Duffy On BEST ANSWER

Parsing UNIX command-line options in general is not possible in a schemaless manner, especially when supporting GNU conventions (ie. allowing intermixing between options and arguments).

Consider the usage you gave here:

my-script -x ex --y=why zebra

Now, should this be:

{options: {x: "ex", y: "why"}; arguments=["zebra"]}

...or should it be...

{options: {x: True, y: "why"}; arguments=["ex", "zebra"]}

The answer is that you don't know without knowing whether x accepts an argument -- meaning that you need a schema.


Consider also:

nice -n -1

Is -1 an argument to -n, or is 1 a key value? Again, you can't tell.


Thus: Schemaless command-line parsers exist, but do not cover enough cases to be widely useful -- and thus are typically isolated within the programs that use them, rather than being made into a library.

A typical schemaless command-line parser in bash (4.0 or newer), by the way, might look like the following:

# put key/value pairs into kwargs, other arguments into args
declare -A kwargs=( )
args=( )
args_only=0
for arg; do
  if (( args_only )); then
    args+=( "$arg" )
    continue
  fi
  case $arg in
    --) args_only=1 ;;
    --*=*)
      arg=${arg#--}
      kwargs[${arg%%=*}]=${arg#*=}
      ;;
    --*) kwargs[${arg#--}]=1 ;;
    -*) kwargs[${arg#-}]=1 ;;
    *) args+=( "$arg" ) ;;
  esac
done

This would work with...

my-script --x=ex --y=why zebra

...resulting in the values:

args=( zebra )
kwargs=( [x]=ex [y]=why )

...and also with some more interesting useful cases as well, but still would be a long distance from handling the general case.