How to expand environment variables in python as bash does?

12.4k views Asked by At

With os.path.expandvars I can expand environment variables in a string, but with the caveat: "Malformed variable names and references to non-existing variables are left unchanged" (emphasis mine). And besides, os.path.expandvars expands escaped \$ too.

I would like to expand the variables in a bash-like fashion, at least in these two points. Compare:

import os.environ
import os.path
os.environ['MyVar'] = 'my_var'
if 'unknown' in os.environ:
  del os.environ['unknown']
print(os.path.expandvars("$MyVar$unknown\$MyVar"))

which gives my_var$unknown\my_var with:

unset unknown
MyVar=my_var
echo $MyVar$unknown\$MyVar

which gives my_var$MyVar, and this is what I want.

7

There are 7 answers

9
fferri On

Try this:

re.sub('\$[A-Za-z_][A-Za-z0-9_]*', '', os.path.expandvars(path))

The regular expression should match any valid variable name, as per this answer, and every match will be substituted with the empty string.

Edit: if you don't want to replace escaped vars (i.e. \$VAR), use a negative lookbehind assertion in the regex:

re.sub(r'(?<!\\)\$[A-Za-z_][A-Za-z0-9_]*', '', os.path.expandvars(path))

(which says the match should not be preceded by \).

Edit 2: let's make this a function:

def expandvars2(path):
    return re.sub(r'(?<!\\)\$[A-Za-z_][A-Za-z0-9_]*', '', os.path.expandvars(path))

check the result:

>>> print(expandvars2('$TERM$FOO\$BAR'))
xterm-256color\$BAR

the variable $TERM gets expanded to its value, the nonexisting variable $FOO is expanded to the empty string, and \$BAR is not touched.

1
fferri On

The alternative solution - as pointed out by @HuStmpHrrr - is that you let bash evaluate your string, so that you don't have to replicate all the wanted bash functionality in python.

Not as efficient as the other solution I gave, but it is very simple, which is also a nice feature :)

>>> from subprocess import check_output
>>> s = '$TERM$FOO\$TERM'
>>> check_output(["bash","-c","echo \"{}\"".format(s)])
b'xterm-256color$TERM\n'

P.S. beware of escaping of " and \: you may want to replace \ with \\ and " with \" in s before calling check_output

1
davidedb On

The following implementation maintain full compatibility with os.path.expandvars, yet allows a greater flexibility through optional parameters:

import os
import re

def expandvars(path, default=None, skip_escaped=False):
    """Expand environment variables of form $var and ${var}.
       If parameter 'skip_escaped' is True, all escaped variable references
       (i.e. preceded by backslashes) are skipped.
       Unknown variables are set to 'default'. If 'default' is None,
       they are left unchanged.
    """
    def replace_var(m):
        return os.environ.get(m.group(2) or m.group(1), m.group(0) if default is None else default)
    reVar = (r'(?<!\\)' if skip_escaped else '') + r'\$(\w+|\{([^}]*)\})'
    return re.sub(reVar, replace_var, path)

Below are some invocation examples:

>>> expandvars("$SHELL$unknown\$SHELL")
'/bin/bash$unknown\\/bin/bash'

>>> expandvars("$SHELL$unknown\$SHELL", '')
'/bin/bash\\/bin/bash'

>>> expandvars("$SHELL$unknown\$SHELL", '', True)
'/bin/bash\\$SHELL'
0
slhck On

There is a pip package called expandvars which does exactly that.

pip3 install expandvars
from expandvars import expandvars

print(expandvars("$PATH:${HOME:?}/bin:${SOME_UNDEFINED_PATH:-/default/path}"))
# /bin:/sbin:/usr/bin:/usr/sbin:/home/you/bin:/default/path

It has the benefit of implementing default value syntax (i.e., ${VARNAME:-default}).

2
Seth Robertson On

I was unhappy with the various answers, needing a little more sophistication to handle more edge cases such as arbitrary numbers of backslashes and ${} style variables, but not wanting to pay the cost of a bash eval. Here is my regex based solution:

#!/bin/python

import re
import os

def expandvars(data,environ=os.environ):
    out = ""
    regex = r'''
             ( (?:.*?(?<!\\))                   # Match non-variable ending in non-slash
               (?:\\\\)* )                      # Match 0 or even number of backslash
             (?:$|\$ (?: (\w+)|\{(\w+)\} ) )    # Match variable or END
        '''

    for m in re.finditer(regex, data, re.VERBOSE|re.DOTALL):
        this = re.sub(r'\\(.)',lambda x: x.group(1),m.group(1))
        v = m.group(2) if m.group(2) else m.group(3)
        if v and v in environ:
            this += environ[v]
        out += this
    return out


# Replace with os.environ as desired
envars = { "foo":"bar", "baz":"$Baz" }

tests = { r"foo": r"foo",
          r"$foo": r"bar",
          r"$$": r"$$",                 # This could be considered a bug
          r"$$foo": r"$bar",            # This could be considered a bug
          r"\n$foo\r": r"nbarr",        # This could be considered a bug
          r"$bar": r"",
          r"$baz": r"$Baz",
          r"bar$foo": r"barbar",
          r"$foo$foo": r"barbar",
          r"$foobar": r"",
          r"$foo bar": r"bar bar",
          r"$foo-Bar": r"bar-Bar",
          r"$foo_Bar": r"",
          r"${foo}bar": r"barbar",
          r"baz${foo}bar": r"bazbarbar",
          r"foo\$baz": r"foo$baz",
          r"foo\\$baz": r"foo\$Baz",
          r"\$baz": r"$baz",
          r"\\$foo": r"\bar",
          r"\\\$foo": r"\$foo",
          r"\\\\$foo": r"\\bar",
          r"\\\\\$foo": r"\\$foo" }

for t,v in tests.iteritems():
    g = expandvars(t,envars)
    if v != g:
        print "%s -> '%s' != '%s'"%(t,g,v)
        print "\n\n"
0
Artyer On

Here's a solution that uses the original expandvars logic: Temporarily replace os.environ with a proxy object that makes unknown variables empty strings. Note that a defaultdict wouldn't work because os.environ

For your escape issue, you can replace r'\$' with some value that is guaranteed not to be in the string and will not be expanded, then replace it back.

class EnvironProxy(object):
    __slots__ = ('_original_environ',)

    def __init__(self):
        self._original_environ = os.environ

    def __enter__(self):
        self._original_environ = os.environ
        os.environ = self
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        os.environ = self._original_environ

    def __getitem__(self, item):
        try:
            return self._original_environ[item]
        except KeyError:
            return ''


def expandvars(path):
    replacer = '\0'  # NUL shouldn't be in a file path anyways.
    while replacer in path:
        replacer *= 2

    path = path.replace('\\$', replacer)

    with EnvironProxy():
        return os.path.expandvars(path).replace(replacer, '$')
3
fralau On

I have run across the same issue, but I would propose a different and very simple approach.

If we look at the basic meaning of "escape character" (as they started in printer devices), the purpose is to tell the device "do something different with whatever comes next". It is a sort of clutch. In our particular case, the only problem we have is when we have the two characters '\' and '$' in a sequence.

Unfortunately, we do not have control of the standard os.path.expandvars, so that the string is passed lock, stock and barrel. What we can do, however, is to fool the function so that it fails to recognize the '$' in that case! The best way is to replace the $ with some arbitrary "entity" and then to transform it back.

def expandvars(value):
    """
    Expand the env variables in a string, respecting the escape sequence \$
    """
    DOLLAR = r"\&#36;"
    escaped = value.replace(r"\$", r"\%s" % DOLLAR)
    return os.path.expandvars(escaped).replace(DOLLAR, "$")

I used the HTML entity, but any reasonably improbable sequence would do (a random sequence might be even better). We might imagine cases where this method would have an unwanted side effect, but they should be so unlikely as to be negligible.