How to implement locking in a multi process system?

1.1k views Asked by At

We are running lots of jenkins projects in parallel. We are using python, and we have chosen to manage the virtual environments with pyenv. Unfortunately, pyenv has a well-known race condition. To work around the problem, I would like to implement locking at the process level. What I want to do is:

lock some resource (a file?)
do my pyenv stuff
unlock the resource

My scripts are written in bash. How can I implement resource locking / unlocking in bash?

2

There are 2 answers

2
codeforester On

I would consider using symlinks instead of regular files - because, symlink creation is an atomic operation. So, we could do this:

lockfile=/path/to/lock.file
# lockfile is a symlink that points the string that holds the PID of the locking process
ln -s $$ $lockfile 2>/dev/null
if [[ $? == 0 ]]; then
    # got the lock - ln -s will fail if symlink exists already
else
    otherprocess=$(readlink $lockfile)
    if [[ $otherprocess != $$ ]]; then
        ps -p $otherprocess 2>/dev/null
        if [[ $? != 0 ]]; then
            # stale lock; remove and lock again
            # this can result in race conditions
            # probably, we can make the lock procedure as a function that is shared by concurrent bash scripts and have a random sleep before we remove the stale lock and proceed
         fi
    fi
fi
8
2ps On

So your friend in the unix world when wanting a cross-process lock is a command called flock. It is implemented as an atomic operation at the OS level and is extremely useful for this sort of thing. You can read more about it here. Here is how you can use it:

  # Wait for lock on  (fd 222) for 10 seconds
  (flock -w 10 222 || exit 1

  {
      # Do the operations you want to here
  }) 222>/path/to/lockfile 

There are several tricks here. First, normally when using output redirection, bash will open a file first before even attempting the flock. Here, though, because we have the () bash will first kick off a subshell whose first command is flock. flock will attempt to obtain a lock on file handle 222. Flock will then lock the file descriptor. After locking the file descriptor, the code in the {} is run. After that is run, the contents of file descriptor 222 are written to the lock file (i.e., nothing), the file is closed and the lock is released. This is just like C where closing a file releases a lock. Of course, no on explains it better than the illustrious @CharlesDuffy (hat tip @codeforester) who explains what is going on here.