Spawning a process with {create_group=True} / set_pgid hangs when starting Docker

504 views Asked by At

Given a Linux system, in Haskell GHCi 8.8.3, I can run a Docker command with:

System.Process> withCreateProcess (shell "docker run -it alpine sh -c \"echo hello\""){create_group=False} $ \_ _ _ pid -> waitForProcess pid
hello
ExitSuccess

However, when I switch to create_group=True the process hangs. The effect of create_group is to call set_pgid with 0 in the child, and pid in the parent. Why does that change cause a hang? Is this a bug in Docker? A bug in System.Process? Or an unfortunate but necessary interaction?

1

There are 1 answers

3
Joseph Sible-Reinstate Monica On BEST ANSWER

This isn't a bug in Haskell or a bug in Docker, but rather just the way that process groups work. Consider this C program:

#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>

int main(void) {
    if(setpgid(0, 0)) {
        perror("setpgid");
        return 1;
    }
    execlp("docker", "docker", "run", "-it", "alpine", "echo", "hello", (char*)NULL);
    perror("execlp");
    return 1;
}

If you compile that and run ./a.out directly from your interactive shell, it will print "hello" as you'd expect. This is unsurprising, since the shell will have already put it in its own process group, so its setpgid is a no-op. If you run it with an intermediary program that forks a child to run it (sh -c ./a.out, \time ./a.out - note the backslash, strace ./a.out, etc.), then the setpgid will put it in a new process group, and it will hang like it does in Haskell.

The reason for the hang is explained in "Job Control Signals" in the glibc manual:

Macro: int SIGTTIN

A process cannot read from the user’s terminal while it is running as a background job. When any process in a background job tries to read from the terminal, all of the processes in the job are sent a SIGTTIN signal. The default action for this signal is to stop the process. For more information about how this interacts with the terminal driver, see Access to the Terminal.

Macro: int SIGTTOU

This is similar to SIGTTIN, but is generated when a process in a background job attempts to write to the terminal or set its modes. Again, the default action is to stop the process. SIGTTOU is only generated for an attempt to write to the terminal if the TOSTOP output mode is set; see Output Modes.

When you docker run -it something, Docker will attempt to read from stdin even if the command inside the container doesn't. Since you just created a new process group, and you didn't set it to be in the foreground, it counts as a background job. As such, Docker is getting stopped with SIGTTIN, which causes it to appear to hang.

Here's a list of options to fix this:

  1. Redirect the process's standard input to somewhere other than the TTY
  2. Use signal or sigaction to make the process ignore the SIGTTIN signal
  3. Use sigprocmask to block the process from receiving the SIGTTIN signal
  4. Call tcsetpgrp(0, getpid()) to make your new process group be the foreground process group (note: this is the most complicated, since it will itself cause SIGTTOU, so you'd have to ignore that signal at least temporarily anyway)

Options 2 and 3 will also only work if the program doesn't actually need stdin, which is the case with Docker. When SIGTTIN doesn't stop the process, reads from stdin will still fail with EIO, so if there's actually data you want to read, then you need to go with option 4 (and remember to set it back once the child exits).

If you have TOSTOP set (which is not the default), then you'd have to repeat the fix for SIGTTOU or for standard output and standard error (except for option 4, which wouldn't need to be repeated at all).