Conventional practice for signal handling/child processes

1k views Asked by At

EDIT: If managing child processes for a shell script is really purely a matter of "opinion"......no wonder there are so many terrible shell scripts. Thanks for continuing that.


I'm having trouble understanding how SIGTERM is conventionally handled with relation to child processes in Linux.


I am writing a command line utility in Bash.

It looks like

command1
command2
command3

Very simple, right?

However, if my program is sent SIGTERM signal, the Bash script will end but the current child process (e.g. command2) will continue.

But with some more code, I can write my program like this

trap 'jobs -p | xargs -r kill' TERM

command1 &
wait
command2 &
wait
command3 &
wait

That will propogate SIGTERM to the currently running child process. I haven't often seen Bash scripts written like that, but that's what it would take.


Should I:

  1. Write my program in the second style each time I create a child process?
  2. Or expect users to launch my program in a process group if they want to send SIGTERM?

What's the best practice/conventions for process management responsibilities with respect to SIGTERM for children?

1

There are 1 answers

0
Paul Draper On

tl;dr

The first way.

If a process starts a child process and waits for it to finish (the example), nothing special is necessary.

If a process starts a child process and may prematurely terminate it, it should start that child in a new process group and send signals to the group.

Details

Oddly for how often this applies (like, every shell script), I can't find a good answer about convention/best practice.

Some deduction:

Creating and signaling process groups are very common. In particular, interactive shells do this. So (unless it takes extra steps to prevent it) a processes' children can receive SIGINT signals at any time, in very normal circumstances.

In the interest of supporting as few paradigms as possible, it seems to make sense to rely on that always.

That means the first style is okay, and the burden of process management is placed on processes that deliberately terminate their children during regular operation (which is relatively less common).

See also "Case study: timeout" below for further evidence.

How to do it

While the perspective of the question was from the requirements of a vanilla callee program, this answer prompts the question: how does one start a process in a new process group (in the non-vanilla case that one wishes to prematurely interrupt the process)?

This is easy in some languages and difficult in others. I've created a utility run-pgrp to assist in the latter case.

#!/usr/bin/env python3
# Run the command in a new process group, and forward signals.
import os
import signal
import sys

pid = os.fork()
if not pid:
    os.setpgid(0, 0)
    os.execvp(sys.argv[1], sys.argv[1:])

def receiveSignal(sig, frame):
    os.killpg(pid, sig)
signal.signal(signal.SIGINT, receiveSignal)
signal.signal(signal.SIGTERM, receiveSignal)

_, status = os.waitpid(-1, 0)
sys.exit(status)

The caller can use that to wrap the process that it prematurely terminate.

Node.js example:

const childProcess = require("child_process");
(async () => {
  const process = childProcess.spawn(
    "run-pgrp",
    ["bash", "-c", "echo start; sleep 600; echo done"],
    { stdio: "inherit" }
  );
  /* leaves orphaned process
  const process = childProcess.spawn(
    "bash",
    ["-c", "echo start; sleep 600; echo done"],
    { stdio: "inherit" }
  );
  */
  await new Promise(res => setTimeout(res, /* 1s */ 1000));
  process.kill();
  if (process.exitCode == null) {
    await new Promise(res => process.on("exit", res));
  }
})();

At the end of this program, the sleep process is terminated. If the command invoked directly without run-pgrp, the sleep process continues to run.

Case study: timeout

The GNU timeout utility is a program that may terminate its child process.

Notably, it runs the child in a new process group. This supports the conclusion that potential interruptions should be preceded by creating a new process group.

Interestingly, however, timeout puts itself in the process group as well, to avoid complexities around forwarding signals, but causing some strange behavior. https://unix.stackexchange.com/a/57692/56781

For example, in an interactive shell, run

bash -c "echo start; timeout 600 sleep 600; echo done"

Try to interrupt this (Ctrl+C). It doesn't respond, because timeout never gets the signal!

In contrast, my run-pgrp utility keeps itself in the original process group and forwards SIGINT/SIGTERM to the child group.