I am running into an issue spawning a large number of processes (200) under MacOS X Mountain Lion (though I'm sure this issue is not version specific). What I am doing is launching processes (in my test it is /bin/cat) which have their STDIN, STDOUT, and STDERR connected to pipes -- the other end of which is the spawning (parent) process.
The parent process writes data into the STDIN of the processes, which is piped to the [/bin/cat] child processes, which in turn spit the data back out of STDOUT and is read by the parent process. /bin/cat is just used for testing.
I am actually using kqueue to be notified when there is space available in the STDIN pipe. When kqueue notifies you with a EVFILT_WRITE event that space is available, the event also tells you exactly how many bytes can be written without blocking.
This all works well, and I can easily spawn 100 child (/bin/cat) processes, and everything flows through the pipes without blocking (all day long). However, when I crank up the number of processes to 200 everything grinds to a halt when the single kqueue service thread blocks on a write() call to one of the STDIN pipes. The event says that there is 16384 bytes available (basically an empty pipe) but when the program tries to write exactly 16384 bytes into the pipe, the write() blocks anyway.
Initially I thought I was running into a max. open files issue, but I've jacked up the ulimit for open files to 8192, so that is not the issue. What I have discovered from some googling is that on OS X, STDIN/STDOUT/STDERR are not in fact "files" (or "pipes") but are actually devices. When the process is hung, running lsof on the command-line also hangs with a warning about not being able to stat() the file system:
lsof: WARNING: can't stat() hfs file system /
Output information may be incomplete.
assuming "dev=1000008" from mount table
As soon as I kill the process, lsof completes. This reinforces the notion that STDIN/OUT/ERR are in fact devices and I'm running into some kind of limit.
Does anyone have an idea of what limit I am running into, for example is there a limit on the number of "device" that can be created? Can this be increased somehow?
Just to answer my own question here. This appears to be related to how MacOS X will dynamically expand a pipe from 16K to 32K to 64K (and so on). My basic workaround was to prevent the pipe from expanding. It appears that whenever you fill the pipe completely the OS will expand it. So, when the kqueue triggers that I can write into the pipe, and indicates that I have 16384 bytes available to write, I simply write 16384 - 1 bytes. Basically, whatever it tells me I have available, I write at most (available - 1) bytes. This prevents the pipe from expanding, and is preventing my code from encountering the condition where a write() to the pipe would block (even though the pipe is non-blocking).