Scenario
I am have multiple NodeJS scripts running in forever mode in Ubuntu OS. One of these files(start.js) imports a file that starts a ZMQ publisher by biding it to a specified port. When I start this start.js file in forever mode separately, it binds and starts the publisher, and I am able to fetch the data published by this publisher through a ZMQ subscriber that connects to this port.
I am closing the publisher gracefully by checking for exit, SIGINT and SIGUSR events.
Whenever I restart this start.js file alone using forever restart
, the publisher binds and starts successfully. It also works fine if I stop it manually (using forever stop
) and start it again using forever start
[ also works in the case where I manually stop(using forever stopall
) and start all the forever scripts one by one].
NOTE: All the forever stop and restart commands are run with CLI option --killSignal=SIGINT.
Problem
But the publisher is failing to bind when I do forever restartall --killSignal=SIGINT
. It says that the address is already in use(I have checked this using netstat
and there is no tcp socket at that port). When I stop all the scripts and start them one by one it binds back normally and starts successfully.
I have checked that these kill signals are caught by the publisher script and its closing the publisher socket before exiting.
Failed Attempts:
Lowered the TIME_WAIT state of the tcp sockets.
Enabled reuse of TIME_WAIT sockets.
I thought that the tcp socket is taking time to get released from TIME_WAIT state, and tried to bind the publisher after 1000ms on every failure to bind, but the scripts is trying to bind and failing every time it tries.
Tried forever restarting all the scripts using SIGINT, SIGUSR1 kill signals and handled them in the script that binds the publisher socket.
This is how I am handling the SIG* events in the publisher:
process.stdin.resume();
function exitHandler(options, err){
if (options.cleanup) console.log('pub-clean');
if (err) console.log("pub--" + err.stack);
if (options.exit){
socket.close();
console.log("Publisher Closed")
process.exit();
}
}
process.on('exit', exitHandler.bind(null,{cleanup:true}));
process.on('SIGINT', exitHandler.bind(null, {exit:true}));
process.on('uncaughtException', exitHandler.bind(null,{exit:true}));
process.on('SIGUSR2', exitHandler.bind(null, {exit:true}));
process.on('SIGTERM', exitHandler.bind(null, {exit:true}));
Why the forever restarting all the scripts is causing the publisher script to fail to bind?
What can be done to make the publisher script to bind on forever restarting?