0MQ publisher not binding after forever restartall

109 views Asked by At

Scenario

I am have multiple NodeJS scripts running in forever mode in Ubuntu OS. One of these files(start.js) imports a file that starts a ZMQ publisher by biding it to a specified port. When I start this start.js file in forever mode separately, it binds and starts the publisher, and I am able to fetch the data published by this publisher through a ZMQ subscriber that connects to this port.

I am closing the publisher gracefully by checking for exit, SIGINT and SIGUSR events.

Whenever I restart this start.js file alone using forever restart, the publisher binds and starts successfully. It also works fine if I stop it manually (using forever stop) and start it again using forever start [ also works in the case where I manually stop(using forever stopall) and start all the forever scripts one by one].

NOTE: All the forever stop and restart commands are run with CLI option --killSignal=SIGINT.

Problem

But the publisher is failing to bind when I do forever restartall --killSignal=SIGINT. It says that the address is already in use(I have checked this using netstat and there is no tcp socket at that port). When I stop all the scripts and start them one by one it binds back normally and starts successfully.

I have checked that these kill signals are caught by the publisher script and its closing the publisher socket before exiting.

Failed Attempts:

  • Lowered the TIME_WAIT state of the tcp sockets.

  • Enabled reuse of TIME_WAIT sockets.

  • I thought that the tcp socket is taking time to get released from TIME_WAIT state, and tried to bind the publisher after 1000ms on every failure to bind, but the scripts is trying to bind and failing every time it tries.

  • Tried forever restarting all the scripts using SIGINT, SIGUSR1 kill signals and handled them in the script that binds the publisher socket.

This is how I am handling the SIG* events in the publisher:

process.stdin.resume();
function exitHandler(options, err){
    if (options.cleanup) console.log('pub-clean');
    if (err) console.log("pub--" + err.stack);
    if (options.exit){
        socket.close();
        console.log("Publisher Closed")
        process.exit();
    }
}

process.on('exit', exitHandler.bind(null,{cleanup:true}));
process.on('SIGINT', exitHandler.bind(null, {exit:true}));
process.on('uncaughtException', exitHandler.bind(null,{exit:true}));
process.on('SIGUSR2', exitHandler.bind(null, {exit:true}));
process.on('SIGTERM', exitHandler.bind(null, {exit:true}));

Why the forever restarting all the scripts is causing the publisher script to fail to bind?

What can be done to make the publisher script to bind on forever restarting?

0

There are 0 answers