ZeroMQ publisher not binding after forever restartall

Question

ZeroMQ publisher not binding after forever restartall

419 views Asked by saikiranboga At 12 November 2014 at 18:18

Scenario

I am have multiple NodeJS scripts running in forever mode in Ubuntu OS. One of these files(start.js) imports a file that starts a ZMQ publisher by biding it to a specified port. When I start this start.js file in forever mode separately, it binds and starts the publisher, and I am able to fetch the data published by this publisher through a ZMQ subscriber that connects to this port.

I am closing the publisher gracefully by checking for exit, SIGINT and SIGUSR events.

Whenever I restart this start.js file alone using forever restart, the publisher binds and starts successfully. It also works fine if I stop it manually (using forever stop) and start it again using forever start [ also works in the case where I manually stop(using forever stopall) and start all the forever scripts one by one].

NOTE: All the forever stop and restart commands are run with CLI option --killSignal=SIGINT.

Problem

But the publisher is failing to bind when I do forever restartall --killSignal=SIGINT. It says that the address is already in use(I have checked this using netstat and there is no tcp socket at that port). When I stop all the scripts and start them one by one it binds back normally and starts successfully.

I have checked that these kill signals are caught by the publisher script and its closing the publisher socket before exiting.

Failed Attempts:

Lowered the TIME_WAIT state of the tcp sockets.
Enabled reuse of TIME_WAIT sockets.
I thought that the tcp socket is taking time to get released from TIME_WAIT state, and tried to bind the publisher after 1000ms on every failure to bind, but the scripts is trying to bind and failing every time it tries.
Tried forever restarting all the scripts using SIGINT, SIGUSR1 kill signals and handled them in the script that binds the publisher socket.

This is how I am handling the SIG* events in the publisher:

process.stdin.resume();
function exitHandler(options, err){
    if (options.cleanup) console.log('pub-clean');
    if (err) console.log("pub--" + err.stack);
    if (options.exit){
        socket.close();
        console.log("Publisher Closed")
        process.exit();
    }
}

process.on('exit', exitHandler.bind(null,{cleanup:true}));
process.on('SIGINT', exitHandler.bind(null, {exit:true}));
process.on('uncaughtException', exitHandler.bind(null,{exit:true}));
process.on('SIGUSR2', exitHandler.bind(null, {exit:true}));
process.on('SIGTERM', exitHandler.bind(null, {exit:true}));

Why the forever restarting all the scripts is causing the publisher script to fail to bind?

What can be done to make the publisher script to bind on forever restarting?

Original Q&A

There are 1 answers

**user3666197** · Answer 1 · 2014-11-14T15:15:14+00:00

ZeroMQ-resources are recommended to be released in a controlled way

As discussed in the comments above, a truly graceful release of ZeroMQ resources is not done via system-level SIG* / *KILL, but by executing the ZeroMQ-recommended graceful-release steps.

As posted so far, you do not do that at all in your code and thus the ZeroMQ-resources may and most probably remain hanging ( at least the I/O-thread seems to ).

Check your ZeroMQ-socket settings used in ( not yet posted ) setup ( .setsockopt() calls used in setup phase ) and add:

ensure settings for a non-blocking .close() of all sockets setup ( be they used, or not )
then execute .close() only after [1] is sure and valid
finally, execute explicit ZeroMQ Context instance .term()

This is considered a guaranteed ZeroMQ-graceful-release of all ( internally handled ) resources.

On a sample code request:

A graceful release

void  msLIB.deinit() {
      aComment.ADD( "msLIB.INFO: msLIB.deinit() TracePOINT.<BoPROC>|", False );

   // --------------------------------------------------------------------------------------<THANKS>
   // DO NOT EDIT: the below  IS TRULY NEEDED or else one might get some nice memory leaks!
   // ------------            |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||   


   // -----------------------------------------------------------------------
   // ZMQ-IMPERATIVE ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
   // _______________________________________________________________________ msMOD: ZMQ_Safe&CleanCODE_IMPERATIVE
      zmq_setsockopt(   zmqSpeaker,  ZMQ_LINGER, 0 );                         // no Sending QUEUE .... on ZMQ_PUBLISHER end
      zmq_close(        zmqSpeaker  );                                        // Protect against memory leaks on shutdown.
                        aComment.ADD( "<1>", False );
                        aComment.ADD( " [[[ZMQ]]]<speaker_socket>.set( ZMQ_LINGER ) / .close()-ed ", True );

      zmq_setsockopt(   zmqListener, ZMQ_LINGER, 0 );                         // aKBD.PUB 
      zmq_close(        zmqListener );                                        // Protect against memory leaks on shutdown.
                        aComment.ADD( "<2>", False );
                        aComment.ADD( " [[[ZMQ]]]<listener_socket>.set( ZMQ_LINGER ) / .close()-ed ", True );      

      zmq_term(         zmqContext  );                                        // Protect against memory leaks on shutdown.
                        aComment.ADD( "<3>", False );
                        aComment.ADD( " [[[ZMQ]]]<context>.term()-ed ", True );

   // _______________________________________________________________________ msMOD: ZMQ_Safe&CleanCODE_IMPERATIVE

   // ------------
   // DO NOT EDIT: the above  IS TRULY NEEDED or else one might get some nice memory leaks!               
   // --------------------------------------------------------------------------------------<THANKS>

      aComment.ADD( "|<EoPROC>", False );
      msLIB.aSnapshot.MAKE();
      aComment.ADD( "|aSnapshot.MAKE()-<DONE>", True );
      return;
   }

On missing "in-built" controls

One may extend the architecture so as to contain one's own soft-signalling code for all the situations, that need to get handled softer, than via SIGKILL et al.

a simpler case

and

a more complex case

TechQA.

ZeroMQ publisher not binding after forever restartall

There are 1 answers

ZeroMQ-resources are recommended to be released in a controlled way

On a sample code request:

On missing "in-built" controls

Related Questions in NODE.JS

Related Questions in SOCKETS

Related Questions in ZEROMQ

Related Questions in FOREVER

Popular Questions

Popular Tags

Trending Questions