How to restart node in Akka Multi Node test?

443 views Asked by At

I want to do some Akka Multi Node test and restart one node when some barrier is reached. Something like:

runOn(nodeA) {
  // do something while both nodes are up and running

  enterBarrier("nodeBCrashes")

  // do something while I'm the only node up and running

  enterBarrier("bothNodesUp")

  // do something with both nodes up and running again
}

runOn(nodeB) {
  // do something while both nodes are up and running

  crash()
  enterBarrier("nodeBCrashes")

  // do nothing because I'm out

  enterBarrier("bothNodesUp")
  start()

  // do something with both nodes up and running again

}

Is this can't be done, at least need a way to be able to shutdown the nodeB and initiate another nodeC with the same akka.remote.netty.tcp.port (this is strictly necessary). Something like this

runOn(nodeA) {
  // do something while both nodes are up and running

  enterBarrier("nodeBCrashes")

  // do something while I'm the only node up and running

  enterBarrier("bothNodesUp")

  // do something with both nodes up and running again
}

runOn(nodeB) {
  // do something while both nodes are up and running

  enterBarrier("nodeBCrashes")
  shutdown()

}

// How I can delay nodeC start until nodeA reaches bothNodesUp barrier?
runOn(nodeC) {      
  // do something when both nodes are up and running
} 

The question can be resumed to:

Can we recreate a situation where one node crashes and then restarts?

  1. Can we restart a node?
  2. If not, can we start a node when the rest of them reaches a berrier?
  3. Can we assign the same akka.remote.netty.tcp.port to different nodes (that shouldn't run in parallel). I've tried with *.opts files but without success, is this the way?
1

There are 1 answers

0
Stefano Bonetti On

You should be able to restart an ActorSystem reusing the same port of the crashed one. In Akka's own multi-node tests they do something along the lines of:

  lazy val restartedSecondSystem = ActorSystem(
    system,
    ConfigFactory.parseString("akka.remote.netty.tcp.port=" + secondUniqueAddress.address.port.get).
      withFallback(system.settings.config))

  ...      

  runOn(nodeB) {        
    shutdown(secondSystem)
  }

  enterBarrier("second-shutdown")

  runOn(nodeB) {
    Cluster(restartedSecondSystem).joinSeedNodes(seedNodes)
  }

Checkout the following tests in Akka's source code for more cues.

https://github.com/akka/akka/blob/master/akka-cluster/src/multi-jvm/scala/akka/cluster/RestartNodeSpec.scala

https://github.com/akka/akka/blob/master/akka-cluster/src/multi-jvm/scala/akka/cluster/RestartNode2Spec.scala

https://github.com/akka/akka/blob/master/akka-cluster/src/multi-jvm/scala/akka/cluster/RestartNode3Spec.scala