I m working with servicemix and i have create a zookeeper server bundle that start correctly my zkserver. I run 3 servicemix with each one have a zk server bundle so i have 3 zkserver.
I have done : echo stat | nc localhost 2181 2182 and 2183 and i have one leader and 2 followers
If i stop the leader, an election is done and another one become the leader with one follower, if i restart the stopped one, it become a follower.
I thinkZookeeper part is ok.
Now, let see how the embedded servicemix activemq broker works.
I have my 3 zkserver up, and one master broker elected correctly. If i execute karaf command activemq:bstat, i see on 2 servicemix
service@localhost>bstat
Uptime = 26 minutes
and on the master broker:
service@localhost>bstat
destinationName = Consumer.ZOOKEEPER.VirtualTopic.application.downward
destinationType = Queue
destinationName = ActiveMQ.Advisory.TempQueue_ActiveMQ.Advisory.TempTopic
destinationType = Topic
connectorName = openwire
...
...
many other things...
...
Its look like ok.
Now, if i stop my zkserver leader, i see my master broker stopped too and bstat don't show me anything! and no other broker start one the 2 others servicemix.
LOG when zkserver leader stop
in this log we can see the zkserver stop but i can't see when the broker is stopped and why it's stop because when i stop the zkserver leader, another election is done and another one become the leader, so the broker don't need to stop...
2016-12-19 10:23:18,208 | INFO | l Console Thread | ZookeeperStarter | 284 - zookeeper-server - 1.0.0.SNAPSHOT | Shutting down ZooKeeper server
2016-12-19 10:23:18,208 | INFO | l Console Thread | Leader | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,209 | INFO | l Console Thread | Leader | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutdown called
java.lang.Exception: shutdown Leader! reason: quorum Peer shutdown
at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:499)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.QuorumPeer.shutdown(QuorumPeer.java:828)[94:org.apache.hadoop.zookeeper:3.4.6]
at com.zookeeper.provider.ZookeeperStarter$MyQuorumPeerMain.shutdown(ZookeeperStarter.java:185)[284:zookeeper-server:1.0.0.SNAPSHOT]
at com.zookeeper.provider.ZookeeperStarter.shutdown(ZookeeperStarter.java:73)[284:zookeeper-server:1.0.0.SNAPSHOT]
at com.zookeeper.provider.Activator.stop(Activator.java:71)[284:zookeeper-server:1.0.0.SNAPSHOT]
at org.apache.felix.framework.util.SecureAction.stopActivator(SecureAction.java:667)[org.apache.felix.framework-4.4.1.jar:]
at org.apache.felix.framework.Felix.stopBundle(Felix.java:2538)[org.apache.felix.framework-4.4.1.jar:]
at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:1012)[org.apache.felix.framework-4.4.1.jar:]
at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:999)[org.apache.felix.framework-4.4.1.jar:]
at org.apache.karaf.shell.osgi.StopBundle.doExecute(StopBundle.java:44)[25:org.apache.karaf.shell.osgi:2.4.3]
at org.apache.karaf.shell.osgi.BundlesCommand.doExecute(BundlesCommand.java:37)[25:org.apache.karaf.shell.osgi:2.4.3]
at org.apache.karaf.shell.console.OsgiCommandSupport.execute(OsgiCommandSupport.java:38)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.commands.basic.AbstractCommand.execute(AbstractCommand.java:35)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.CommandProxy.execute(CommandProxy.java:78)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.Closure.executeCmd(Closure.java:477)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.Closure.executeStatement(Closure.java:403)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.Pipe.run(Pipe.java:108)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:183)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:120)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.felix.gogo.runtime.CommandSessionImpl.execute(CommandSessionImpl.java:92)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.karaf.shell.console.jline.Console.run(Console.java:195)[23:org.apache.karaf.shell.console:2.4.3]
at org.apache.karaf.shell.console.jline.DelayedStarted.run(DelayedStarted.java:79)[23:org.apache.karaf.shell.console:2.4.3]
2016-12-19 10:23:18,210 | INFO | l Console Thread | NIOServerCnxn | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Closed socket connection for client /10.149.58.103:45947 which had sessionid 0x15916627e620001
2016-12-19 10:23:18,210 | INFO | Thread-68 | Leader | 94 - org.apache.hadoop.zookeeper - 3.4.6 | exception while shutting down acceptor: java.net.SocketException: Socket closed
2016-12-19 10:23:18,211 | INFO | it.sncf.fr:2181) | ClientCnxn | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Unable to read additional data from server sessionid 0x15916627e620001, likely server has closed socket, closing socket connection and attempting reconnect
2016-12-19 10:23:18,211 | INFO | l Console Thread | NIOServerCnxn | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Closed socket connection for client /10.149.58.103:45946 which had sessionid 0x15916627e620000
2016-12-19 10:23:18,212 | INFO | l Console Thread | ZooKeeperServer | 94 - org.apache.hadoop.zookeeper - 3.4.6 | shutting down
2016-12-19 10:23:18,212 | INFO | l Console Thread | SessionTrackerImpl | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,212 | INFO | l Console Thread | PrepRequestProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,213 | INFO | l Console Thread | ProposalRequestProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,213 | INFO | l Console Thread | CommitProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,213 | INFO | sid:1 cport:-1): | PrepRequestProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | PrepRequestProcessor exited loop!
2016-12-19 10:23:18,213 | INFO | l Console Thread | Leader | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,214 | INFO | l Console Thread | FinalRequestProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | shutdown of request processor complete
2016-12-19 10:23:18,214 | INFO | ommitProcessor:1 | CommitProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | CommitProcessor exited loop!
2016-12-19 10:23:18,214 | INFO | l Console Thread | SyncRequestProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,214 | INFO | SyncThread:1 | SyncRequestProcessor | 94 - org.apache.hadoop.zookeeper - 3.4.6 | SyncRequestProcessor exited!
2016-12-19 10:23:18,225 | WARN | 149.58.103:36413 | LearnerHandler | 94 - org.apache.hadoop.zookeeper - 3.4.6 | ******* GOODBYE /10.149.58.103:36413 ********
2016-12-19 10:23:18,226 | WARN | 149.58.103:36413 | LearnerHandler | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Ignoring unexpected exception
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)[:1.7.0_71]
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)[:1.7.0_71]
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:656)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:649)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,229 | WARN | 149.58.103:36408 | LearnerHandler | 94 - org.apache.hadoop.zookeeper - 3.4.6 | ******* GOODBYE /10.149.58.103:36408 ********
2016-12-19 10:23:18,229 | WARN | 149.58.103:36408 | LearnerHandler | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Ignoring unexpected exception
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)[:1.7.0_71]
at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)[:1.7.0_71]
at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:656)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:649)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,232 | INFO | 0.0/0.0.0.0:2181 | NIOServerCnxnFactory | 94 - org.apache.hadoop.zookeeper - 3.4.6 | NIOServerCnxn factory exited run method
2016-12-19 10:23:18,233 | WARN | 0:0:0:0:0:0:2181 | QuorumPeer | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Unexpected exception
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:453)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:799)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,233 | INFO | 0:0:0:0:0:0:2181 | Leader | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Shutting down
2016-12-19 10:23:18,234 | WARN | 0:0:0:0:0:0:2181 | QuorumPeer | 94 - org.apache.hadoop.zookeeper - 3.4.6 | QuorumPeer main thread exited
2016-12-19 10:23:18,235 | ERROR | .149.58.103:3888 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Exception while listening
java.net.SocketException: Socket closed
at java.net.PlainSocketImpl.socketAccept(Native Method)[:1.7.0_71]
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)[:1.7.0_71]
at java.net.ServerSocket.implAccept(ServerSocket.java:530)[:1.7.0_71]
at java.net.ServerSocket.accept(ServerSocket.java:498)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:509)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,237 | WARN | RecvWorker:3 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Connection broken for id 3, my id = 1, error =
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)[:1.7.0_71]
at java.net.SocketInputStream.read(SocketInputStream.java:152)[:1.7.0_71]
at java.net.SocketInputStream.read(SocketInputStream.java:122)[:1.7.0_71]
at java.net.SocketInputStream.read(SocketInputStream.java:210)[:1.7.0_71]
at java.io.DataInputStream.readInt(DataInputStream.java:387)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:765)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,237 | WARN | RecvWorker:3 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Interrupting SendWorker
2016-12-19 10:23:18,238 | WARN | SendWorker:2 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)[:1.7.0_71]
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)[:1.7.0_71]
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:849)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:64)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:685)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,239 | WARN | RecvWorker:2 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Connection broken for id 2, my id = 1, error =
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)[:1.7.0_71]
at java.net.SocketInputStream.read(SocketInputStream.java:152)[:1.7.0_71]
at java.net.SocketInputStream.read(SocketInputStream.java:122)[:1.7.0_71]
at java.net.SocketInputStream.read(SocketInputStream.java:210)[:1.7.0_71]
at java.io.DataInputStream.readInt(DataInputStream.java:387)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:765)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,239 | WARN | RecvWorker:2 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Interrupting SendWorker
2016-12-19 10:23:18,240 | WARN | SendWorker:2 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Send worker leaving thread
2016-12-19 10:23:18,241 | WARN | SendWorker:3 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Interrupted while waiting for message on queue
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)[:1.7.0_71]
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2095)[:1.7.0_71]
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:389)[:1.7.0_71]
at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:849)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:64)[94:org.apache.hadoop.zookeeper:3.4.6]
at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:685)[94:org.apache.hadoop.zookeeper:3.4.6]
2016-12-19 10:23:18,241 | WARN | SendWorker:3 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Send worker leaving thread
2016-12-19 10:23:18,321 | WARN | spatch-DEFAULT-2 | MasterLevelDBStore | 96 - org.apache.activemq.activemq-osgi - 5.12.1 | Unexpected session error: java.io.IOException: Connection reset by peer
2016-12-19 10:23:18,330 | WARN | spatch-DEFAULT-3 | MasterLevelDBStore | 96 - org.apache.activemq.activemq-osgi - 5.12.1 | Unexpected session error: java.io.IOException: Connection reset by peer
2016-12-19 10:23:18,583 | INFO | -broker] Task-14 | ZooKeeper | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Session: 0x15916627e620001 closed
2016-12-19 10:23:18,584 | INFO | er])-EventThread | ClientCnxn | 94 - org.apache.hadoop.zookeeper - 3.4.6 | EventThread shut down
2016-12-19 10:23:19,236 | INFO | .149.58.103:3888 | QuorumCnxManager | 94 - org.apache.hadoop.zookeeper - 3.4.6 | Leaving listener
2016-12-19 10:23:19,803 | INFO | Receiver[myid=1] | FastLeaderElection | 94 - org.apache.hadoop.zookeeper - 3.4.6 | WorkerReceiver is down
2016-12-19 10:23:19,803 | INFO | erSender[myid=1] | FastLeaderElection | 94 - org.apache.hadoop.zookeeper - 3.4.6 | WorkerSender is down
2016-12-19 10:23:20,000 | INFO | SessionTracker | SessionTrackerImpl | 94 - org.apache.hadoop.zookeeper - 3.4.6 | SessionTrackerImpl exited loop!
Zk server configuration
Only myid change to 1 / 2 / 3 and karaf.data is different because myid file in dataDir is unique and contain 1 or 2 or 3
tickTime=2000
initLimit=10
syncLimit=10
clientPort=2183
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
myId=3
dataDir=${karaf.data}/zookeeper/
Activemq replicated leveldb configuration
<persistenceAdapter>
<replicatedLevelDB
directory="${data}"
replicas="3"
bind="tcp://0.0.0.0:61622"
zkAddress="localhost:2181,localhost:2182,localhost:2183"
zkPassword="password"
zkPath="/activemq/leveldb-stores"
hostname="localhost"
/>
</persistenceAdapter>
version
Servicemix 5.5.4 ActiveMQ : bundle osgi : activemq-osgi (5.12.1) Zookeeper : bundle osgi : ZooKeeper Bundle (3.4.6)
thanks for help to understand why my master broker stop and why no other broker start after zk server leader election is done.