I have an embedded Neo4j v2.2.3 three identical server setup, where I'm trying to turn a single database into a HA setup. I've tried beginning the HA process with every combination of databases: all empty, all-but-one empty and all using the same database, but to no avail. AFAIK for some reason the Neo4j instances can't connect to each other. I have verified that the IP addresses are correct, and the port 5001 should be open. I've also opened 6001 for.
Here is my messages.log.
2015-06-25 20:37:16.461+0000 INFO [o.n.k.i.DiagnosticsManager]: --- INITIALIZED diagnostics START ---
2015-06-25 20:37:16.462+0000 INFO [o.n.k.i.DiagnosticsManager]: Neo4j Kernel properties:
2015-06-25 20:37:16.467+0000 INFO [o.n.k.i.DiagnosticsManager]: ha.server_id=1
2015-06-25 20:37:16.467+0000 INFO [o.n.k.i.DiagnosticsManager]: ha.server=:6001
2015-06-25 20:37:16.467+0000 INFO [o.n.k.i.DiagnosticsManager]: online_backup_server=0.0.0.0:6362
2015-06-25 20:37:16.467+0000 INFO [o.n.k.i.DiagnosticsManager]: ephemeral=false
2015-06-25 20:37:16.467+0000 INFO [o.n.k.i.DiagnosticsManager]: ha.initial_hosts=[IP1]:5001,[IP2]:5001,[IP3]:5001
2015-06-25 20:37:16.467+0000 INFO [o.n.k.i.DiagnosticsManager]: online_backup_enabled=true
2015-06-25 20:37:16.468+0000 INFO [o.n.k.i.DiagnosticsManager]: ha.cluster_server=:5001
2015-06-25 20:37:16.468+0000 INFO [o.n.k.i.DiagnosticsManager]: store_dir=/var/neo4j
2015-06-25 20:37:16.468+0000 INFO [o.n.k.i.DiagnosticsManager]: org.neo4j.server.webserver.address=0.0.0.0
2015-06-25 20:37:16.468+0000 INFO [o.n.k.i.DiagnosticsManager]: org.neo4j.server.database.mode=HA
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: Diagnostics providers:
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: org.neo4j.kernel.configuration.Config
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: org.neo4j.kernel.info.DiagnosticsManager
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: SYSTEM_MEMORY
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: JAVA_MEMORY
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: OPERATING_SYSTEM
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: JAVA_VIRTUAL_MACHINE
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: CLASSPATH
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: LIBRARY_PATH
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: SYSTEM_PROPERTIES
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: LINUX_SCHEDULERS
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: NETWORK
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: NodeCache
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: RelationshipCache
2015-06-25 20:37:16.469+0000 INFO [o.n.k.i.DiagnosticsManager]: HighAvailabilityDiagnostics
....
2015-06-25 21:55:36.502+0000 INFO [o.n.k.i.DiagnosticsManager]: High Availability diagnostics
Member state:PENDING
State machines:
AtomicBroadcastMessage:start
AcceptorMessage:start
ProposerMessage:start
LearnerMessage:start
HeartbeatMessage:start
ElectionMessage:start
SnapshotMessage:start
ClusterMessage:start
Current timeouts:
Eventually after two minutes I get a transaction exception:
Caused by: org.neo4j.graphdb.TransactionFailureException: Timeout waiting for database to become available and allow new transactions. Waited 2m. 2 reasons for blocking: Database is stopped, Cluster state is 'PENDING'.
I create a graphDatabaseFactory = new HighlyAvailableGraphDatabaseFactory()
which is used to create
DatabaseServiceImpl(
graphDatabaseFactory
.newEmbeddedDatabaseBuilder(neo4jStoreDir)
.loadPropertiesFromFile(configFileLocation)
.newGraphDatabase())
This is what my neo4j.properties looks like:
online_backup_enabled=true
online_backup_server=0.0.0.0:6362
org.neo4j.server.webserver.address=0.0.0.0
org.neo4j.server.database.mode=HA
ha.server_id=1
ha.cluster_server=0.0.0.0:5001
ha.server=0.0.0.0:6001
ha.initial_hosts=[IP1]:5001,[IP2]:5001,[IP3]:5001
I've tried a lot of different combinations for the properties and also added the suggested values from neo4-server.properties but nothing helps. Where should I put neo4j-server.properties in embedded mode, or are they not needed (that's my initial guess)?
What might be wrong? Is it even possible to setup a HA cluster using embedded Neo4j anymore?
EDIT. I made sure every server is on the same subnet and the servers can connect to each other without obstructions.
So the problem turned out to be that I'm using kernel extensions set with
new HighlyAvailableGraphDatabaseFactory().addKernelExtensions(myKernelExtensionsArray)
. TheaddKernelExtensions
method is deprecated, but these extensions work great with a single server setup. However on this HA server setup they fail for some reason.I was able to reuse my kernel extensions by replacing a call to
addKernelExtensions
withregisterTransacionEventHandler
.