I have a data set of 3.8million nodes and I'm trying to load all of these into Neo4j spatial. The nodes are going into a simple point layer, so have the required latitude and longitude fields. I've tried:
MATCH (d:pointnode)
WITH collect(d) as pn
CALL spatial.addNodes("point_geom", pn) yield count return count
But this just keeps spinning without anything happening. I've also tried (I've been running the next query all on one line, but I've just split it up for ease of reading):
CALL apoc.periodic.iterate("MATCH (d:pointnode)
WITH collect(d) AS pnodes return pnodes",
"CALL spatial.addNodes('point_geom', pnodes) YIELD count return count",
{batchSize:10000, parallel:false, listIterate:true})
But again a lot of spinning and the occasional JAVA heap error.
The final approach I tried was to use FME with the HTTP caller, this works but is exceptionally slow so doesn't scale well for millions of nodes.
Any advice or suggestions would be much appreciated. Would apoc.periodic.commit or apoc.periodic.rock_n_roll be a better choice than periodic iterate?
After a bit of trial and error periodic commit has led to a relatively quick solution (still going to take 2-3 hours)
May be quicker with larger batch sizes
EDIT with a batch size of 5000 it takes 45 minutes