I am using py2neo with transactions. This means I am using the Cypher language. I am appending the textual Cypher statements to a transaction queue and submitting the contents of the queue in one shot with commit.
It works fine. However, it is slow. I am getting about 100/nodes per second and as the transaction queue gets larger the inserts take longer. My app times out if a transaction has more than about 6,000 nodes (and a like amount of relationships).
For now I want to focus on my Cypher. My app generates a lot of this:
CREATE (n:METHOD {version: 6995, unique: 682, return_type: 0, fully_qualified_name: 0, name: "method4", accessibility: 0})
CREATE (n:PARAMETER {version: 6995, unique: 687, fully_qualified_name: 0, param_type: 1, name: "param4", accessibility: 0})
MATCH (a:METHOD), (b:PARAMETER) WHERE a.unique=682 AND a.version=6995 AND b.unique=687 AND b.version=6995 CREATE (a)-[r:INVOKED_WITH]->(b)
So I create a METHOD node, create a PARAMETER node, then relate them. What is bothering me is that I basically create the two nodes, then throw away the fact that I just created them. Then I find them with a lookup so I can connect them. This irks me. The previous version did not use transactions; when I created a node, I got an native neo4j ID back and used that when creating relations. Now I can't do that since the textual statements are being submitted en masse to the neo4j server.
Am I allowed to put RETURN statements in there the way I can do in the neo4j web interface? Is there better Cypher to use?
EDIT - I have indexes on the "unique" property of all relevant node types.
I am not using parameters in my Python code because the code is using transactions. Therefore I have to use py2neo's mechanism to talk to neo4j directly. This involves creating the textual commands you see above.
Py2neo supports transactions and of course you can use parameters in the cypher queries, a simple code I just tested :
The diff time is around 80ms
Update, you can also do: