Are my Cypher statements 'overdone'

119 views Asked by At

I am using py2neo with transactions. This means I am using the Cypher language. I am appending the textual Cypher statements to a transaction queue and submitting the contents of the queue in one shot with commit.

It works fine. However, it is slow. I am getting about 100/nodes per second and as the transaction queue gets larger the inserts take longer. My app times out if a transaction has more than about 6,000 nodes (and a like amount of relationships).

For now I want to focus on my Cypher. My app generates a lot of this:

CREATE (n:METHOD {version: 6995, unique: 682, return_type: 0, fully_qualified_name: 0, name: "method4", accessibility: 0})
CREATE (n:PARAMETER {version: 6995, unique: 687, fully_qualified_name: 0, param_type: 1, name: "param4", accessibility: 0})
MATCH (a:METHOD), (b:PARAMETER) WHERE a.unique=682 AND a.version=6995 AND b.unique=687 AND b.version=6995 CREATE (a)-[r:INVOKED_WITH]->(b)

So I create a METHOD node, create a PARAMETER node, then relate them. What is bothering me is that I basically create the two nodes, then throw away the fact that I just created them. Then I find them with a lookup so I can connect them. This irks me. The previous version did not use transactions; when I created a node, I got an native neo4j ID back and used that when creating relations. Now I can't do that since the textual statements are being submitted en masse to the neo4j server.

Am I allowed to put RETURN statements in there the way I can do in the neo4j web interface? Is there better Cypher to use?

EDIT - I have indexes on the "unique" property of all relevant node types.

I am not using parameters in my Python code because the code is using transactions. Therefore I have to use py2neo's mechanism to talk to neo4j directly. This involves creating the textual commands you see above.

1

There are 1 answers

11
Christophe Willemsen On BEST ANSWER

Py2neo supports transactions and of course you can use parameters in the cypher queries, a simple code I just tested :

from py2neo import Graph
import time

graph = Graph("http://neo4j:password@localhost:7474/db/data/");

tx = graph.cypher.begin()
for x in range(0,100):
    tx.append("CREATE (m:Method {id:{id}})", {"id": x})
    tx.append("CREATE (p:Parameter {id:{id}})", {"id": x})
    tx.append("MATCH (m:Method {id:{mid}}), (p:Parameter {id: {pid}}) CREATE (m)-[:RELATES]->(p)", {"mid": x, "pid": x})

mstart = int(round(time.time() * 1000))
tx.commit()
mend = int(round(time.time() * 1000))
diff = mend - mstart
print diff

The diff time is around 80ms

Update, you can also do:

    tx.append("CREATE (m:Method {id:{method_id}}) WITH m
               UNWIND {parameter_ids} as p_id 
               CREATE (p:Parameter {id:p_id})
               CREATE (m)-[:RELATES]->(p)", 
              {"method_id": 1234, "parameter_ids":range(0,100)})