I'm using Property Graph and Cypher from Neo4j. As described in the title, I'm trying to delete a number of nodes which can be reached from another node without passing other nodes and only have 1 incoming relationship. Here is the example of this case:
Each node has its label (big, bold character) and a property called nodeId
, which is unique among nodes. The labels of relationships are omitted because we cannot rely on it for some reasons. The nodeId
property is already indexed with a unique constraint.
Now, starting from node A {nodeId: 1}
, I want to delete it and all other nodes which:
- can be reached from
A {nodeId: 1}
without passing another A-label node. - only has 1 incoming relationship
So, the nodes will be deleted are: A {nodeId: 1}
, B {nodeId: 3}
, C {nodeId: 4}
, and C {nodeId: 8}
.
Below is my Cypher code:
MATCH p = (s:A {nodeId: 1 }) -[*1..10]-> (e)
WHERE NONE (x in NODES(p) WHERE x:A AND NOT x.nodeId = 1)
WITH s, e
MATCH (e) <-[r]- ()
WITH count(r) AS num_r, s, e
WHERE num_r < 2
DETACH DELETE e
DETACH DELETE s
The code works fine but as my graph grows, it becomes slower and slower. In the beginning, it takes less than 10 ms. But now, when I have around 1 million of nodes and 2 million of relationships, it takes more than 1 second.
What should I do to improve the performance of that code?
Thank you for your help.
Tezra has the right idea for the Cypher version of this query (but without usage of shortestPath()).
An alternate approach that might work better in a more complex graph is to use APOC Procedures, which has path expansion procs that will work well for your use case, finding only single paths to each distinct node, and filtering efficiently on labels.
Here's how you might use this, using
apoc.path.subgraphNodes()
The labelFilter in the procedure call ensures that no node in the expansion has an
:A
label (the filter doesn't apply to the starting node of the expansion by default, which works in your case here, though this is configurable).EDIT
One flaw in this approach, however, is that this expands any relationship, in any direction.
While the relationshipFilter can filter by direction, there's currently a bug which won't allow us to specify only relationship direction without a type.
UPDATE
As of the APOC Summer releases in 2018 (3.3.0.4 along the 3.3.x line, and 3.4.0.2 along the 3.4.x line), you can now specify typeless, direction-only in the relationshipFilter:
relationshipFilter:'>'