Neo4j Avoiding circular results in cypher query

888 views Asked by At

I have the Uniform Medical Language System (UMLS) ontologies loaded into Neo4j and relationships between SNOMED concept nodes. There can be multiple types of relationships between any two nodes. The 2015AA release of UMLS and my selection of options has produced 1,256,982 SNOMED nodes and 2,258,642 relationships between them. This query results in the expected 21 child nodes of the SNOMED root node:

MATCH (n:MRCONSO{AUI:'A3684559'}) match n<-[*..1]-x return count(*)

Increasing the depth of the query causes problems. This query produces 3338 rows in the return

MATCH (n:MRCONSO{AUI:'A3684559'}) match n<-[*..2]-x return id(x)

There are 11 rows where the id is not unique. This can also be seen in this query which results in 3327 rows (3338-11)

MATCH (n:MRCONSO{AUI:'A3684559'}) match p=shortestpath(n<-[*..2]-x) return id(x)

Thus, I can get the unique child node IDs using the shortestpath. However, the query times are 52 ms and 61745 for the 2nd and 3rd queries, respectfully. Either of these deteriorates with queries of greater depth.

Is there a way to avoid the circularity in the query and thereby reduce the query time?

1

There are 1 answers

6
Michael Hunger On

Which version of Neo4j are you using? Try to update to 2.2.2

Were you able to determine why you get those duplicate ids? It could be that a child is reachable on two levels

Also your query will output both the level 1 and level 2 children.

Can you output all paths of a certain duplicate id?

Would it be good enough to just get the unique id's ? Then you can use distinct.

MATCH (n:MRCONSO{AUI:'A3684559'})
MATCH (n)<-[*..2]-(x) 
RETURN distinct id(x)

Do you have an index or constraint on :MRCONSO(AUI) ?