Exporting data from Neo4j to Gephi is not showing all nodes

88 views Asked by At

I am facing a real problem with extracting data (nodes and relations) from neo4j to gephi. I am using using the apoc.gephi.add method to do that. This method is only extracting a random subset of the nodes and relations in the neo4j query!

Below you can find the cypher query to extract from neo4j to gephi:

  match (t1:tag)<-[:has]- (vid1:video)-[:recommends]->(vid2:video)-[:has]->(t2:tag)
where (t1.title contains 'ukraine' or t1.title contains 'russia') and not (t2.title contains 'ukraine' or t2.title contains 'russia') 
match (u1:user)-[:author]->(vid1)
match path = (u2:user)-[:author]->(vid2)
where u2.verified = 0 and size(u2.nickname) < 4  and u2.commerceUserInfo_commerceUser =1
CALL apoc.gephi.add(null,'workspace1',path,'weight',['title', 'diggCount','followerCount','followingCount','commentCount','heartCount','playCount','shareCount','videoQuality','uniqueId','verified']) yield nodes, relationships, time
return *

Neo4j shows 300 nodes and there relations out of 6100 nodes resulted from this query as shown below:

Neo4j Resluts

However, gephi shows only 61 nodes and 32 relations!

Gephi Results

Why is this happening and how to export all nodes from neo4j to gephi? Thank you

1

There are 1 answers

0
Giuseppe Villani On

I'm quite sure it's not a apoc.gephi.add problem, but only a wrong query.

I mean, you executed a multiple match but in your apoc.gephi.add you pass only the path variable, that is only the (u2:user)-[:author]->(vid2) part, instead in your final return you do return *, so you return all variables from the other matches as well.

If you execute a return path instead of return *, you should see only the same results as the Gephi attached image.

So i guess you should change it, in order to pass everything you need in your gephi procedure.

I think something like that (but it depends on your dataset anyway):

match p1 = (t1:tag)<-[:has]-(vid1:video)-[:recommends]->(vid2:video)-[:has]->(t2:tag)
where (t1.title contains 'ukraine' or t1.title contains 'russia') and not (t2.title contains 'ukraine' or t2.title contains 'russia') 
match p2=(u1)-[:author]->(vid1)
match p3=(u2)-[:author]->(vid2)
where u2.verified = 0 and size(u2.nickname) < 4  and u2.commerceUserInfo_commerceUser =1
with [p1, p2, p3] as dataToAdd
CALL apoc.gephi.add(null,'workspace1', dataToAdd,'weight',['title', 'diggCount','followerCount','followingCount','commentCount','heartCount','playCount','shareCount','videoQuality','uniqueId','verified']) yield nodes, relationships, time
return dataToAdd, nodes, relationships, time