Defining neo4j relationships

59 views Asked by At

I am trying to define relationships between nodes. Each type of node has it own csv file and the relationships are defined in a csv file as well. The nodes have an unique node id and based on the node id's the relationships are defined in the csv file. Now I am unable to define the relationships in neo4j without making new nodes for each relation. This code snippet might give some insight in what I am trying to do, although it is not correct:

# create relationships between entities and addresses
for i, row in dfRelationships.iterrows():
    entity_node = graph.nodes.match("Entity", name=row['node_id_start']).first()
    address_node = graph.nodes.match("Address", address=row['node_id_end']).first()
    if entity_node and address_node:
        rel_properties = {
            "start_date": row['start_date'],
            "end_date": row['end_date'],
            "sourceID": row['sourceID'],
            "status": row['status'],
            "rel_type": row["rel_type"],
            "link": row["link"]
        }
        rel = Relationship(entity_node, "Registered Address", address_node, **rel_properties)
        graph.create(rel)

The csv file of the relationships contain the following columns: node_id_start, node_id_end, rel_type, link, status, start_date, end_date, sourceID.

See the following picture for the instances: https://i.stack.imgur.com/88cVJ.png

Sorry for the messy post, it's my first one :)

1

There are 1 answers

0
cybersam On

Some immediate issues are evident:

  1. A relationship type cannot have embedded space characters (unless escaped). Use something like REGISTERED_ADDRESS instead of Registered Address. (Relationship types should be all-uppercased.)
  2. Is the relationship type actually supposed to come from row["rel_type"]? If so, use that instead of Registered Address, and do not store it in the relationship as a property.
  3. The relationship map does not contain a sourceID property, so row['sourceID'] will cause an error. If the property is actually supposed to be node_id_start, then don't store it in the relationship (i.e., the relationship would have the corresponding start node as an endpoint, so storing its id in the relationship would be redundant).