Using straightforward Cypher to load data from a CSV and just create nodes.
Code is as follows:
:auto LOAD CSV WITH HEADERS FROM 'file:///registrants.csv' AS row
CALL {
WITH row
MERGE (r:Registrant {row_wid: toInteger(row.ROW_WID)})
ON CREATE SET
r.row_wid = toInteger(row.ROW_WID),
r.w_insert_dt = row.W_INSERT_DT,
r.w_update_dt = row.W.UPDATE_DT,
r.email_address = row.EMAIL_ADDRESS,
r.attendee_contact_wid = toInteger(row.ATTENDEE_CONTACT_WID),
r.attendee_account_wid = toInteger(row.ATTENDEE_ACCOUNT_WID),
r.reg_contact_wid = toInteger(row.REG_CONTACT_WID),
r.reg_account_wid = toInteger(row.REG_ACCOUNT_WID),
r.event_wid = toInteger(row.EVENT_WID),
r.tkt1_wid = toInteger(row.TKT1_WID),
r.tkt2_wid = toInteger(row.TKT2_WID),
r.tkt3_wid = toInteger(row.TKT3_WID),
r.tkt4_wid = toInteger(row.TKT4_WID),
r.tkt5_wid = toInteger(row.TKT5_WID),
r.tkt6_wid = toInteger(row.TKT6_WID),
r.current_flg = row.CURRENT_FLG,
r.delete_flg = row.DELETE_FLG,
r.created_on_dt = row.CREATED_ON_DT,
r.updated_on_dt = row.UPDATED_ON_DT,
r.reg_dt = row.REG_DT,
r.attend_dt = row.ATTEND_DT,
r.cancel_dt = row.CANCEL_DT,
r.alumni = row.ALUMNI,
r.reg_channel = row.REG_CHANNEL
} IN TRANSACTIONS of 1000 ROWS
Did this with 100 rows and it worked seamlessly. Trying to create with 700K rows and it has been running over 12 hours.
I also have an index for creation of this node in the DB.
I'm a newbie so please excuse if I'm doing something wrong.
From my research this looks right.
Not getting any errors.
Insights appreciated
Thank you.
Make sure you have a uniqueness constraint on
Registrant.row_wid:Examine the query plan to make sure the index is being used by prepending
EXPLAINto the query and make sure there are no "eager" operations that would prevent batching (given the query I wouldn't expect there to be).Increase the number of rows per transaction. It depends on how much memory is allocated for Neo4j transactions, but typically around 100k is what I use.