dsbulk unload is failing after ran couple of hours with OOM issue

12 views Asked by At

dsbulk unload failing due to cassandra pod restart. cassandra pod shows it's got killed due to OOM issue.

2024-03-26 21:54:14 INFO Operation directory: /cassandra_data/dsbulk/dsbulk-1.11.0/bin/logs/LOAD_20240326-215414-519088 2024-03-26 21:54:16 ERROR Operation LOAD_20240326-215414-519088 failed: Java.io.IOException: Error creating CSV parser for file:/cassandra_data/dsbulk/dsbulk.csv. Caused by: Error creating CSV parser for file:/cassandra_data/dsbulk/dsbulk.csv. Caused by: File not found: /cassandra_data/dsbulk/dsbulk.csv (No such file or directory). reactor.core.Exceptions$ReactiveException: java.io.IOException: Error creating CSV parser for file:/cassandra_data/dsbulk/dsbulk.csv at com.datastax.oss.dsbulk.workflow.load.LoadWorkflow.execute(LoadWorkflow.java:242) [3 skipped] com.datastax.oss.dsbulk.io.CompressedIOUtils.newBufferedReader(CompressedIOUtils.java:96) 2024-03-26 21:54:18 INFO Final stats:

We're using the below command to export data. export DSBULK_JAVA_OPTS="-Xmx10G" dsbulk unload -url /cassandra_data/dsbulk/export2.csv -delim "|" --executor.continuousPaging.enabled false -cl LOCAL_QUORUM --driver.basic.request.timeout="30 minutes" --datastax-java-driver.basic.request.timeout="30 minutes" -maxErrors 1000000 --schema.splits=12C -maxConcurrentQueries 1 -maxConcurrentFiles 10 -header true -k -t

The table having 1.6TB. is there any best options/approach to export huge cassandra table.

0

There are 0 answers