My dask script runs well until the last step, which concats thousands of dataframes together and writes to CSV. Memory use immediately jumps from 6GB to over 15GB and I receive an error like "95% memory exceeded, restarting workers". My machine has plenty of memory though. I have two questions: (1) how can I increase the available memory for workers or for this last step? (2) Would intermediate concat steps help and how best to add them? The problematic code is below is:
future = client.submit(pd.concat, tasks)
future.result().to_csv(path)