I have a csv file that I ingested to NiFi using GetFile Processor, then used ConvertRecord Processor without any issue, when I dragged the ExecuteStreamCommand to the canvas to run a python script I run to some issues to write the fileflow.
This is the configuration I used: enter image description here
The python script:
import pandas as pd
import sys
from io import StringIO
try:
# Read input data from the stdin:
path_file = sys.stdin.read()
input_df = pd.read_csv(StringIO(path_file))
# Fill missing values with column means
input_df.dropna(inplace=True)
# Write the processed data to the stdout:
sys.stdout.write(input_df.to_csv(path_file, index=False))
except Exception as e:
sys.stderr.write("An error occurred: {}".format(str(e)))
As you can see in the image, the flow is read but unable to write to the next processor, and I keep getting the error:
Unable to write flowfile content to content repository container default due to archive file size constraints; waiting for archive cleanup. Total number of files currently archived = 13
For your information, I have already tried disabling the nifi.flow.configuration.archive.enabled=false and resized the max storage size to 10 GB but to no avail, I keep having the same error.
What should I do to fix this issue?
As a heads-up, I found the cause of my problem. In the python script :
the path_file should be removed and this command instead: