TFX. Properties for CsvCoder in CsvExampleGen: 'Columns do not match specified csv headers'

415 views Asked by At

I am working with TensorFlow Extended and stack in a loading .csv file. This file has ; separation and can't be read by default TFX generator CsvExampleGen(). It throws out the following error: ValueError: Columns do not match specified csv headers

I found that this problem related to inner dependencies such as tft.coders.CsvCoder() that requires not default parameters to parse .csv file.

Question is the following:

  • How to throw parameters in tft.coders.CsvCoder() from tfx.components.CsvExampleGen?
from tfx.components import CsvExampleGen
from tfx.utils.dsl_utils import external_input

data_path = './data'
intro_component = CsvExampleGen(input=external_input(data_path))
...
1

There are 1 answers

0
AudioBubble On

From the comments

Current solution is to transform the datafile with pandas:

df = pd.read_csv(_file_path, sep=';')

df.to_csv(_file_path)

(paraphrased from Oleks).