Error while defining a string variable in tf.data.experimental.CsvDataset

17 views Asked by At

I build my dataset (ds) using this fonction:

def build_dataset(path_list, index_variables_retenus, header=True, field_delim=','):

    def csv_loader(path):
        return tf.data.experimental.CsvDataset(
            path,
            record_defaults=[tf.string]+[tf.float32]*(len(variables_a_extraire)-1),
            header=header,
            field_delim=field_delim,  #,
            select_cols=variables_a_extraire
            )
    # ajouter la colonne des ID pour permettre de filtrer les échantillons mélangeant 2 bvs
    variables_a_extraire = [0] + index_variables_retenus
    record_defaults =[tf.string]+[tf.float32]*(len(variables_a_extraire)-1)
    # cr.er un tensor list
    tf_list=tf.data.Dataset.list_files(path_list,shuffle=True)

    return tf_list.interleave(csv_loader, cycle_length=1)

"record_defaults" is define with a tf.string on first variable because I add an ID information to my data for later process. Next step I need to run this line:

ds = ds.map(lambda *items: tf.stack(items))

and get this error:

    TypeError: Tensors in list passed to 'values' of 'Pack' Op have types [string, float32, float32, float32, float32] that don't all match.
ValueError: values_1: Tensor conversion requested dtype string for Tensor with dtype float32: <tf.Tensor 'args_1:0' shape=() dtype=float32>

My understand of the error is that my variable define as tf.string is the problem, I try to remove it and all work great.

I'm looking around on the web but don't find why it don't work, the values of this columns in the original csv= camelsaus_102101A...

thank all for your help

0

There are 0 answers