How to get the current ragged tensor dimension in CTCLoss calculation?

58 views Asked by Lucas Servi At 07 December 2023 at 17:16

I'm adapting the following script from the Keras documentation (https://keras.io/examples/audio/ctc_asr/). (mainly I modified the first CONV2D to a CONV1D for my data)

My dataset consists of a list of numeric arrays (of variable length) and a string of characters (of variable length) which should be predicted based on the array of numbers. This is similar to a speech recognition (voice to sentences) approach.

Sin every row of my dataset has variable length I thought that the best way to implement this was using ragged tensors (as an alternative to adding "0" or " " white spaces to either column:

# My data
ints_feature = tf.ragged.constant(new_df.Numbers.tolist(), dtype=tf.float32) # Lists of numbers
strings_feature = tf.ragged.constant(new_df.Strings.tolist()) # Strings

# Create a dataset from the two features
dataset = tf.data.Dataset.from_tensor_slices((ints_feature, strings_feature))

An example could be:

import tensorflow as tf

# Generate sample data
data_numbers = [[1, 2, 3], [4, 5, 6, 7], [8, 9], [5,6,2,7], [1,9,3,4,5]]
data_strings = ['apple', 'orange','banana', 'grape', 'kiwi']

# Convert the data to ragged tensors
ragged_numbers = tf.ragged.constant(data_numbers, dtype=tf.float32)
ragged_strings = tf.ragged.constant(data_strings)

# Create a dataset from the ragged tensors and labels
dataset = tf.data.Dataset.from_tensor_slices((ragged_numbers, ragged_strings))

# Print the dataset
for numbers, strings in dataset:
    print("Numbers:", numbers.numpy(), "Strings:", strings.numpy())

I structured the model similar to the Keras example but I'm having trouble on the CTCLoss function:

def CTCLoss(y_true, y_pred):
    
    print(tf.cast(tf.shape(y_pred)[0], dtype="int64"))
    print(tf.cast(tf.shape(y_true)[0], dtype="int64"))
    print(tf.shape(y_pred))
    print(tf.shape(y_true))
    print(y_pred)
    print(y_true)

    # Compute the training-time loss value
    batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
    input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
    label_length = tf.cast(tf.shape(y_true)[1], dtype="int64") #RESOLVER!!!

    input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
    label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")

    loss = keras.backend.ctc_batch_cost(y_true, y_pred, input_length, label_length)
    return loss

I added the first 6 prints to get some insight of what's happening and I get the following:

Tensor("CTCLoss/Cast:0", shape=(), dtype=int64)
Tensor("CTCLoss/Cast_1:0", shape=(), dtype=int64)
Tensor("CTCLoss/Shape_2:0", shape=(3,), dtype=int32)
<DynamicRaggedShape lengths=[None, None] num_row_partitions=1>
Tensor("DeepSpeech_2/dense/Softmax:0", shape=(None, 1, 6), dtype=float32)
tf.RaggedTensor(values=Tensor("RaggedFromVariant/RaggedTensorFromVariant:1", shape=(None,), dtype=int64), row_splits=Tensor("RaggedFromVariant/RaggedTensorFromVariant:0", shape=(None,), dtype=int64))

When training the model I get this error:

ValueError: in user code:

    File "/.../python3.9/site-packages/keras/src/engine/training.py", line 1377, in train_function  *
        return step_function(self, iterator)
    File "/.../1562801909.py", line 14, in CTCLoss  *
        label_length = tf.cast(tf.shape(y_true)[1], dtype="int64") #RESOLVER!!!

    ValueError: Index 1 is not uniform

I'm guessing the error is asociated to different lengths of the y_pred and y_true, but I can't figure out how to arrange the shapes and dimensions (mainly because I'm using ragged tensors). I would gladly appreciate some help or perhaps a suggestion on another approach to solve this problem.

I tried trimming the number arrays and the strings to the shortest one in the dataset to have the same length in every row, this works perfectly, while using regular tensors.

Original Q&A

TechQA.

How to get the current ragged tensor dimension in CTCLoss calculation?

There are 0 answers

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in SPEECH-TO-TEXT

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in RAGGED-TENSORS

Popular Questions

Popular Tags

Trending Questions