Manipulate inputs/features supplied to TensorFlow Recommender Subclass Model

65 views Asked by At

I'm building a TensorFlow recommender model. Right now, I set up an embedding layer, like this:


unique_user_fat = tf.range(1,200)

######################

class UserModel(tf.keras.Model):
   
  def __init__(self):
    super().__init__()

    self.fat_embedding = tf.keras.Sequential([
        tf.keras.layers.IntegerLookup(
            vocabulary=unique_user_fat, mask_token=None),
        tf.keras.layers.Embedding(len(unique_user_fat) + 1, 32),
    ])

    self.carbs_embedding = tf.keras.Sequential([
        tf.keras.layers.IntegerLookup(
            vocabulary=unique_user_carbs, mask_token=None),
        tf.keras.layers.Embedding(len(unique_user_carbs) + 1, 32),
    ])
 
  def call(self, inputs):
    # Take the input dictionary, pass it through each input layer,
    # and concatenate the result.
    return tf.concat([
        self.fat_embedding(inputs["fat_value"]),
        self.carbs_embedding(inputs["carbohydrates_value"])
    ], axis=1)

BUT, that's not quite what I'm looking for. My question: is it possible to manipulate the inputs in def call(self, inputs):, or features in def compute_loss when subclassing? What I'd like to do is get the fat + carbs as the initial input, then run a function on them, based on their values.

So the input to model is: [10 g fat] & [2 carbs]

then, run a function inside the model that results in a new input...like: if fat >= 5 and carbs <= 12, get a new column: inputs['diet_type'] = 'keto'

then, pass that new input to my embedding layer:

self.diet_type_embedding = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
          vocabulary=unique_diet_types,mask_token=None),
      tf.keras.layers.Embedding(len(unique_diet_types) + 1, 32)
    ])

This is a partial snippet of how I set up my training datasets, just to get an idea of how I'm setting up a relationship btwn fat + carbs:

conditions = {
    'high_carb': (barcodes_df['carbohydrates_value'] >= 30),
    'keto': (barcodes_df['fat_value'] >= 5) & (barcodes_df['carbohydrates_value'] <= 12),
    'free_food': (barcodes_df['fat_value'] <= 1) & (barcodes_df['carbohydrates_value'] <= 1),
    'low_fat': (barcodes_df['fat_value'] <= 3)
}
barcodes_df['food_diet_type'] = np.select(conditions.values(), conditions.keys(), default='standard')

I'd like to do something like this in my combined model. First, pass in the fat + carbs. Then:

@tf.function
def fn1(x):
    y = {}
    y.update(x)
    
    z = add_diet_type(x['fat_value'], x['carbohydrates_value'])
    y['food_diet_type'] = z
    return y   

def compute_loss(self, features, training=False):  

    features = features.map(self.fn1)
    
    query_embeddings = self.query_model({
        "food_diet_type": features["food_diet_type"]
    })
    barcode_embeddings = self.candidate_model(features["code"])
 
    return self.task(
        query_embeddings, barcode_embeddings, compute_metrics=not training)

...where add_diet_type is a function that returns the diet types based on fat + carb input. But this isn't possible, b/c in the subclassed models, features is a Tensor graph/iterator:

<tf.Tensor 'IteratorGetNext:1' shape=(None,) dtype=int64>,

I'm not sure how to work with the features datatype (dict/iterator/graph). Is there a way to manipulate the features in compute_loss, so I could iterate through each, and compute the diet type?

I looked into TensorFlow Features, e.g. buckets & cross columns, but this seems deprecated?

Any help is appr - thanks.

0

There are 0 answers