Extracting Item Latent Vectors from Trained AWS Factorization Machines Model

20 views Asked by At

I have successfully trained and tested an AWS Factorization Machines model on a training dataset of interactions merged with item attributes and user attributes for a recommendation engine problem.

Now, I would like to additionally use item latent vectors generated by the model to build item embeddings and find similarities between items. I have managed to extract the latent vectors from the model as shown in the code below, yet I want to map them to the item ids I have in my dataset to be able to concat them to other item attributes and make interpretations about item similarities (i.e mapping a certain latent vector to an item in my portfolio).

Is is possible to understand the sequence by which these vectors are created such that I can map them to the item ids in my training data. How can I do this mapping to ensure I am linking each vector with its relevant item?

Also, I noticed the number of item latent vectors created exceeds the number of unique item ids in the train data, any clue why would this happen and what do these extra vectors represent?

import mxnet as mx
import os
model_file_name = "model.tar.gz"
model_full_path = fm.output_path +"/"+ fm.latest_training_job.job_name +"/output/"+model_file_name
print ("Model Path: ", model_full_path)

#Download FM model 
os.system("aws s3 cp "+model_full_path+ " .")
#Extract model file for loading to MXNet
os.system("tar xzvf "+model_file_name)
os.system("unzip -o model_algo-1")
os.system("mv symbol.json model-symbol.json")
os.system("mv params model-0000.params")

obj = s3.get_object(Bucket=bucket, Key="Factorization_Machines/data/df_train.csv")
df = pd.read_csv(io.BytesIO(obj['Body'].read()))
nb_users = df.user_id.nunique()
nb_items = df.item_id.nunique() 

#Extract model data
m = mx.module.Module.load('./model', 0, False, label_names=['out_label'])
V = m._arg_params['v'].asnumpy()
w = m._arg_params['w1_weight'].asnumpy()
b = m._arg_params['w0_weight'].asnumpy()

# item latent matrix - concat(V[i], w[i]).  
knn_item_matrix = np.concatenate((V[nb_users:], w[nb_users:]), axis=1)
knn_train_label = np.arange(1,nb_items+1)

#user latent matrix - concat (V[u], 1) 
ones = np.ones(nb_users).reshape((nb_users, 1))
knn_user_matrix = np.concatenate((V[:nb_users], ones), axis=1)
0

There are 0 answers