How to save a list of concatenated tensors using pgvector

64 views Asked by At

I want to make a recommendation system and for that I am embedding the following data

 """
        In this function, each element in the product is tokenized and transformed into an embed.
        :param product: ProductDataForEmbedsSchema
        :return: None
        """
        product_url_embedding = self.embedding(tokenize_string(string=product.product_url))
        categories_embedding = self.embedding(tokenize_string(string=product.categories))
        price_embedding = self.embedding((torch.FloatTensor([product.price])).to(torch.int64))
        
        price_with_discount_embedding = None
        
        attributes_values_embeds, attributes_keys_embeds = None, None
        
        if product.price_with_discount is not None:
            price_with_discount_embedding = self.embedding(
                (torch.FloatTensor([product.price_with_discount])).to(torch.int64))
        
        """
        Considering that the attributes are json and we cannot provide unique numbers, we train a model for text.
        Each key and value from the product attributes are transformed into embeds, which are unique
        """
        if product.attributes is not None and len(product.attributes) > 0:
            attributes_values_embeds, attributes_keys_embeds = generate_indices_for_attributes(
                attributes=product.attributes)
        
        embeds = [product_url_embedding, categories_embedding, price_embedding]
        
        if price_with_discount_embedding is not None:
            embeds.append(price_with_discount_embedding)

        if attributes_keys_embeds is not None and attributes_keys_embeds is not None:
            for embed in attributes_keys_embeds:
                embeds.append(self.embedding(torch.from_numpy(embed).to(torch.int64)))

            for embed in attributes_values_embeds:
                embeds.append(self.embedding(torch.from_numpy(embed).to(torch.int64)))
                
        max_dim_0 = max(embed.size(0) for embed in embeds)
        
        embeds = [torch.cat([embed, torch.zeros(max_dim_0 - embed.size(0), embed.size(1))]) for embed in embeds]

        product_embedding = torch.stack(embeds, dim=0)

I would like to save that final embed in the database for each product and calculate a similarity. How could I do this? I always get the following error

    return '[' + ','.join([str(float(v)) for v in value]) + ']'
                               ^^^^^^^^
sqlalchemy.exc.StatementError: (builtins.ValueError) only one element tensors can be converted to Python scalars
[SQL: UPDATE products SET embedding=%(embedding)s, "updatedAt"=now() WHERE products.id = %(id_1)s]
[parameters: [{}]]

0

There are 0 answers