Best practices for tracking impressions for real-time feedback in Amazon Personalize

72 views Asked by At

I trained a recommender using the user personalization recipe in Amazon Personalize. A list of recommended products is called using get_recommendations() in a Flask app that serves the front end. In addition, user interactions, encoded as Segment events, are passed back to the Personalize event tracker using an AWS Lambda following the guidance here. I want to use the put_event() API to write impression data back to the event tracker in order to avoid users seeing the same items over and over again even when they don't interact with them. My problem is that I can't figure out how to avoid a situation in which get_recommendations() (called in a flask app) can get the information it needs about user interactions with items (handled in the AWS Lambda since these interactions are encoded as segment events).

I can call put_event() from the flask app right after calling get_recommendations(), and use the output of the latter to specify which products were seen, but it seems that the intended use of put_events is to provide one itemId that the user interacted with, and then a list of itemIds that the user saw/clicked/etc. and did NOT interact with. However the information about which item the user interacted with is captured in the Lambda, not the flask app. Similarly, the AWS Lambda knows what action the user took, but does not know the full list of recommended items that were shown, and does not have access to the recommendationId.

I feel like this scenario isn't too exotic and I suspect I'm overlooking something fairly obvious. I'm wondering if other people have run into a similar situation and how they manage the flow of information through such a system.

1

There are 1 answers

0
James J On

Impression data for Personalize is used to inform exploration.

Amazon Personalize uses impressions data to determine what items to include in exploration. With exploration, recommendations include some items that would be typically less likely to be recommended for the user, such as new items, items with few interactions, or items less relevant for the user based on their previous behavior. The more frequently an item occurs in impressions data, the less likely it is that Amazon Personalize includes the item in exploration.

Impression data is not used to exclude items that were seen by a user in subsequent calls to GetRecommendations. Exploration will vary the cold items that are recommended to a user from call to call but for users with long interaction histories, more relevant warm items will likely be more prevalent in recommendations.

If you want to forcibly exclude items that a user has recently seen, you could use a Personalize filter to exclude those items by their item ID. To do this, you'd have to add a column to the items dataset (e.g., Items.ITEM_ID_FOR_FILTERING) and then use a dynamic filter to pass in the item IDs to exclude.

EXCLUDE ItemID WHERE Items.ITEM_ID_FOR_FILTERING IN ($ITEM_IDS)

There is a maximum length of 1000 characters for the filter value ITEM_IDS so you'll only be able to send as many item IDs that will fit in 1000 characters (including commas).

You should also exclude the Items.ITEM_ID_FOR_FILTERING column from training since it's only needed for filtering and doesn't have any value for training.

Excluding all items that a user has seen is not that common since you will be excluding the most relevant items for the user. Eventually, the recommendations will be dominated by the least relevant items to the user.