Filter 30 unique product ids based on score and rank using databricks pyspark

60 views Asked by At

Here I am trying to filter 30 distinct product ids based on the score and rank of each combination of country and gender (US:W; US:M; CA:W; CA:M). Using the code, I am getting the right output, but the only issue is that product IDs are displayed in case they are repeating in the entire dataset of each combination.

for e.g. 

product_id_1: Score: 10: rank 1

product_id_1 : Score: 8; rank 3

product_id_1 : Score: 0.1; rank 40

In this example, I should display only product_id_1 and product_id_1 since I am fetching only 30 distinct product ids, but I am getting product_id_1 rank 40 as well.

[![enter code here][1]][1]
0

There are 0 answers