I have data in format:
user_id | user_name | movie_rating | movie_name |
---|---|---|---|
1 | x | 1 | asd1 |
1 | x | 3 | asd2 |
1 | x | 5 | asd3 |
1 | x | 5 | asd4 |
2 | y | 2 | asd4 |
2 | y | 3 | asd5 |
2 | y | 4 | asd6 |
I want to extract the best and the worst rated movie per user in a format where best_rated_movie and worst_rated_movie is array (because user can have multiple worst movies etc):
user_id | user_name | best_rated_movie | worst_rated_movie |
---|---|---|---|
1 | x | (asd3,asd4) | asd1 |
2 | y | asd6 | asd4 |
I managed to extract table to output where I have only the best and the worst rated movies:
user | user_name | movie_rating | movie_name |
---|---|---|---|
1 | x | 1 | asd1 |
1 | x | 5 | asd3 |
1 | x | 5 | asd5 |
2 | y | 2 | asd4 |
2 | y | 4 | asd6 |
Here is a code which I used for that:
with best_movie_rating as (
select
user_id,
max(movie_rating) as max_rating
from source_table
group by user_id
),
worst_movie_rating as (
select
user_id,
min(movie_rating) as min_rating
from source_table
group by user_id
),
columns_final as (
select
t1.user_id,
t1.user_name,
t1.company_name,
t1.movie_rating,
t1.movie_name
from source_table t1
inner join best_movie_rating t2
on t1.user_id = t2.user_id
inner join worst_movie_rating t3
on t1.user_id = t3.user_id
where (t1.movie_rating= t2.max_rating and t1.user_id = t2.user_id)
or (t1.movie_rating= t3.min_rating and t1.user_id = t3.user_id)
)
select * from columns_final
Unfortunately I don't know how to move forward, I've tried aggregating with no success (especially with arrays in play). I'd appreciate at least a suggestion how to deal with this.
Consider below
If applied to sample data in your question - output is
If (as other answer imply) you want result as a string of comma separated movies - you can use below
with output