create an edge list that groups films by genre, i.e. join two films of the same genre

74 views Asked by At

I've just been using python and I want to build an edge list that groups together the titles of movies that have a genre in common. I have this dictionary:

dictionary_title_withonegenere=
{28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

now 28,12,16 are the genres of movies.I want to create an edge list that groups movies by genre, i.e. I join two movies of the same genre:

source                         target 
Avatar: The Way of Water       Violent Nigh
Avatar: The Way of Water       Puss in Boots: The Last Wish
Violent Nigh                   Puss in Boots: The Last Wish
Avatar: The Way of Water       The Chronicles of Narnia: The Lion, the Witch 
                               and the Wardrobe
Puss in Boots: The Last Wish   Strange World

This is my idea:

edges=[]
genres=[28,12,16]

    for i in range(0,len(genres)):
            for genres[i] in dictionary_title_withonegenere[genres[i]]:
                for genres[i] in dictionary_title_withonegenere[genres[i]][1:]:
                    edges.append({"sorce":dictionary_title_withonegenere[genres[i]][0],"target":dictionary_title_withonegenere[genres[i]][y]})

    print((edges))

My code don't work. How can i do?

2

There are 2 answers

0
Albin Paul On BEST ANSWER

You can check if 2 movies have common genre by creating an intermediate datastructure, that is to have a mapping with movie->genres and with that datastructure, you can iterate over all movies and see if there is any common genre and create an edge between them.

from pprint import pprint
dictionary_title_withonegenere= {28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

movies_with_genre = {}
movies_set = set()
for genre, movies in dictionary_title_withonegenere.items():
    for movie in movies:
        movies_with_genre.setdefault(movie, set()).add(genre)
        movies_set.add(movie)
    
pprint(movies_with_genre)
movie_list = list(movies_set)
edges = []
for i in range(len(movie_list)):
    source_movie= movie_list[i]
    for j in range(i + 1, len(movie_list)):
        target_movie = movie_list[j]
        common_genre = False
        for source_genre in movies_with_genre[source_movie]:
            if source_genre in movies_with_genre[target_movie]:
                common_genre = True
                break
        if common_genre:
            edges.append({"sorce":source_movie, "target":target_movie})
pprint(edges)

OUTPUT

{'Avatar: The Way of Water': {28, 12},
 'Puss in Boots: The Last Wish': {16, 28},
 'Strange World': {16},
 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe': {12},
 'Violent Night': {28}}
[{'sorce': 'Strange World', 'target': 'Puss in Boots: The Last Wish'},
 {'sorce': 'Avatar: The Way of Water',
  'target': 'Puss in Boots: The Last Wish'},
 {'sorce': 'Avatar: The Way of Water', 'target': 'Violent Night'},
 {'sorce': 'Avatar: The Way of Water',
  'target': 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'},
 {'sorce': 'Puss in Boots: The Last Wish', 'target': 'Violent Night'}]
0
Marcus.Aurelianus On

try this

edges=[]
genres=[28,12,16,35,80,99,18,10751,14,36,27,10402,9648,10749,878,10770,53,10752,37]
dictionary_title_withonegenere = {28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

for i in range(0,len(genres)):
    if genres[i] in dictionary_title_withonegenere:
        genres_list = dictionary_title_withonegenere[genres[i]]
        genres_list_len = len(genres_list)
        if genres_list_len <= 1:
            continue
        for j in range(genres_list_len):
            for k in range(j+1,genres_list_len):
                edges.append({"name_movies":genres_list[j],"onegres_movies":genres_list[k]})

for edge in edges:
    print(f'{edge["name_movies"]: <40}{edge["onegres_movies"]}')

output

Avatar: The Way of Water                Violent Night
Avatar: The Way of Water                Puss in Boots: The Last Wish
Violent Night                           Puss in Boots: The Last Wish
Avatar: The Way of Water                The Chronicles of Narnia: The Lion, the Witch and the Wardrobe
Puss in Boots: The Last Wish            Strange World