A more efficient Django (SQL) query

117 views Asked by At

Working on a Django project, I have a database of users and artworks, the latter of which can be liked. I would like to query, to get how many likes on all their pictures a singular user has together. I am able to write that over two for loops in views.py, but that is slow. I've also written it as a separate (terrible) SQL query, but I'm not certain how to then use it properly, since it's not a query-set anymore (I guess I'd have to query in a way where I get all the required data?).

The end idea is to simply have a table consisting of users, their emails, the number of likes their pictures have received, and the number of images posted. Here are the relevant models (I'm using the default Django auth_user table for users) and the SQL query.

class Arts(models.Model):
user_id = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE, unique=False)
title = models.CharField(max_length=100)
description = models.TextField(unique=False, null=False, blank=True)
timestamp = models.DateTimeField(default=timezone.now)
url = models.URLField()
likes = models.IntegerField(default=0)

Artwork liked by user:

class Like(models.Model):                                  
artwork = models.ForeignKey(Arts, on_delete=models.CASCADE)
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)

my views.py for loop:

def all_users(request):
...
liked = defaultdict()
for user in users_all:
    liked[user.id] = 0
    for artwork in Arts.objects.filter(user_id=user.id):
        liked[user.id] += artwork.likes
...

SQL query

    with connection.cursor() as cursor:
    sql = """
WITH lajki AS 
(SELECT user_id_id, likes FROM artists_arts ORDER BY likes) 
SELECT user_id_id, SUM(likes) AS suma 
FROM lajki 
GROUP BY user_id_id 
ORDER BY suma DESC
"""
    cursor.execute(sql)
    def namedtuplefetchall(cursor):
        desc = cursor.description
        nt_result = namedtuple('Result', [col[0] for col in desc])
        return [nt_result(*row) for row in cursor.fetchall()]
    liked = namedtuplefetchall(cursor)

Is there a way to do this more efficiently?

1

There are 1 answers

1
willeM_ Van Onsem On BEST ANSWER

There is no need to use raw queries at all. You can query with:

from django.contrib.auth import get_user_model

def all_users(request):
    users = get_user_model().objects.annotate(
        suma=Sum('arts__likes')
    ).order_by('-suma')
    # …

The user objects that arise from this queryset will have an extra attribute .suma that will sum the likes of the related Arts objects of that user. This will normally avoid the temporary table, and just make one query.

I'm however not sure if it is a good idea to store the likes in the Arts object. This is basically data duplication (you store an aggregated form of data in the object). It turns out that keeping data in sync, even on the same database is harder. It thus might be better to count the likes for the Arts objects by the Like object.