Complex Django ORM Annotations & Aggregations

228 views Asked by At

I'm currently preparing some logged items into a JSON serialized format.

I am attempting to do this via Django's built-in ORM, utilising annotations and aggregations.

I'm trying to replicate the following structure per each of these "logged items":

...
{
    "count": 20,
    "metric-type": "total-dataset-requests",
    "access-method": "regular",
    "country-counts": {
        "au": 6,
        "jp": 1,
        "us": 13
    }
}
...

I've currently built this queryset from my own knowledge:

metrics = LoggedItem.objects.filter(
    identifier=record['identifier'],
    hit_type='investigations',
    is_machine=False,
    is_robot=False
).values(
    'identifier', 'country'
).annotate(
    count=models.Count('identifier'),
    metric_type=models.Value("total-dataset-requests", output_field=models.CharField()),
    access_method=models.Value("regular", output_field=models.CharField())
)

This gives me a <Queryset []> as follows:

<QuerySet [{'identifier': '10.80519/954e-4db4', 'country': 'fr', 'count': 1, 'metric_type': 'total-dataset-requests', 'access_method': 'regular'}, {'identifier': '10.80519/954e-4db4', 'country': 'gb', 'count': 5, 'metric_type': 'total-dataset-requests', 'access_method': 'regular'}]>

As you can see, I have all of the data I need for mimicking the above data stucture. But in a sightly obscure format ... I can go in a clean this up with Python, from a values_list() or iterator(), but I'd like to do most of the heavy lifting on the database layer via the ORM.

So I guess my question is, how could I replicate the kid of aggregation as shown in the JSON structure as an ORM query...?

FYI: It would likely be useful to know The full LogItem from models.py:

class LogItem(models.Model):
    ....

    id = models.AutoField(_('ID'), primary_key=True)

    session_id = models.TextField(_('Session ID'), default='')

    doubleclick_id = models.TextField(_('Doubleclick ID'), default='')

    event = models.DateTimeField(_('Event Recorded At'), default=now, blank=True)

    client_ip = models.CharField(_('Client IP'), max_length=15, null=True, blank=True)

    session_cookie_id = models.CharField(_('Session Cookie ID'), max_length=120, null=True, blank=True)

    user_cookie_id = models.CharField(_('User Cookie ID'), max_length=120, null=True, blank=True)

    user_id = models.CharField(_('User ID'), max_length=100, null=True, blank=True)

    request_url = models.TextField(_('Request URL'), default='')

    identifier = models.CharField(_('Identifier'), max_length=120, null=True, blank=True)

    filename = models.TextField(_('Filename'), null=True)

    size = models.PositiveBigIntegerField(_('Size'), null=True)

    user_agent = models.TextField(_('User Agent'), default='')

    # Alpha-2 ISO 3166 Codes: [Reference: Country Codes Alpha-2 & Alpha-3](https://www.iban.com/country-codes)
    country = models.CharField(_('Country'), max_length=10, default='gb')

    hit_type = models.CharField(_('Hit Type'), max_length=60, null=True, blank=True)

    is_robot = models.BooleanField(_('Is Robot'), default=False)

    is_machine = models.BooleanField(_('Is Machine'), default=False)
1

There are 1 answers

0
George Messiha On

I think you can only do this in python and not QuerySet, as Queryset is a database "SELECT" representation, and as far as i know you can not include dictionary "country-counts" in your queryset rather than build the result Either in a custom model Serializer Method or in python code:

Serializer Example:

    class LogItemSerializer(serializers.ModelSerializer):
        country_counts = serializers.SerializerMethodField()

        class Meta:
            model = LogItem

        def get_country_counts(self, obj):
            # create the dic you want
            result = {
                  "au": 6,
                  "jp": 1,
                  "us": 13
                  }
            return result