Query takes 30-40 sec (both via ORM and raw SQL). 200kk rows

91 views Asked by At

I have a problem with a query which takes 30-40 secs.

I do next 2 queries:

country = CountriesTranslations.objects.filter(name='United Kingdom').first().country # Takes 1-8 secs
city = Cities.objects.filter(citiestranslations__name='London', country=country).first() # Takes 30-40 secs

Via connection.queries checked raw version of the 2nd query in order to view what Django trying to execute and then the same code launched via pgAdmin in order to avoid influence of Python\ORM - execution time is pretty similar

Raw query:

SELECT * FROM cities
INNER JOIN cities_translations ON (cities.id = cities_translations.city_id)
WHERE (cities_translations.name = 'London' AND cities.country_id = 32) ORDER BY cities.id ASC LIMIT 1

Here is the 'EXPLAIN ANALYZE':

"Limit  (cost=0.43..217162.78 rows=1 width=182) (actual time=35674.343..35674.344 rows=1 loops=1)"
"  ->  Nested Loop  (cost=0.43..10206631.04 rows=47 width=182) (actual time=35674.341..35674.341 rows=1 loops=1)"
"        Join Filter: (cities.id = cities_translations.city_id)"
"        Rows Removed by Join Filter: 33684"
"        ->  Index Scan using city_pkey on cities  (cost=0.43..5202957.08 rows=20514 width=168) (actual time=83.479..92.091 rows=16843 loops=1)"
"              Filter: (country_id = 32)"
"              Rows Removed by Filter: 164074"
"        ->  Materialize  (cost=0.00..3210046.95 rows=5829 width=14) (actual time=0.193..2.112 rows=2 loops=16843)"
"              ->  Seq Scan on cities_translations  (cost=0.00..3210017.80 rows=5829 width=14) (actual time=3247.887..35572.948 rows=2 loops=1)"
"                    Filter: ((name)::text = 'London'::text)"
"                    Rows Removed by Filter: 178754098"
"Planning time: 2.122 ms"
"Execution time: 35674.519 ms"

Models:

class Countries(models.Model):
    iso2 = models.CharField('ISO 2', max_length=2, blank=True)
    iso3 = models.CharField('ISO 3', max_length=3, blank=True)
    # ... lots of other stuff

    class Meta:
        db_table = 'countries'
        verbose_name = 'Country'
        verbose_name_plural = 'Countries'
        ordering = ['iso2']

class CountriesTranslations(models.Model):
    name = models.CharField("Translation", max_length=60, db_index=True, blank=True)
    is_short_name = models.IntegerField()
    description = models.TextField(blank=True)

    country = models.ForeignKey(Countries, on_delete=models.CASCADE, verbose_name='Country')
    lang = models.ForeignKey(Languages, on_delete=models.CASCADE, null=True, verbose_name='Language')

    class Meta:
        db_table = 'countries_translations'
        verbose_name = 'Country Translation'
        verbose_name_plural = 'Countries Translations'
        ordering = ['name']


class Cities(models.Model):
    city_type = models.ForeignKey('CitiesType', on_delete=models.CASCADE, verbose_name='City type', null=True)
    country = models.ForeignKey('Countries', on_delete=models.CASCADE, verbose_name='Country')
    # ... lots of other stuff

    class Meta:
        db_table = 'cities'
        verbose_name = 'City'
        verbose_name_plural = 'Cities'

class CitiesTranslations(models.Model):
    name = models.CharField(max_length=150, db_index=True, blank=True)
    description = models.TextField(blank=True)

    city = models.ForeignKey(Cities, on_delete=models.CASCADE, verbose_name='Details of City')
    lang = models.ForeignKey('Languages', on_delete=models.CASCADE, verbose_name='Language of city', null=True)

    class Meta:
        db_table = 'cities_translations'
        verbose_name = 'City Translation'
        verbose_name_plural = 'Cities Translations'
        ordering = ['name']

System:

  • Django 1.9
  • PostgreSQL 9.4.6
  • Ubuntu 15.10
  • SSD (not HDD)

Rows:

  • CountriesTranslations ~ 11k
  • Cities ~ 2.5kk
  • CitiesTranslations ~ 180kk

Also tried to change default conf of Postgresql according to this article: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

Could someone suggest me where to dig?

1

There are 1 answers

0
Gordon Linoff On BEST ANSWER

For this query:

SELECT *
FROM cities c INNER JOIN
     cities_translations ct
     ON c.id = ct.city_id
WHERE ct.name = 'London' AND c.country_id = 32
ORDER BY c.id ASC
LIMIT 1;

You want to try some indexes. The first I would recommend are cities(country_id, id) and cities_translations(city_id, name).