Poor performance of postgresql KNN queries

90 views Asked by Liviu Nastase At 20 April 2021 at 18:19

There are two DB machines identical in structure and configurations with the characteristics as below. The databases are also identical in structure and contents. The performance of almost 99% of queryes is better on the new machine, only on few is is drastically degraded. Trying to narrow the investigation, I finally found that only one KNN (K Nearest Neighbour) query is the actual offender. Here is the relevant info about the table "maps" (about 820k rows) :

\d maps
                     Table "maps"
   Column   |       Type        | Collation | Nullable |                       
------------+-------------------+-----------+----------
 id         | integer           |           | not null | 
 strada     | character varying |           |          | 
(...)

 the_geom   | geography         |           |          | 
 concat_all | character varying |           |          | 
(...)

Indexes:
    "pk_maps" PRIMARY KEY, btree (id), tablespace "maps_idx"
    "ggx_maps_the_geom" gist (the_geom) WITH (fillfactor='100', buffering='on'), tablespace "maps_idx"
    "ix_maps_strada" hash (strada), tablespace "maps_idx"

Tablespace: "maps_data"

In the following "EXPLAIN ANALYZE" investigation I used a 'point' for test (in reality nothing else but a long irelevant string, for the sake of lizibility just replaced by 'point' and transformed by st_buffer in 'buffer')

EXPLAIN ANALYZE
  SELECT *
  FROM maps m
  WHERE st_buffer('point'::GEOGRAPHY,100::DOUBLE PRECISION) && m.the_geom
  ORDER BY m.the_geom<->'point'::GEOGRAPHY
  LIMIT 1;

The following are the actual machine responses :

(Machine#1 : 16CPU | 64GB RAM | Linux Ubuntu 16.04.7 LTS | PostgreSQL 9.6.21)

 Limit  (cost=0.29..1.96 rows=1 width=711) (actual time=1.306..1.306 rows=1 loops=1)
   ->  Index Scan using ggx_maps_the_geom on maps m  (cost=0.29..1.96 rows=1 width=711) (actual time=1.305..1.305 rows=1 loops=1)
         Index Cond: ('buffer'::geography && the_geom)
         Order By: (the_geom <-> 'point'::geography)
 Planning time: 2.977 ms
 Execution time: 1.480 ms
(6 rows)
Time: **6.885 ms**

(Machine#2 : 16CPU | 64GB RAM | Linux Ubuntu 20.04.2 LTS | PostgreSQL 13.2)

 Limit  (cost=0.29..1.96 rows=1 width=692) (actual time=66.421..66.422 rows=1 loops=1)
   ->  Index Scan using ggx_maps_the_geom on maps m  (cost=0.29..1.96 rows=1 width=692) (actual time=66.420..66.420 rows=1 loops=1)
         Index Cond: (the_geom && 'buffer'::geography)
         Order By: (the_geom <-> 'point'::geography)
 Planning Time: 20.289 ms
 Execution Time: 66.477 ms
(6 rows)
Time: **88.116 ms** (!)

As can be seen, a drastical performance degradation on the machine#2, despite the fact that the two machines are cvasi-identical in all of the aspects, including configurations as well. Does anyone have any idea how to deal with this ? What to check to get some clues ?

Original Q&A

TechQA.

Poor performance of postgresql KNN queries

There are 0 answers

Related Questions in POSTGRESQL

Related Questions in QUERY-OPTIMIZATION

Related Questions in POSTGIS

Related Questions in KNN

Related Questions in GIST-INDEX

Popular Questions

Popular Tags

Trending Questions