There are two DB machines identical in structure and configurations with the characteristics as below. The databases are also identical in structure and contents. The performance of almost 99% of queryes is better on the new machine, only on few is is drastically degraded. Trying to narrow the investigation, I finally found that only one KNN (K Nearest Neighbour) query is the actual offender. Here is the relevant info about the table "maps" (about 820k rows) :
\d maps
Table "maps"
Column | Type | Collation | Nullable |
------------+-------------------+-----------+----------
id | integer | | not null |
strada | character varying | | |
(...)
the_geom | geography | | |
concat_all | character varying | | |
(...)
Indexes:
"pk_maps" PRIMARY KEY, btree (id), tablespace "maps_idx"
"ggx_maps_the_geom" gist (the_geom) WITH (fillfactor='100', buffering='on'), tablespace "maps_idx"
"ix_maps_strada" hash (strada), tablespace "maps_idx"
Tablespace: "maps_data"
In the following "EXPLAIN ANALYZE" investigation I used a 'point' for test (in reality nothing else but a long irelevant string, for the sake of lizibility just replaced by 'point' and transformed by st_buffer in 'buffer')
EXPLAIN ANALYZE
SELECT *
FROM maps m
WHERE st_buffer('point'::GEOGRAPHY,100::DOUBLE PRECISION) && m.the_geom
ORDER BY m.the_geom<->'point'::GEOGRAPHY
LIMIT 1;
The following are the actual machine responses :
(Machine#1 : 16CPU | 64GB RAM | Linux Ubuntu 16.04.7 LTS | PostgreSQL 9.6.21)
Limit (cost=0.29..1.96 rows=1 width=711) (actual time=1.306..1.306 rows=1 loops=1)
-> Index Scan using ggx_maps_the_geom on maps m (cost=0.29..1.96 rows=1 width=711) (actual time=1.305..1.305 rows=1 loops=1)
Index Cond: ('buffer'::geography && the_geom)
Order By: (the_geom <-> 'point'::geography)
Planning time: 2.977 ms
Execution time: 1.480 ms
(6 rows)
Time: **6.885 ms**
(Machine#2 : 16CPU | 64GB RAM | Linux Ubuntu 20.04.2 LTS | PostgreSQL 13.2)
Limit (cost=0.29..1.96 rows=1 width=692) (actual time=66.421..66.422 rows=1 loops=1)
-> Index Scan using ggx_maps_the_geom on maps m (cost=0.29..1.96 rows=1 width=692) (actual time=66.420..66.420 rows=1 loops=1)
Index Cond: (the_geom && 'buffer'::geography)
Order By: (the_geom <-> 'point'::geography)
Planning Time: 20.289 ms
Execution Time: 66.477 ms
(6 rows)
Time: **88.116 ms** (!)
As can be seen, a drastical performance degradation on the machine#2, despite the fact that the two machines are cvasi-identical in all of the aspects, including configurations as well. Does anyone have any idea how to deal with this ? What to check to get some clues ?