Postgis query over 2 tables explodes in runtime

41 views Asked by At

Two small postgis queries alone work fine:

SELECT a.name, a.population, a.place 
   FROM SH_POINT a 
   WHERE a.place IN ('hamlet', 'city', 'town', 'village');

Returns 3700 rows in only 0.26s.

SELECT b.name,b.population 
   FROM SH_POLY b 
   WHERE b.boundary='administrative' AND b.admin_level='8';

Return 1161 rows in only 0.35s.

Now I want to combine these results simply based on the name. With such small datasets it should be no problem but

SELECT a.name, a.population, a.place, b.population 
   FROM SH_POINT a, SH_POLY b 
   WHERE a.place in ('hamlet', 'city', 'town', 'village') 
      AND b.boundary='administrative' AND b.admin_level='8' 
      AND a.name = b.name;

takes over 16 minutes for only 1273 rows!

Other variants like

WITH 
   a AS (SELECT name, population, place 
            FROM SH_POINT 
            WHERE place IN ('hamlet', 'city', 'town', 'village') ), 
   b AS (SELECT name, population 
            FROM SH_POLY 
            WHERE boundary='administrative' AND admin_level='8' ) 

SELECT a.name, a.population, a.place, b.population 
   FROM a, b 
   WHERE a.name = b.name;

fail, too: Time: 990295,242 ms (16:30,295)

SH_POINT and SH_POLY are views to a normal planet_osm_* postgis table to reduce the interesting data to a specific area. (As you can see in the first two single queries this works fine.)

CREATE VIEW SH_POINT AS SELECT * FROM planet_osm_point WHERE 
   way @ (SELECT ST_Collect(way) FROM planet_osm_polygon 
             WHERE name='Schleswig-Holstein' AND admin_level='4') 
   AND ST_Within(way, (SELECT ST_COLLECT(way) FROM planet_osm_polygon WHERE 
                          name='Schleswig-Holstein' AND admin_level='4') );

A query analysis for another variant:

EXPLAIN (ANALYZE, BUFFERS) SELECT a.name, a.population, a.place, b.population from SH_POINT a FULL OUTER JOIN SH_POLY b ON a.name = b.name WHERE a.place IN ('hamlet', 'city', 'town', 'village') AND b.boundary='administrative' AND b.admin_level='8';
Time: 1008286,628 ms (16:48,287)

                                                                          QUERY PLAN                                                                      
       
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-------
 Nested Loop  (cost=242.04..304.10 rows=1 width=34) (actual time=524.970..1008283.783 rows=1273 loops=1)
   Buffers: shared hit=811729 read=193326311
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=57.88..57.89 rows=1 width=32) (actual time=0.289..0.291 rows=1 loops=1)
           Buffers: shared hit=3 read=22
           ->  Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_1  (cost=52.49..56.51 rows=1 width=189) (actual time=0.084..0.089 rows=2 loops=1)
                 Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
                 Heap Blocks: exact=2
                 Buffers: shared hit=3 read=6
                 ->  BitmapAnd  (cost=52.49..52.49 rows=1 width=0) (actual time=0.073..0.075 rows=0 loops=1)
                       Buffers: shared hit=2 read=5
                       ->  Bitmap Index Scan on planet_osm_polygon_name_idx  (cost=0.00..5.07 rows=67 width=0) (actual time=0.047..0.047 rows=2 loops=1)
                             Index Cond: (name = 'Schleswig-Holstein'::text)
                             Buffers: shared hit=2 read=2
                       ->  Bitmap Index Scan on planet_osm_polygon_adminlevel_idx  (cost=0.00..47.17 rows=4098 width=0) (actual time=0.023..0.023 rows=106 lo
ops=1)
                             Index Cond: (admin_level = '4'::text)
                             Buffers: shared read=3
   InitPlan 2 (returns $1)
     ->  Aggregate  (cost=57.88..57.89 rows=1 width=32) (actual time=0.119..0.121 rows=1 loops=1)
           Buffers: shared hit=25
           ->  Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_2  (cost=52.49..56.51 rows=1 width=189) (actual time=0.030..0.033 rows=2 loops=1)
                 Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
                 Heap Blocks: exact=2
                 Buffers: shared hit=9
                 ->  BitmapAnd  (cost=52.49..52.49 rows=1 width=0) (actual time=0.027..0.028 rows=0 loops=1)
                       Buffers: shared hit=7
                       ->  Bitmap Index Scan on planet_osm_polygon_name_idx  (cost=0.00..5.07 rows=67 width=0) (actual time=0.015..0.015 rows=2 loops=1)
                             Index Cond: (name = 'Schleswig-Holstein'::text)
                             Buffers: shared hit=4
                       ->  Bitmap Index Scan on planet_osm_polygon_adminlevel_idx  (cost=0.00..47.17 rows=4098 width=0) (actual time=0.010..0.011 rows=106 lo
ops=1)
                             Index Cond: (admin_level = '4'::text)
                             Buffers: shared hit=3
   InitPlan 3 (returns $2)
     ->  Aggregate  (cost=57.88..57.89 rows=1 width=32) (actual time=0.122..0.124 rows=1 loops=1)
           Buffers: shared hit=25
           ->  Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_3  (cost=52.49..56.51 rows=1 width=189) (actual time=0.032..0.035 rows=2 loops=1)
                 Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
                 Heap Blocks: exact=2
                 Buffers: shared hit=9
                 ->  BitmapAnd  (cost=52.49..52.49 rows=1 width=0) (actual time=0.029..0.030 rows=0 loops=1)
                       Buffers: shared hit=7
                       ->  Bitmap Index Scan on planet_osm_polygon_name_idx  (cost=0.00..5.07 rows=67 width=0) (actual time=0.017..0.017 rows=2 loops=1)
                             Index Cond: (name = 'Schleswig-Holstein'::text)
                             Buffers: shared hit=4
                       ->  Bitmap Index Scan on planet_osm_polygon_adminlevel_idx  (cost=0.00..47.17 rows=4098 width=0) (actual time=0.010..0.010 rows=106 loops=1)
                             Index Cond: (admin_level = '4'::text)
                             Buffers: shared hit=3
   InitPlan 4 (returns $3)
     ->  Aggregate  (cost=57.88..57.89 rows=1 width=32) (actual time=0.192..0.193 rows=1 loops=1)
           Buffers: shared hit=25
           ->  Bitmap Heap Scan on planet_osm_polygon planet_osm_polygon_4  (cost=52.49..56.51 rows=1 width=189) (actual time=0.026..0.028 rows=2 loops=1)
                 Recheck Cond: ((name = 'Schleswig-Holstein'::text) AND (admin_level = '4'::text))
                 Heap Blocks: exact=2

                 
                 Buffers: shared hit=9
                 ->  BitmapAnd  (cost=52.49..52.49 rows=1 width=0) (actual time=0.023..0.024 rows=0 loops=1)
                       Buffers: shared hit=7
                       ->  Bitmap Index Scan on planet_osm_polygon_name_idx  (cost=0.00..5.07 rows=67 width=0) (actual time=0.013..0.013 rows=2 loops=1)
                             Index Cond: (name = 'Schleswig-Holstein'::text)
                             Buffers: shared hit=4
                       ->  Bitmap Index Scan on planet_osm_polygon_adminlevel_idx  (cost=0.00..47.17 rows=4098 width=0) (actual time=0.009..0.009 rows=106 loops=1)
                             Index Cond: (admin_level = '4'::text)
                             Buffers: shared hit=3
   ->  Index Scan using planet_osm_point_place on planet_osm_point  (cost=0.28..33.31 rows=1 width=29) (actual time=3.592..85.382 rows=3699 loops=1)
         Index Cond: ((way @ $0) AND (way @ $1))
         Filter: ((place = ANY ('{hamlet,city,town,village}'::text[])) AND st_within(way, $1))
         Rows Removed by Filter: 4951
         Buffers: shared hit=49 read=4234
   ->  Bitmap Heap Scan on planet_osm_polygon  (cost=10.19..39.21 rows=1 width=24) (actual time=272.552..272.555 rows=0 loops=3699)
         Recheck Cond: ((way @ $2) AND (name = planet_osm_point.name))
         Rows Removed by Index Recheck: 5
         Filter: ((boundary = 'administrative'::text) AND (admin_level = '8'::text) AND st_within(way, $3))
         Rows Removed by Filter: 0
         Heap Blocks: exact=1369 lossy=912
         Buffers: shared hit=811680 read=193322077
         ->  BitmapAnd  (cost=10.19..10.19 rows=1 width=0) (actual time=272.290..272.290 rows=0 loops=3699)
               Buffers: shared hit=811480 read=193319438
               ->  Bitmap Index Scan on planet_osm_polygon_way_idx  (cost=0.00..4.87 rows=45 width=0) (actual time=271.210..271.210 rows=2789969 loops=3699)
                     Index Cond: ((way @ $2) AND (way @ $3))
                     Buffers: shared hit=811430 read=193304692
               ->  Bitmap Index Scan on planet_osm_polygon_name_idx  (cost=0.00..5.07 rows=67 width=0) (actual time=0.018..0.018 rows=2 loops=3699)
                     Index Cond: (name = planet_osm_point.name)
                     Buffers: shared hit=50 read=14746
 Planning:
   Buffers: shared hit=10
 Planning Time: 1.283 ms
 Execution Time: 1008284.465 ms
                 

The only workaround I have found so far is to store both queries in new tables:

CREATE TABLE ppoint AS SELECT a.name, a.population, a.place 
   FROM SH_POINT a 
   WHERE a.place IN ('hamlet', 'city', 'town', 'village') ;

CREATE TABLE ppoly AS SELECT b.name, b.population 
   FROM SH_POLY b 
   WHERE b.boundary='administrative' AND b.admin_level='8';

SELECT a.name, a.population, a.place, b.population 
   FROM ppoint a, ppoly b 
   WHERE a.name=b.name;

This works in less than 3ms but is not a desired option here.

  • What is the root cause for such a long runtime?
  • Can the query be modified to give results in acceptable times?

Thanks for your help!

1

There are 1 answers

0
JGH On BEST ANSWER

The planner has rewritten the query embedding the view code in an inefficient way.

Using the CTE, you can force it to first compute (materialize) the selection from the views:

WITH 
   a AS MATERIALIZED  (SELECT name, population, place 
            FROM SH_POINT 
            WHERE place IN ('hamlet', 'city', 'town', 'village') ), 
   b AS MATERIALIZED  (SELECT name, population 
            FROM SH_POLY 
            WHERE boundary='administrative' AND admin_level='8' ) 

SELECT a.name, a.population, a.place, b.population 
   FROM a, b 
   WHERE a.name = b.name;