Query Optimization: Select polygons that intersect points

535 views Asked by At

I have two Oracle 18c tables:

  • ply — Polygons: 4970 rows.
  • pnt — Points: 3500 rows.
  • The shapes are stored in SDO_GEOMETRY columns.
  • Edit: I realize now that most of the points are duplicates (spatially). But I don't think that’s a problem.

The data can be viewed here: db<>fiddle.


enter image description here


I have a query that selects polygons that intersect at least one point.

  SELECT objectid 
    FROM (SELECT ply.objectid,
                 row_number() over(partition by ply.objectid order by null) rn
            FROM ply                                       --ORDER BY NULL is intentional.
      CROSS JOIN pnt                                       --It doesn't matter what polygon row per OBJECTID gets selected.
           WHERE sdo_anyinteract(ply.shape, pnt.shape) = 'TRUE'
         )
  WHERE rn = 1

   OBJECTID
   ----------
            1
            2
            3
            4
            5
          ...

   1443 rows selected.
--------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |           |  1457 | 37882 |       | 12851   (1)| 00:00:01 |
|*  1 |  VIEW                              |           |  1457 | 37882 |       | 12851   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK          |           |  1457 |    10M|    11M| 12851   (1)| 00:00:01 |
|   3 |    NESTED LOOPS                    |           |  1457 |    10M|       | 10522   (1)| 00:00:01 |
|   4 |     TABLE ACCESS FULL              | PNT       |  3500 |    12M|       |    11   (0)| 00:00:01 |
|   5 |     TABLE ACCESS BY INDEX ROWID    | PLY       |     1 |  3848 |       | 10522   (1)| 00:00:01 |
|*  6 |      DOMAIN INDEX (SEL: 0.010000 %)| PLY_SHAPE |       |       |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "PLY"."OBJECTID" ORDER BY  NULL )<=1)
   6 - access("MDSYS"."SDO_ANYINTERACT"("PLY"."SHAPE","PNT"."SHAPE")='TRUE')
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

The query is designed to only select one row per polygon. Reason: I want a unique list of polygons that intersect at least one point.


The query is relatively slow.

  • It takes 5 seconds to run in my on-prem Oracle 18c database.
  • And it takes 5+ seconds to run in db<>fiddle too.
  • I believe the statistics are up-to-date since the tables are newly created in my on-prem database and in db<>fiddle.
  • The explain plan indicates that the PLY_SHAPE spatial/domain index is being used, which I think is appropriate.

In comparison, desktop mapping software like ArcGIS Pro can perform the same operation in RAM in 0.5 seconds. Screenshot.


Question:

Can the query performance be improved?

As a novice, it seems surprising to me that my mediocre office computer with mapping software is out-performing our enterprise Oracle database.

2

There are 2 answers

2
kpatenge On

You can start using SDO_JOIN in conjunction with SDO_ANYINTERACT and go forward from there.

SELECT /*+ ordered use_nl (a,b) use_nl (a,c) */ 
    b.objectid as point_object_id, 
    c.objectid as polygon_object_id
FROM 
    TABLE(sdo_join('PNT','SHAPE','PLY','SHAPE')) a,
    PNT b,
    PLY c
WHERE 
    a.rowid1 = b.rowid
AND a.rowid2 = c.rowid
AND SDO_GEOM.RELATE (b.shape, 'ANYINTERACT', c.shape, 1.0) = 'TRUE';
2
User1974 On

I took @kpatenge's excellent answer and added the last small step: only select one row per polygon, using row_number() and WHERE rn = 1.

It runs instantly: 0.3 seconds.

select objectid from (
    select /*+ ordered use_nl (a,b) use_nl (a,c) */    --Try removing this hint. Removing it seemed to reduce the cost without impacting performance.
        c.objectid as objectid,
        row_number() over(partition by c.objectid order by null) rn
    from 
        sdo_join('PNT','SHAPE','PLY','SHAPE') a,       --I removed table(), since that's no longer necessary. https://stackoverflow.com/q/72590279/5576771
        pnt b,
        ply c
    where 
        a.rowid1 = b.rowid
        and a.rowid2 = c.rowid
        and sdo_geom.relate (b.shape, 'ANYINTERACT', c.shape, 1.0) = 'TRUE'   --1.0 is the tolerance: https://docs.oracle.com/en/database/oracle/oracle-database/18/spatl/spatial-concepts.html#GUID-7469388B-6D23-4294-904F-78CA3B7191D3
) where rn = 1;   
----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |          |   142K|  3607K|       |   297K  (1)| 00:00:12 |
|*  1 |  VIEW                                 |          |   142K|  3607K|       |   297K  (1)| 00:00:12 |
|*  2 |   WINDOW SORT PUSHED RANK             |          |   142K|    13M|    17M|   297K  (1)| 00:00:12 |
|   3 |    NESTED LOOPS                       |          |   142K|    13M|       |   294K  (1)| 00:00:12 |
|   4 |     NESTED LOOPS                      |          |   285K|    14M|       |  8204   (1)| 00:00:01 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH| SDO_JOIN |  8168 | 16336 |       |    29   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS BY USER ROWID       | PNT      |    35 |  1820 |       |     1   (0)| 00:00:01 |
|*  7 |     TABLE ACCESS BY USER ROWID        | PLY      |     1 |    46 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("RN"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "C"."OBJECTID" ORDER BY  NULL )<=1)
   6 - access(CHARTOROWID(VALUE(KOKBF$)))
   7 - access(CHARTOROWID(VALUE(KOKBF$)))
PLAN_TABLE_OUTPUT                                                                                                                                --------------------------------------------------------
       filter("SDO_GEOM"."RELATE"("B"."SHAPE",'ANYINTERACT',"C"."SHAPE",1.0)='TRUE')

Related:


Edit: (a simplified query suggested by @AlbertGodfrind)

Time: 0.2 - 0.3 seconds.

    select distinct
        c.objectid as objectid
    from 
        sdo_join('PNT','SHAPE','PLY','SHAPE','MASK=anyinteract') a,
        pnt b,
        ply c
    where 
        a.rowid1 = b.rowid
        and a.rowid2 = c.rowid
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |  4970 |   145K|  1872  (99)| 00:00:01 |
|   1 |  HASH UNIQUE                         |                      |  4970 |   145K|  1872  (99)| 00:00:01 |
|*  2 |   HASH JOIN                          |                      |    14M|   406M|   197  (82)| 00:00:01 |
|   3 |    INDEX FAST FULL SCAN              | R202801_SDE_ROWID_UK |  4970 | 79520 |     4   (0)| 00:00:01 |
|*  4 |    HASH JOIN                         |                      |   285K|  3908K|    37  (11)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN             | R202802_SDE_ROWID_UK |  3500 | 42000 |     4   (0)| 00:00:01 |
|   6 |     COLLECTION ITERATOR PICKLER FETCH| SDO_JOIN             |  8168 | 16336 |    29   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("C".ROWID=CHARTOROWID(VALUE(KOKBF$)))
   4 - access("B".ROWID=CHARTOROWID(VALUE(KOKBF$)))