PostgreSQL stable functions in query

744 views Asked by At

I'm have some data model which consists of couple tables and I need to filter them.

It is two functions funcFast and funcList. funcFast can return fast result is table need to be filtered by funcList or not. funcList return list of allowed ids. I marked functions as STABLE but they run not as fast as I expect:)

I create couple of example functions:

CREATE OR REPLACE FUNCTION funcFastPlPgSql(res boolean)
returns boolean as $$
    begin return res; end
$$ language plpgsql stable;

CREATE OR REPLACE FUNCTION funcList(cnt int)
returns setof integer as $$
    select generate_series(1, cnt)
$$ language sql stable;

And tests.

Case 1. Filter only by fast function work OK:

explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs 
where funcFastPlPgSql(true)

Query plan is:

Aggregate  (cost=27.76..27.77 rows=1 width=8) (actual time=573.258..573.259 rows=1 loops=1)
  CTE obs
    ->  Result  (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..114.327 rows=1000000 loops=1)
  ->  Result  (cost=0.25..20.25 rows=1000 width=0) (actual time=0.038..489.942 rows=1000000 loops=1)
        One-Time Filter: funcfastplpgsql(true)
        ->  CTE Scan on obs  (cost=0.25..20.25 rows=1000 width=0) (actual time=0.012..392.504 rows=1000000 loops=1)
Planning time: 0.184 ms
Execution time: 576.177 ms

Case 2. Filter only by slow function work OK too:

explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs 
where  id in (select funcList(1000))

Query plan is:

Aggregate  (cost=62.26..62.27 rows=1 width=8) (actual time=469.344..469.344 rows=1 loops=1)
  CTE obs
    ->  Result  (cost=0.00..5.01 rows=1000 width=4) (actual time=0.006..106.144 rows=1000000 loops=1)
  ->  Hash Join  (cost=22.25..56.00 rows=500 width=0) (actual time=1.566..469.202 rows=1000 loops=1)
        Hash Cond: (obs.id = (funclist(1000)))
        ->  CTE Scan on obs  (cost=0.00..20.00 rows=1000 width=4) (actual time=0.009..359.580 rows=1000000 loops=1)
        ->  Hash  (cost=19.75..19.75 rows=200 width=4) (actual time=1.548..1.548 rows=1000 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 44kB
              ->  HashAggregate  (cost=17.75..19.75 rows=200 width=4) (actual time=1.101..1.312 rows=1000 loops=1)
                    Group Key: funclist(1000)
                    ->  Result  (cost=0.00..5.25 rows=1000 width=4) (actual time=0.058..0.706 rows=1000 loops=1)
Planning time: 0.141 ms
Execution time: 472.183 ms

Case 3. But then two function combined I expect what the best case should be close to [case 1] and worst case should be close to [case 2], but:

explain analyze
with obs as (select generate_series(1, 1000000) as id)
select count(*) from obs
where funcFastPlPgSql(true) or id in (select funcList(1000))

Query plan is:

Aggregate  (cost=286.93..286.94 rows=1 width=8) (actual time=1575.775..1575.775 rows=1 loops=1)
  CTE obs
    ->  Result  (cost=0.00..5.01 rows=1000 width=4) (actual time=0.008..131.372 rows=1000000 loops=1)
  ->  CTE Scan on obs  (cost=7.75..280.25 rows=667 width=0) (actual time=0.035..1468.007 rows=1000000 loops=1)
        Filter: (funcfastplpgsql(true) OR (hashed SubPlan 2))
        SubPlan 2
          ->  Result  (cost=0.00..5.25 rows=1000 width=4) (never executed)
Planning time: 0.100 ms
Execution time: 1578.624 ms

What I am missing here? Why query with two together functions runs much longer and how to fix it?

0

There are 0 answers