Big performance difference between using "|"(or), "&"(and) in PostgreSQL full text search

95 views Asked by At

I created a table with about 100,000 data records in the database that includes "id"(serial4), "name"(text), "description"(text) and "vector"(tsvector) columns. The "vector" column is calculated and stored by the setweight of "name" and "description".

When I search using the following queries, query using "or" | Is much slower than queries done with "and" &.

A. Query using "&":

SELECT * FROM "document"
WHERE "vector" @@ to_tsquery('Software & Appliction & Windows & Security & 8.1')
ORDER BY ts_rank_cd("vector", to_tsquery('Software & Appliction & Windows & Security & 8.1')) DESC

Explain query using "&":

Sort  (cost=80.78..80.78 rows=1 width=1284) (actual time=0.168..0.169 rows=0 loops=1)
  Sort Key: (ts_rank_cd(vector, to_tsquery('Software & Appliction & Windows & Security & 8.1'::text))) DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Bitmap Heap Scan on document  (cost=76.25..80.77 rows=1 width=1284) (actual time=0.152..0.153 rows=0 loops=1)
        Recheck Cond: (vector @@ to_tsquery('Software & Appliction & Windows & Security & 8.1'::text))
        ->  Bitmap Index Scan on index_document_vector  (cost=0.00..76.25 rows=1 width=0) (actual time=0.149..0.150 rows=0 loops=1)
              Index Cond: (vector @@ to_tsquery('Software & Appliction & Windows & Security & 8.1'::text))
Planning Time: 0.464 ms
Execution Time: 0.236 ms

B. Query using "|":

SELECT * FROM "document"
WHERE "vector" @@ to_tsquery('Software | Appliction | Windows | Security | 8.1')
ORDER BY ts_rank_cd("vector", to_tsquery('Software | Appliction | Windows | Security | 8.1')) DESC

Explain query using "|":

Gather Merge  (cost=56865.65..64207.06 rows=62922 width=1284) (actual time=1575.781..1717.312 rows=66263 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Sort  (cost=55865.63..55944.28 rows=31461 width=1284) (actual time=1497.434..1525.802 rows=22088 loops=3)
        Sort Key: (ts_rank_cd(vector, to_tsquery('Software | Appliction | Windows | Security | 8.1'::text))) DESC
        Sort Method: external merge  Disk: 29624kB
        Worker 0:  Sort Method: external merge  Disk: 26440kB
        Worker 1:  Sort Method: external merge  Disk: 25752kB
        ->  Parallel Bitmap Heap Scan on document  (cost=769.42..35878.79 rows=31461 width=1284) (actual time=17.608..1374.920 rows=22088 loops=3)
              Recheck Cond: (vector @@ to_tsquery('Software | Appliction | Windows | Security | 8.1'::text))
              Heap Blocks: exact=6763
              ->  Bitmap Index Scan on index_document_vector  (cost=0.00..750.54 rows=75506 width=0) (actual time=34.811..34.811 rows=66329 loops=1)
                    Index Cond: (vector @@ to_tsquery('Software | Appliction | Windows | Security | 8.1'::text))
Planning Time: 0.668 ms
Execution Time: 1726.138 ms

As you can see, a query with "&" is 7314 times faster than a query with "|".

It is also clear from the results of the Explain that the methods of obtaining query results are different from each other.

  1. Why are these two queries so different in terms of method and execution speed?
  2. How can the query with "|" be optimized?
0

There are 0 answers