My table has integer columns "a", "b". There are only few (<30) different values for "a", but for each "a", huge number of different "b" exists (>10**7). To speed things up, I created composite index (a,b). I observe that
select count(*) from tab where a=1;
runs fast, also
select count(*) from tab where a=2;
runs fast, but
select count(*) from tab where a=1 or a=2;
runs (oh-my-god-so) slow, the same is true for
select count(*) from tab;
running explain explained, that the fast queries use
-> Index Only Scan using idx on tab
but for the slow queries, sequential scan is used.
Why is this so? Why does Postgresql not use the same index for the latter two queries? Is it just imperfection of the query planner, or are there more profound reasons why Index Only Scan cannot be use?
Because for the conditions like a=1 there is no need to recheck the condition for each row. So, it just traverses the b-tree and counts all leafs that have a=1. No need for eliciting the actual data in this case. I'd try union for two queries with a=1 and a=2 or maybe the condition like a<=2 will work.