This the minimal query statement I want to execute.
select count(*) from temper_300_1 group by onegid;
I do have "where" clauses to go along as well though. What I am trying to do is build a histogram query and determine the number of elements with a particular "onegid". the query takes about 7 seconds on 800 million rows. Could someone suggest a faster alternative or optimization.
I was to actually trying plot a heatmap from a spatial data consisting of latitudes and longitudes, I have assigned a grid id to each elements, but the "group by aggregation" is coming out to be pretty costly in terms of time.
You're not going to get much faster than
group by
, though your current query won't display which group item is associated with each count.Make sure that the table is properly distributed with
The counts should be roughly equal. If they're not, your DBA needs to redistribute the table on a better distribution key.
If it is, you could ask your DBA to create a materialized view on that specific column, ordered by that column. You may see some performance gains.