Unique Count on Multiple Columns

68 views Asked by At

I am on postgres/greenplum and I have a query result like this :

         e1          |         e2          
 ------------------- | -------------------
 2017-01-01 00:00:00 | 2017-01-02 00:00:00
 2017-01-01 00:00:00 | 
 2017-01-03 00:00:00 | 2017-01-03 00:00:00

I want to transform this into following :

      timestamp      | e1 | e2 
---------------------+----+----
 2017-01-01 00:00:00 |  2 |  0
 2017-01-02 00:00:00 |  0 |  1
 2017-01-03 00:00:00 |  1 |  1

Here the first column contains all the unique timestamps and rest of the columns respective count of occurrences for each event (e1, e2 ...).

NOTE: The dataset is very large so transformation in application side is very costly.

1

There are 1 answers

0
Mureinik On

You could have two aggregate queries, on for each column, and use a full outer join to combine the two:

SELECT          COALESCE(e1_timestamp, e2_timestamp), 
                COALESCE(e1_count, 0),
                COALESCE(e2_count, 0)
FROM            (SELECT   e1 AS e1_timestamp, COUNT(*) AS e1_count
                 FROM     mytable
                 GROUP BY e1) e1
FULL OUTER JOIN (SELECT   e2 AS e2_timestamp, COUNT(*) AS e2_count
                 FROM     mytable
                 GROUP BY e2) e2 ON e1_timestamp = e2.timestamp