Aggregating data by month using SQL in metabase from redshift

140 views Asked by At

I hope you are doing well,

I am working on aggregating data by month,

Note: I am querying data from Redhsift using Metabase.

This is my query:

(SELECT 
     dataset.value AS value, 
     TO_CHAR(dataset.time ::date,'MM/YYYY') AS time, 
     COUNT(id) AS count
 FROM  
     dataset
 GROUP BY 
     dataset.value, TO_CHAR(dataset.time ::date,'MM/YYYY')
 ORDER BY 
     dataset.value ASC, TO_CHAR(dataset.time ::date,'MM/YYYY') ASC)

UNION ALL

(SELECT 
     dataset_2.value_2 AS x, 
     TO_CHAR(dataset_2.time ::date,'MM/YYYY') AS time, 
     COUNT(id) AS count
 FROM 
     dataset_2
 GROUP BY 
     dataset_2.value_2, TO_CHAR(dataset_2.time ::date,'MM/YYYY')
 ORDER BY 
     dataset_2.value_2in ASC, TO_CHAR(dataset_2.time ::date,'MM/YYYY') ASC)

However this query converts date into string. Hence the output is not in proper order. Eg: Jan 22, Jan 23, Feb 22, March 23 instead of Jan 22, Feb 22, March 22 and so on.

Below are the screenshot of two sample dataset.

enter image description here

enter image description here

Screenshot of the Desired Output:

enter image description here

Requesting your help improve the query so the date aggregated by month is in ascending order. Thanks in advance.

2

There are 2 answers

0
DataWrangler On

You need to just switch the ORDER clause

SELECT 
     D.value AS value, 
     TO_CHAR(D.time ::date,'MM/YYYY') AS time, 
     COUNT(event_id) AS count
 FROM  
     dategrouping D
 GROUP BY 
     D.value, TO_CHAR(D.time ::date,'MM/YYYY')
 ORDER BY 
     TO_CHAR(D.time ::date,'MM/YYYY') asc, D.value ASC

Result

value time count
5/5 - Better than before 05/2022 2
Skipped check-out 05/2022 1
1/5 - Not too great 06/2022 2
3/5 - About the same 06/2022 1
5/5 - Better than before 06/2022 2
3/5 - About the same 07/2022 1
4/5 - Between About the same and Better than before 07/2022 2
5/5 - Better than before 07/2022 2
Skipped check-out 07/2022 1
4/5 - Between About the same and Better than before 08/2022 2
Skipped check-out 08/2022 1
4/5 - Between About the same and Better than before 09/2022 2
5/5 - Better than before 09/2022 1
0
Gabriel Jacobina On

Thumbs up to be using metabase

I would recomend you to use the DATE_TRUNC() function instead of TO_CHAR() so the date records will all be moved to the first day of each month and therefore still be stored as a date, which is able to be ordered.

Also, I would first use a CTE to union the date and aggregate it afterwards. The looks of the query would be something like this:

WITH data AS 
(
  SELECT 
      dataset.value AS value, 
      dataset.time AS time
  FROM  
      dataset
  UNION ALL
     (
     SELECT 
          dataset_2.value_2 AS value, 
          dataset_2.time AS time
     FROM 
          dataset_2
     )
 )

SELECT
    data.value,
    DATE_TRUNC('month',data.time) AS mth
FROM 
    data
GROUP BY 
    value,mth
ORDER BY 
    mth