Create a row for every month between 2 dates in PostgreSQL

6.2k views Asked by At

I need to create a row for every month (result should be first day of the month) between 2 dates for each person in my table. For example, if I have the following data in my source table:

rowID | person      | startdate   | enddate
1     | 12345       | 2014-04-01  | 2014-11-30
2     | 67890       | 2014-03-01  | 2014-05-01

I want the results in my destination table to be:

person | month
12345  | 2014-04-01
12345  | 2014-05-01
12345  | 2014-06-01
12345  | 2014-07-01
12345  | 2014-08-01
12345  | 2014-09-01
12345  | 2014-10-01
12345  | 2014-11-01
67890  | 2014-03-01
67890  | 2014-04-01
67890  | 2014-05-01

Thanks so much for the help.

2

There are 2 answers

3
Clodoaldo Neto On BEST ANSWER

No need for a CTE or a lateral join:

select
    person,
    generate_series(
        date_trunc('month', startdate), 
        enddate, '1 month'
    )::date as month
from rfem
order by 1, 2
;
 person |   month    
--------+------------
  12345 | 2014-04-01
  12345 | 2014-05-01
  12345 | 2014-06-01
  12345 | 2014-07-01
  12345 | 2014-08-01
  12345 | 2014-09-01
  12345 | 2014-10-01
  12345 | 2014-11-01
  67890 | 2014-03-01
  67890 | 2014-04-01
  67890 | 2014-05-01
0
Kamil Gosciminski On

Calculate minimum and maximum dates for each person with first days of months then generate monthly-based ranges between those dates using generate_series:

WITH date_ranges AS (
SELECT 
  person,
  min(date_trunc('month', startdate))::timestamptz AS min_start,
  max(date_trunc('month', enddate))::timestamptz AS max_end
FROM person_table
GROUP BY 1
)
SELECT 
  dr.person,
  ser.month::DATE as month
FROM date_ranges AS dr,
     generate_series(min_start, max_end, '1 month') AS ser(month)

Output

 person |   month
--------+------------
  12345 | 2014-04-01
  12345 | 2014-05-01
  12345 | 2014-06-01
  12345 | 2014-07-01
  12345 | 2014-08-01
  12345 | 2014-09-01
  12345 | 2014-10-01
  12345 | 2014-11-01
  67890 | 2014-03-01
  67890 | 2014-04-01
  67890 | 2014-05-01

How it works? Implicit LATERAL JOIN for function call forces the calculation for every row from the input.

This solution takes into account that you may have more than 1 row per each person with dates and it takes the highest possible range there is.