Adding Missing Dates To Each Location - MySQL 5.7

53 views Asked by At

So my SQL output is:

DUNF    2021-04-01  18
DUNF    2021-04-02  17
DUNF    2021-04-03  7
DUNF    2021-04-04  10
DUNF    2021-04-05  18
DUNF    2021-04-06  20
DUNF    2021-04-07  17
DUNF    2021-04-08  14
LEEDS   2021-04-01  4
LEEDS   2021-04-02  4
LEEDS   2021-04-03  5
LEEDS   2021-04-05  9
LEEDS   2021-04-06  3
LEEDS   2021-04-07  1
LEEDS   2021-04-08  3
etc.

But I need to have an entry for each day, even if the final field is a volume (sometimes nothing happens in that location on that day).

I've tried everything, but my SQL 5.7 knowledge is weak. I can generate a calendar table, but it doesn't fill the gaps in each site (you can see that Leeds is missing a day, but there are other sites that are missing dates too).

The code I'm using so far is:

SELECT location, begin_date, count(*) as volume
  FROM abs_raw_data
  WHERE begin_date >= (SELECT date_sub(max(begin_date), interval 7 day) from abs_raw_data)
  GROUP BY location, begin_date
  ORDER BY location asc, begin_date asc

How would I even go about doing this?

2

There are 2 answers

2
Akina On BEST ANSWER

It seems that you need

SELECT locations.location, 
       calendar.begin_date, 
       COALESCE(COUNT(abs_raw_data.location), 0) as volume
FROM ( {calendar generating subquery} ) AS calendar
CROSS JOIN ( SELECT DISTINCT location
             FROM abs_raw_data ) locations
CROSS JOIN ( SELECT MAX(begin_date) - INTERVAL 7 DAY AS begin_date 
             FROM abs_raw_data ) maxdate
LEFT JOIN abs_raw_data ON calendar.begin_date = abs_raw_data.begin_date 
                      AND locations.location = abs_raw_data.location
                      AND abs_raw_data.begin_date >= maxdate.begin_date 
GROUP BY locations.location, calendar.begin_date
ORDER BY locations.location ASC, calendar.begin_date ASC

The calendar generating subquery may be, for example,

SELECT abs_raw_data.begin_date - INTERVAL nums.n DAY
FROM ( SELECT MAX(begin_date) begin_date
       FROM abs_raw_data) abs_raw_data 
CROSS JOIN ( SELECT 0 n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
             UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 ) nums
-- https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=49ef261eaf01956973d4ce99f6203a91
0
Lajos Arpad On

The other answer suggest the generation of the dates. However, I would create a table called years as

years(year)

another called months as

months(month)

and another called days as

days(month, day)

fill all of them properly, including the 29th of February. So, if you want to generate days and months this way, then you will need a join, like:

select years.year, months.month, days.day
from years
join months
on 1
join days
on months.month = days.month and
   ((days.day <> 29) or (months.month <> 2) or (years.year % 4 > 0))
where (
      years.year > 2020 and
      years.year < 2025 and
      ) or 
      (
      years.year = 2020 and
      years.year = 2025 and
      months.month < 11 and
      months.month > 2 and
      ) or
      (
      years.year = 2020 and
      years.year = 2025 and
      months.month = 11 and
      months.month = 2 and
      days.day >= 2 and
      days.day <= 21 and
      )

Make sure that you insert any year that you may use, you insert all months of the year and all days of the month and then use the above as a subquery.