Recursive Queries with Window functions

67 views Asked by At

So, I have a table:

   DATE     STORE   PRODUCT
01.01.2020  Store1  Product1
01.01.2020  Store1  Product2
01.01.2020  Store1  Product3
01.01.2020  Store1  Product4
01.02.2020  Store1  Product5
01.02.2020  Store1  Product6
01.02.2020  Store1  Product7
01.02.2020  Store1  Product8
01.01.2020  Store2  Product1
01.01.2020  Store2  Product2
01.01.2020  Store2  Product3
01.01.2020  Store2  Product4
03.01.2020  Store2  Product1
01.03.2020  Store2  Product2
02.03.2020  Store2  Product8
03.03.2020  Store2  Product10
06.08.2020  Store2  Product6
08.11.2020  Store2  Product7

I need to to Count Distinct stores. Moreover, if more than 20 days have passed between previous rows, then the value of store is unique

What I expect:

   date Store   Product Product_x
01.01.2020  Store1  Product1    1
01.01.2020  Store1  Product2    1
01.01.2020  Store1  Product3    1
01.01.2020  Store1  Product4    1
01.02.2020  Store1  Product5    1e
01.02.2020  Store1  Product6    1e
01.02.2020  Store1  Product7    1e
01.02.2020  Store1  Product8    1e
01.01.2020  Store2  Product1    2
01.01.2020  Store2  Product2    2
01.01.2020  Store2  Product3    2
01.01.2020  Store2  Product4    2
03.01.2020  Store2  Product1    2
01.03.2020  Store2  Product2    2e
02.03.2020  Store2  Product8    2e
03.03.2020  Store2  Product10   2e
06.08.2020  Store2  Product6    2ee
08.11.2020  Store2  Product7    2eee

What I have and my query:

    DATE_   STORE   PRODUCT TT
01.01.2020  Store1  Product1    1
01.01.2020  Store1  Product2    1
01.01.2020  Store1  Product3    1
01.01.2020  Store1  Product4    1
01.02.2020  Store1  Product8    1e
01.02.2020  Store1  Product7    1
01.02.2020  Store1  Product5    1
01.02.2020  Store1  Product6    1
01.01.2020  Store2  Product4    2
01.01.2020  Store2  Product3    2
01.01.2020  Store2  Product2    2
01.01.2020  Store2  Product1    2
03.01.2020  Store2  Product1    2
01.03.2020  Store2  Product2    2e
02.03.2020  Store2  Product8    2
03.03.2020  Store2  Product10   2
06.08.2020  Store2  Product6    2e
08.11.2020  Store2  Product7    2e

Query:

 select p.*, case
                  when p.date_ - lag(p.date_, 1) over ( partition by p.store order by p.date_)>20
                  then to_char(dense_rank() over (order by p.store)) || 'e'
                  else to_char(dense_rank() over (order by p.store)) end tt 
from table p;

How is it possible to solve this problem? And how to apply a recursive function to add "e" if the difference between the dates is more than 20 days?

1

There are 1 answers

0
MT0 On

How is it possible to solve this problem?

From Oracle 12, you can use MATCH_RECOGNIZE for row-by-row pattern matching:

SELECT dt,
       store,
       product,
       DENSE_RANK() OVER (ORDER BY store) || LPAD('e', mn - 1, 'e') AS tt
FROM   table_name
       MATCH_RECOGNIZE(
         PARTITION BY store
         ORDER BY dt, product
         MEASURES
           MATCH_NUMBER() AS mn
         ALL ROWS PER MATCH
         PATTERN (first_row within_20_day* )
         DEFINE
           within_20_day AS dt <= PREV(dt) + INTERVAL '20' DAY
       )

Or, in earlier versions, using the LAG and SUM analytic functions:

SELECT dt,
       store,
       product,
       DENSE_RANK() OVER (ORDER BY store)
       || LPAD(
            'e',
            SUM(has_changed_grp) OVER (PARTITION BY store ORDER BY dt, product) - 1,
            'e'
          ) AS tt
FROM   (
  SELECT dt,
         store,
         product,
         CASE
         WHEN dt <= LAG(dt) OVER (PARTITION BY store ORDER BY dt, product)
                    + INTERVAL '20' DAY
         THEN 0
         ELSE 1
         END AS has_changed_grp
  FROM   table_name
)

Which, for the sample data:

CREATE TABLE table_name (dt, STORE, PRODUCT) AS
SELECT DATE '2020-01-01', 'Store1', 'Product1' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store1', 'Product2' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store1', 'Product3' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store1', 'Product4' FROM DUAL UNION ALL
SELECT DATE '2020-02-01', 'Store1', 'Product5' FROM DUAL UNION ALL
SELECT DATE '2020-02-01', 'Store1', 'Product6' FROM DUAL UNION ALL
SELECT DATE '2020-02-01', 'Store1', 'Product7' FROM DUAL UNION ALL
SELECT DATE '2020-02-01', 'Store1', 'Product8' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store2', 'Product1' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store2', 'Product2' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store2', 'Product3' FROM DUAL UNION ALL
SELECT DATE '2020-01-01', 'Store2', 'Product4' FROM DUAL UNION ALL
SELECT DATE '2020-01-03', 'Store2', 'Product1' FROM DUAL UNION ALL
SELECT DATE '2020-03-01', 'Store2', 'Product2' FROM DUAL UNION ALL
SELECT DATE '2020-03-02', 'Store2', 'Product8' FROM DUAL UNION ALL
SELECT DATE '2020-03-03', 'Store2', 'Product10' FROM DUAL UNION ALL
SELECT DATE '2020-08-06', 'Store2', 'Product6' FROM DUAL UNION ALL
SELECT DATE '2020-11-08', 'Store2', 'Product7' FROM DUAL;

Both output:

DT STORE PRODUCT TT
2020-01-01 00:00:00 Store1 Product1 1
2020-01-01 00:00:00 Store1 Product2 1
2020-01-01 00:00:00 Store1 Product3 1
2020-01-01 00:00:00 Store1 Product4 1
2020-02-01 00:00:00 Store1 Product5 1e
2020-02-01 00:00:00 Store1 Product6 1e
2020-02-01 00:00:00 Store1 Product7 1e
2020-02-01 00:00:00 Store1 Product8 1e
2020-01-01 00:00:00 Store2 Product1 2
2020-01-01 00:00:00 Store2 Product2 2
2020-01-01 00:00:00 Store2 Product3 2
2020-01-01 00:00:00 Store2 Product4 2
2020-01-03 00:00:00 Store2 Product1 2
2020-03-01 00:00:00 Store2 Product2 2e
2020-03-02 00:00:00 Store2 Product8 2e
2020-03-03 00:00:00 Store2 Product10 2e
2020-08-06 00:00:00 Store2 Product6 2ee
2020-11-08 00:00:00 Store2 Product7 2eee

fiddle

And how to apply a recursive function to add "e" if the difference between the dates is more than 20 days?

You probably could do it with recursion but it is likely going to be harder than using MATCH_RECOGNIZE or alternatively, LAG/LEAD and analytic functions.