Extracting data of recent n active days of every user from dataframe of gaming activity

111 views Asked by At

So, first of all, data looks like this :

enter image description here

Sample Data if you would like to work

Every record in the table represents a specific user playing a specific type of game with some amount of money. So, it is possible that a user might play one day and not the other day.

What I would like to extract is ...activity of users in their last active n days(let's say n is 15). To be clear, I'll take an example: Let's say a user with account_id 12345 has played some games on Sep 16,17,18,19,25,26 2020, and on Oct 8 2020(no of the games he played doesn't matter, he has played at least one game on those days ). What I had done till now is I've extracted the activity of that player from the recent date he played(Oct 8 2020 to 15 days back, i.e, 24 Sep 2020). So, I'll only have the gaming activity of the player on 25,26 Sep 2020, and Oct 8 2020. But what I want is the activity of the past 15 active days which includes Oct 8 2020 and Sep 16,17,18,19,25,26 and 8 active days before that(since I want 15 days and the dates I've mentioned count to 7)

My approach for the activity of the last 15 days(I might be active or Inactive) is

WITH BASE AS( 
SELECT 
MAX(date) AS LastDate, 
account_id 
FROM aug 
GROUP BY account_id
) 
SELECT 
ga.account_id, 
ga.date
FROM aug GA 
JOIN BASE B ON b.account_id = ga.account_id 
WHERE ga.date >= DATE_SUB(b.LastDate, INTERVAL 15 DAY) 
  AND ga.date <= b.LastDate  

I'm unable to convert these last 15 days into the last active 15 days. Pls, guide me on this. Thanks in Advance :)

2

There are 2 answers

9
forpas On BEST ANSWER

There is no need for joins.
You have all the data you need in the table aug.
Use DENSE_RANK() window function to select the rows for the last 15 active days:

select account_id, date, Real_money, table_name
from (
  select *, dense_rank() over (order by date desc) dr
  from aug 
  where account_id = ?
) t
where dr <= 15 

Replace ? with the account_id that you want to search for.

If you want results for all the account_ids:

select account_id, date, Real_money, table_name
from (
  select *, dense_rank() over (partition by account_id order by date desc) dr
  from aug 
) t
where dr <= 15 

For versions of SQLite prior to 3.25.0, without window functions, create this index:

CREATE INDEX aug_account_id_date ON aug(account_id, date);

and try:

select a.* from aug a
where a.date >= coalesce(
  (
    select distinct date 
    from aug
    where account_id = a.account_id
    order by date desc limit 14, 1
  ), '0000-00-00'
)

or:

select a.* from aug a
where (select count(distinct date) from aug where account_id = a.account_id and date >= a.date) <= 15

See the demo.

2
Gordon Linoff On

If I understand correctly:

SELECT gab.*
FROM (SELECT ga.account_id, ga.date,
             DENSE_RANK() OVER (ORDER BY b.LastDate DESC) as seqnum
      FROM aug GA JOIN
           BASE b
           ON b.account_id = ga.account_id 
     ) gab
WHERE seqnum >= 15;