Mysql: update to create group of visit entries separated by no more than maxmium minutes

63 views Asked by At

I have the following mysql table which records user visit times:

enter image description here

I'd like to update visit_id so that a "visit" is a group of entries for the same user where no entry is more than 20 minutes after the previous, and the visit_id for each entry is the visit_id of the first entry of that visit.

I hope that this example helps make it clear. After the update, the example table above should become:

enter image description here

I'm able to create an update which sets the visit_id to the visit_id of the previous entry for that user if the previous entry occurred less than 20 minutes before.

But I can't solve how to create a sql-only update to deal with the "cascading" effect, in other words, that the visit_id needs to find the earliest entry for that user_id which occurs before a 20-minute gap. It might be the visit_id of that entry (if it's the first of that visit), or the visit_id of previous entry, or of the previous-previous, or of the previous-previous-previous-previous, etc.

How could I write this update?

1

There are 1 answers

3
krokodilko On BEST ANSWER

Try this query (but don't try it on production data, but rather on a copy of this data):

update tabb_after_update tabb, 
(
   select t.*,
       ( SELECT min( visit_id )
         FROM tabb_after_update t1
         WHERE t1.user_id = t.user_id
         AND t1.time_of_visit <= t.time_of_visit
         AND t1.time_of_visit >= subtime( t.time_of_visit, '00:20' )
        ) new_id
    from tabb_after_update t
) tabb1
SET tabb.visit_id = tabb1.new_id
WHERE tabb.user_id = tabb1.user_id
  AND tabb.visit_id = tabb1.visit_id
;

SQLFiddle demo --> http://www.sqlfiddle.com/#!2/caa08/1




------ EDIT -----

Another version that "joins gaps" into one group if gaps are <= 20 minutes.

set @last_uid = 0;
set @last_tm = '00:00';
set @last_vid = 0;
update tabb_after_update tabb, 
(
        select t.* ,
               case when @last_uid = user_id 
                         AND cast( @last_tm as time) >= subtime( time_of_visit, '00:20' )
                    then if( (@last_tm := time_of_visit ), @last_vid, @last_vid )
                    else 
                         if( (@last_uid := user_id) + 
                             (@last_vid := visit_id ) +
                             (@last_tm := time_of_visit ),
                             @last_vid, @last_vid )
               end new_id
        from tabb_after_update t
        order by user_id, time_of_visit
) tabb1
SET tabb.visit_id = tabb1.new_id
WHERE tabb.user_id = tabb1.user_id
  AND tabb.visit_id = tabb1.visit_id
;

SQLFiddle demo --> http://www.sqlfiddle.com/#!9/39f03/1
In this demo the user 1 has entries from 17:10 to 17:50 with "gaps" betwen records less than 20 minutes, and the query "combines" all these records into one group.