MySQL - Keep one row per minute

1.2k views Asked by At

I have a very big table in MySQL which contains rows which are timestamped up to milliseconds..

I have a date column which contains dates (called date)

2011-01-13  

and another column which contains the time, in hours, minutes, seconds, milliseconds.. (called time)

01:11:43.381

since I have too many records per minute, is it possible that I keep one entry per minute and delete the other entries in the same minute ? (I could keep the first entry for every minute for example)

How can I do that ? I couldn't find anything online although I searched a bit :(

Thanks a lot !

3

There are 3 answers

4
Gordon Linoff On BEST ANSWER

You can do what you want with delete and aggregation:

delete bt
    from bigtable bt join
         (select date, min(time) as time
          from bigtable
          group by date, hour(time), minute(time)
         ) btt
         on btt.date = bt.date and
            hour(bt.time) = hour(btt.time) and
            minute(bt.time) = minute(btt.time) and
            bt.time <> btt.mintime;

I am not promising you that this is efficient, but it should do what you want.

0
kamoor On

You should be able to delete every rows except first one meet the condition you stated using a simple delete statement below.

DELETE FROM TABLE1 WHERE ID NOT IN (
        SELECT I FROM (
               SELECT  MIN(PRIMARY_KEY) I, 
                       CONCAT( DATE_FIELD ,' ', SUBSTR(TIME_FIELD, 1, 5)) DT  
               FROM TABLE1 GROUP BY DT) AS X)

Basically I are trying to concatenate Date and "Time until minute" and filter the minimum identifier in inner query, Then delete everything except the minimum ID selected.

Important Note: Not inserting unwanted records multiple times will be the best way to solve this issue but the person who asked the question already mentioned thats not an option for him.

4
tadman On

Having your date and time split up into two columns can prove exceptionally annoying, so I'd avoid doing that unless you had a compelling technical reason.

As for your minute problem, what you need is a column you can enforce a UNIQUE constraint on. The simplest approach is to use a minute serial, or the epoch time in seconds divided by 60.

Add a column called epoch_minute of type INT, give it a UNIQUE index constraint, and populate it with the equivalent of that value. Then you can use an INSERT IGNORE or INSERT ... ON DUPLICATE KEY type trigger to handle repetition per minute. This depends on if you want to keep the first record, or the last record for each minute.

You can migrate your old values using the UNIX_TIMESTAMP() function:

UPDATE table_name SET epoch_minute = UNIX_TIMESTAMP(date_column) / 60

As an example, given a table called table_name, prepare a new table to receive the values that has a unique column constraint:

CREATE TABLE _table_name LIKE table_name;
ALTER TABLE _table_name ADD COLUMN epoch_minute INT;
CREATE UNIQUE INDEX index_table_name_epoch_minute ON _table_name (epoch_minute);

Then you can use INSERT IGNORE to skip duplicates:

INSERT IGNORE INTO _table_name (column_a, column_b, epoch_minute)
  SELECT column_a, column_b, UNIX_TIMESTAMP(date_column) / 60
    FROM table_name;

Lastly, swap tables:

RENAME TABLE table_name TO table_name_old, _table_name to table_name;