Delete all records except the max value of a group

84 views Asked by At

I've got this table:

CREATE TABLE tblTest
(
    Id INT PRIMARY KEY NOT NULL,
    Weight INT NULL,
    Batch INT NULL,
    t_stamp DATETIME NULL
);

Let's pretend it was filled with values:

INSERT INTO BatchDataWillieNelson (WT401_Raw, Batch_Done_Signal,
                                   Batch_Number, t_stamp)
VALUES (31, 3994, 2, '2023-12-05 06:19:58.990'),
       (32, 4052, 2, '2023-12-05 06:37:24.883'),
       (33, 4000, 5, '2023-12-05 08:44:09.780'),
       (34, 4058, 5, '2023-12-05 08:58:59.683'),
       (35, 4032, 8, '2023-12-05 11:18:11.727'),
       (36, 3983, 11, '2023-12-05 13:20:35.167'),
       (37, 4013, 11, '2023-12-05 13:33:54.877'),
       (38, 3993, 14, '2023-12-05 15:29:50.060'),
       (39, 3470, 14, '2023-12-05 15:48:51.053'),
       (40, 3996, 14, '2023-12-05 17:20:09.893'),
       (41, 3348, 14, '2023-12-05 17:39:22.477'),
       (42, 3993, 2, '2023-12-05 19:48:42.117'),
       (43, 4054, 2, '2023-12-05 20:09:12.217'),
       (44, 3991, 23, '2023-12-05 22:55:45.020'),
       (45, 4065, 23, '2023-12-05 23:16:12.077'),
       (46, 3993, 26, '2023-12-06 02:56:29.713'),
       (47, 4033, 26, '2023-12-06 03:12:24.483'),
       (48, 3982, 26, '2023-12-06 06:06:45.720'),
       (49, 4075, 26, '2023-12-06 06:35:05.197'),
       (50, 3994, 2, '2023-12-06 08:02:58.887'),
       (51, 3029, 2, '2023-12-06 08:28:21.260'),
       (52, 3982, 2, '2023-12-06 09:49:23.497'),
       (53, 4038, 7, '2023-12-06 10:12:41.707')

I am using SQL Server 2008. I have a table that has extra records that were written into it by mistake. I need to remove these.

I need to evaluate the Weight column for the max value based on grouping of Batch_Number. The same Batch_Number can be used multiple times. So, without ordering the table, I was trying to look for the change in Batch_Number and treat those rows as a group to evaluate without having to cursor through every record.

Expected results:

Id Weight Batch t_stamp
32 4052 2 '2023-12-05 06:37:24.883'
34 4058 5 '2023-12-05 08:58:59.683'
35 4032 8 '2023-12-05 11:18:11.727'
37 4013 11 '2023-12-05 13:33:54.877'
40 3996 14 '2023-12-05 17:20:09.893'
43 4054 2 '2023-12-05 20:09:12.217'
49 4075 26 '2023-12-06 06:35:05.197'
50 3994 2 '2023-12-06 08:02:58.887'
53 4038 7 '2023-12-06 10:12:41.707'

Ok, sorry for the confusion in what I am asking. Look at the Batch "2"s. There are 3 sets of them. I don't want all three of them grouped together. They each have to be considered their own groups since they are separated by other batch numbers.

Id Weight Batch t_stamp expect Result
31 3994 2 '2023-12-05 06:19:58.990' grp 1 - Remove
32 4052 2 '2023-12-05 06:37:24.883' grp 1 - Keep
33 4000 5 '2023-12-05 08:44:09.780' grp 2 - Remove
34 4058 5 '2023-12-05 08:58:59.683' grp 2 - Keep
35 4032 8 '2023-12-05 11:18:11.727' grp 3 - Keep
36 3983 11 '2023-12-05 13:20:35.167' grp 4 - Remove
37 4013 11 '2023-12-05 13:33:54.877' grp 4 - Keep
38 3993 14 '2023-12-05 15:29:50.060' grp 5 - Remove
39 3470 14 '2023-12-05 15:48:51.053' grp 5 - Remove
40 3996 14 '2023-12-05 17:20:09.893' grp 5 - Keep
41 3348 14 '2023-12-05 17:39:22.477' grp 5 - Remove
42 3993 2 '2023-12-05 19:48:42.117' grp 6 - Remove
43 4054 2 '2023-12-05 20:09:12.217' grp 6 - Keep
44 3991 23 '2023-12-05 22:55:45.020' grp 7 - Remove
45 4065 23 '2023-12-05 23:16:12.077' grp 7 - Keep
46 3993 26 '2023-12-06 02:56:29.713' grp 8 - Remove
47 4033 26 '2023-12-06 03:12:24.483' grp 8 - Remove
48 3982 26 '2023-12-06 06:06:45.720' grp 8 - Remove
49 4075 26 '2023-12-06 06:35:05.197' grp 8 - Keep
50 3994 2 '2023-12-06 08:02:58.887' grp 9 - Keep
51 3029 2 '2023-12-06 08:28:21.260' grp 9 - Remove
52 3982 2 '2023-12-06 09:49:23.497' grp 9 - Remove
53 4038 7 '2023-12-06 10:12:41.707' grp 10 - Keep
3

There are 3 answers

3
Stu On BEST ANSWER

If your RDBMS supports operating directly on a table expression, such as SQL Server, the following is another simple approach:

delete from d
from (
    select *, Row_Number() over(partition by batch order by weight desc) rn
    from t
)d
where rn > 1;

See Fiddle demo

2
Beulah Evanjalin On
WITH MaxWeightPerBatch AS (
    SELECT Batch_Number, MAX(Weight) AS MaxWeight
    FROM tblTest
    GROUP BY Batch_Number
)
DELETE FROM tblTest
WHERE (Batch_Number, Weight) NOT IN (
    SELECT Batch_Number, MaxWeight
    FROM MaxWeightPerBatch
);

Find the maximum Weight for each Batch_Number group, and then DELETE the rows from tblTest that do not have the maximum Weight within their respective Batch_Number groups.

5
Thorsten Kettner On

Delete all rows for which exists another row with the same batch number and a higher weight/signal.

delete from batchdatawillienelson
where exists
(
  select null
  from batchdatawillienelson other
  where other.batch_number = batchdatawillienelson.batch_number
  and other.batch_done_signal > batchdatawillienelson.batch_done_signal
);