I am using the below code to keep only the most recent timestamp per id in a table that has ~4,200 observations with ~4,000 unique id's. The table has ~2,700 columns and some of the id's have multiple timestamps. The script takes 306 seconds to finish.

DT[, .SD[which.max(`timestamp`)], id]

However it only takes fraction of a second when I use .I to determine the rows that have the most recent timestamp per id.

DT[, .I[which.max(`die timestamp`)], id]

Why is .SD very slow and what is the best way to retain only the most recent observation by id?

0 Answers