When calculating the statistical mode of a vector, there is often more than one mode:
c(1, 1, 2, 2, 3, 4) # mode is both 1 and 2
In such scenarios, if I want to decide between two (or more) possible values, I use fmode() from {collapse} package, which offers, through the ties argument, 3 possible methods for deciding:
tiesan integer or character string specifying the method to resolve ties between multiple possible > modes i.e. multiple values with the maximum frequency or sum of weights:
Int. String Description 1 first take the first occurring mode. 2 min take the smallest of the possible modes. 3 max take the largest of the possible modes.
Example of fmode()
library(collapse)
my_vec <- c(1, 1, 3, 4, 5, 5, -6, -6, 2, 2) # 4 modes here: 1, 2, 5, -3
fmode(my_vec, ties = "first")
#> [1] 1
fmode(my_vec, ties = "min")
#> [1] -6
fmode(my_vec, ties = "max")
#> [1] 5
My Question
I'm looking for a "last" method — i.e., whenever there's more than one mode, return the "last" mode. But unfortunately, fmode() doesn't have a "last" method.
So if we return to my example, what I wish is that for the vector:
my_vec <- c(1, 1, 3, 4, 5, 5, -6, -6, 2, 2)
I want a function that does
custom_mode_func(my_vec, method = "last")
## [1] 2
The only option you have with collapse is sorting the data beforehand e.g.
The reason
revdoesn't work is because collapse grouping doesn't split the data, but only determines to which group a row belongs, and then computes statistics on all groups simultaneously using running algorithms in C++ (e.g. the grouped computation is done byfmodeitself). So in your coderevis actually executed before the grouping and reverses the entire vector. In this case, probably a native data.table implementation callingfmode.defaultdirectly (to optimize on method dispatch) would be the fastest solution. I can think about adding a"last"mode if I find time for that.