How to find all values which only appear less than X times in a vector

5.4k views Asked by At

I have a vector, in this case a character vector. I want all the elements which only appear once in the vector, but the solution should be generalizable for limits other than 1.

I can pick them manually if I use the table function. I thought that the solution would look something like

frequencies <- table(myVector)
myVector[??@frequencies <= 1] 

But first of all, I don't know the slot name which will have to go into ??, and searches for documentation on the table object lead me to nowhere.

Second, while the documentation for table() says that it returns 'an object of class "table"', trying the above with some random word used instead of ??, I didn't get a "no such slot" error, but

Error: trying to get slot "frequencies" from an object of a basic class ("function") with no slots

which seems to indicate that the above won't function even if I knew the slot name.

So what is the correct solution, and how do I get at the separate columns in a table when I need them?

2

There are 2 answers

1
Konrad Rudolph On BEST ANSWER

You don’t need table for this:

vector <- c(1, 0, 2, 2, 3, 2, 1, 4)
threshold <- 1
Filter(function (elem) length(which(vector == elem)) <= threshold, vector)
# [1] 0 3 4

You can use table, but then you get the result as character strings rather than numbers. You can convert them back, of course, but it’s somehow less elegant:

tab <- table(vector)
names(tab)[tab <= threshold]
# [1] "0" "3" "4"
0
rumtscho On

D'oh, the documentation of the table function led me on a merry chase of imaginary object slots.

Whatever the table() function returns, it acts as a simple numeric vector. So my solution idea works when written as:

threshold <- 1
frequencies <- table(myVector)
frequencies[frequencies <= threshold]