I would like to compare two sets efficiently and using setdiff and intersect functions, but is not working the way I wanted to. I would like to compare the elements of two sets and see if which elements are different.
for example:
Aset = c("AAAAA", "AABBB", "AAABB", "BBBBB")
Bset = c("AAABB" ,"AAABB", "BBBAA", "BBBBB")
# present in Aset but not in Bset
setdiff(Aset, Bset)
[1] "AAAAA" "AABBB"
#present in Bset but not in Aset
setdiff(Bset, Aset)
[1] "BBBAA"
# both in Aset in Bset
intersect (Aset, Bset)
[1] "AAABB" "BBBBB"
However when I repeated values, this will consider this as one element (which is correct mathematically) but I want to see how many elements match without considering duplications.
Cset = c("AAAAA", "BBBBB", "AAABB", "BBBBB")
Dset = c("AAABB" ,"AAABB", "ABBBB", "BBBBB")
# present in Aset but not in Bset
setdiff(Cset, Dset)
[1] "AAAAA"
There is one more BBBB in set Cset over the Dset. So I am want a alternate function that can consider duplicated values and give something like this:
[1] "AAAAA" "BBBBB"
The intersect also show similar behavior (which correct by definition).
Eset = c("AAAAA", "BBBBB", "AAAAA", "BBBBB")
Fset = c("BBBBB" ,"AAAAA", "BBBBB", "AAAAA")
intersect (Eset, Fset)
[1] "AAAAA" "BBBBB"
What I would like to see that all four are matching.
[1] "AAAAA" "BBBBB" "AAAAA" "BBBBB"
Looking for alternate function - that fit my need ..