phyloseq: Discrepancies in otu counts before and after using tax_glom

247 views Asked by At

Maybe I missed something in how tax_glom works but as I did not find any info here nor elsewhere on the web, maybe someone here can help. I do not provide data but I can on request. Here is the code highlighting the issue I have

colSums(CYANO%>%otu_table())

CYANO_gen <- CYANO %>%
  tax_glom(taxrank = "Genus")
colSums(CYANO_gen%>%otu_table())

CYANO is a phyloseq object that I wanted to agglomerate at the Genus rank but I noticed that a sample (named 100) was not present in a dataviz. This led me to check where the issue happened. 7 samples out of 54 present discrepancies as shown in the last line of the attached image, weird isn't it?

Results given by the code above and 2 additional lines which highlight the importance of discrepancies and the fact that this is not always the case

Thank, Guillaume

1

There are 1 answers

0
Zina On

The NArm term in the tax_glom function is, by default, set as TRUE. To avoid losing observations with NA cells you need to set the NArm = FALSE. Cheers