How to assess the relative abundance of specific taxa in phyloseq?

1.4k views Asked by At

I have my relative abundance for all samples (X). I have also taken the subset_taxa of that overall for the specific genera (Y). How do I now go about measuring the relative abundance of Y in X?

ps<-phyloseq(ASV,TAX,mapfile,tree) relabun<-transform_sample_counts(ps,function(x) x / sum(x)) ps_genusP <- subset_taxa(ps, Genus == "D_5__Flavisolibacter" | Genus == "D_5__Halomonas"| Genus == "D_5__Thiobacillus"| Genus == "D_5__Sphingomonas"| Genus == "D_5__Bacillus" | Genus == "D_5__uncultured Acidobacterium sp." | Genus == "D_5__Bradyrhizobium"| Genus == "D_5__Ohtaekwangia"| Genus =="D_5__Steroidobacter")

How do I find the relative abundance of ps_genusP genera in the ps (total community)?

Thanks

1

There are 1 answers

0
Zina On

First of all, I can see you created your new phyloseq object (ps_genusP) from ps instead of your relabun. In this way, ps_genusP shows the raw count data instead of relative abundances. Moreover, you might want to agglomerate your data at genus level.

As for your question, my favorite way is to transform my phyloseq object into a dataframe and then use the tidyverse functions to access whatever information I need.

So, in your shoes, I would do:

library(phyloseq)
library(dplyr)
library(tidyr)

relabun.ps <- transform_sample_counts(ps,function(x) x / sum(x))
ps_genus <- tax_glom(relabun.ps, taxrank = "Genus", NArm = FALSE)
ps_genusP <- subset_taxa(ps_genus, Genus %in% c("D_5__Flavisolibacter", "D_5__Halomonas",
                                                "D_5__Thiobacillus", "D_5__Sphingomonas",
                                                "D_5__Bacillus", "D_5__uncultured Acidobacterium sp.", 
                                                "D_5__Bradyrhizobium", "D_5__Ohtaekwangia", 
                                                "D_5__Steroidobacter")
genus.df <- psmelt(ps_genusP)
head(genus.df)
names(myphyla.df)  # to choose factors you want to use to navigate

MySummary <- genus.df %>%
  group_by(Factor1, Factor2, Genus) %>%
  summarize(mean_abund = mean(Abundance, na.rm=TRUE)) 
head(MySummary)

Now you will have a dataframe that looks like this:

*A tibble: 6 x 6 Groups: Factor1, Factor2

Factor1 Factor2 Genus mymean

1 Level1 LevelA Genus1 0.107
2 Level1 LevelA Genus2 8.83
3 Level1 LevelB Genus1 0.0101
4 Level1 LevelB Genus2 17.2
5 Level2 LevelA Genus1 0.533
6 Level2 LevelA Genus2 0.00121*

And that you can use to extract information:

MySummary[MySummary$Genus == "Genus1" & MySummary$Factor1 == "Level1" , ]

I hope this can be helpful. Cheers