I am trying to make a stacked barplot using R. The main sticking point is using the colors from the color column in the plot appropriately.
Requirements of the plot:
- Each bar(x axis) should represent a time.
- Each species should be its appropriate color (given by the color column) with its space on the barplot reflecting abundance(y axis).
- Within each bar, the species in the same phyla should be grouped together.
- Setting the width of the bars would be really cool, but not necessary.
Characteristics of the dataset:
- Each species has an individual color and the colors of the species are gradiented by their phyla.
- The abundances of species within a time sum to 100.
- Not every species is in every time
- There are 7 times, 8 phyla, 132 species
Other ideas on how to represent these data are welcome.
Representative data:
phyla species abundance color time
Actinobacteria Bifidobacterium_adolescentis 18.73529 #F7FBFF D30
Firmicutes Faecalibacterium_prausnitzii 14.118 #F7FCF5 D30
Firmicutes Catenibacterium_mitsuokai 12.51944 #F3F9F2 D30
Bacteroidetes Bacteroides_ovatus 7.52241 #FFF5EB D30
Firmicutes Faecalibacterium_prausnitzii 21.11866 #F7FCF5 D7
Firmicutes Ruminococcus_sp_5_1_39BFAA 13.54397 #92B09C D7
Actinobacteria Bifidobacterium_adolescentis 10.21989 #F7FBFF D7
Actinobacteria Bifidobacterium_adolescentis 38.17028 #F7FBFF D90
Firmicutes Catenibacterium_mitsuokai 11.04982 #F3F9F2 D90
Firmicutes Faecalibacterium_prausnitzii 9.82507 #F7FCF5 D90
Actinobacteria Collinsella_aerofaciens 5.2334 #D4DEE9 D90
Thank you in advance; I am banging my head against the wall with this.
Code thanks to Robert.
#reshape the dataframes as matrices
#species are row names and times are columns (abundance data makes up matrix)
#put the matrix times in the correct order
#create stacked barplot that has the width of column reflecting shannon index
#save the stacked barplots in files named by the entry list
for(i in 1:n){
phyl=aggregate(abundance ~ phyla+species+color+time, dfs[[i]], sum)
phyl=phyl[with(phyl,order(phyla,species,time)),]
wide <- reshape(phyl, idvar = c("phyla","species","color"),
timevar = "time", direction = "wide")
wide[is.na(wide)]<-0
wide
res1=as.matrix(wide[,-c(1:3)],ncol=dim(wide[,-c(1:3)])[2])
colnames(res1)=
unlist(strsplit(colnames(res1), ".", fixed = TRUE)) [seq(2,length(colnames(res1))*2,by=2)]
rownames(res1)=wide$species
res1 <- res1[,c('E','FMT','PA','PF','D7','D30','D90')]
bar.width <- as.matrix(div.dfs[[i]]['frac'])
mypath <- file.path(output.path,paste(project.name, "_", lhs[i], ".tiff", sep = ""))
tiff(file=mypath)
mytitle = paste(project.name, lhs[i])
barplot(res1,col=wide$color,beside = F, width = c(bar.width), main = mytitle, legend.text=F,args.legend=
list(x = "top",bty="n",cex=.6,ncol=2))
dev.off()
rm(res1)
}
#makes the legend and exports as a eps file
setwd(output.path)
plot_colors <- database$color
text <- database$species
SetEPS()
postscript('legend.eps')
plot.new()
par(xpd=TRUE)
legend("center",legend = text, text.width = max(sapply(text, strwidth)),
col=plot_colors, lwd=1, cex=.2, horiz = F, ncol=2, bty='n')
par(xpd=FALSE)
dev.off()
This is without phyla
This is the approach considering phyla