Align barplot with boxplot in R

1.7k views Asked by At

I would like to plot a distribution of counts using the barplot function in R, and underlay it with a boxplot to include information on median, quartiles, and outliers. A not-too-elegant solution for this has been found for histogram and boxplots: http://rgraphgallery.blogspot.com/2013/04/rg-plotting-boxplot-and-histogram.html.

There are many places online where one can find the argument being made that numerical data should be plotted with histograms while categorical data should be plotted with bar plots. My data are numerical, and in fact on a ratio scale (as they are counts), but because they are discrete, I want columns with gaps, not columns that touch, which seems to be the only option for histogram().

I currently have the following, but bar- and boxplot do not align quite perfectly:

set.seed(476372)
counts1 <- rpois(10000,3)
nf <- layout(mat = matrix(c(1,2),2,1, byrow=TRUE),  height = c(3,1))
par(mar=c(3.1, 3.1, 1.1, 2.1))
barplot(prop.table(table(counts1)))
boxplot(counts1, horizontal=TRUE,  outline=TRUE,ylim=c(0,12), frame=F, width = 10)

Here my question: How can I make them align?

2

There are 2 answers

4
r2evans On

Another option that's similar but a little more work. This preserves the option for gaps between the bars:

tbl <- prop.table(table(counts1))
left <- -0.4 + do.call('seq', as.list(range(counts1)))
right <- left + (2 * 0.4)
bottom <- rep(0, length(left))
top <- tbl
xlim <- c(-0.5, 0.5) + range(counts1)

nf <- layout(mat = matrix(c(1,2),2,1, byrow=TRUE),  height = c(3,1))
par(mar=c(3.1, 3.1, 1.1, 2.1))
plot(NA, xlim=xlim, ylim=c(0, max(tbl)))
rect(left, bottom, right, top, col='gray')
boxplot(counts1, horizontal=TRUE,  outline=TRUE, ylim=xlim, frame=F, width = 10)

enter image description here

1
Robert On

Maybe using a "fake" histogram at the end

ht=hist(counts1,breaks=12,plot = F)
ht$counts=as.numeric(table(counts1))
ht$density=as.numeric(prop.table(table(counts1)))
ht$breaks=as.numeric(names(table(counts1)))
ht$mids=sapply(1:(length(ht$breaks)-1),function(z)mean(ht$breaks[z:(z+1)]))

plot(ht,freq=F,col=3,main="")
boxplot(counts1, horizontal=TRUE,outline=TRUE,ylim=range(ht$breaks), frame=F, col="green1", width = 10)

enter image description here