Adding group mean lines to geom_bar plot and including in legend

2.9k views Asked by At

I want to be able to create a bar graph which shows also shows the mean value for bars in each group. AND shows the mean bar in the legend.

I have been able to get this graph Bar chart with means using the code below, which is fine, but I would like to be able to see the mean lines in the legend.

enter image description here

##The data to be graphed is the proportion of persons receiving a treatment
## (num=numerator) in each population (denom=demoninator). The population is 
##grouped by two age groups and (Age) and further divided by a categorical 
##variable V1

###SET UP DATAFRAME###
require(ggplot2)    
df <- data.frame(V1 = c(rep(c("S1","S2","S3","S4","S5"),2)), 
               Age= c(rep(70,5),rep(80,5)), 
               num=c(5280,6570,5307,4894,4119,3377,4244,2999,2971,2322),
               denom=c(9984,12600,9425,8206,7227,7290,8808,6386,6206,5227))

df$prop<-df$num/df$denom*100

PopMean<-sum(df$num)/sum(df$denom)*100

df70<-df[df$Age==70,]
group70mean<-sum(df70$num)/sum(df70$denom)*100

df80<-df[df$Age==80,]
group80mean<-sum(df80$num)/sum(df80$denom)*100

df$PopMean<-c(rep(PopMean,10))
df$groupmeans<-c(rep(group70mean,5),rep(group80mean,5))

I want the plot to look like this, but want the lines in the legend too, to be labelled as 'mean of group' or similar.

 #basic plot
 P<-ggplot(df, aes(x=factor(Age), y=prop, fill=factor(V1))) +
   geom_bar(position=position_dodge(), colour='black',stat="identity")    

 P
####add mean lines    
P+geom_errorbar(aes(y=df$groupmeans, ymax=df$groupmeans, 
ymin=df$groupmeans), col="red", lwd=2)

Adding show.legend=TRUE overlays the error bars onto the factor legend, rather than separately. If there is a way of showing geom_errorbar separately in the legend this is probably the simplest solution.

I have also tried various things with geom_line The syntax below produces a line for the population mean value, but running from the centre of each point rather than covering the width of the bars This produces a line for the population mean and it does produce a legend but one showing a bar of colour rather than a line.

P+geom_line(aes(y=df$PopMean, group=df$PopMean, color=df$PopMean),lwd=1)

If i try to do lines for group means the lines are not visible (because they are only single points).

P+geom_line(aes(y=df$groupmeans, group=df$groupmeans, color=df$groupmeans))

I also tried to get round this with facet plot, although this requires me to pretend my categorical variable is numeric to get it to work.

###set up new df
df2<-df
df2$V1<-c(rep(c(1,2,3,4,5),2))

P<-ggplot(df2, aes(x=factor(V1), y=prop, fill=factor(V1))) +
  geom_bar(position=position_dodge(),     
  colour='black',stat="identity",width=1)

P+facet_grid(.~factor(df2$Age))

P+facet_grid(.~factor(df2$Age))+geom_line(aes(y=df$groupmeans, 
group=df$groupmeans, color=df$groupmeans))

Facetplot

enter image description here

This allows me to show the mean lines, using geom_line, so a legend does appear (although it doesn't look right, showing a colour gradient rather than coloured lines!). However, the lines still do not go the full width of the bars. Also my x-axis now needs relabelling to show S1, S2 etc rather than numeric 1,2,3

To sum up - is there a way of showing error bar lines separately in the legend?

If not, then, if i use facetting, how do I correct the legend appearance and relabel axes with my categorical variables and is is possible to get the line to go the full width of the plot?

Or is there an alternate solution that I am missing!?

Thanks

1

There are 1 answers

0
timat On BEST ANSWER

To get the legend for the geom_error you need to pass the colour argument in the aes. As you want only one category (here red), I've create a dummy variable first

df$mean <- "Mean"
ggplot(df, aes(x=factor(Age), y=prop, fill=factor(V1))) +
  geom_bar(position=position_dodge(), colour='black',stat="identity") +
  geom_errorbar(aes (ymax=groupmeans, 
                ymin=groupmeans, colour=mean), lwd=2) +
  scale_colour_manual(name="",values = "#ff0000")

enter image description here