Reordering and colouring specific group data in boxplot, using R

484 views Asked by At

The question may look similar to this one: Colouring different group data in boxplot using r, but I need to highlight specific columns and found this: http://www.r-graph-gallery.com/23-add-colors-to-specific-groups-of-a-boxplot/

Furthermore, I'm sorting the graphs by mean, similarly to this: Sorting a boxplot based on median value

The final result should be something like this:

bymean <- with(data, reorder(sample, trait, mean, na.rm = TRUE))
boxplot(trait~bymean, data=data,  
         col=ifelse(levels(data$sample)=="cpt2", "red",
             ifelse(levels(data$sample)=="cpt12", "blue",
             ifelse(levels(data$sample)=="cpt13", "green",
             ifelse(levels(data$sample)=="cpt30", "yellow", "grey")))))

Now. When changing "trait", I explect that the data will re-reorder and the colours will re-reorder aswell, paired to the data. But it simply doesn't work. The colours are set according to the alphabetic order of the samples: blue (cpt12), green (cpt13), red (cpt2) and yellow (cpt30), no matter where the samples, after reordering, are on the x axis.

A smaller version of the original file availble here: https://drive.google.com/file/d/0B1kEh3I4podcaUd5NWJaNkhPS0E/view

1

There are 1 answers

2
Dave2e On

The order of the color vector aligns with the order to the boxes plotted. Therefore if the order of the boxes in the boxplot rearranges then one must rearrange the colors. In your case you are rearranging the order of the levels in the first line of code.
In this solution I create an data frame to match the expected levels with the desired color and include an extra default. Then using the match function I create the vector of colors in the proper order.

Try this:

bymean <- with(data, reorder(sample, trait, mean, na.rm = TRUE))
colordf<-data.frame( fac=c("cpt2", "cpt12", "cpt13", "cpt30", NA), 
                     color = c("red", "blue", "green", "yellow", "grey" ))

plotcolor<-colordf$color[match(levels(bymean), colordf$fac, nomatch=5)]

boxplot(trait~bymean, data=data,  
        col=plotcolor)