Overlay two differently formatted qplots in ggplot2

890 views Asked by At

I have two scatterplots, based on different but related data, created using qplot() from ggplot2. (Learning ggplot hasn't been a priority because qplot has been sufficient for my needs up to now). What I want to do is superimpose/overlay the two charts so that the x,y data for each is plotted in the same plot space. The complication is that I want each plot to retain its formatting/aesthetics.

That data in question are row and column scores from correspondence analysis - corresp() from MASS - so the number of data rows (i.e. samples or taxa) differ between the two datasets. I can plot the two score sets together easily. Either by combing the two datasets or, even easier, just using the biplot() function.

However, I have been using qplot to get the plots looking exactly as I need them; with samples plotted as colour-coded symbols and taxa as labels:

PlotSample <- qplot(DataCorresp$rscore[,1], DataCorresp$rscore[,2], 
                    colour=factor(DataAll$ColourCode)) + 
  scale_colour_manual(values = c("black","darkgoldenrod2",
                                 "deepskyblue2","deeppink2"))

and

PlotTaxa <- qplot(DataCorresp$cscore[,1], DataCorresp$cscore[,2], 
                  label=colnames(DataCorresp), size=10, geom=“text”)

Can anyone suggest a way by which either

  • the two plots (PlotSample and PlotTaxa) can be superimposed atop of each other,
  • the two datasets (DataCorresp$rscore and DataCorresp$cscore) can be plotted together but formatted in their different ways, or
  • another function (e.g. biplot()) that could be used to achieve my aim.

Example of workflow using a extremely simplified and made-up dataset:

> require(MASS)
> require(ggplot2)
> alldata<-read.csv("Fake data.csv",header=T,row.name=1)
> selectdata<-alldata[,2:10]
> alldata
          Period Species.1 Species.2 Species.3 Species.4 Species.5 Species.6
Sample-1   Early        50        87        97        12        60        49
Sample-2   Early        41        90        36        52        36        27
Sample-3   Early        87        56        82        45        56        13
Sample-4   Early        37        47        78        29        53        34
Sample-5   Early        58        70        34        35         8        21
Sample-6   Early        94        82        48        16        27        26
Sample-7   Early        91        69        50        57        24        13
Sample-8   Early        63        38        86        20        28        11
Sample-9  Middle         4        19        55        99        86        38
Sample-10 Middle        29        25        10        93        37        54
Sample-11 Middle        48        12        59        73        39        92
Sample-12 Middle        31         6        34        81        39        54
Sample-13 Middle        29        40        26        52        34        84
Sample-14 Middle         1        46        15        97        67        41
Sample-15   Late        43        47        30        18        60        23
Sample-16   Late        45        10        49         2         2        45
Sample-17   Late        14         8        51        36        58        51
Sample-18   Late        41        51        32        47        23        43
Sample-19   Late        43        17         6        54         4        12
Sample-20   Late        20        25         1        29        35         2
          Species.7 Species.8 Species.9
Sample-1         41        39        57
Sample-2         59         4        45
Sample-3         10        56         5
Sample-4         59        30        39
Sample-5          9        29        57
Sample-6         29        24        35
Sample-7         22         4        42
Sample-8         31        19        40
Sample-9         17         7        57
Sample-10         6         9        29
Sample-11        34        20         0
Sample-12        56        41        59
Sample-13         6        31        13
Sample-14        25        12        28
Sample-15        60        75        84
Sample-16        32        69        34
Sample-17        48        53        56
Sample-18        80        86        46
Sample-19        50        70        82
Sample-20        57        84        70
> biplot(selectca,cex=c(0.6,0.6))
> selectca<-corresp(selectdata,nf=5)
> PlotSample <- qplot(selectca$rscore[,1], selectca$rscore[,2], colour=factor(alldata$Period) )
> PlotTaxa<-qplot(selectca$cscore[,1], selectca$cscore[,2], label=colnames(selectdata), size=10, geom="text") 

The biplot will produce this plot: /r/10wk1a8/5

The PlotSample appears as such: /r/i29cba/5

The PlotTaxa appears as such: /r/245bl9d/5

EDIT so don't have enough rep to post pictures and tinypic links not accepted (despite https://meta.stackexchange.com/questions/60563/how-to-upload-images-on-stack-overflow). So if you add tinypic's URL to the start of those codes above you'll get there.

Essentially I want to creat the biplot plot but with samples colour coded as they are in PlotSample.

1

There are 1 answers

1
EDi On

Have a look at Gavin Simpsons ggvegan-package!

require(vegan)
require(ggvegan)
# some data
data(dune)

# CA
mod <- cca(dune)

# plot
autoplot(mod, geom = 'text')

enter image description here

For a finer control (or if you want to stick with corresp(), you may also want to take a look at the code of the two involved functions fortify.cca (which wraps the data in the cca objects into a useable format for ggplot) and autoplot.cca for creating the plot.

I you want to do it from scratch, you'll have to wrap both scores (sites and species) into one data.frame (see how fortify.cca does this and extract the relevant values from the corresp() object) and use this to build the plot.