Specify point shape and colour ind in fviz_pca_biplot R

355 views Asked by At

can the fviz_pca_biplot function be used to assign a shape and colour to Ind in a Biplot based on two different grouping factors?

I've got the following df where Ind is the column with numbers from 73 to 78, variables for the PCA are X1 to X6 and tr and Treat are factors that I want to use to group Ind in the PCA Biplot:

       tr      Treat       X1    X2     X3    X4    X5    X6
73      0       A         13.953 1.190 25.713 0.136 1.052 0.023
74     10       B          12.392 1.296 25.908 0.142 1.059 0.020
75      5       C         13.378 1.072 25.021 0.137 0.893 0.425
76      10      B         9.111 1.100 25.081 0.158 0.900 0.027
77      1       C         8.391 1.076 26.538 0.150 0.942 0.018
78      5       C         10.528 1.148 30.567 0.168 0.990 0.038

To run the principal component analysis i use:

library("FactoMineR")
library("factoextra")
pca <- PCA(df[,-1:2], graph=FALSE)

I plot the data in a Biplot using the following:

fviz_pca_biplot(pca, 
            col.var = "black", repel = TRUE,
            col.ind = df$`Treat`, palette = "lancet",
            addEllipses = TRUE, label = "var", mean.point = FALSE,
            ellipse.type = 'confidence', ellipse.level=0.98,
            legend.title = "Treat",
            ggtheme = theme_minimal()
)

The Biplot legend shows a unique shape and colour to group Ind by treat, e.g., Treat "A" is a blue dot, "B" a red square, and so on.

I would like to use the shape to group Ind by tr column and a colour to group Ind by Treat column. I have tried many things but I am not managing.

With the following code I could assign a shape filling to Treat and the shape line to Tr:

fviz_pca_biplot(pca,
            geom.ind = "point",
            pointshape = 21,
            pointsize = 2.5,
            fill.ind = df$Treat,
            col.ind = df$tr,
            label = "var", mean.point = FALSE,
            col.var = "black",
            legend.title = list(fill = "Treat", 
                                color = "tr"),
            repel = TRUE
)

However its not visually easy to differentiate, that's why I would like to have different shapes and colours. I wonder if this can be done at all? I dont find any examples online.

Cheers people

1

There are 1 answers

0
Allan Cameron On

There doesn't seem to be a direct way of doing this, though you can edit the ggplot produced by fviz_pca_biplot to make the necessary changes.

First we make the plot and store it.

library(FactoMineR)
library(factoextra)

pca <- PCA(df[,-(1:2)], graph = FALSE)

p <- fviz_pca_biplot(pca, 
                col.var = "black", repel = TRUE,
                col.ind = df$`Treat`,
                palette = "lancet",
                addEllipses = TRUE, label = "var", mean.point = FALSE,
                ellipse.type = 'confidence', ellipse.level=0.98,
                legend.title = "Treat",
                ggtheme = theme_minimal()
)

Now we add data and change the aesthetic mapping to the first layer

p$layers[[1]]$data$tr <- factor(df$tr)
p$layers[[1]]$mapping <- aes(x, y, colour = Col., shape = tr)
p$layers[[1]]$aes_params$size <- 3
p <- p + labs(shape = 'tr')

So now we have

p

enter image description here


Data from question in reproducible format

df <- structure(list(tr = c(0L, 10L, 5L, 10L, 1L, 5L), Treat = c("A", 
"B", "C", "B", "C", "C"), X1 = c(13.953, 12.392, 13.378, 9.111, 
8.391, 10.528), X2 = c(1.19, 1.296, 1.072, 1.1, 1.076, 1.148), 
    X3 = c(25.713, 25.908, 25.021, 25.081, 26.538, 30.567), X4 = c(0.136, 
    0.142, 0.137, 0.158, 0.15, 0.168), X5 = c(1.052, 1.059, 0.893, 
    0.9, 0.942, 0.99), X6 = c(0.023, 0.02, 0.425, 0.027, 0.018, 
    0.038)), class = "data.frame", row.names = c("73", "74", 
"75", "76", "77", "78"))