can the fviz_pca_biplot function be used to assign a shape and colour to Ind in a Biplot based on two different grouping factors?
I've got the following df where Ind is the column with numbers from 73 to 78, variables for the PCA are X1 to X6 and tr and Treat are factors that I want to use to group Ind in the PCA Biplot:
tr Treat X1 X2 X3 X4 X5 X6
73 0 A 13.953 1.190 25.713 0.136 1.052 0.023
74 10 B 12.392 1.296 25.908 0.142 1.059 0.020
75 5 C 13.378 1.072 25.021 0.137 0.893 0.425
76 10 B 9.111 1.100 25.081 0.158 0.900 0.027
77 1 C 8.391 1.076 26.538 0.150 0.942 0.018
78 5 C 10.528 1.148 30.567 0.168 0.990 0.038
To run the principal component analysis i use:
library("FactoMineR")
library("factoextra")
pca <- PCA(df[,-1:2], graph=FALSE)
I plot the data in a Biplot using the following:
fviz_pca_biplot(pca,
col.var = "black", repel = TRUE,
col.ind = df$`Treat`, palette = "lancet",
addEllipses = TRUE, label = "var", mean.point = FALSE,
ellipse.type = 'confidence', ellipse.level=0.98,
legend.title = "Treat",
ggtheme = theme_minimal()
)
The Biplot legend shows a unique shape and colour to group Ind by treat, e.g., Treat "A" is a blue dot, "B" a red square, and so on.
I would like to use the shape to group Ind by tr column and a colour to group Ind by Treat column. I have tried many things but I am not managing.
With the following code I could assign a shape filling to Treat and the shape line to Tr:
fviz_pca_biplot(pca,
geom.ind = "point",
pointshape = 21,
pointsize = 2.5,
fill.ind = df$Treat,
col.ind = df$tr,
label = "var", mean.point = FALSE,
col.var = "black",
legend.title = list(fill = "Treat",
color = "tr"),
repel = TRUE
)
However its not visually easy to differentiate, that's why I would like to have different shapes and colours. I wonder if this can be done at all? I dont find any examples online.
Cheers people
There doesn't seem to be a direct way of doing this, though you can edit the ggplot produced by
fviz_pca_biplot
to make the necessary changes.First we make the plot and store it.
Now we add data and change the aesthetic mapping to the first layer
So now we have
Data from question in reproducible format