PCA in R is not giving contributions for each principal component

123 views Asked by At

I have run a simple PCA with the package FactoMineR.

The PCA runs fine and I get 15 dimensions (I have 15 variables)

When I try to get the contributions of each variable to each principal component I only get the results of 5 dimensions and not all 15.

my code:

library(FactoMineR)
library(factoextra)

set.seed(123)
PCA_data <- matrix(rnorm(675), ncol = 15)

PCA_scaled <- scale(PCA_data)
pca_result <- PCA(PCA_scaled, graph = TRUE)
eigenvalues <- get_eigenvalue(pca_result)

variance_explained <- get_pca_var(pca_result)$prop_var

contributions <- pca_result$var$contrib 
contributions

get_pca_var(pca_result)$contribSDT_scaled <- scale(PCA_data)  

fviz_eig(pca_result, choice = "eigenvalue", addlabels = TRUE)

# Biplot
fviz_pca_biplot(pca_result, repel = TRUE, col.var = "contrib", 
                gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"))

# Create a data frame for plotting
plot_data <- data.frame(
  Principal_Component = rep(1:ncol(contributions), each = nrow(var_contributions)),
  Variable = rep(rownames(contributions), ncol(contributions)),
  Contribution = as.vector(contributions)
)

# Create a stacked bar plot
ggplot(plot_data, aes(x = Principal_Component, y = Contribution, fill = Variable)) +
  geom_bar(stat = "identity") +
  labs(title = "Variable Contributions to Principal Components",
       x = "Principal Component",
       y = "Contribution") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  scale_fill_viridis_d() 

#I also tried: 
summary(pca_result, nbelements=Inf)

None of these gave me all 15 dimensions or contributions of every variable.

1

There are 1 answers

0
Ben Bolker On

tl;dr To retain all of the components, specify ncp=15 when you run the PCA ...

From ?PCA:

PCA(X, scale.unit = TRUE, ncp = 5, ind.sup = NULL, 
         quanti.sup = NULL, quali.sup = NULL, row.w = NULL, 
         col.w = NULL, graph = TRUE, axes = c(1,2))

(emphasis added).

...

ncp: number of dimensions kept in the results (by default 5)