How to format R2 and RMSE table added to a ggplot?

954 views Asked by At

I am trying to add R2 and RMSE to a facetted ggplot. I am using the following code for that

library(caret)
library(tidyverse)
library(ggpmisc)

summ <- iris %>% 
  group_by(Species) %>% 
  summarise(Rsq = R2(Sepal.Length, Petal.Length),
            RMSE = RMSE(Sepal.Length, Petal.Length)) %>% 
  mutate_if(is.numeric, round, digits=2) 

my.formula <- y ~ x

p <- ggplot(data=iris, aes(x = Sepal.Length, y = Petal.Length)) +
  geom_point(color="blue",alpha = 1/3) + 
  facet_wrap(Species ~ ., scales="free") +
  geom_smooth(method=lm, fill="black", formula = my.formula) +
  xlab("Sepal Length") +
  ylab("Petal Length") + theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) 

p + geom_table_npc(data = summ,label = split(summ, summ$Species),
                   npcx = 0.00, npcy = 1, hjust = 0, vjust = 1, size=3,
                   table.theme = ttheme_gtlight)

which gives me the following plot

enter image description here

As we can see from the plot, the species column of the geom_table_npc is unnecessary. Now how can I get the plot as following

enter image description here

1

There are 1 answers

5
MacOS On BEST ANSWER

My answer has two parts. The first part suggests that you keep using geom_table_npc to add your information. While the second part explains how one can have the output you asked for.

First, you can simply remove the column from your results.

p + geom_table_npc(data = summ, label = lapply(split(summ, summ$Species),
                                               FUN = function(entry) {subset(entry, select = -Species)}),
                   npcx = 0.00, npcy = 1, hjust = 0, vjust = 1, size=3,
                   table.theme = ttheme_gtlight)

So, if I run this code

library(caret)
library(tidyverse)
library(ggpmisc)

summ <- iris %>% 
  group_by(Species) %>% 
  summarise(Rsq = R2(Sepal.Length, Petal.Length),
            RMSE = RMSE(Sepal.Length, Petal.Length)) %>% 
  mutate_if(is.numeric, round, digits=2) 

p <- ggplot(data=iris, aes(x = Sepal.Length, y = Petal.Length)) +
  geom_point(color="blue",alpha = 1/3) + 
  facet_wrap(Species ~ ., scales="free") +
  geom_smooth(method=lm, fill="black", formula = y ~ x) +
  xlab("Sepal Length") +
  ylab("Petal Length") + theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) 

# The key is here. By using lapply I remove the Species column from each
# data frame in the list.
p + geom_table_npc(data = summ, label = lapply(split(summ, summ$Species),
                                               FUN = function(entry) {subset(entry, select = -Species)}),
                   npcx = 0.00, npcy = 1, hjust = 0, vjust = 1, size=3,
                   table.theme = ttheme_gtlight)

I get this output.

geom_table_npc result

Second, you can use either geom_text() or annotate() to achieve your desired output. Let's use geom_text().

# ... the other code - plot creation - from above.

# Here we create our annotations data frame.
df.annotations <- data.frame()
# Rsq
df.annotations <- rbind(df.annotations,
                        cbind(as.character(summ$Species),
                              paste("Rsq", summ$Rsq,
                                    sep = " = ")))

# RMSE
df.annotations <- rbind(df.annotations,
                        cbind(as.character(summ$Species),
                              paste("RMSE", summ$RMSE,
                                    sep = " = ")))

# This here is important, especially naming the first column
# Species
colnames(df.annotations) <- c("Species", "label")

df.annotations$x <- rep.int(c(4.5, 5.5, 5.5), times = 2)
df.annotations$y <- c(1.75, 5.0, 6.8,
                      1.7, 4.9, 6.7)


p + geom_text(data = df.annotations,
              mapping = aes(x = x, y = y, label = label))

Gives the following plot. enter image description here

HTH!