Previous Post: I am trying to remove NAs from a set of violin plots, and the error keeps changing. With the following code, I get the message that "! Aesthetics must be either length 1 or the same as the data (61)" - but I have looked into this - am using a package called viridis for the colors so I'm not sure how to change the following. I'm also not really understanding where the error is coming from, but the code works when I just run the "dataframe" instead of "withoutNAs" as the relevant dataframe in ggplot. So I think it's something about "withoutNAs" or viridis.
dataframe$KnownAgeCategories <- as.factor(KnownBehAge)
KnownAgeCategories
withoutNAs <- dataframe[!is.na(dataframe$KnownAgeCategories),]
# Plot of the relationship between age and deuterium
ggplot(withoutNAs, aes (KnownAgeCategories, Deut)) +
geom_violin(
mapping = aes(
x = KnownAgeCategories,
y = Deut, fill = KnownAgeCategories
)
)+
theme_classic()+
stat_summary(fun.data = "mean_cl_boot", geom = "pointrange")+
stat_summary(fun.data = n_fun, geom = "text", vjust = -1)+
labs(y="Deuterium", x="")+
scale_fill_brewer(palette="BuPu")+
scale_y_continuous(limits = c(-70,-40))
Added post: Apologies, I have tried to create a reprex (thank you for the link) and of course the problem goes away now, but I can't figure out why. I suspect there is something wrong with the way I am referring to my original dataframe, but I am lost. The following reprex works great and results in the following ugly violin plot without NAs:
# Building a minimal dataframe
Deut <- c(-60, -59, -53, -54, -60, -55, -60, -59, -60, -59, -60, -59)
Age <- c(NA, NA, 3, 4, 5, NA, 5, 7, NA, NA, NA, NA)
dataframe <- data.frame(Deut, Age)
# turning the column Age, which happens to have 2 NAs, into a factor and renaming it
dataframe$AgeCategories <- as.factor(dataframe$Age)
# removing NAs from KnownAgeCategories
dataframewithoutNAs <- dataframe[!is.na(dataframe$Age),]
dataframewithoutNAs
# Creating a function that helps some of the spacing work in ggplot (less relevant to this example)
n_fun <- function(x){
return(data.frame(y = max(x), label = paste0("n = ",length(x))))
}
# Attempting to plot the dataframe and get AgeCategories on x axis, Deut values on y axis, and no NAs:
ggplot(dataframewithoutNAs, aes (AgeCategories, Deut)) +
geom_violin(
mapping = aes(
x = AgeCategories,
y = Deut, fill = AgeCategories
)
)+
theme_classic()+
stat_summary(fun.data = "mean_cl_boot", geom = "pointrange")+
stat_summary(fun.data = n_fun, geom = "text", vjust = -1)+
labs(y="Deuterium", x="")+
scale_fill_brewer(palette="BuPu")+
scale_y_continuous(limits = c(-70,-40))
