I am having trouble with haven and ggplot2.
I have converted all the columns to factors or integers as appropriate, but I am still getting the error message that ggplot can't cope with labelled data.
For the purposes of this example, I am interested in the percentage of people with a certain number of cats who do or don't have a specific product. The data is the responses to part of a large survey that was imported from SPSS, hence the data type (and handling NAs in the code). I want to be able to make a percentage stacked bar chart to show the results.
library(haven)
library(labelled)
library(tidyverse)
library(ggplot2)
#Generating some random data for the example.
number_cats <- labelled(floor(runif(50, 1, 10)))
has_prod <- labelled(rbinom(n=50, size=1, prob=0.45), c(Yes = 1, No = 0))
pets_3 <- tibble(number_cats, has_prod)
pets_3 %>%
group_by(number_cats, has_prod) %>%
drop_na() %>% #This is here because the real dataset has NAs
summarize(count = n())
pets_3 %>%
mutate(has_prod = haven::as_factor(has_prod, levels = "labels"),
number_cats = haven::as_factor(number_cats),
count = as.integer(count))
ggplot(pets_3, aes(x = number_cats, y = count, fill=has_prod )) +
geom_bar(position = "fill", stat = "identity")
#I also get the error if I make count a factor.
Error in UseMethod("rescale") :
no applicable method for 'rescale' applied to an object of class "c('haven_labelled', 'vctrs_vctr', 'double')"
I can't see why ggplot is still having trouble when I made the data into 'normal' types rather than labelled data. I have successfully used as_factor on other parts of the dataset before to make graphs.