How to create a ggplot when the answers are FALSE or TRUE?

1.1k views Asked by At

How can I create a plot with ggplot when my answers are TRUE or FALSE?

This is my code:

t.obese<-master1%>%
  filter(Income>0,obese==TRUE)%>%
  select(Income,obese)

> head(t.obese)
  Income obese
1  21600    TRUE
2   4000    TRUE
3  12720    TRUE
4  26772    TRUE

when I am trying to create a plot , r tells me " Don't know how to automatically pick scale for object of type haven_labelled/vctrs_vctr/double. Defaulting to continuous. Fehler: stat_count() can only have an x or y aesthetic."

Thank you!

> dput(t.obese[1:10, ])
structure(list(Income = structure(c(1944, 4000, 16000, 19200, 
22800, 21600, 18000, 18000, 2000, 18000), label = "Wages,Salary from                    main job", format.stata = "%42.0g", labels = c(`[-5] in Fragebogenversion    nicht enthalten` = -5, 
 `[-2] trifft nicht zu` = -2), class = c("haven_labelled",      "vctrs_vctr", 
 "double")), obese = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
TRUE, TRUE, TRUE)), row.names = c(NA, 10L), class = "data.frame")
2

There are 2 answers

0
Lefkios Paikousis On BEST ANSWER

If you want to compare Income distribution across obesity, then you need both obese = TRUE and obese = FALSE, so you can do the comparison

I randomly created an non_obese dataset just to do the comparison. Also, I removed the haven_labelled class for the Income since it was causing some issues in the reprex rendering [using haven::zap_labels()

Anyway, hope the following will help you get started

library(dplyr)
library(ggplot2)
library(haven)

obese <- 
structure(list(Income = structure(c(1944, 4000, 16000, 19200, 
                                    22800, 21600, 18000, 18000, 2000, 18000), 
                                  label = "Wages,Salary from main job", 
                                  format.stata = "%42.0g", 
                                  labels = c(`[-5] in Fragebogenversion nicht enthalten` = -5,
                                             `[-2] trifft nicht zu` = -2), 
                                  class = c("haven_labelled", "vctrs_vctr","double")), 
               obese = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,TRUE, TRUE, TRUE)), 
          row.names = c(NA, 10L), class = "data.frame"
          )


# remove the haven/labelled class of the income variable
obese <- 
  obese %>% 
  haven::zap_labels() 

non_obese <- 
  obese %>% 
  mutate(
    Income = Income - rnorm(1, mean = 1000, sd = 50),
    obese  = !obese
  )



full_data <- 
  bind_rows(obese, non_obese)


# Box plot 
full_data %>% 
  ggplot(
    aes(obese, Income)
  )+
  geom_boxplot(width = 0.5)+
  geom_point(position = position_jitter(width  = 0.05))

# Density plot
full_data %>% 
  ggplot(
    aes(Income,fill = obese)
  )+
  geom_density(alpha = 0.5)

Created on 2020-12-03 by the reprex package (v0.3.0)

0
Duck On

With the data you shared, which is minimal, tried this:

library(ggplot2)
#Code1
ggplot(as.data.frame(t.obese), aes(x=factor(obese), y=Income)) +
  geom_bar(stat='identity')+
  xlab('Obese')+
  scale_y_continuous(labels = scales::comma)

Output:

enter image description here

And this:

#Code 2
ggplot(as.data.frame(t.obese), aes(x=factor(obese), y=Income)) +
  geom_point()+
  geom_jitter()+
  geom_boxplot()+
  xlab('Obese')

Output:

enter image description here