R likert graph error when grouping and all responses are negative or positive

114 views Asked by At

likert plot result

I am using R's likert package to graph survey responses. When using the grouping option and at least one group of responses are all negative or positive, the graph is messed up (see picture). Any workarounds or is this a bug?

library(likert)
grouplabel <- c(rep("A", 6), rep("B", 6), rep("C", 6))
surveyresult <- factor(c(4, 5, 7, 2, 3, 4,
                         5, 6, 7, 5, 6, 7,
                         1, 2, 1, 2, 3, 1))
df <- data.frame(grouplabel, surveyresult)

mylikert <- likert(df[,2, drop=FALSE], grouping = df[,1], nlevels=7)
plot(mylikert)

I have tried rearranging the levels of 'grouplabel' and various different data files with the same issue.

1

There are 1 answers

0
DaveArmstrong On

This does seem to be a bug in the likert package. When you do plot(mylikert) the function calls likert:::likert.bar.plot(). If you follow that through, it creates a bunch of data and scalars. However, the three objects it creates - results, results.low and results.high are enough to replicate the problem.

results <- structure(list(Group = c("A", "B", "C", "A", "B", "C", "A", "B", 
"C", "A", "B", "C", "A", "B", "C", "A", "B", "C", "A", "B", "C", 
"A", "B", "C"), Item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L), levels = "surveyresult", class = c("ordered", "factor")), 
    variable = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 
    4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 4L, 4L, 4L
    ), levels = c("1", "2", "3", "4", "5", "6", "7"), class = c("ordered", 
    "factor")), value = c(0, 0, -50, -16.6666666666667, 0, -33.3333333333333, 
    -16.6666666666667, 0, -16.6666666666667, 16.6666666666667, 
    0, 0, 16.6666666666667, 33.3333333333333, 0, 0, 33.3333333333333, 
    0, 16.6666666666667, 33.3333333333333, 0, -16.6666666666667, 
    0, 0)), row.names = c("1", "2", "3", "4", "5", "6", "7", 
"8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", 
"19", "20", "21", "101", "111", "121"), class = "data.frame")
results.low <- structure(list(Group = c("C", "A", "C", "A", "C", "A"), 
    Item = c("surveyresult", "surveyresult", "surveyresult", 
    "surveyresult", "surveyresult", "surveyresult"
    ), variable = structure(c(1L, 2L, 2L, 3L, 3L, 4L), levels = c("1", 
    "2", "3", "4", "5", "6", "7"), class = c("ordered", "factor"
    )), value = c(-50, -16.6666666666667, -33.3333333333333, 
    -16.6666666666667, -16.6666666666667, -16.6666666666667)), 
    row.names = c("3", "4", "6", "7", "9", "101"
), class = "data.frame")
results.high <- structure(list(Group = c("A", "A", "B", "B", "A", "B"), Item = structure(c(1L, 
1L, 1L, 1L, 1L, 1L), levels = "surveyresult", class = c("ordered", 
"factor")), variable = structure(c(4L, 3L, 3L, 2L, 1L, 1L), levels = c("7", 
"6", "5", "4", "3", "2", "1"), class = "factor"), value = c(16.6666666666667, 
16.6666666666667, 33.3333333333333, 33.3333333333333, 16.6666666666667, 
33.3333333333333)), row.names = c("10", "13", "14", "17", "19", 
"20"), class = "data.frame")

The code below generates the base graph. All the other graphical code in the function is really styling. Again, this is enough to replicate the problem. The issue is that the function strips out all the zero entries, so the results.low object does not have any B values in it. This doesn't seem to cause a problem for the results.high object, which doesn't have any C values in it. Nonetheless, the output shows that the results.low bars each plot across some of the range of B.

library(ggplot2)
ggplot(results, 
       aes(y = value, 
           x = Group, 
           group = variable)) + 
  geom_bar(data = results.high, 
           aes(fill = variable), 
           stat = "identity") +
  geom_bar(data = results.low[nrow(results.low):1, ], 
           aes(fill = variable), 
           stat = "identity") + 
  geom_hline(yintercept = 0)

If you add a zero value for B into results.low you get the desired result. Unfortunately, you don't usually get to intervene at this stage since all this happens internally to the function. You may want to file a bug report on the GitHub repo

library(dplyr)
results.low <- bind_rows(results.low, 
                         data.frame(Group = "B", 
                                    Item = "surveyresult", 
                                    variable = ordered(4, levels=1:7), 
                                    value=0))

ggplot(results, 
       aes(y = value, 
           x = Group, 
           group = variable)) + 
  geom_bar(data = results.high, 
           aes(fill = variable), 
           stat = "identity") +
  geom_bar(data = results.low[nrow(results.low):1, ], 
           aes(fill = variable), 
           stat = "identity") + 
  geom_hline(yintercept = 0)

Created on 2023-10-13 with reprex v2.0.2