Imagine a dataframe (this is an illustrative sample)
s <- c("January", "February", "March", "January", "March", "April")
t <- c(5, 3, 2, 3, 3, 7)
df1 <- as.data.frame(s)
df1[ , 2] <- t
Now for graphing purposes, I wanted to consolidate by month. If I write the code and then summarize:
library(dplyr)
df1$s <- factor(df1$s, levels = month.name)
summary <- df1 %>% group_by(a) %>% summarize(Sales = sum(V2))
The outputs are correct but out of order:
April 7
February 3
January 8
March 5
However, if I do the following:
df1$s <- as.factor(df1$s)
levels(df1$s) <- c("January", "February", "March", "April")
Summary <- df1 %>% group_by(s) %>% summarize(Sales = sum(V2))
The output are:
January 7
February 3
March 8
April 5
The sums are wrong but order is correct. Why would this be?
It's like it organizes by month alphabetically then resorts the Month column without changing the other values.
If you want to relevel factor, you can use the
forcats
package and manipulate factor order. As you see in the end of this post, your factor order was not in the month order. So, I usedfct_relevel()
to change the level and did the calculation.