First time asking a question here and very new to R, so please bear with me if I format/phrase something incorrectly. Thanks in advance for your help!
I have multiple columns stored as factors in my dataframe. However, after running a code chunk of dplyr functions with pipes, the tibble output is not retaining the factor levels from the dataframe. The data is out of order.
Here are the factor levels (I eliminated some of the preceding code because it's long):
...
POST_600_WOBA_GRADE = factor(POST_600_WOBA_GRADE, levels = c("Elite", "Great", "Good", "Average", "Below Average", "Poor", "Awful", "DNQ")))
When I run this to confirm, the correct levels show:
levels(all_batter_career_data_by_ID$POST_600_WOBA_GRADE)
[1] "Elite" "Great" "Good" "Average" "Below Average"
[6] "Poor" "Awful" "DNQ"
However, when I run this below, DNQ and Awful get flipped. DNQ (Did Not Qualify) should be on the far right.
all_batter_career_data_by_ID %>%
filter(BAT_DEBUT < "2015-01-01") %>%
group_by(FIRST_600_WOBA_GRADE, POST_600_WOBA_GRADE) %>%
summarize(n = n()) %>%
pivot_wider(names_from = POST_600_WOBA_GRADE, values_from = n)
# A tibble: 8 × 9
# Groups: FIRST_600_WOBA_GRADE [8]
FIRST_600_WOBA_GRADE Elite Great Good Average `Below Average` Poor DNQ Awful
<fct> <int> <int> <int> <int> <int> <int> <int> <int>
1 Elite 1 1 4 1 NA NA NA NA
2 Great 5 9 14 7 3 1 8 NA
3 Good 1 21 62 56 20 12 44 1
4 Average 1 10 45 77 29 29 78 6
5 Below Average NA 5 24 36 14 26 56 7
6 Poor NA 5 22 35 14 33 94 20
7 Awful NA 1 7 21 9 38 132 22
8 DNQ NA NA NA NA NA NA 1126 NA
I am running the exact same code for another variable with the same factor levels, and the tibble is in the correct order, so I'm not sure why this one is flipped. Thanks again for your help!
To understand what is going on, here's some sample data:
Pivoting this data wider gives us the right answer, but once we randomly sort the rows, we get the wrong answer:
As Darren has pointed out, the solution is to use the
names_sort = TRUEargument forpivot_wider()(sorting the column you're wanting to pivot wider would work too):Output: