Can an ANOVA be carried out using a dataframe looking like this?
category_1 | category_2 | category_4 | category_5 |
---|---|---|---|
0.75 | 0.82 | 0.91 | 0.32 |
0.71 | 0.39 | 0.21 | 0.76 |
0.17 | 0.10 | 0.43 | 0.37 |
I already tried using unlist
to transform the data into a long format. However, the column names will be in a column without a name in that case and have an extra number tied to them. Then, it should not be possible to use an ANOVA. Is there another way?
"category_x" is the grouping variable, and I want to check whether some categories are used more often than others (higher category score = used more often).
Let us recreate your data frame and call it
df
:To get these data in a suitable format for ANOVA, we can pivot to long format. This puts all the values in one column, and creates another column that labels each value according to its original column. We can use
pivot_longer
from the tidyverse to do thisNow our data frame looks like this:
We can now create a linear model of the values according to category and review the summary:
Finally, we can run our model through
anova
Created on 2022-06-12 by the reprex package (v2.0.1)