Long to wide format using dcast in r

284 views Asked by At

can anyone help me figure out the code for transforming the following data in long format (Figure 1) to wide format (Figure 2)? I would like to use the wide format to make a heatmap. Thank you so much.

Figure 1

Figure 2

I've tried

dat_wide <- dcast(dat1, 
                  id1+id2 ~ grade + year + category,
                  value.var = c("%level1", "%level2"))

Then an error message showed up saying that: ! Can't extract column with value.var. ✖ Subscript value.var must be size 1, not 2.

1

There are 1 answers

2
akrun On

The main issue here would be calling the dcast function i.e. dcast is available in data.table and reshape2 packages. If the data is data.table, it calls the data.table::dcast instead of the reshape2::dcast, but if it is a data.frame, it may call reshape2::dcast (assuming both are loaded). With data.table::dcast,

value.var - ..., Cast multiple value.var columns simultaneously by passing their names as a character vector.

whereas in ?reshape2::dcast

value.var - name of column which stores values, see guess_value for default strategies to figure this out. i.e. it can take only a single column in value.var

-testing

> reshape2::dcast(dat1, 
                   id1+id2  ~ grade + year + category,
                   value.var = c("%level1", "%level2"))
Error in if (!(value.var %in% names(data))) { : 
  the condition has length > 1

whereas

dcast(as.data.table(dat1), 
                    id1+id2  ~ grade + year + category,
                    value.var = c("%level1", "%level2"))
Key: <id1, id2>
     id1   id2 %level1_3_2016_A %level1_3_2016_B %level1_3_2017_A %level1_3_2017_B %level2_3_2016_A %level2_3_2016_B %level2_3_2017_A %level2_3_2017_B
   <num> <num>            <num>            <num>            <num>            <num>            <num>            <num>            <num>            <num>
1:     1   101               58               56               57               53               33               20               34               19
2:     2   102               54               49               58               43               32               19               32               20

data

dat1 <- structure(list(id1 = c(1, 1, 1, 1, 2, 2, 2, 2), id2 = c(101, 
101, 101, 101, 102, 102, 102, 102), grade = c(3, 3, 3, 3, 3, 
3, 3, 3), year = c(2016, 2017, 2016, 2017, 2016, 2017, 2016, 
2017), category = c("A", "A", "B", "B", "A", "A", "B", "B"), 
    `%level1` = c(58, 57, 56, 53, 54, 58, 49, 43), `%level2` = c(33, 
    34, 20, 19, 32, 32, 19, 20)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -8L))