I am starting with this table
| Gene | SampleID | Timepoint | log2_CPM |
|---|---|---|---|
| krt19 | 7578 | 14 | 123 |
| sftpd1 | 7578 | 14 | 32 |
| Cxcl13 | 7578 | 14 | 365 |
| krt19 | 7458 | 14 | 125 |
| sftpd1 | 7458 | 14 | 36 |
| Cxcl13 | 7458 | 14 | 330 |
and I need to convert to something like this
| SampleID | Timepoint | krt18 | sftpd1 | Cxcl13 |
|---|---|---|---|---|
| 7578 | 14 | 123 | 32 | 365 |
| 7458 | 14 | 125 | 36 | 330 |
Can someone please help me generate an R code for this? Thanks
I tried doing it manually & by using R codes. None worked. It is a very large dataset, and I would appreciate the help.
formattedDATA <-dcast(setDT(rawDATA), formula = c("Timepoint","SampleID") ~ Gene)
ERROR
> formattedDATA <-dcast(setDT(rawDATA), formula = c("Timepoint","SampleID") ~ Gene)
Using 'log2_CPM' as value column. Use 'value.var' to override
Error in vapply(X = x, FUN = fun, ..., FUN.VALUE = NA_character_, USE.NAMES = use.names) :
values must be length 1,
but FUN(X[[1]]) result is length 0
Answer
formattedDATA <-dcast(setDT(rawDATA), formula = SampleID + Timepoint ~ Gene , value.var = "log2_CPM")