How do I convert this data from long to wide using dcast in R?

45 views Asked by At

Trying to reformat my data from long to wide.

Starting out with:

head(rna.dat)
     sample_barcode gene_name HTSeq__Counts
1: TCGA-CS-4938-01B     KRT17             8
2: TCGA-CS-4938-01B     SMAD2          6123
3: TCGA-CS-4938-01B      BRAF          1512
4: TCGA-CS-4938-01B   ANKRD61            16
5: TCGA-CS-4938-01B      MTOR          6474
6: TCGA-CS-4938-01B      EXOG           582

Then I run:

rna.wide=dcast.data.table(rna.dat,sample_barcode~gene_name,value.var="HTSeq__Counts")

Aggregate function missing, defaulting to 'length'

...which outputs:

     sample_barcode A1BG A1CF A2M A2ML1
1: TCGA-CS-4938-01B    1    1   1     1
2: TCGA-CS-4942-01A    1    1   1     1
3: TCGA-CS-4943-01A    1    1   1     1
4: TCGA-CS-5393-01A    1    1   1     1
5: TCGA-CS-5396-01A    1    1   1     1

... i.e. the count values are being filled with 1's.

For example, the value in position [1,1] should be 36:

rna.dat[rna.dat$sample_barcode=="TCGA-CS-4938-01B" & rna.dat$gene_name=="A1BG",]
     sample_barcode gene_name HTSeq__Counts
1: TCGA-CS-4938-01B      A1BG            36

In the past I've used 'reshape' with success, but this data has some 4 million rows, and never seems to finish. dcast works in a matter of seconds.

Thanks in advance!

0

There are 0 answers