I have tick data for the equities in the Dow Jones Industrial Average over the course of the day. Here is a sample of the data:
> head(df)
TS Sym Ask
1: 2015-03-24 14:00:00.000 YMM5 17956.00
2: 2015-03-24 14:00:00.002 AAPL 126.91
3: 2015-03-24 14:00:00.005 UNH 118.35
4: 2015-03-24 14:00:00.009 XOM 84.64
5: 2015-03-24 14:00:00.014 AXP 81.35
6: 2015-03-24 14:00:00.016 PG 84.19
I am trying to use the dcast() function of reshape2 to transform the data into wide format, so it will look like:
TS AAPL AXP PG UNH XOM
1: 2015-03-24 14:00:00.000 126.91 81.35 84.19 118.35 84.64
When I try the following set of commands though, here is what I get:
tick <- data.table(read.csv("2015-3-24.csv"))
df<- data.table(TS = tick$DateTime, Sym = tick$Symbol, Ask = tick$Ask, Bid = tick$Bid)
tmp <- dcast(data = df, formula = TS ~ Sym)
head(tmp)
TS AAPL AXP BA CAT CSCO CVX DD DIS GE GS HD IBM INTC JNJ JPM KO MCD MMM MRK MSFT NKE PFE PG TRV UNH UTX V VZ WMT XOM YMM5
1 2015-03-24 14:00:00.000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2 2015-03-24 14:00:00.002 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 2015-03-24 14:00:00.005 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
I know I am getting the formula wrong or something, but no matter what I have though to try I am getting the same result. Thanks in advance.
Several issues with your post, I'd like to get clarified: (too long for comment, so this is a guess at best as of now.)
data.table
, but the version is not shown.data.table
provides efficient implementations ofmelt
anddcast
. In versions <= 1.9.4, you might want to usedcast.data.table
. From versions1.9.5+
, you can directly usedcast()
without loadingreshape2
. So I'm not sure if you're usingdcast
fromreshape2
or from the devel version ofdata.table
.You've not shown the entire result from
dcast
, I believe. If it complains something aboutthen your id and measure vars in formula do not uniquely identify cells. With the minimal info provided, I can only guess that this is indeed your issue.